[jira] [Updated] (HIVE-11138) Query fails when there isn't a comparator for an operator [Spark Branch]

2015-06-28 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-11138:
--
Attachment: HIVE-11138.1-spark.patch

> Query fails when there isn't a comparator for an operator [Spark Branch]
> 
>
> Key: HIVE-11138
> URL: https://issues.apache.org/jira/browse/HIVE-11138
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-11138.1-spark.patch
>
>
> In such case, OperatorComparatorFactory should default to false instead of 
> throw exceptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6791) Support variable substition for Beeline shell command

2015-06-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605184#comment-14605184
 ] 

Lefty Leverenz commented on HIVE-6791:
--

Doc note:  Linking this to HIVE-10810 for documentation.

> Support variable substition for Beeline shell command
> -
>
> Key: HIVE-6791
> URL: https://issues.apache.org/jira/browse/HIVE-6791
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI, Clients
>Affects Versions: 0.14.0
>Reporter: Xuefu Zhang
>Assignee: Ferdinand Xu
> Fix For: beeline-cli-branch
>
> Attachments: HIVE-6791-beeline-cli.2.patch, 
> HIVE-6791-beeline-cli.patch, HIVE-6791.3-beeline-cli.patch, 
> HIVE-6791.3-beeline-cli.patch, HIVE-6791.4-beeline-cli.patch, 
> HIVE-6791.5-beeline-cli.patch
>
>
> A follow-up task from HIVE-6694. Similar to HIVE-6570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11103) Add banker's rounding BROUND UDF

2015-06-28 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605162#comment-14605162
 ] 

Alexander Pivovarov commented on HIVE-11103:


2 errors are unrelated to the patch.
[~jdere] could you look at this new UDF?
This UDF is useful for financial/accounting SQL projects.
People keep asking about it, e.g.
http://www.databasejournal.com/features/mysql/rounding-down-bankers-rounding-and-random-rounding-in-mysql.html
http://www.sqlservercentral.com/Forums/Topic246556-8-1.aspx

> Add banker's rounding BROUND UDF
> 
>
> Key: HIVE-11103
> URL: https://issues.apache.org/jira/browse/HIVE-11103
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-11103.1.patch, HIVE-11103.1.patch
>
>
> Banker's rounding: the value is rounded to the nearest even number. Also 
> known as "Gaussian rounding", and, in German, "mathematische Rundung".
> Example
> {code}
>   2 digits2 digits
> Unrounded"Standard" rounding"Gaussian" rounding
>   54.1754  54.18  54.18
>  343.2050 343.21 343.20
> +106.2038+106.20+106.20 
> =======
>  503.5842 503.59 503.58
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7150) FileInputStream is not closed in HiveConnection#getHttpClient()

2015-06-28 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-7150:
--
Fix Version/s: 2.0.0
   1.3.0

> FileInputStream is not closed in HiveConnection#getHttpClient()
> ---
>
> Key: HIVE-7150
> URL: https://issues.apache.org/jira/browse/HIVE-7150
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Ted Yu
>Assignee: Alexander Pivovarov
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-7150.1.patch, HIVE-7150.2.patch, HIVE-7150.3.patch, 
> HIVE-7150.4.patch
>
>
> Here is related code:
> {code}
> sslTrustStore.load(new FileInputStream(sslTrustStorePath),
> sslTrustStorePassword.toCharArray());
> {code}
> The FileInputStream is not closed upon returning from the method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9625) Delegation tokens for HMS are not renewed

2015-06-28 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605057#comment-14605057
 ] 

Nemon Lou commented on HIVE-9625:
-

[~xuefuz],thanks for your attention.What i propose is a workaround for the lack 
of renewing HMS tokens (not only for HiveServer2).The solution has been used in 
our production environment,and quite follow Thejas M Nair 's advice:
{quote}
I think it would be better if we can renew it from a HMS client implementation 
on a failure-retry, similar to how reloginFromKeyTab was added to the client in 
HIVE-4233. This way any client of HMS could potentially benefit from this 
change. 
{quote}
Here,"any client of HMS" can be HiveServer2,WebHcat,Impala,SparkSQL,etc in my 
opinion.
Since HIVE-9625 already has its solution and accepted by the Hive community,I 
think it's ok to fix this problem without the solution i provided.
And thanks Brock Noland and Xuefu Zhang for working on this.


> Delegation tokens for HMS are not renewed
> -
>
> Key: HIVE-9625
> URL: https://issues.apache.org/jira/browse/HIVE-9625
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-9625.1.patch, HIVE-9625.1.patch, HIVE-9625.1.patch, 
> HIVE-9625.2.patch
>
>
> AFAICT the delegation tokens stored in [HiveSessionImplwithUGI 
> |https://github.com/apache/hive/blob/trunk/service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java#L45]
>  for HMS + Impersonation are never renewed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-28 Thread xiaowei wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605045#comment-14605045
 ] 

xiaowei wang commented on HIVE-11095:
-

[~brocknoland]

> SerDeUtils  another bug ,when Text is reused
> 
>
> Key: HIVE-11095
> URL: https://issues.apache.org/jira/browse/HIVE-11095
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
> Fix For: 2.0.0
>
> Attachments: HIVE-11095.1.patch.txt, HIVE-11095.2.patch.txt
>
>
> {noformat}
> The method transformTextFromUTF8 have a  error bug, It invoke a bad method of 
> Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> How I found this bug?
> When i query data from a lzo table , I found in results : the length of the 
> current row is always largr than the previous row, and sometimes,the current 
> row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select * from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
> `line` string)
> PARTITIONED BY (
> `logdate` string)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '
> U'
> WITH SERDEPROPERTIES (
> 'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
> OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
> 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-28 Thread xiaowei wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605043#comment-14605043
 ] 

xiaowei wang commented on HIVE-10983:
-

[~brocknoland]

> SerDeUtils bug  ,when Text is reused 
> -
>
> Key: HIVE-10983
> URL: https://issues.apache.org/jira/browse/HIVE-10983
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
>  Labels: patch
> Fix For: 2.0.0
>
> Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
> HIVE-10983.3.patch.txt, HIVE-10983.4.patch.txt, HIVE-10983.5.patch.txt
>
>
> {noformat}
> The mothod transformTextToUTF8 and transformTextFromUTF8  have a error bug,It 
> invoke a bad method of Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> When i query data from a lzo table , I found  in results : the length of the 
> current row is always largr  than the previous row, and sometimes,the current 
>  row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select *   from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content  of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
>   `line` string)
> PARTITIONED BY (
>   `logdate` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\\U'
> WITH SERDEPROPERTIES (
>   'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT  "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
>   OUTPUTFORMAT 
> "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-28 Thread xiaowei wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605039#comment-14605039
 ] 

xiaowei wang commented on HIVE-11095:
-

Thank you for [~sushant.patil] suggestion!This bug affect 0.14,1.0,1.1,1.2.

> SerDeUtils  another bug ,when Text is reused
> 
>
> Key: HIVE-11095
> URL: https://issues.apache.org/jira/browse/HIVE-11095
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
> Fix For: 2.0.0
>
> Attachments: HIVE-11095.1.patch.txt, HIVE-11095.2.patch.txt
>
>
> {noformat}
> The method transformTextFromUTF8 have a  error bug, It invoke a bad method of 
> Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> How I found this bug?
> When i query data from a lzo table , I found in results : the length of the 
> current row is always largr than the previous row, and sometimes,the current 
> row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select * from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
> `line` string)
> PARTITIONED BY (
> `logdate` string)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '
> U'
> WITH SERDEPROPERTIES (
> 'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
> OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
> 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-28 Thread xiaowei wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605038#comment-14605038
 ] 

xiaowei wang commented on HIVE-10983:
-

Thank you for [~sushant.patil] suggestion!This bug affect 0.14,1.0,1.1,1.2.

> SerDeUtils bug  ,when Text is reused 
> -
>
> Key: HIVE-10983
> URL: https://issues.apache.org/jira/browse/HIVE-10983
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
>  Labels: patch
> Fix For: 2.0.0
>
> Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
> HIVE-10983.3.patch.txt, HIVE-10983.4.patch.txt, HIVE-10983.5.patch.txt
>
>
> {noformat}
> The mothod transformTextToUTF8 and transformTextFromUTF8  have a error bug,It 
> invoke a bad method of Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> When i query data from a lzo table , I found  in results : the length of the 
> current row is always largr  than the previous row, and sometimes,the current 
>  row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select *   from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content  of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
>   `line` string)
> PARTITIONED BY (
>   `logdate` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\\U'
> WITH SERDEPROPERTIES (
>   'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT  "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
>   OUTPUTFORMAT 
> "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-28 Thread xiaowei wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaowei wang updated HIVE-10983:

Fix Version/s: 2.0.0

> SerDeUtils bug  ,when Text is reused 
> -
>
> Key: HIVE-10983
> URL: https://issues.apache.org/jira/browse/HIVE-10983
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
>  Labels: patch
> Fix For: 2.0.0
>
> Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
> HIVE-10983.3.patch.txt, HIVE-10983.4.patch.txt, HIVE-10983.5.patch.txt
>
>
> {noformat}
> The mothod transformTextToUTF8 and transformTextFromUTF8  have a error bug,It 
> invoke a bad method of Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> When i query data from a lzo table , I found  in results : the length of the 
> current row is always largr  than the previous row, and sometimes,the current 
>  row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select *   from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content  of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
>   `line` string)
> PARTITIONED BY (
>   `logdate` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\\U'
> WITH SERDEPROPERTIES (
>   'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT  "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
>   OUTPUTFORMAT 
> "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-28 Thread xiaowei wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaowei wang updated HIVE-11095:

Fix Version/s: 2.0.0

> SerDeUtils  another bug ,when Text is reused
> 
>
> Key: HIVE-11095
> URL: https://issues.apache.org/jira/browse/HIVE-11095
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
> Fix For: 2.0.0
>
> Attachments: HIVE-11095.1.patch.txt, HIVE-11095.2.patch.txt
>
>
> {noformat}
> The method transformTextFromUTF8 have a  error bug, It invoke a bad method of 
> Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> How I found this bug?
> When i query data from a lzo table , I found in results : the length of the 
> current row is always largr than the previous row, and sometimes,the current 
> row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select * from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
> `line` string)
> PARTITIONED BY (
> `logdate` string)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '
> U'
> WITH SERDEPROPERTIES (
> 'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
> OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
> 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9823) Load spark-defaults.conf from classpath [Spark Branch]

2015-06-28 Thread JoneZhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605031#comment-14605031
 ] 

JoneZhang commented on HIVE-9823:
-

hi, Xuefu Zhang
There is a sentence in the 
wiki(https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started)
 :  "Configure Spark-application configs for Hive.  See: 
http://spark.apache.org/docs/latest/configuration.html.  This can be done 
either by adding a file "spark-defaults.conf" with these properties to the Hive 
classpath...".

According to this issue,It's not necessary to do that manual.
Is that so?

> Load spark-defaults.conf from classpath [Spark Branch]
> --
>
> Key: HIVE-9823
> URL: https://issues.apache.org/jira/browse/HIVE-9823
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 1.2.0
>
> Attachments: HIVE-9823.1-spark.patch, HIVE-9823.2-spark.patch, 
> HIVE-9823.3-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11010) Accumulo storage handler queries via HS2 fail

2015-06-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605019#comment-14605019
 ] 

Sushanth Sowmyan commented on HIVE-11010:
-

Removing fix version of 1.2.1 since this is not part of the already-released 
1.2.1 release. Please set appropriate commit version when this fix is committed.

> Accumulo storage handler queries via HS2 fail
> -
>
> Key: HIVE-11010
> URL: https://issues.apache.org/jira/browse/HIVE-11010
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0, 1.2.1
> Environment: Secure
>Reporter: Takahiko Saito
>Assignee: Josh Elser
>
> On Kerberized cluster, accumulo storage handler throws an error, 
> "[usrname]@[principlaname] is not allowed to impersonate [username]" 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-10792) PPD leads to wrong answer when mapper scans the same table with multiple aliases

2015-06-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605017#comment-14605017
 ] 

Sushanth Sowmyan edited comment on HIVE-10792 at 6/29/15 1:15 AM:
--

Removing fix version of 1.2.1 since this is not part of the already-released 
1.2.1 release. Please set appropriate commit version when this fix is committed.


was (Author: sushanth):
Removing fix version of 1.2.1 since this is not part of the already-released 
1.2.` release. Please set appropriate commit version when this fix is committed.

> PPD leads to wrong answer when mapper scans the same table with multiple 
> aliases
> 
>
> Key: HIVE-10792
> URL: https://issues.apache.org/jira/browse/HIVE-10792
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Query Processor
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0, 1.1.0, 1.2.1
>Reporter: Dayue Gao
>Assignee: Dayue Gao
>Priority: Critical
> Attachments: HIVE-10792.1.patch, HIVE-10792.2.patch, 
> HIVE-10792.test.sql
>
>
> Here's the steps to reproduce the bug.
> First of all, prepare a simple ORC table with one row
> {code}
> create table test_orc (c0 int, c1 int) stored as ORC;
> {code}
> Table: test_orc
> ||c0||c1||
> |0|1|
> The following SQL gets empty result which is not expected
> {code}
> select * from test_orc t1
> union all
> select * from test_orc t2
> where t2.c0 = 1
> {code}
> Self join is also broken
> {code}
> set hive.auto.convert.join=false; -- force common join
> select * from test_orc t1
> left outer join test_orc t2 on (t1.c0=t2.c0 and t2.c1=0);
> {code}
> It gets empty result while the expected answer is
> ||t1.c0||t1.c1||t2.c0||t2.c1||
> |0|1|NULL|NULL|
> In these cases, we pushdown predicates into OrcInputFormat. As a result, 
> TableScanOperator for "t1" can't receive its rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11010) Accumulo storage handler queries via HS2 fail

2015-06-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11010:

Fix Version/s: (was: 1.2.1)

> Accumulo storage handler queries via HS2 fail
> -
>
> Key: HIVE-11010
> URL: https://issues.apache.org/jira/browse/HIVE-11010
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0, 1.2.1
> Environment: Secure
>Reporter: Takahiko Saito
>Assignee: Josh Elser
>
> On Kerberized cluster, accumulo storage handler throws an error, 
> "[usrname]@[principlaname] is not allowed to impersonate [username]" 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-4577) hive CLI can't handle hadoop dfs command with space and quotes.

2015-06-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605018#comment-14605018
 ] 

Sushanth Sowmyan edited comment on HIVE-4577 at 6/29/15 1:15 AM:
-

Removing fix version of 1.2.1 since this is not part of the already-released 
1.2.1 release. Please set appropriate commit version when this fix is committed.


was (Author: sushanth):
Removing fix version of 1.2.1 since this is not part of the already-released 
1.2.` release. Please set appropriate commit version when this fix is committed.

> hive CLI can't handle hadoop dfs command  with space and quotes.
> 
>
> Key: HIVE-4577
> URL: https://issues.apache.org/jira/browse/HIVE-4577
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.9.0, 0.10.0, 0.14.0, 0.13.1, 1.2.0, 1.1.0
>Reporter: Bing Li
>Assignee: Bing Li
> Attachments: HIVE-4577.1.patch, HIVE-4577.2.patch, 
> HIVE-4577.3.patch.txt, HIVE-4577.4.patch
>
>
> As design, hive could support hadoop dfs command in hive shell, like 
> hive> dfs -mkdir /user/biadmin/mydir;
> but has different behavior with hadoop if the path contains space and quotes
> hive> dfs -mkdir "hello"; 
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:40 
> /user/biadmin/"hello"
> hive> dfs -mkdir 'world';
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:43 
> /user/biadmin/'world'
> hive> dfs -mkdir "bei jing";
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:44 
> /user/biadmin/"bei
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:44 
> /user/biadmin/jing"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10792) PPD leads to wrong answer when mapper scans the same table with multiple aliases

2015-06-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-10792:

Fix Version/s: (was: 1.2.1)

> PPD leads to wrong answer when mapper scans the same table with multiple 
> aliases
> 
>
> Key: HIVE-10792
> URL: https://issues.apache.org/jira/browse/HIVE-10792
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Query Processor
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0, 1.1.0, 1.2.1
>Reporter: Dayue Gao
>Assignee: Dayue Gao
>Priority: Critical
> Attachments: HIVE-10792.1.patch, HIVE-10792.2.patch, 
> HIVE-10792.test.sql
>
>
> Here's the steps to reproduce the bug.
> First of all, prepare a simple ORC table with one row
> {code}
> create table test_orc (c0 int, c1 int) stored as ORC;
> {code}
> Table: test_orc
> ||c0||c1||
> |0|1|
> The following SQL gets empty result which is not expected
> {code}
> select * from test_orc t1
> union all
> select * from test_orc t2
> where t2.c0 = 1
> {code}
> Self join is also broken
> {code}
> set hive.auto.convert.join=false; -- force common join
> select * from test_orc t1
> left outer join test_orc t2 on (t1.c0=t2.c0 and t2.c1=0);
> {code}
> It gets empty result while the expected answer is
> ||t1.c0||t1.c1||t2.c0||t2.c1||
> |0|1|NULL|NULL|
> In these cases, we pushdown predicates into OrcInputFormat. As a result, 
> TableScanOperator for "t1" can't receive its rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4577) hive CLI can't handle hadoop dfs command with space and quotes.

2015-06-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605018#comment-14605018
 ] 

Sushanth Sowmyan commented on HIVE-4577:


Removing fix version of 1.2.1 since this is not part of the already-released 
1.2.` release. Please set appropriate commit version when this fix is committed.

> hive CLI can't handle hadoop dfs command  with space and quotes.
> 
>
> Key: HIVE-4577
> URL: https://issues.apache.org/jira/browse/HIVE-4577
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.9.0, 0.10.0, 0.14.0, 0.13.1, 1.2.0, 1.1.0
>Reporter: Bing Li
>Assignee: Bing Li
> Attachments: HIVE-4577.1.patch, HIVE-4577.2.patch, 
> HIVE-4577.3.patch.txt, HIVE-4577.4.patch
>
>
> As design, hive could support hadoop dfs command in hive shell, like 
> hive> dfs -mkdir /user/biadmin/mydir;
> but has different behavior with hadoop if the path contains space and quotes
> hive> dfs -mkdir "hello"; 
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:40 
> /user/biadmin/"hello"
> hive> dfs -mkdir 'world';
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:43 
> /user/biadmin/'world'
> hive> dfs -mkdir "bei jing";
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:44 
> /user/biadmin/"bei
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:44 
> /user/biadmin/jing"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10792) PPD leads to wrong answer when mapper scans the same table with multiple aliases

2015-06-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605017#comment-14605017
 ] 

Sushanth Sowmyan commented on HIVE-10792:
-

Removing fix version of 1.2.1 since this is not part of the already-released 
1.2.` release. Please set appropriate commit version when this fix is committed.

> PPD leads to wrong answer when mapper scans the same table with multiple 
> aliases
> 
>
> Key: HIVE-10792
> URL: https://issues.apache.org/jira/browse/HIVE-10792
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Query Processor
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0, 1.1.0, 1.2.1
>Reporter: Dayue Gao
>Assignee: Dayue Gao
>Priority: Critical
> Attachments: HIVE-10792.1.patch, HIVE-10792.2.patch, 
> HIVE-10792.test.sql
>
>
> Here's the steps to reproduce the bug.
> First of all, prepare a simple ORC table with one row
> {code}
> create table test_orc (c0 int, c1 int) stored as ORC;
> {code}
> Table: test_orc
> ||c0||c1||
> |0|1|
> The following SQL gets empty result which is not expected
> {code}
> select * from test_orc t1
> union all
> select * from test_orc t2
> where t2.c0 = 1
> {code}
> Self join is also broken
> {code}
> set hive.auto.convert.join=false; -- force common join
> select * from test_orc t1
> left outer join test_orc t2 on (t1.c0=t2.c0 and t2.c1=0);
> {code}
> It gets empty result while the expected answer is
> ||t1.c0||t1.c1||t2.c0||t2.c1||
> |0|1|NULL|NULL|
> In these cases, we pushdown predicates into OrcInputFormat. As a result, 
> TableScanOperator for "t1" can't receive its rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605015#comment-14605015
 ] 

Sushanth Sowmyan commented on HIVE-10983:
-

Removing fix version of 1.2.0 & 0.14.1 since this is not part of the 
already-released 1.2.0 and 0.14.1 release. Please set appropriate commit 
version(as the version this fix goes into) when this fix is committed.

> SerDeUtils bug  ,when Text is reused 
> -
>
> Key: HIVE-10983
> URL: https://issues.apache.org/jira/browse/HIVE-10983
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
>  Labels: patch
> Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
> HIVE-10983.3.patch.txt, HIVE-10983.4.patch.txt, HIVE-10983.5.patch.txt
>
>
> {noformat}
> The mothod transformTextToUTF8 and transformTextFromUTF8  have a error bug,It 
> invoke a bad method of Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> When i query data from a lzo table , I found  in results : the length of the 
> current row is always largr  than the previous row, and sometimes,the current 
>  row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select *   from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content  of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
>   `line` string)
> PARTITIONED BY (
>   `logdate` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\\U'
> WITH SERDEPROPERTIES (
>   'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT  "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
>   OUTPUTFORMAT 
> "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-4577) hive CLI can't handle hadoop dfs command with space and quotes.

2015-06-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-4577:
---
Fix Version/s: (was: 1.2.1)

> hive CLI can't handle hadoop dfs command  with space and quotes.
> 
>
> Key: HIVE-4577
> URL: https://issues.apache.org/jira/browse/HIVE-4577
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.9.0, 0.10.0, 0.14.0, 0.13.1, 1.2.0, 1.1.0
>Reporter: Bing Li
>Assignee: Bing Li
> Attachments: HIVE-4577.1.patch, HIVE-4577.2.patch, 
> HIVE-4577.3.patch.txt, HIVE-4577.4.patch
>
>
> As design, hive could support hadoop dfs command in hive shell, like 
> hive> dfs -mkdir /user/biadmin/mydir;
> but has different behavior with hadoop if the path contains space and quotes
> hive> dfs -mkdir "hello"; 
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:40 
> /user/biadmin/"hello"
> hive> dfs -mkdir 'world';
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:43 
> /user/biadmin/'world'
> hive> dfs -mkdir "bei jing";
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:44 
> /user/biadmin/"bei
> drwxr-xr-x   - biadmin supergroup  0 2013-04-23 09:44 
> /user/biadmin/jing"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-10983:

Fix Version/s: (was: 1.2.0)
   (was: 0.14.1)

> SerDeUtils bug  ,when Text is reused 
> -
>
> Key: HIVE-10983
> URL: https://issues.apache.org/jira/browse/HIVE-10983
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
>  Labels: patch
> Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
> HIVE-10983.3.patch.txt, HIVE-10983.4.patch.txt, HIVE-10983.5.patch.txt
>
>
> {noformat}
> The mothod transformTextToUTF8 and transformTextFromUTF8  have a error bug,It 
> invoke a bad method of Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> When i query data from a lzo table , I found  in results : the length of the 
> current row is always largr  than the previous row, and sometimes,the current 
>  row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select *   from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content  of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
>   `line` string)
> PARTITIONED BY (
>   `logdate` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\\U'
> WITH SERDEPROPERTIES (
>   'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT  "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
>   OUTPUTFORMAT 
> "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605013#comment-14605013
 ] 

Sushanth Sowmyan commented on HIVE-11095:
-

Removing fix version of 1.2.0 since this is not part of the already-released 
1.2.0 release. Please set appropriate commit version when this fix is committed.

> SerDeUtils  another bug ,when Text is reused
> 
>
> Key: HIVE-11095
> URL: https://issues.apache.org/jira/browse/HIVE-11095
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
> Attachments: HIVE-11095.1.patch.txt, HIVE-11095.2.patch.txt
>
>
> {noformat}
> The method transformTextFromUTF8 have a  error bug, It invoke a bad method of 
> Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> How I found this bug?
> When i query data from a lzo table , I found in results : the length of the 
> current row is always largr than the previous row, and sometimes,the current 
> row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select * from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
> `line` string)
> PARTITIONED BY (
> `logdate` string)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '
> U'
> WITH SERDEPROPERTIES (
> 'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
> OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
> 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11095:

Fix Version/s: (was: 1.2.0)

> SerDeUtils  another bug ,when Text is reused
> 
>
> Key: HIVE-11095
> URL: https://issues.apache.org/jira/browse/HIVE-11095
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
> Attachments: HIVE-11095.1.patch.txt, HIVE-11095.2.patch.txt
>
>
> {noformat}
> The method transformTextFromUTF8 have a  error bug, It invoke a bad method of 
> Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> How I found this bug?
> When i query data from a lzo table , I found in results : the length of the 
> current row is always largr than the previous row, and sometimes,the current 
> row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select * from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
> `line` string)
> PARTITIONED BY (
> `logdate` string)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '
> U'
> WITH SERDEPROPERTIES (
> 'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
> OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
> 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11048) Make test cbo_windowing robust

2015-06-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11048:

Fix Version/s: 1.2.2

> Make test cbo_windowing robust
> --
>
> Key: HIVE-11048
> URL: https://issues.apache.org/jira/browse/HIVE-11048
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 1.2.2
>
> Attachments: HIVE-11048.patch
>
>
> Add partition / order by in over clause to make result set deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11050) testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data creation queries

2015-06-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11050:

Fix Version/s: (was: 1.2.1)
   1.2.2

> testCliDriver_vector_outer_join.* failures in Unit tests due to unstable data 
> creation queries
> --
>
> Key: HIVE-11050
> URL: https://issues.apache.org/jira/browse/HIVE-11050
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Blocker
> Fix For: 1.2.2
>
> Attachments: HIVE-11050.01.branch-1.patch, HIVE-11050.01.patch
>
>
> In some environments the Q file tests vector_outer_join\{1-4\}.q fail because 
> the data creation queries produce different input files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11066) Ensure tests don't share directories on FS

2015-06-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11066:

Fix Version/s: (was: 1.2.1)
   1.2.2

> Ensure tests don't share directories on FS
> --
>
> Key: HIVE-11066
> URL: https://issues.apache.org/jira/browse/HIVE-11066
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.2.2
>
> Attachments: HIVE-11066.patch
>
>
> Tests often fail with errors like
> "Could not fully delete 
> D:\w\hv\hcatalog\hcatalog-pig-adapter\target\tmp\dfs\name1" on Windows 
> platforms.
> Attached is a prototype on avoiding these false negatives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11060) Make test windowing.q robust

2015-06-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11060:

Fix Version/s: 1.2.2

> Make test windowing.q robust
> 
>
> Key: HIVE-11060
> URL: https://issues.apache.org/jira/browse/HIVE-11060
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.0.0, 1.2.2
>
> Attachments: HIVE-11060.01.patch, HIVE-11060.patch
>
>
> Add partition / order by in over clause to make result set deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11059) hcatalog-server-extensions tests scope should depend on hive-exec

2015-06-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11059:

Fix Version/s: (was: 1.2.1)
   1.2.2

> hcatalog-server-extensions tests scope should depend on hive-exec
> -
>
> Key: HIVE-11059
> URL: https://issues.apache.org/jira/browse/HIVE-11059
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.2.1
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>Priority: Minor
> Fix For: 1.2.2
>
> Attachments: HIVE-11059.patch
>
>
> (causes test failures in Windows due to the lack of WindowsPathUtil being 
> available otherwise)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11076) Explicitly set hive.cbo.enable=true for some tests

2015-06-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11076:

Fix Version/s: 1.2.2

> Explicitly set hive.cbo.enable=true for some tests
> --
>
> Key: HIVE-11076
> URL: https://issues.apache.org/jira/browse/HIVE-11076
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.0.0, 1.2.2
>
> Attachments: HIVE-11076.01.patch, HIVE-11076.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11083) Make test cbo_windowing robust

2015-06-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11083:

Fix Version/s: 1.2.2

> Make test cbo_windowing robust
> --
>
> Key: HIVE-11083
> URL: https://issues.apache.org/jira/browse/HIVE-11083
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 1.2.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 2.0.0, 1.2.2
>
> Attachments: HIVE-11083.patch
>
>
> Make result set deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11074) Update tests for HIVE-9302 after removing binaries

2015-06-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-11074:

Fix Version/s: (was: 1.2.1)
   1.2.2

> Update tests for HIVE-9302 after removing binaries
> --
>
> Key: HIVE-11074
> URL: https://issues.apache.org/jira/browse/HIVE-11074
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 1.2.2
>
> Attachments: HIVE-11074.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10754) new Job() is deprecated. Replaced all with Job.getInstance() for Hcatalog

2015-06-28 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604994#comment-14604994
 ] 

Chaoyu Tang commented on HIVE-10754:


[~aihuaxu] Could you help elaborate what exactly the issue this patch is going 
to fix? Both Hive 2.0.0 and 1.3.0 seem use Hadoop 2.6. Thanks.

> new Job() is deprecated. Replaced all with Job.getInstance() for Hcatalog
> -
>
> Key: HIVE-10754
> URL: https://issues.apache.org/jira/browse/HIVE-10754
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-10754.patch
>
>
> Replace all the deprecated new Job() with Job.getInstance() in HCatalog.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11122) ORC should not record the timezone information when there are no timestamp columns

2015-06-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604953#comment-14604953
 ] 

Hive QA commented on HIVE-11122:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742421/HIVE-11122.2.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 8990 tests executed
*Failed tests:*
{noformat}
TestMiniSparkOnYarnCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_alter_merge_orc
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_alter_merge_stats_orc
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_ptf
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4429/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4429/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4429/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742421 - PreCommit-HIVE-TRUNK-Build

> ORC should not record the timezone information when there are no timestamp 
> columns
> --
>
> Key: HIVE-11122
> URL: https://issues.apache.org/jira/browse/HIVE-11122
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11122.1.patch, HIVE-11122.2.patch, HIVE-11122.patch
>
>
> Currently ORC records the time zone information in the stripe footer even 
> when there are no timestamp columns. This will not only add to the size of 
> the footer but also can cause inconsistencies (file size difference) in test 
> cases when run under different time zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6791) Support variable substition for Beeline shell command

2015-06-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604950#comment-14604950
 ] 

Xuefu Zhang commented on HIVE-6791:
---

+1

> Support variable substition for Beeline shell command
> -
>
> Key: HIVE-6791
> URL: https://issues.apache.org/jira/browse/HIVE-6791
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI, Clients
>Affects Versions: 0.14.0
>Reporter: Xuefu Zhang
>Assignee: Ferdinand Xu
> Attachments: HIVE-6791-beeline-cli.2.patch, 
> HIVE-6791-beeline-cli.patch, HIVE-6791.3-beeline-cli.patch, 
> HIVE-6791.3-beeline-cli.patch, HIVE-6791.4-beeline-cli.patch, 
> HIVE-6791.5-beeline-cli.patch
>
>
> A follow-up task from HIVE-6694. Similar to HIVE-6570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11122) ORC should not record the timezone information when there are no timestamp columns

2015-06-28 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11122:
-
Attachment: (was: HIVE-11122.2.patch)

> ORC should not record the timezone information when there are no timestamp 
> columns
> --
>
> Key: HIVE-11122
> URL: https://issues.apache.org/jira/browse/HIVE-11122
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11122.1.patch, HIVE-11122.2.patch, HIVE-11122.patch
>
>
> Currently ORC records the time zone information in the stripe footer even 
> when there are no timestamp columns. This will not only add to the size of 
> the footer but also can cause inconsistencies (file size difference) in test 
> cases when run under different time zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11122) ORC should not record the timezone information when there are no timestamp columns

2015-06-28 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11122:
-
Attachment: HIVE-11122.2.patch

Previous patch had some stray characters.

> ORC should not record the timezone information when there are no timestamp 
> columns
> --
>
> Key: HIVE-11122
> URL: https://issues.apache.org/jira/browse/HIVE-11122
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11122.1.patch, HIVE-11122.2.patch, HIVE-11122.patch
>
>
> Currently ORC records the time zone information in the stripe footer even 
> when there are no timestamp columns. This will not only add to the size of 
> the footer but also can cause inconsistencies (file size difference) in test 
> cases when run under different time zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11122) ORC should not record the timezone information when there are no timestamp columns

2015-06-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604901#comment-14604901
 ] 

Hive QA commented on HIVE-11122:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742419/HIVE-11122.2.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4428/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4428/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4428/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ spark-client ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
spark-client ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client ---
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar
 to 
/home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to 
/home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Query Language 2.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec ---
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql/target
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-exec ---
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen
Generating vector expression code
Generating vector expression test code
[INFO] Executed tasks
[INFO] 
[INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec ---
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/ql/src/gen/protobuf/gen-java
 added.
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/ql/src/gen/thrift/gen-javabean
 added.
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java
 added.
[INFO] 
[INFO] --- antlr3-maven-plugin:3.4:antlr (default) @ hive-exec ---
[INFO] ANTLR: Processing source directory 
/data/hive-ptest/working/apache-github-source-source/ql/src/java
ANTLR Parser Generator  Version 3.4
org/apache/hadoop/hive/ql/parse/HiveLexer.g
org/apache/hadoop/hive/ql/parse/HiveParser.g
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_MAP" using 
multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_SELECT" 
using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_SORT KW_BY" using 
multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_MAP LPAREN" using 
multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_DISTRIBUTE KW_BY" 
using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_ALL" using 
multi

[jira] [Commented] (HIVE-11043) ORC split strategies should adapt based on number of files

2015-06-28 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604902#comment-14604902
 ] 

Gopal V commented on HIVE-11043:


[~leftylev]: yes, it needs doc - I will write up a "decision" tree of the 
hybrid strategy for the docs.

> ORC split strategies should adapt based on number of files
> --
>
> Key: HIVE-11043
> URL: https://issues.apache.org/jira/browse/HIVE-11043
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Gopal V
> Fix For: 2.0.0
>
> Attachments: HIVE-11043.1.patch, HIVE-11043.2.patch, 
> HIVE-11043.3.patch
>
>
> ORC split strategies added in HIVE-10114 chose strategies based on average 
> file size. It would be beneficial to choose a different strategy based on 
> number of files as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11090) ordering issues with windows unit test runs

2015-06-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604894#comment-14604894
 ] 

Lefty Leverenz commented on HIVE-11090:
---

Nudge:  This was committed to branch-1 (1.3.0) and master (2.0.0) so the 
Status, Resolution, and Fix Version need to be updated.

Commits 440c91c979226ddc970536f70ff0769c651483c1 & 
63deec40731c709f84b23525dc68a7cec3307052.

> ordering issues with windows unit test runs
> ---
>
> Key: HIVE-11090
> URL: https://issues.apache.org/jira/browse/HIVE-11090
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-11090.01.patch, HIVE-11090.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11051) Hive 1.2.0 MapJoin w/Tez - LazyBinaryArray cannot be cast to [Ljava.lang.Object;

2015-06-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604883#comment-14604883
 ] 

Lefty Leverenz commented on HIVE-11051:
---

Nudge:  This was committed to branch-1 (1.3.0) and master (2.0.0) so the 
Status, Resolution, and Fix Version need to be updated.

Commits 5351c35bffa251ba17de22bcd5ef0b9b06d134c9 & 
2a77e87e347d368a806c53df5f5ac709339a47bc.

> Hive 1.2.0  MapJoin w/Tez - LazyBinaryArray cannot be cast to 
> [Ljava.lang.Object;
> -
>
> Key: HIVE-11051
> URL: https://issues.apache.org/jira/browse/HIVE-11051
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers, Tez
>Affects Versions: 1.2.0, 2.0.0
>Reporter: Greg Senia
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11051.01.patch, HIVE-11051.02.patch, 
> problem_table_joins.tar.gz
>
>
> I tried to apply: HIVE-10729 which did not solve the issue.
> The following exception is thrown on a Tez MapJoin with Hive 1.2.0 and Tez 
> 0.5.4/0.5.3
> {code}
> Status: Running (Executing on YARN cluster with App id 
> application_1434641270368_1038)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  3  300   0  
>  0
> Map 2 ... FAILED  3  102   7  
>  0
> 
> VERTICES: 01/02  [=>>-] 66%   ELAPSED TIME: 7.39 s
>  
> 
> Status: Failed
> Vertex failed, vertexName=Map 2, vertexId=vertex_1434641270368_1038_2_01, 
> diagnostics=[Task failed, taskId=task_1434641270368_1038_2_01_02, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum_reason_cd":"98","sum_reason":"Exclude","crsr_master_claim_index":null,"svcrqct_cds":["
>"],"svcrqst_lupdt":"2015-04-23 
> 22:14:01.288132","crsr_lupdt":null,"cntevsds_lupdt":"2015-04-23 
> 11:54:40.740061","ignore_me":1,"notes":null}
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"cnctevn_id":"002245282386","svcrqst_id":"003627217285","svcrqst_crt_dts":"2015-04-23
>  11:54:39.238357","subject_seq_no":1,"plan_component":"HMOM1 
> ","cust_segment":"RM 
> ","cnctyp_cd":"001","cnctmd_cd":"D02","cnctevs_cd":"007","svcrtyp_cd":"335","svrstyp_cd":"088","cmpltyp_cd":"
>  ","catsrsn_cd":"","apealvl_cd":" 
> ","cnstnty_cd":"001","svcrqst_asrqst_ind":"Y","svcrqst_rtnorig_in":"N","svcrqst_vwasof_dt":"null","sum

[jira] [Commented] (HIVE-11122) ORC should not record the timezone information when there are no timestamp columns

2015-06-28 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604882#comment-14604882
 ] 

Prasanth Jayachandran commented on HIVE-11122:
--

Addressed Gopal's comments and regenerated golden files for failing tests.

> ORC should not record the timezone information when there are no timestamp 
> columns
> --
>
> Key: HIVE-11122
> URL: https://issues.apache.org/jira/browse/HIVE-11122
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11122.1.patch, HIVE-11122.2.patch, HIVE-11122.patch
>
>
> Currently ORC records the time zone information in the stripe footer even 
> when there are no timestamp columns. This will not only add to the size of 
> the footer but also can cause inconsistencies (file size difference) in test 
> cases when run under different time zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11122) ORC should not record the timezone information when there are no timestamp columns

2015-06-28 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11122:
-
Attachment: HIVE-11122.2.patch

> ORC should not record the timezone information when there are no timestamp 
> columns
> --
>
> Key: HIVE-11122
> URL: https://issues.apache.org/jira/browse/HIVE-11122
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11122.1.patch, HIVE-11122.2.patch, HIVE-11122.patch
>
>
> Currently ORC records the time zone information in the stripe footer even 
> when there are no timestamp columns. This will not only add to the size of 
> the footer but also can cause inconsistencies (file size difference) in test 
> cases when run under different time zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11083) Make test cbo_windowing robust

2015-06-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604858#comment-14604858
 ] 

Lefty Leverenz commented on HIVE-11083:
---

Not branch-1 (for 1.3.0)?

> Make test cbo_windowing robust
> --
>
> Key: HIVE-11083
> URL: https://issues.apache.org/jira/browse/HIVE-11083
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 1.2.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 2.0.0
>
> Attachments: HIVE-11083.patch
>
>
> Make result set deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11118) Load data query should validate file formats with destination tables

2015-06-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604843#comment-14604843
 ] 

Lefty Leverenz commented on HIVE-8:
---

Nudge:  This needs to show Fix Versions 1.3.0 and 2.0.0.

(Commits 49da35903f8334d6dd0c597563c34388772914cc & 
d373962de475ea9f3ef7b2594fbc5d8488636af0.)

> Load data query should validate file formats with destination tables
> 
>
> Key: HIVE-8
> URL: https://issues.apache.org/jira/browse/HIVE-8
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-8.2.patch, HIVE-8.3.patch, 
> HIVE-8.4.patch, HIVE-8.patch
>
>
> Load data local inpath queries does not do any validation wrt file format. If 
> the destination table is ORC and if we try to load files that are not ORC, 
> the load will succeed but querying such tables will result in runtime 
> exceptions. We can do some simple sanity checks to prevent loading of files 
> that does not match the destination table file format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11043) ORC split strategies should adapt based on number of files

2015-06-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604841#comment-14604841
 ] 

Lefty Leverenz commented on HIVE-11043:
---

Does this need documentation?

Also, shouldn't Fix Version include 1.3.0 (commit 
64f8e0f069f71f82518a9280d199f790174bee33 to branch-1)?

> ORC split strategies should adapt based on number of files
> --
>
> Key: HIVE-11043
> URL: https://issues.apache.org/jira/browse/HIVE-11043
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Gopal V
> Fix For: 2.0.0
>
> Attachments: HIVE-11043.1.patch, HIVE-11043.2.patch, 
> HIVE-11043.3.patch
>
>
> ORC split strategies added in HIVE-10114 chose strategies based on average 
> file size. It would be beneficial to choose a different strategy based on 
> number of files as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-28 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-10233:
--
Labels: TODOC1.3  (was: )

> Hive on tez: memory manager for grace hash join
> ---
>
> Key: HIVE-10233
> URL: https://issues.apache.org/jira/browse/HIVE-10233
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap, 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Gunther Hagleitner
>  Labels: TODOC1.3
> Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
> HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
> HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
> HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
> HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch, 
> HIVE-10233.15.patch, HIVE-10233.16.patch, HIVE-10233.17.patch, 
> HIVE-10233.18.patch, HIVE-10233.19.patch, HIVE-10233.20.patch, 
> HIVE-10233.21.patch, HIVE-10233.22.patch, HIVE-10233.23.patch, 
> HIVE-10233.24.patch
>
>
> We need a memory manager in llap/tez to manage the usage of memory across 
> threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604833#comment-14604833
 ] 

Lefty Leverenz commented on HIVE-10233:
---

Doc note:  This adds two configuration parameters 
(*hive.tez.enable.memory.manager* & *hive.hash.table.inflation.factor*) which 
need to be documented in the wiki in Configuration Properties for release 1.3.0.

* *hive.tez.enable.memory.manager* belongs in [Configuration Properties -- Tez 
| 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Tez]
* *hive.hash.table.inflation.factor* belongs in [Configuration Properties -- 
Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]

Is any general documentation needed for the memory manager?  Perhaps in the 
design docs?

* [Hive on Tez | https://cwiki.apache.org/confluence/display/Hive/Hive+on+Tez]
* [Hybrid Hybrid Grace Hash Join, v1.0 | 
https://cwiki.apache.org/confluence/display/Hive/Hybrid+Hybrid+Grace+Hash+Join%2C+v1.0]

Also, this jira needs updates for Status, Resolution, and Fix Version.

> Hive on tez: memory manager for grace hash join
> ---
>
> Key: HIVE-10233
> URL: https://issues.apache.org/jira/browse/HIVE-10233
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap, 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Gunther Hagleitner
>  Labels: TODOC1.3
> Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
> HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
> HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
> HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
> HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch, 
> HIVE-10233.15.patch, HIVE-10233.16.patch, HIVE-10233.17.patch, 
> HIVE-10233.18.patch, HIVE-10233.19.patch, HIVE-10233.20.patch, 
> HIVE-10233.21.patch, HIVE-10233.22.patch, HIVE-10233.23.patch, 
> HIVE-10233.24.patch
>
>
> We need a memory manager in llap/tez to manage the usage of memory across 
> threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table

2015-06-28 Thread Sivanesan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604797#comment-14604797
 ] 

Sivanesan commented on HIVE-5795:
-

I agree with prashant kumar- I face this exacr issue. I find this issue only 
when I use CombineHiveInputFormat and not while using HiveInputFormat. Does 
this have something to do with InputSplit? Please help.

> Hive should be able to skip header and footer rows when reading data file for 
> a table
> -
>
> Key: HIVE-5795
> URL: https://issues.apache.org/jira/browse/HIVE-5795
> Project: Hive
>  Issue Type: New Feature
>Reporter: Shuaishuai Nie
>Assignee: Shuaishuai Nie
>  Labels: TODOC13
> Fix For: 0.13.0
>
> Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, 
> HIVE-5795.4.patch, HIVE-5795.5.patch
>
>
> Hive should be able to skip header and footer lines when reading data file 
> from table. In this way, user don't need to processing data which generated 
> by other application with a header or footer and directly use the file for 
> table operations.
> To implement this, the idea is adding new properties in table descriptions to 
> define the number of lines in header and footer and skip them when reading 
> the record from record reader. An DDL example for creating a table with 
> header and footer should be like this:
> {code}
> Create external table testtable (name string, message string) row format 
> delimited fields terminated by '\t' lines terminated by '\n' location 
> '/testtable' tblproperties ("skip.header.line.count"="1", 
> "skip.footer.line.count"="2");
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9625) Delegation tokens for HMS are not renewed

2015-06-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604791#comment-14604791
 ] 

Xuefu Zhang commented on HIVE-9625:
---

[~nemon], could you please describe what problem your proposal is addressing? 
I'm not sure if that's for the same problem here or an enhancement to the 
current solution. Please feel free to create a follow-up JIRA if necessary. 
Thanks.

> Delegation tokens for HMS are not renewed
> -
>
> Key: HIVE-9625
> URL: https://issues.apache.org/jira/browse/HIVE-9625
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-9625.1.patch, HIVE-9625.1.patch, HIVE-9625.1.patch, 
> HIVE-9625.2.patch
>
>
> AFAICT the delegation tokens stored in [HiveSessionImplwithUGI 
> |https://github.com/apache/hive/blob/trunk/service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java#L45]
>  for HMS + Impersonation are never renewed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9625) Delegation tokens for HMS are not renewed

2015-06-28 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-9625:
--
Attachment: (was: HIVE-9625-branch-1.patch)

> Delegation tokens for HMS are not renewed
> -
>
> Key: HIVE-9625
> URL: https://issues.apache.org/jira/browse/HIVE-9625
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-9625.1.patch, HIVE-9625.1.patch, HIVE-9625.1.patch, 
> HIVE-9625.2.patch
>
>
> AFAICT the delegation tokens stored in [HiveSessionImplwithUGI 
> |https://github.com/apache/hive/blob/trunk/service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java#L45]
>  for HMS + Impersonation are never renewed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9625) Delegation tokens for HMS are not renewed

2015-06-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604786#comment-14604786
 ] 

Xuefu Zhang commented on HIVE-9625:
---

The above test failures seems rather infrastructural. Patch #2 is committed to 
both master and branch-1. Thanks to Brock and Prasad.

> Delegation tokens for HMS are not renewed
> -
>
> Key: HIVE-9625
> URL: https://issues.apache.org/jira/browse/HIVE-9625
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-9625.1.patch, HIVE-9625.1.patch, HIVE-9625.1.patch, 
> HIVE-9625.2.patch
>
>
> AFAICT the delegation tokens stored in [HiveSessionImplwithUGI 
> |https://github.com/apache/hive/blob/trunk/service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java#L45]
>  for HMS + Impersonation are never renewed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-06-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604704#comment-14604704
 ] 

Xuefu Zhang edited comment on HIVE-10438 at 6/28/15 5:12 PM:
-

Here are some of my high-level thoughts:

1. I don't think Hive needs to support multiple compressors at the same time. 
This is very unlikely in a real production scenario, though different users 
might choose different compression technologies (i.e. snappy vs lzo). For 
simplicity, we should start just one. Thus, we need to two flags on server 
side: #1, enable/disable compression; #2, the class name (some sort of 
identifier) of the compressor.

2. JDBC client should be able to specify whether to use result set compression. 
This can be done via a hiveconf variable specified in JdBC connection string 
 section below:
{code}
jdbc:hive2://:/;?#
{code}
An example of this variable can be "hive.client.use.resultset.compression".

3. When updating patch, please choose "update" patch instead of "add file" so 
as to make it easy to see diffs between the patches.

4. A default implementation such as via Snappy would be nice.

5. Have some testcases using the default implementation and verifying result.


was (Author: xuefuz):
Here are some of my high-level thoughts:

1. I don't think Hive needs to support multiple compressors at the same time. 
This is very unlikely in a real production scenario, though different users 
might choose different compression technologies (i.e. snappy vs lzo). For 
simplicity, we should start just one. Thus, we need to two flags on server 
side: #1, enable/disable compression; #2, the class name (some sort of 
identifier) of the compressor.

2. JDBC client should be able to specify whether to use result set compression. 
This can be done via a hiveconf variable specified in JdBC connection string 
 section below:
{code}
jdbc:hive2://:/;?#
{code}
An example of this variable can be "hive.client.use.resultset.compression".

3. When updating patch, please choose "update" patch instead of "add file" so 
as to make it easy to see diffs between the patches.


> Architecture for  ResultSet Compression via external plugin
> ---
>
> Key: HIVE-10438
> URL: https://issues.apache.org/jira/browse/HIVE-10438
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive, Thrift API
>Affects Versions: 1.2.0
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
>  Labels: patch
> Attachments: HIVE-10438-1.patch, HIVE-10438.patch, 
> Proposal-rscompressor.pdf, README.txt, 
> Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, 
> hs2driver-master.zip
>
>
> This JIRA proposes an architecture for enabling ResultSet compression which 
> uses an external plugin. 
> The patch has three aspects to it: 
> 0. An architecture for enabling ResultSet compression with external plugins
> 1. An example plugin to demonstrate end-to-end functionality 
> 2. A container to allow everyone to write and test ResultSet compressors with 
> a query submitter (https://github.com/xiaom/hs2driver) 
> Also attaching a design document explaining the changes, experimental results 
> document, and a pdf explaining how to setup the docker container to observe 
> end-to-end functionality of ResultSet compression. 
> https://reviews.apache.org/r/35792/ Review board link. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11117) Hive external table - skip header and trailer property issue

2015-06-28 Thread Janarthanan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janarthanan updated HIVE-7:
---
Environment: Production
   Priority: Critical  (was: Major)

> Hive external table - skip header and trailer property issue
> 
>
> Key: HIVE-7
> URL: https://issues.apache.org/jira/browse/HIVE-7
> Project: Hive
>  Issue Type: Bug
> Environment: Production
>Reporter: Janarthanan
>Priority: Critical
>
> I am using an external hive table pointing to a HDFS location. The external 
> table is partitioned on year/mm/dd folders. When there are more than one 
> partition folder (ex: /2015/01/02/file.txt & /2015/01/03/file2.txt), the 
> select on external table, skips the DATA RECORD instead of skipping the 
> header/trailer record from one of the file). 
> tblproperties ("skip.header.line.count"="1");
> Resolution: On enabling hive.input format instead of text input format and 
> execution using TEZ engine instead of MapReduce resovled the issue. 
> How to resolve the problem without setting these parameters ? I don't want to 
> run the hive query using TEZ.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9625) Delegation tokens for HMS are not renewed

2015-06-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604719#comment-14604719
 ] 

Hive QA commented on HIVE-9625:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742392/HIVE-9625.2.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9020 tests executed
*Failed tests:*
{noformat}
TestCliDriver-protectmode2.q-authorization_create_temp_table.q-tez_self_join.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4426/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4426/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4426/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742392 - PreCommit-HIVE-TRUNK-Build

> Delegation tokens for HMS are not renewed
> -
>
> Key: HIVE-9625
> URL: https://issues.apache.org/jira/browse/HIVE-9625
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-9625-branch-1.patch, HIVE-9625.1.patch, 
> HIVE-9625.1.patch, HIVE-9625.1.patch, HIVE-9625.2.patch
>
>
> AFAICT the delegation tokens stored in [HiveSessionImplwithUGI 
> |https://github.com/apache/hive/blob/trunk/service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java#L45]
>  for HMS + Impersonation are never renewed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10438) Architecture for ResultSet Compression via external plugin

2015-06-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604704#comment-14604704
 ] 

Xuefu Zhang commented on HIVE-10438:


Here are some of my high-level thoughts:

1. I don't think Hive needs to support multiple compressors at the same time. 
This is very unlikely in a real production scenario, though different users 
might choose different compression technologies (i.e. snappy vs lzo). For 
simplicity, we should start just one. Thus, we need to two flags on server 
side: #1, enable/disable compression; #2, the class name (some sort of 
identifier) of the compressor.

2. JDBC client should be able to specify whether to use result set compression. 
This can be done via a hiveconf variable specified in JdBC connection string 
 section below:
{code}
jdbc:hive2://:/;?#
{code}
An example of this variable can be "hive.client.use.resultset.compression".

3. When updating patch, please choose "update" patch instead of "add file" so 
as to make it easy to see diffs between the patches.


> Architecture for  ResultSet Compression via external plugin
> ---
>
> Key: HIVE-10438
> URL: https://issues.apache.org/jira/browse/HIVE-10438
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive, Thrift API
>Affects Versions: 1.2.0
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
>  Labels: patch
> Attachments: HIVE-10438-1.patch, HIVE-10438.patch, 
> Proposal-rscompressor.pdf, README.txt, 
> Results_Snappy_protobuf_TBinary_TCompact.pdf, hs2ResultSetCompressor.zip, 
> hs2driver-master.zip
>
>
> This JIRA proposes an architecture for enabling ResultSet compression which 
> uses an external plugin. 
> The patch has three aspects to it: 
> 0. An architecture for enabling ResultSet compression with external plugins
> 1. An example plugin to demonstrate end-to-end functionality 
> 2. A container to allow everyone to write and test ResultSet compressors with 
> a query submitter (https://github.com/xiaom/hs2driver) 
> Also attaching a design document explaining the changes, experimental results 
> document, and a pdf explaining how to setup the docker container to observe 
> end-to-end functionality of ResultSet compression. 
> https://reviews.apache.org/r/35792/ Review board link. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9625) Delegation tokens for HMS are not renewed

2015-06-28 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-9625:
--
Attachment: HIVE-9625.2.patch

> Delegation tokens for HMS are not renewed
> -
>
> Key: HIVE-9625
> URL: https://issues.apache.org/jira/browse/HIVE-9625
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-9625-branch-1.patch, HIVE-9625.1.patch, 
> HIVE-9625.1.patch, HIVE-9625.1.patch, HIVE-9625.2.patch
>
>
> AFAICT the delegation tokens stored in [HiveSessionImplwithUGI 
> |https://github.com/apache/hive/blob/trunk/service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java#L45]
>  for HMS + Impersonation are never renewed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-06-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604682#comment-14604682
 ] 

Hive QA commented on HIVE-9557:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742383/HIVE-9557.2.patch

{color:green}SUCCESS:{color} +1 9039 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4425/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4425/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4425/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742383 - PreCommit-HIVE-TRUNK-Build

> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Nishant Kelkar
>  Labels: CosineSimilarity, SimilarityMetric, UDF
> Attachments: HIVE-9557.1.patch, HIVE-9557.2.patch, 
> udf_cosine_similarity-v01.patch
>
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-06-28 Thread Nishant Kelkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Kelkar updated HIVE-9557:
-
Attachment: HIVE-9557.2.patch

> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Nishant Kelkar
>  Labels: CosineSimilarity, SimilarityMetric, UDF
> Attachments: HIVE-9557.1.patch, HIVE-9557.2.patch, 
> udf_cosine_similarity-v01.patch
>
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9625) Delegation tokens for HMS are not renewed

2015-06-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604605#comment-14604605
 ] 

Hive QA commented on HIVE-9625:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742370/HIVE-9625.1.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4423/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4423/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4423/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] Excluding org.apache.hadoop:hadoop-yarn-common:jar:2.4.0 from the shaded 
jar.
[INFO] Excluding com.google.inject.extensions:guice-servlet:jar:3.0 from the 
shaded jar.
[INFO] Excluding com.google.inject:guice:jar:3.0 from the shaded jar.
[INFO] Excluding javax.inject:javax.inject:jar:1 from the shaded jar.
[INFO] Excluding aopalliance:aopalliance:jar:1.0 from the shaded jar.
[INFO] Excluding com.sun.jersey.contribs:jersey-guice:jar:1.9 from the shaded 
jar.
[INFO] Excluding org.apache.commons:commons-collections4:jar:4.0 from the 
shaded jar.
[INFO] Excluding org.apache.tez:tez-runtime-library:jar:0.5.2 from the shaded 
jar.
[INFO] Excluding org.apache.tez:tez-common:jar:0.5.2 from the shaded jar.
[INFO] Excluding org.apache.tez:tez-runtime-internals:jar:0.5.2 from the shaded 
jar.
[INFO] Excluding org.apache.tez:tez-mapreduce:jar:0.5.2 from the shaded jar.
[INFO] Excluding commons-collections:commons-collections:jar:3.2.1 from the 
shaded jar.
[INFO] Excluding org.apache.spark:spark-core_2.10:jar:1.3.1 from the shaded jar.
[INFO] Excluding com.twitter:chill_2.10:jar:0.5.0 from the shaded jar.
[INFO] Excluding com.twitter:chill-java:jar:0.5.0 from the shaded jar.
[INFO] Excluding org.apache.hadoop:hadoop-client:jar:1.2.1 from the shaded jar.
[INFO] Excluding org.apache.spark:spark-network-common_2.10:jar:1.3.1 from the 
shaded jar.
[INFO] Excluding org.apache.spark:spark-network-shuffle_2.10:jar:1.3.1 from the 
shaded jar.
[INFO] Excluding net.java.dev.jets3t:jets3t:jar:0.7.1 from the shaded jar.
[INFO] Excluding org.apache.curator:curator-recipes:jar:2.6.0 from the shaded 
jar.
[INFO] Excluding org.eclipse.jetty.orbit:javax.servlet:jar:3.0.0.v201112011016 
from the shaded jar.
[INFO] Excluding org.apache.commons:commons-math3:jar:3.1.1 from the shaded jar.
[INFO] Excluding org.slf4j:jul-to-slf4j:jar:1.7.10 from the shaded jar.
[INFO] Excluding org.slf4j:jcl-over-slf4j:jar:1.7.10 from the shaded jar.
[INFO] Excluding com.ning:compress-lzf:jar:1.0.0 from the shaded jar.
[INFO] Excluding net.jpountz.lz4:lz4:jar:1.2.0 from the shaded jar.
[INFO] Excluding org.roaringbitmap:RoaringBitmap:jar:0.4.5 from the shaded jar.
[INFO] Excluding commons-net:commons-net:jar:2.2 from the shaded jar.
[INFO] Excluding org.spark-project.akka:akka-remote_2.10:jar:2.3.4-spark from 
the shaded jar.
[INFO] Excluding org.spark-project.akka:akka-actor_2.10:jar:2.3.4-spark from 
the shaded jar.
[INFO] Excluding com.typesafe:config:jar:1.2.1 from the shaded jar.
[INFO] Excluding org.spark-project.protobuf:protobuf-java:jar:2.5.0-spark from 
the shaded jar.
[INFO] Excluding org.uncommons.maths:uncommons-maths:jar:1.2.2a from the shaded 
jar.
[INFO] Excluding org.spark-project.akka:akka-slf4j_2.10:jar:2.3.4-spark from 
the shaded jar.
[INFO] Excluding org.scala-lang:scala-library:jar:2.10.4 from the shaded jar.
[INFO] Excluding org.json4s:json4s-jackson_2.10:jar:3.2.10 from the shaded jar.
[INFO] Excluding org.json4s:json4s-core_2.10:jar:3.2.10 from the shaded jar.
[INFO] Excluding org.json4s:json4s-ast_2.10:jar:3.2.10 from the shaded jar.
[INFO] Excluding org.scala-lang:scalap:jar:2.10.0 from the shaded jar.
[INFO] Excluding org.scala-lang:scala-compiler:jar:2.10.0 from the shaded jar.
[INFO] Excluding org.apache.mesos:mesos:jar:shaded-protobuf:0.21.0 from the 
shaded jar.
[INFO] Excluding com.clearspring.analytics:stream:jar:2.7.0 from the shaded jar.
[INFO] Excluding io.dropwizard.metrics:metrics-graphite:jar:3.1.0 from the 
shaded jar.
[INFO] Excluding 
com.fasterxml.jackson.module:jackson-module-scala_2.10:jar:2.4.4 from the 
shaded jar.
[INFO] Excluding org.scala-lang:scala-reflect:jar:2.10.4 from the shaded jar.
[INFO] Excluding oro:oro:jar:2.0.8 from the shaded jar.
[INFO] Excluding org.tachyonproject:tachyon-client:jar:0.5.0 from the shaded 
jar.
[INFO] Excluding org.tachyonproject:tachyon:jar:0.5.0 from the shaded jar.
[INFO] Excluding org.spark-project:pyrolite:jar:2.0.1 from the shaded jar.
[INFO] Excluding net.sf.py4j:py4j:jar:0.8.2.1 from the shaded jar.
[INFO] Excluding org.spark-project.spark:unused:jar:1.0.0 from the shaded jar.
[INFO] Excluding org.apache.hadoop:hadoop-core:jar:1.2.

[jira] [Commented] (HIVE-11131) Get row information on DataWritableWriter once for better writing performance

2015-06-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604602#comment-14604602
 ] 

Hive QA commented on HIVE-11131:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742358/HIVE-11131.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9033 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4422/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4422/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4422/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742358 - PreCommit-HIVE-TRUNK-Build

> Get row information on DataWritableWriter once for better writing performance
> -
>
> Key: HIVE-11131
> URL: https://issues.apache.org/jira/browse/HIVE-11131
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 1.2.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-11131.2.patch, HIVE-11131.3.patch
>
>
> DataWritableWriter is a class used to write Hive records to Parquet files. 
> This class is getting all the information about how to parse a record, such 
> as schema and object inspector, every time a record is written (or write() is 
> called).
> We can make this class perform better by initializing some writers per data
> type once, and saving all object inspectors on each writer.
> The class expects that the next records written will have the same object 
> inspectors and schema, so there is no need to have conditions for that. When 
> a new schema is written, DataWritableWriter is created again by Parquet. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11103) Add banker's rounding BROUND UDF

2015-06-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604567#comment-14604567
 ] 

Hive QA commented on HIVE-11103:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742349/HIVE-11103.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9038 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4421/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4421/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4421/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742349 - PreCommit-HIVE-TRUNK-Build

> Add banker's rounding BROUND UDF
> 
>
> Key: HIVE-11103
> URL: https://issues.apache.org/jira/browse/HIVE-11103
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-11103.1.patch, HIVE-11103.1.patch
>
>
> Banker's rounding: the value is rounded to the nearest even number. Also 
> known as "Gaussian rounding", and, in German, "mathematische Rundung".
> Example
> {code}
>   2 digits2 digits
> Unrounded"Standard" rounding"Gaussian" rounding
>   54.1754  54.18  54.18
>  343.2050 343.21 343.20
> +106.2038+106.20+106.20 
> =======
>  503.5842 503.59 503.58
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)