[jira] [Updated] (HIVE-11110) Enable HiveJoinAddNotNullRule in CBO

2015-06-27 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-0:
--
Attachment: HIVE-0.1.patch

> Enable HiveJoinAddNotNullRule in CBO
> 
>
> Key: HIVE-0
> URL: https://issues.apache.org/jira/browse/HIVE-0
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-0.1.patch, HIVE-0.patch
>
>
> Query
> {code}
> select  count(*)
>  from store_sales
>  ,store_returns
>  ,date_dim d1
>  ,date_dim d2
>  where d1.d_quarter_name = '2000Q1'
>and d1.d_date_sk = ss_sold_date_sk
>and ss_customer_sk = sr_customer_sk
>and ss_item_sk = sr_item_sk
>and ss_ticket_number = sr_ticket_number
>and sr_returned_date_sk = d2.d_date_sk
>and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’);
> {code}
> The store_sales table is partitioned on ss_sold_date_sk, which is also used 
> in a join clause. The join clause should add a filter “filterExpr: 
> ss_sold_date_sk is not null”, which should get pushed the MetaStore when 
> fetching the stats. Currently this is not done in CBO planning, which results 
> in the stats from __HIVE_DEFAULT_PARTITION__ to be fetched and considered in 
> the optimization phase. In particular, this increases the NDV for the join 
> columns and may result in wrong planning.
> Including HiveJoinAddNotNullRule in the optimization phase solves this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11131) Get row information on DataWritableWriter once for better writing performance

2015-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604054#comment-14604054
 ] 

Hive QA commented on HIVE-11131:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742287/HIVE-11131.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9030 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_schema_evolution
org.apache.hadoop.hive.ql.io.parquet.TestDataWritableWriter.testStructType
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4406/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4406/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4406/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742287 - PreCommit-HIVE-TRUNK-Build

> Get row information on DataWritableWriter once for better writing performance
> -
>
> Key: HIVE-11131
> URL: https://issues.apache.org/jira/browse/HIVE-11131
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 1.2.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-11131.2.patch
>
>
> DataWritableWriter is a class used to write Hive records to Parquet files. 
> This class is getting all the information about how to parse a record, such 
> as schema and object inspector, every time a record is written (or write() is 
> called).
> We can make this class perform better by initializing some writers per data
> type once, and saving all object inspectors on each writer.
> The class expects that the next records written will have the same object 
> inspectors and schema, so there is no need to have conditions for that. When 
> a new schema is written, DataWritableWriter is created again by Parquet. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-06-27 Thread Nishant Kelkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604068#comment-14604068
 ] 

Nishant Kelkar commented on HIVE-9557:
--

Figured out the issue. Made a dummy var. HADOOP_HOME point to HIVE_HOME. Also, 
removed commented out queries from the udf_cosine_similarity.q clientpositive 
file. I'll upload a patch with an RB link soon.

> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Nishant Kelkar
>  Labels: CosineSimilarity, SimilarityMetric, UDF
> Attachments: udf_cosine_similarity-v01.patch
>
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-06-27 Thread Nishant Kelkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Kelkar updated HIVE-9557:
-
Attachment: HIVE-9557.1.patch

Attached first revision on cosine similarity UDF.

> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Nishant Kelkar
>  Labels: CosineSimilarity, SimilarityMetric, UDF
> Attachments: HIVE-9557.1.patch, udf_cosine_similarity-v01.patch
>
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7150) FileInputStream is not closed in HiveConnection#getHttpClient()

2015-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604074#comment-14604074
 ] 

Hive QA commented on HIVE-7150:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742291/HIVE-7150.3.patch

{color:green}SUCCESS:{color} +1 9030 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4407/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4407/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4407/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742291 - PreCommit-HIVE-TRUNK-Build

> FileInputStream is not closed in HiveConnection#getHttpClient()
> ---
>
> Key: HIVE-7150
> URL: https://issues.apache.org/jira/browse/HIVE-7150
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Alexander Pivovarov
>  Labels: jdbc
> Attachments: HIVE-7150.1.patch, HIVE-7150.2.patch, HIVE-7150.3.patch
>
>
> Here is related code:
> {code}
> sslTrustStore.load(new FileInputStream(sslTrustStorePath),
> sslTrustStorePassword.toCharArray());
> {code}
> The FileInputStream is not closed upon returning from the method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11128) Stats annotation should consider select star same as select without column list

2015-06-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11128:

Attachment: HIVE-11128.3.patch

> Stats annotation should consider select star same as select without column 
> list
> ---
>
> Key: HIVE-11128
> URL: https://issues.apache.org/jira/browse/HIVE-11128
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 1.2.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11128.2.patch, HIVE-11128.3.patch, HIVE-11128.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11123) Fix how to confirm the RDBMS product name at Metastore.

2015-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604124#comment-14604124
 ] 

Hive QA commented on HIVE-11123:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742290/HIVE-11123.1.patch

{color:green}SUCCESS:{color} +1 9030 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4408/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4408/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4408/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742290 - PreCommit-HIVE-TRUNK-Build

> Fix how to confirm the RDBMS product name at Metastore.
> ---
>
> Key: HIVE-11123
> URL: https://issues.apache.org/jira/browse/HIVE-11123
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.0
> Environment: PostgreSQL
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
> Attachments: HIVE-11123.1.patch
>
>
> I use PostgreSQL to Hive Metastore. And I saw the following message at 
> PostgreSQL log.
> {code}
> < 2015-06-26 10:58:15.488 JST >ERROR:  syntax error at or near "@@" at 
> character 5
> < 2015-06-26 10:58:15.488 JST >STATEMENT:  SET @@session.sql_mode=ANSI_QUOTES
> < 2015-06-26 10:58:15.489 JST >ERROR:  relation "v$instance" does not exist 
> at character 21
> < 2015-06-26 10:58:15.489 JST >STATEMENT:  SELECT version FROM v$instance
> < 2015-06-26 10:58:15.490 JST >ERROR:  column "version" does not exist at 
> character 10
> < 2015-06-26 10:58:15.490 JST >STATEMENT:  SELECT @@version
> {code}
> When Hive CLI and Beeline embedded mode are carried out, this message is 
> output to PostgreSQL log.
> These queries are called from MetaStoreDirectSql#determineDbType. And if we 
> use MetaStoreDirectSql#getProductName, we need not to call these queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7180) BufferedReader is not closed in MetaStoreSchemaInfo ctor

2015-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604142#comment-14604142
 ] 

Hive QA commented on HIVE-7180:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742293/HIVE-7180.3.patch

{color:green}SUCCESS:{color} +1 9030 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4409/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4409/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4409/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742293 - PreCommit-HIVE-TRUNK-Build

> BufferedReader is not closed in MetaStoreSchemaInfo ctor
> 
>
> Key: HIVE-7180
> URL: https://issues.apache.org/jira/browse/HIVE-7180
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Ted Yu
>Assignee: Alexander Pivovarov
>Priority: Minor
>  Labels: patch
> Attachments: HIVE-7180.3.patch, HIVE-7180.patch, HIVE-7180_001.patch
>
>
> Here is related code:
> {code}
>   BufferedReader bfReader =
> new BufferedReader(new FileReader(upgradeListFile));
>   String currSchemaVersion;
>   while ((currSchemaVersion = bfReader.readLine()) != null) {
> upgradeOrderList.add(currSchemaVersion.trim());
> {code}
> BufferedReader / FileReader should be closed upon return from ctor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11134) HS2 should log open session failure

2015-06-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604160#comment-14604160
 ] 

ASF GitHub Bot commented on HIVE-11134:
---

GitHub user thejasmn opened a pull request:

https://github.com/apache/hive/pull/43

HIVE-11134 - HS2 should log open session failure



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/thejasmn/hive HIVE-11134

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/43.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #43


commit 4c2fe6c877dcc4c387207edc5a4c0ff5e2dcaa71
Author: Thejas Nair 
Date:   2015-06-27T14:04:40Z

HIVE-11134 - HS2 should log open session failure




> HS2 should log open session failure
> ---
>
> Key: HIVE-11134
> URL: https://issues.apache.org/jira/browse/HIVE-11134
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>
> HiveServer2 should log OpenSession failure.  If beeline is not running with 
> "--verbose=true" all stack trace information is not available for later 
> debugging, as it is not currently logged in server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11134) HS2 should log open session failure

2015-06-27 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-11134:
-
Attachment: HIVE-11134.1.patch

> HS2 should log open session failure
> ---
>
> Key: HIVE-11134
> URL: https://issues.apache.org/jira/browse/HIVE-11134
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-11134.1.patch
>
>
> HiveServer2 should log OpenSession failure.  If beeline is not running with 
> "--verbose=true" all stack trace information is not available for later 
> debugging, as it is not currently logged in server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11122) ORC should not record the timezone information when there are no timestamp columns

2015-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604164#comment-14604164
 ] 

Hive QA commented on HIVE-11122:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742295/HIVE-11122.1.patch

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 9028 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_stats_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_full
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_partial
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_partial_ndv
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_ptf
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_alter_merge_orc
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_alter_merge_stats_orc
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_ptf
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_alter_merge_orc
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_alter_merge_stats_orc
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_ptf
org.apache.hadoop.hive.ql.io.orc.TestColumnStatistics.testHasNull
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testCombinationInputFormatWithAcid
org.apache.hadoop.hive.ql.io.orc.TestJsonFileDump.testJsonDump
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4410/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4410/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4410/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742295 - PreCommit-HIVE-TRUNK-Build

> ORC should not record the timezone information when there are no timestamp 
> columns
> --
>
> Key: HIVE-11122
> URL: https://issues.apache.org/jira/browse/HIVE-11122
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11122.1.patch, HIVE-11122.patch
>
>
> Currently ORC records the time zone information in the stripe footer even 
> when there are no timestamp columns. This will not only add to the size of 
> the footer but also can cause inconsistencies (file size difference) in test 
> cases when run under different time zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7150) FileInputStream is not closed in HiveConnection#getHttpClient()

2015-06-27 Thread Gabor Liptak (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604186#comment-14604186
 ] 

Gabor Liptak commented on HIVE-7150:


[~apivovarov] Thank you for extending the patch


> FileInputStream is not closed in HiveConnection#getHttpClient()
> ---
>
> Key: HIVE-7150
> URL: https://issues.apache.org/jira/browse/HIVE-7150
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Alexander Pivovarov
>  Labels: jdbc
> Attachments: HIVE-7150.1.patch, HIVE-7150.2.patch, HIVE-7150.3.patch
>
>
> Here is related code:
> {code}
> sslTrustStore.load(new FileInputStream(sslTrustStorePath),
> sslTrustStorePassword.toCharArray());
> {code}
> The FileInputStream is not closed upon returning from the method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604214#comment-14604214
 ] 

Hive QA commented on HIVE-10233:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742296/HIVE-10233.24.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9030 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4411/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4411/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4411/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742296 - PreCommit-HIVE-TRUNK-Build

> Hive on tez: memory manager for grace hash join
> ---
>
> Key: HIVE-10233
> URL: https://issues.apache.org/jira/browse/HIVE-10233
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap, 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Gunther Hagleitner
> Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
> HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
> HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
> HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
> HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch, 
> HIVE-10233.15.patch, HIVE-10233.16.patch, HIVE-10233.17.patch, 
> HIVE-10233.18.patch, HIVE-10233.19.patch, HIVE-10233.20.patch, 
> HIVE-10233.21.patch, HIVE-10233.22.patch, HIVE-10233.23.patch, 
> HIVE-10233.24.patch
>
>
> We need a memory manager in llap/tez to manage the usage of memory across 
> threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11135) Fix the Beeline set and save command in order to avoid the NullPointerException

2015-06-27 Thread Shinichi Yamashita (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichi Yamashita updated HIVE-11135:
--
Attachment: HIVE-11135.1.patch

> Fix the Beeline set and save command in order to avoid the 
> NullPointerException
> ---
>
> Key: HIVE-11135
> URL: https://issues.apache.org/jira/browse/HIVE-11135
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
> Attachments: HIVE-11135.1.patch
>
>
> When I run set and save command at Beeline in my environment. And 
> NullPointerException occurred as follows.
> {code}
> [root@hive ~]# /usr/local/hive/bin/beeline
> Beeline version 2.0.0-SNAPSHOT by Apache Hive
> beeline> !set
> java.lang.NullPointerException
> beeline> !save
> Saving preferences to: /root/.beeline/beeline.properties
> java.lang.NullPointerException
> {code}
> This problem has occurred because the following method's return value in 
> BeeLineOpts#toProperties is null.
> {code}
> beeLine.getReflector().invoke(this, "get" + names[i], new 
> Object[0]).toString()
> {code}
> Therefore it is modified so as to avoid NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11135) Fix the Beeline set and save command in order to avoid the NullPointerException

2015-06-27 Thread Shinichi Yamashita (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604230#comment-14604230
 ] 

Shinichi Yamashita commented on HIVE-11135:
---

I attach a patch file and it is output as follows.

{code}
beeline> !set
authtype
autocommit  false
autosavefalse
color   false
delimiterfordsv |
entirelineascommand false
fastconnect true
force   false
headerinterval  100
historyfile /root/.beeline/history
hiveconfvariables   {}
hivevariables   {}
incremental false
initfile
isolation   TRANSACTION_REPEATABLE_READ
maxcolumnwidth  15
maxheight   63
maxwidth237
nullemptystring false
nullstring  NULL
numberformatdefault
outputformattable
propertiesfile  /root/.beeline/beeline.properties
scriptfile
showelapsedtime true
showheader  true
shownestederrs  false
showwarningsfalse
timeout -1
trimscripts true
truncatetable   false
verbose false
{code}

> Fix the Beeline set and save command in order to avoid the 
> NullPointerException
> ---
>
> Key: HIVE-11135
> URL: https://issues.apache.org/jira/browse/HIVE-11135
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
> Attachments: HIVE-11135.1.patch
>
>
> When I run set and save command at Beeline in my environment. And 
> NullPointerException occurred as follows.
> {code}
> [root@hive ~]# /usr/local/hive/bin/beeline
> Beeline version 2.0.0-SNAPSHOT by Apache Hive
> beeline> !set
> java.lang.NullPointerException
> beeline> !save
> Saving preferences to: /root/.beeline/beeline.properties
> java.lang.NullPointerException
> {code}
> This problem has occurred because the following method's return value in 
> BeeLineOpts#toProperties is null.
> {code}
> beeLine.getReflector().invoke(this, "get" + names[i], new 
> Object[0]).toString()
> {code}
> Therefore it is modified so as to avoid NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11071) FIx the output of beeline dbinfo command

2015-06-27 Thread Shinichi Yamashita (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shinichi Yamashita updated HIVE-11071:
--
Attachment: HIVE-11071.1.patch

To run the Hive test, I attach a patch file again.

> FIx the output of beeline dbinfo command
> 
>
> Key: HIVE-11071
> URL: https://issues.apache.org/jira/browse/HIVE-11071
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
> Attachments: HIVE-11071-001-output.txt, HIVE-11071-001.patch, 
> HIVE-11071.1.patch
>
>
> When dbinfo is executed by beeline, it is displayed as follows. 
> {code}
> 0: jdbc:hive2://localhost:10001/> !dbinfo
> Error: Method not supported (state=,code=0)
> allTablesAreSelectabletrue
> Error: Method not supported (state=,code=0)
> Error: Method not supported (state=,code=0)
> Error: Method not supported (state=,code=0)
> getCatalogSeparator   .
> getCatalogTerminstance
> getDatabaseProductNameApache Hive
> getDatabaseProductVersion 2.0.0-SNAPSHOT
> getDefaultTransactionIsolation0
> getDriverMajorVersion 1
> getDriverMinorVersion 1
> getDriverName Hive JDBC
> ...
> {code}
> The method name of Error is not understood. I fix this output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10233) Hive on tez: memory manager for grace hash join

2015-06-27 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604268#comment-14604268
 ] 

Vikram Dixit K commented on HIVE-10233:
---

+1 for the latest iteration.

> Hive on tez: memory manager for grace hash join
> ---
>
> Key: HIVE-10233
> URL: https://issues.apache.org/jira/browse/HIVE-10233
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap, 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Gunther Hagleitner
> Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
> HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch, 
> HIVE-10233-WIP-7.patch, HIVE-10233-WIP-8.patch, HIVE-10233.08.patch, 
> HIVE-10233.09.patch, HIVE-10233.10.patch, HIVE-10233.11.patch, 
> HIVE-10233.12.patch, HIVE-10233.13.patch, HIVE-10233.14.patch, 
> HIVE-10233.15.patch, HIVE-10233.16.patch, HIVE-10233.17.patch, 
> HIVE-10233.18.patch, HIVE-10233.19.patch, HIVE-10233.20.patch, 
> HIVE-10233.21.patch, HIVE-10233.22.patch, HIVE-10233.23.patch, 
> HIVE-10233.24.patch
>
>
> We need a memory manager in llap/tez to manage the usage of memory across 
> threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11110) Enable HiveJoinAddNotNullRule in CBO

2015-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604269#comment-14604269
 ] 

Hive QA commented on HIVE-0:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742303/HIVE-0.1.patch

{color:red}ERROR:{color} -1 due to 131 failed/errored test(s), 9030 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join22
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join33
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_without_localtask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_logical
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_self_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join22
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join33
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join34
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join35
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_alt_syntax
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_merge_multi_expressions
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_oneskew_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_hook
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mergejoins
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataonly1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_gby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_gby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_gby_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_random
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_vc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_regex_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_semijoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestCli

[jira] [Commented] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604315#comment-14604315
 ] 

Hive QA commented on HIVE-9557:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742310/HIVE-9557.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9034 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_cosine_similarity_error_2
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4413/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4413/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4413/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742310 - PreCommit-HIVE-TRUNK-Build

> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Nishant Kelkar
>  Labels: CosineSimilarity, SimilarityMetric, UDF
> Attachments: HIVE-9557.1.patch, udf_cosine_similarity-v01.patch
>
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-06-27 Thread Nishant Kelkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604318#comment-14604318
 ] 

Nishant Kelkar commented on HIVE-9557:
--

I'm not handling the "both empty strings" case. Will upload an updated patch.

> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Nishant Kelkar
>  Labels: CosineSimilarity, SimilarityMetric, UDF
> Attachments: HIVE-9557.1.patch, udf_cosine_similarity-v01.patch
>
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-06-27 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604327#comment-14604327
 ] 

Alexander Pivovarov commented on HIVE-9557:
---

Hi Nishant, if I click on RB link I get this
{code}
You don't have access to this review request.

This review request is private. You must be a requested reviewer, either 
directly or on a requested group, and have permission to access the repository 
in order to view this review request.
{code}

I recommend to use RBTTools to upload and update patches.
for initial upload
{code}
rbt post -g yes
{code}
to update
{code}
rbt post -u -g yes 
{code}
https://www.reviewboard.org/downloads/rbtools/


> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Nishant Kelkar
>  Labels: CosineSimilarity, SimilarityMetric, UDF
> Attachments: HIVE-9557.1.patch, udf_cosine_similarity-v01.patch
>
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-06-27 Thread Nishant Kelkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604335#comment-14604335
 ] 

Nishant Kelkar commented on HIVE-9557:
--

Hey Alexander,
Hmmm, in the review settings, I've added the group 'hive' and the user 
'apivovarov'. 

I used rbt to create and upload the ticket to the Apache server.

> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Nishant Kelkar
>  Labels: CosineSimilarity, SimilarityMetric, UDF
> Attachments: HIVE-9557.1.patch, udf_cosine_similarity-v01.patch
>
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7180) BufferedReader is not closed in MetaStoreSchemaInfo ctor

2015-06-27 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-7180:
--
Attachment: HIVE-7180.4.patch

patch #4
- using java 7 try-with-resources

> BufferedReader is not closed in MetaStoreSchemaInfo ctor
> 
>
> Key: HIVE-7180
> URL: https://issues.apache.org/jira/browse/HIVE-7180
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Ted Yu
>Assignee: Alexander Pivovarov
>Priority: Minor
>  Labels: patch
> Attachments: HIVE-7180.3.patch, HIVE-7180.4.patch, HIVE-7180.patch, 
> HIVE-7180_001.patch
>
>
> Here is related code:
> {code}
>   BufferedReader bfReader =
> new BufferedReader(new FileReader(upgradeListFile));
>   String currSchemaVersion;
>   while ((currSchemaVersion = bfReader.readLine()) != null) {
> upgradeOrderList.add(currSchemaVersion.trim());
> {code}
> BufferedReader / FileReader should be closed upon return from ctor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7150) FileInputStream is not closed in HiveConnection#getHttpClient()

2015-06-27 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-7150:
--
Attachment: HIVE-7150.4.patch

patch #4
- using try-with-resources

> FileInputStream is not closed in HiveConnection#getHttpClient()
> ---
>
> Key: HIVE-7150
> URL: https://issues.apache.org/jira/browse/HIVE-7150
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Alexander Pivovarov
>  Labels: jdbc
> Attachments: HIVE-7150.1.patch, HIVE-7150.2.patch, HIVE-7150.3.patch, 
> HIVE-7150.4.patch
>
>
> Here is related code:
> {code}
> sslTrustStore.load(new FileInputStream(sslTrustStorePath),
> sslTrustStorePassword.toCharArray());
> {code}
> The FileInputStream is not closed upon returning from the method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-06-27 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604346#comment-14604346
 ] 

Alexander Pivovarov commented on HIVE-9557:
---

The review request should be public.
https://www.reviewboard.org/docs/manual/2.5/admin/configuration/access-control/#access-control

> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Nishant Kelkar
>  Labels: CosineSimilarity, SimilarityMetric, UDF
> Attachments: HIVE-9557.1.patch, udf_cosine_similarity-v01.patch
>
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11103) Add banker's rounding BROUND UDF

2015-06-27 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-11103:
---
Attachment: HIVE-11103.1.patch

attaching patch #1 again

> Add banker's rounding BROUND UDF
> 
>
> Key: HIVE-11103
> URL: https://issues.apache.org/jira/browse/HIVE-11103
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-11103.1.patch, HIVE-11103.1.patch
>
>
> Banker's rounding: the value is rounded to the nearest even number. Also 
> known as "Gaussian rounding", and, in German, "mathematische Rundung".
> Example
> {code}
>   2 digits2 digits
> Unrounded"Standard" rounding"Gaussian" rounding
>   54.1754  54.18  54.18
>  343.2050 343.21 343.20
> +106.2038+106.20+106.20 
> =======
>  503.5842 503.59 503.58
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7150) FileInputStream is not closed in HiveConnection#getHttpClient()

2015-06-27 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-7150:
--
Component/s: JDBC

> FileInputStream is not closed in HiveConnection#getHttpClient()
> ---
>
> Key: HIVE-7150
> URL: https://issues.apache.org/jira/browse/HIVE-7150
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Ted Yu
>Assignee: Alexander Pivovarov
>  Labels: jdbc
> Attachments: HIVE-7150.1.patch, HIVE-7150.2.patch, HIVE-7150.3.patch, 
> HIVE-7150.4.patch
>
>
> Here is related code:
> {code}
> sslTrustStore.load(new FileInputStream(sslTrustStorePath),
> sslTrustStorePassword.toCharArray());
> {code}
> The FileInputStream is not closed upon returning from the method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-06-27 Thread Nishant Kelkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604349#comment-14604349
 ] 

Nishant Kelkar commented on HIVE-9557:
--

Done. Could you please test for access now?

> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Nishant Kelkar
>  Labels: CosineSimilarity, SimilarityMetric, UDF
> Attachments: HIVE-9557.1.patch, udf_cosine_similarity-v01.patch
>
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7180) BufferedReader is not closed in MetaStoreSchemaInfo ctor

2015-06-27 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-7180:
--
Labels:   (was: patch)

> BufferedReader is not closed in MetaStoreSchemaInfo ctor
> 
>
> Key: HIVE-7180
> URL: https://issues.apache.org/jira/browse/HIVE-7180
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.1
>Reporter: Ted Yu
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-7180.3.patch, HIVE-7180.4.patch, HIVE-7180.patch, 
> HIVE-7180_001.patch
>
>
> Here is related code:
> {code}
>   BufferedReader bfReader =
> new BufferedReader(new FileReader(upgradeListFile));
>   String currSchemaVersion;
>   while ((currSchemaVersion = bfReader.readLine()) != null) {
> upgradeOrderList.add(currSchemaVersion.trim());
> {code}
> BufferedReader / FileReader should be closed upon return from ctor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7150) FileInputStream is not closed in HiveConnection#getHttpClient()

2015-06-27 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-7150:
--
Labels:   (was: jdbc)

> FileInputStream is not closed in HiveConnection#getHttpClient()
> ---
>
> Key: HIVE-7150
> URL: https://issues.apache.org/jira/browse/HIVE-7150
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Ted Yu
>Assignee: Alexander Pivovarov
> Attachments: HIVE-7150.1.patch, HIVE-7150.2.patch, HIVE-7150.3.patch, 
> HIVE-7150.4.patch
>
>
> Here is related code:
> {code}
> sslTrustStore.load(new FileInputStream(sslTrustStorePath),
> sslTrustStorePassword.toCharArray());
> {code}
> The FileInputStream is not closed upon returning from the method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7180) BufferedReader is not closed in MetaStoreSchemaInfo ctor

2015-06-27 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-7180:
--
Component/s: Metastore

> BufferedReader is not closed in MetaStoreSchemaInfo ctor
> 
>
> Key: HIVE-7180
> URL: https://issues.apache.org/jira/browse/HIVE-7180
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.1
>Reporter: Ted Yu
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-7180.3.patch, HIVE-7180.4.patch, HIVE-7180.patch, 
> HIVE-7180_001.patch
>
>
> Here is related code:
> {code}
>   BufferedReader bfReader =
> new BufferedReader(new FileReader(upgradeListFile));
>   String currSchemaVersion;
>   while ((currSchemaVersion = bfReader.readLine()) != null) {
> upgradeOrderList.add(currSchemaVersion.trim());
> {code}
> BufferedReader / FileReader should be closed upon return from ctor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11134) HS2 should log open session failure

2015-06-27 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604359#comment-14604359
 ] 

Alexander Pivovarov commented on HIVE-11134:


+1

> HS2 should log open session failure
> ---
>
> Key: HIVE-11134
> URL: https://issues.apache.org/jira/browse/HIVE-11134
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-11134.1.patch
>
>
> HiveServer2 should log OpenSession failure.  If beeline is not running with 
> "--verbose=true" all stack trace information is not available for later 
> debugging, as it is not currently logged in server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo

2015-06-27 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604360#comment-14604360
 ] 

Alexander Pivovarov commented on HIVE-9557:
---

I can open the RB now.
I noticed that some people use github PR now as an alternative to RB. e.g. 
https://issues.apache.org/jira/browse/HIVE-11134

> create UDF to measure strings similarity using Cosine Similarity algo
> -
>
> Key: HIVE-9557
> URL: https://issues.apache.org/jira/browse/HIVE-9557
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Nishant Kelkar
>  Labels: CosineSimilarity, SimilarityMetric, UDF
> Attachments: HIVE-9557.1.patch, udf_cosine_similarity-v01.patch
>
>
> algo description http://en.wikipedia.org/wiki/Cosine_similarity
> {code}
> --one word different, total 2 words
> str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f
> {code}
> reference implementation:
> https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10647) Hive on LLAP: Limit HS2 from overwhelming LLAP

2015-06-27 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604361#comment-14604361
 ] 

Lefty Leverenz commented on HIVE-10647:
---

Doc note:  This adds configuration parameter 
*hive.server2.llap.concurrent.queries* so I'm linking to HIVE-9850 
(documentation for llap).  When the llap branch merges to trunk, this parameter 
will need to be documented in the HiveServer2 section of Configuration 
Properties:

* [Configuration Properties -- HiveServer2 | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveServer2]

> Hive on LLAP: Limit HS2 from overwhelming LLAP
> --
>
> Key: HIVE-10647
> URL: https://issues.apache.org/jira/browse/HIVE-10647
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: llap
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Fix For: llap
>
> Attachments: HIVE-10647.1.patch, HIVE-10647.2.patch, 
> HIVE-10647.3.patch, HIVE-10647.4.patch
>
>
> We want to restrict the number of queries that flow through LLAP. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11128) Stats annotation should consider select star same as select without column list

2015-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604365#comment-14604365
 ] 

Hive QA commented on HIVE-11128:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742313/HIVE-11128.3.patch

{color:green}SUCCESS:{color} +1 9030 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4414/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4414/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4414/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742313 - PreCommit-HIVE-TRUNK-Build

> Stats annotation should consider select star same as select without column 
> list
> ---
>
> Key: HIVE-11128
> URL: https://issues.apache.org/jira/browse/HIVE-11128
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 1.2.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11128.2.patch, HIVE-11128.3.patch, HIVE-11128.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11128) Stats annotation should consider select star same as select without column list

2015-06-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604375#comment-14604375
 ] 

Prasanth Jayachandran commented on HIVE-11128:
--

LGTM, +1

> Stats annotation should consider select star same as select without column 
> list
> ---
>
> Key: HIVE-11128
> URL: https://issues.apache.org/jira/browse/HIVE-11128
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 1.2.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11128.2.patch, HIVE-11128.3.patch, HIVE-11128.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11122) ORC should not record the timezone information when there are no timestamp columns

2015-06-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604378#comment-14604378
 ] 

Prasanth Jayachandran commented on HIVE-11122:
--

[~gopalv] That run is for HIVE-2 patch and not for this one.

> ORC should not record the timezone information when there are no timestamp 
> columns
> --
>
> Key: HIVE-11122
> URL: https://issues.apache.org/jira/browse/HIVE-11122
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11122.1.patch, HIVE-11122.patch
>
>
> Currently ORC records the time zone information in the stripe footer even 
> when there are no timestamp columns. This will not only add to the size of 
> the footer but also can cause inconsistencies (file size difference) in test 
> cases when run under different time zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11123) Fix how to confirm the RDBMS product name at Metastore.

2015-06-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604380#comment-14604380
 ] 

Ashutosh Chauhan commented on HIVE-11123:
-

Have you tested this with other DBs like mysql, MS sql server and oracle?

> Fix how to confirm the RDBMS product name at Metastore.
> ---
>
> Key: HIVE-11123
> URL: https://issues.apache.org/jira/browse/HIVE-11123
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.0
> Environment: PostgreSQL
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
> Attachments: HIVE-11123.1.patch
>
>
> I use PostgreSQL to Hive Metastore. And I saw the following message at 
> PostgreSQL log.
> {code}
> < 2015-06-26 10:58:15.488 JST >ERROR:  syntax error at or near "@@" at 
> character 5
> < 2015-06-26 10:58:15.488 JST >STATEMENT:  SET @@session.sql_mode=ANSI_QUOTES
> < 2015-06-26 10:58:15.489 JST >ERROR:  relation "v$instance" does not exist 
> at character 21
> < 2015-06-26 10:58:15.489 JST >STATEMENT:  SELECT version FROM v$instance
> < 2015-06-26 10:58:15.490 JST >ERROR:  column "version" does not exist at 
> character 10
> < 2015-06-26 10:58:15.490 JST >STATEMENT:  SELECT @@version
> {code}
> When Hive CLI and Beeline embedded mode are carried out, this message is 
> output to PostgreSQL log.
> These queries are called from MetaStoreDirectSql#determineDbType. And if we 
> use MetaStoreDirectSql#getProductName, we need not to call these queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)



[jira] [Updated] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2015-06-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11097:

Assignee: Wan Chang

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604384#comment-14604384
 ] 

Ashutosh Chauhan commented on HIVE-11095:
-

This one seems to be same issue as HIVE-2 If so, we should close this as 
dupe, since one on HIVE-2 has a patch which contains a test case.

> SerDeUtils  another bug ,when Text is reused
> 
>
> Key: HIVE-11095
> URL: https://issues.apache.org/jira/browse/HIVE-11095
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
> Fix For: 1.2.0
>
> Attachments: HIVE-11095.1.patch.txt, HIVE-11095.2.patch.txt
>
>
> {noformat}
> The method transformTextFromUTF8 have a  error bug, It invoke a bad method of 
> Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> How I found this bug?
> When i query data from a lzo table , I found in results : the length of the 
> current row is always largr than the previous row, and sometimes,the current 
> row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select * from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
> `line` string)
> PARTITIONED BY (
> `logdate` string)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '
> U'
> WITH SERDEPROPERTIES (
> 'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
> OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
> 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11128) Stats Annotation misses extracting stats for cols in some cases

2015-06-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11128:

Summary: Stats Annotation misses extracting stats for cols in some cases  
(was: Stats annotation should consider select star same as select without 
column list)

> Stats Annotation misses extracting stats for cols in some cases
> ---
>
> Key: HIVE-11128
> URL: https://issues.apache.org/jira/browse/HIVE-11128
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 1.2.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 2.0.0
>
> Attachments: HIVE-11128.2.patch, HIVE-11128.3.patch, HIVE-11128.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11052) Unify HiveSessionBase#getusername method

2015-06-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604396#comment-14604396
 ] 

Ashutosh Chauhan commented on HIVE-11052:
-

+1

> Unify HiveSessionBase#getusername method
> 
>
> Key: HIVE-11052
> URL: https://issues.apache.org/jira/browse/HIVE-11052
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
>Priority: Minor
> Attachments: HIVE-11052-001.patch
>
>
> Current HiveSessionBase has two methods call getUserName() and getUsername().
> These two methods are united to getUserName().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11037) HiveOnTez: make explain user level = true as default

2015-06-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11037:

Component/s: Diagnosability

> HiveOnTez: make explain user level = true as default
> 
>
> Key: HIVE-11037
> URL: https://issues.apache.org/jira/browse/HIVE-11037
> Project: Hive
>  Issue Type: Improvement
>  Components: Diagnosability
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11037.01.patch, HIVE-11037.02.patch, 
> HIVE-11037.03.patch, HIVE-11037.04.patch, HIVE-11037.05.patch, 
> HIVE-11037.06.patch, HIVE-11037.07.patch, HIVE-11037.08.patch
>
>
> In Hive-9780, we introduced a new level of explain for hive on tez. We would 
> like to make it running by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10996) Aggregation / Projection over Multi-Join Inner Query producing incorrect results

2015-06-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10996:

Component/s: (was: Hive)
 Query Planning

> Aggregation / Projection over Multi-Join Inner Query producing incorrect 
> results
> 
>
> Key: HIVE-10996
> URL: https://issues.apache.org/jira/browse/HIVE-10996
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0
>Reporter: Gautam Kowshik
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Fix For: 2.0.0, 1.2.2
>
> Attachments: HIVE-10996.01.patch, HIVE-10996.02.patch, 
> HIVE-10996.03.patch, HIVE-10996.04.patch, HIVE-10996.05.patch, 
> HIVE-10996.06.patch, HIVE-10996.07.patch, HIVE-10996.08.patch, 
> HIVE-10996.09.patch, HIVE-10996.patch, explain_q1.txt, explain_q2.txt
>
>
> We see the following problem on 1.1.0 and 1.2.0 but not 0.13 which seems like 
> a regression.
> The following query (Q1) produces no results:
> {code}
> select s
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> {code}
> While this one (Q2) does produce results :
> {code}
> select *
> from (
>   select last.*, action.st2, action.n
>   from (
> select purchase.s, purchase.timestamp, max (mevt.timestamp) as 
> last_stage_timestamp
> from (select * from purchase_history) purchase
> join (select * from cart_history) mevt
> on purchase.s = mevt.s
> where purchase.timestamp > mevt.timestamp
> group by purchase.s, purchase.timestamp
>   ) last
>   join (select * from events) action
>   on last.s = action.s and last.last_stage_timestamp = action.timestamp
> ) list;
> 1 21  20  Bob 1234
> 1 31  30  Bob 1234
> 3 51  50  Jeff1234
> {code}
> The setup to test this is:
> {code}
> create table purchase_history (s string, product string, price double, 
> timestamp int);
> insert into purchase_history values ('1', 'Belt', 20.00, 21);
> insert into purchase_history values ('1', 'Socks', 3.50, 31);
> insert into purchase_history values ('3', 'Belt', 20.00, 51);
> insert into purchase_history values ('4', 'Shirt', 15.50, 59);
> create table cart_history (s string, cart_id int, timestamp int);
> insert into cart_history values ('1', 1, 10);
> insert into cart_history values ('1', 2, 20);
> insert into cart_history values ('1', 3, 30);
> insert into cart_history values ('1', 4, 40);
> insert into cart_history values ('3', 5, 50);
> insert into cart_history values ('4', 6, 60);
> create table events (s string, st2 string, n int, timestamp int);
> insert into events values ('1', 'Bob', 1234, 20);
> insert into events values ('1', 'Bob', 1234, 30);
> insert into events values ('1', 'Bob', 1234, 25);
> insert into events values ('2', 'Sam', 1234, 30);
> insert into events values ('3', 'Jeff', 1234, 50);
> insert into events values ('4', 'Ted', 1234, 60);
> {code}
> I realize select * and select s are not all that interesting in this context 
> but what lead us to this issue was select count(distinct s) was not returning 
> results. The above queries are the simplified queries that produce the issue. 
> I will note that if I convert the inner join to a table and select from that 
> the issue does not appear.
> Update: Found that turning off  hive.optimize.remove.identity.project fixes 
> this issue. This optimization was introduced in 
> https://issues.apache.org/jira/browse/HIVE-8435



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11112) ISO-8859-1 text output has fragments of previous longer rows appended

2015-06-27 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-2:

Description: 
If a LazySimpleSerDe table is created using ISO 8859-1 encoding, query results 
for a string column are incorrect for any row that was preceded by a row 
containing a longer string.

Example steps to reproduce:

1. Create a table using ISO 8859-1 encoding:
{code:sql}
CREATE TABLE person_lat1 (name STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH 
SERDEPROPERTIES ('serialization.encoding'='ISO8859_1');
{code}
2. Copy an ISO-8859-1 encoded text file into the appropriate warehouse folder 
in HDFS. I'll attach an example file containing the following text: 
{noformat}
Müller,Thomas
Jørgensen,Jørgen
Peña,Andrés
Nåm,Fæk
{noformat}
3. Execute {{SELECT * FROM person_lat1}}

Result - The following output appears:
{noformat}
+---+--+
| person_lat1.name |
+---+--+
| Müller,Thomas |
| Jørgensen,Jørgen |
| Peña,Andrésørgen |
| Nåm,Fækdrésørgen |
+---+--+
{noformat}

  was:
If a LazySimpleSerDe table is created using ISO 8859-1 encoding, query results 
for a string column are incorrect for any row that was preceded by a row 
containing a longer string.

Example steps to reproduce:

1. Create a table using ISO 8859-1 encoding:

CREATE TABLE person_lat1 (name STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH 
SERDEPROPERTIES ('serialization.encoding'='ISO8859_1');

2. Copy an ISO-8859-1 encoded text file into the appropriate warehouse folder 
in HDFS. I'll attach an example file containing the following text: 

Müller,Thomas
Jørgensen,Jørgen
Peña,Andrés
Nåm,Fæk

3. Execute SELECT * FROM person_lat1

Result - The following output appears:

+---+--+
| person_lat1.name |
+---+--+
| Müller,Thomas |
| Jørgensen,Jørgen |
| Peña,Andrésørgen |
| Nåm,Fækdrésørgen |
+---+--+


> ISO-8859-1 text output has fragments of previous longer rows appended
> -
>
> Key: HIVE-2
> URL: https://issues.apache.org/jira/browse/HIVE-2
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-2.1.patch
>
>
> If a LazySimpleSerDe table is created using ISO 8859-1 encoding, query 
> results for a string column are incorrect for any row that was preceded by a 
> row containing a longer string.
> Example steps to reproduce:
> 1. Create a table using ISO 8859-1 encoding:
> {code:sql}
> CREATE TABLE person_lat1 (name STRING)
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH 
> SERDEPROPERTIES ('serialization.encoding'='ISO8859_1');
> {code}
> 2. Copy an ISO-8859-1 encoded text file into the appropriate warehouse folder 
> in HDFS. I'll attach an example file containing the following text: 
> {noformat}
> Müller,Thomas
> Jørgensen,Jørgen
> Peña,Andrés
> Nåm,Fæk
> {noformat}
> 3. Execute {{SELECT * FROM person_lat1}}
> Result - The following output appears:
> {noformat}
> +---+--+
> | person_lat1.name |
> +---+--+
> | Müller,Thomas |
> | Jørgensen,Jørgen |
> | Peña,Andrésørgen |
> | Nåm,Fækdrésørgen |
> +---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11134) HS2 should log open session failure

2015-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604403#comment-14604403
 ] 

Hive QA commented on HIVE-11134:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742322/HIVE-11134.1.patch

{color:green}SUCCESS:{color} +1 9030 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4416/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4416/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4416/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742322 - PreCommit-HIVE-TRUNK-Build

> HS2 should log open session failure
> ---
>
> Key: HIVE-11134
> URL: https://issues.apache.org/jira/browse/HIVE-11134
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-11134.1.patch
>
>
> HiveServer2 should log OpenSession failure.  If beeline is not running with 
> "--verbose=true" all stack trace information is not available for later 
> debugging, as it is not currently logged in server side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-27 Thread xiaowei wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604411#comment-14604411
 ] 

xiaowei wang commented on HIVE-11095:
-

This one is not the same as HIVE-2 .In 2,the patch is for method of 
transformTextToUTF8,In my patch, is for the  method of transformTextFromUTF8.


> SerDeUtils  another bug ,when Text is reused
> 
>
> Key: HIVE-11095
> URL: https://issues.apache.org/jira/browse/HIVE-11095
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
> Fix For: 1.2.0
>
> Attachments: HIVE-11095.1.patch.txt, HIVE-11095.2.patch.txt
>
>
> {noformat}
> The method transformTextFromUTF8 have a  error bug, It invoke a bad method of 
> Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> How I found this bug?
> When i query data from a lzo table , I found in results : the length of the 
> current row is always largr than the previous row, and sometimes,the current 
> row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select * from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
> `line` string)
> PARTITIONED BY (
> `logdate` string)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '
> U'
> WITH SERDEPROPERTIES (
> 'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
> OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
> 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reopened HIVE-11095:
-

> SerDeUtils  another bug ,when Text is reused
> 
>
> Key: HIVE-11095
> URL: https://issues.apache.org/jira/browse/HIVE-11095
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
> Fix For: 1.2.0
>
> Attachments: HIVE-11095.1.patch.txt, HIVE-11095.2.patch.txt
>
>
> {noformat}
> The method transformTextFromUTF8 have a  error bug, It invoke a bad method of 
> Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> How I found this bug?
> When i query data from a lzo table , I found in results : the length of the 
> current row is always largr than the previous row, and sometimes,the current 
> row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select * from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
> `line` string)
> PARTITIONED BY (
> `logdate` string)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '
> U'
> WITH SERDEPROPERTIES (
> 'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
> OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
> 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-27 Thread xiaowei wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604425#comment-14604425
 ] 

xiaowei wang commented on HIVE-10983:
-

 The unit test has passed .
[~chengxiang li] ,[~xuefuz],[~ctang.ma]

> SerDeUtils bug  ,when Text is reused 
> -
>
> Key: HIVE-10983
> URL: https://issues.apache.org/jira/browse/HIVE-10983
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
>  Labels: patch
> Fix For: 0.14.1, 1.2.0
>
> Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
> HIVE-10983.3.patch.txt, HIVE-10983.4.patch.txt, HIVE-10983.5.patch.txt
>
>
> {noformat}
> The mothod transformTextToUTF8 and transformTextFromUTF8  have a error bug,It 
> invoke a bad method of Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> When i query data from a lzo table , I found  in results : the length of the 
> current row is always largr  than the previous row, and sometimes,the current 
>  row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select *   from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content  of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
>   `line` string)
> PARTITIONED BY (
>   `logdate` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\\U'
> WITH SERDEPROPERTIES (
>   'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT  "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
>   OUTPUTFORMAT 
> "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10687) AvroDeserializer fails to deserialize evolved union fields

2015-06-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604426#comment-14604426
 ] 

Ashutosh Chauhan commented on HIVE-10687:
-

+1

> AvroDeserializer fails to deserialize evolved union fields
> --
>
> Key: HIVE-10687
> URL: https://issues.apache.org/jira/browse/HIVE-10687
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Swarnim Kulkarni
>Assignee: Swarnim Kulkarni
> Attachments: HIVE-10687.1.patch
>
>
> Consider the union field:
> {noformat}
> union {int, string}
> {noformat}
> and now this field evolves to
> {noformat}
> union {null, int, string}.
> {noformat}
> Running it through the avro schema compatibility check[1], they are actually 
> compatible which means that the latter could be used to deserialize the data 
> written with former. However the avro deserializer fails to do that. Mainly 
> because of the way it reads the tags from the reader schema and then reds the 
> corresponding data from the writer schema. [2]
> [1] http://pastebin.cerner.corp/31078
> [2] 
> https://github.com/cloudera/hive/blob/cdh5.4.0-release/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java#L354



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10687) AvroDeserializer fails to deserialize evolved union fields

2015-06-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10687:

Component/s: File Formats

> AvroDeserializer fails to deserialize evolved union fields
> --
>
> Key: HIVE-10687
> URL: https://issues.apache.org/jira/browse/HIVE-10687
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Serializers/Deserializers
>Reporter: Swarnim Kulkarni
>Assignee: Swarnim Kulkarni
> Fix For: 2.0.0
>
> Attachments: HIVE-10687.1.patch
>
>
> Consider the union field:
> {noformat}
> union {int, string}
> {noformat}
> and now this field evolves to
> {noformat}
> union {null, int, string}.
> {noformat}
> Running it through the avro schema compatibility check[1], they are actually 
> compatible which means that the latter could be used to deserialize the data 
> written with former. However the avro deserializer fails to do that. Mainly 
> because of the way it reads the tags from the reader schema and then reds the 
> corresponding data from the writer schema. [2]
> [1] http://pastebin.cerner.corp/31078
> [2] 
> https://github.com/cloudera/hive/blob/cdh5.4.0-release/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java#L354



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-27 Thread xiaowei wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604433#comment-14604433
 ] 

xiaowei wang commented on HIVE-11095:
-

[~ashutoshc]

> SerDeUtils  another bug ,when Text is reused
> 
>
> Key: HIVE-11095
> URL: https://issues.apache.org/jira/browse/HIVE-11095
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
> Fix For: 1.2.0
>
> Attachments: HIVE-11095.1.patch.txt, HIVE-11095.2.patch.txt
>
>
> {noformat}
> The method transformTextFromUTF8 have a  error bug, It invoke a bad method of 
> Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> How I found this bug?
> When i query data from a lzo table , I found in results : the length of the 
> current row is always largr than the previous row, and sometimes,the current 
> row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select * from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
> `line` string)
> PARTITIONED BY (
> `logdate` string)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '
> U'
> WITH SERDEPROPERTIES (
> 'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
> OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
> 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11095) SerDeUtils another bug ,when Text is reused

2015-06-27 Thread xiaowei wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604434#comment-14604434
 ] 

xiaowei wang commented on HIVE-11095:
-

This one is not the same as HIVE-2 .In 2,the patch is for method of 
transformTextToUTF8,In my patch, is for the method of transformTextFromUTF8.

> SerDeUtils  another bug ,when Text is reused
> 
>
> Key: HIVE-11095
> URL: https://issues.apache.org/jira/browse/HIVE-11095
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
> Fix For: 1.2.0
>
> Attachments: HIVE-11095.1.patch.txt, HIVE-11095.2.patch.txt
>
>
> {noformat}
> The method transformTextFromUTF8 have a  error bug, It invoke a bad method of 
> Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> How I found this bug?
> When i query data from a lzo table , I found in results : the length of the 
> current row is always largr than the previous row, and sometimes,the current 
> row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select * from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
> `line` string)
> PARTITIONED BY (
> `logdate` string)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '
> U'
> WITH SERDEPROPERTIES (
> 'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
> OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
> 'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11135) Fix the Beeline set and save command in order to avoid the NullPointerException

2015-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604448#comment-14604448
 ] 

Hive QA commented on HIVE-11135:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742330/HIVE-11135.1.patch

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9031 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_multiinsert
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_assert_true
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_assert_true2
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4417/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4417/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4417/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742330 - PreCommit-HIVE-TRUNK-Build

> Fix the Beeline set and save command in order to avoid the 
> NullPointerException
> ---
>
> Key: HIVE-11135
> URL: https://issues.apache.org/jira/browse/HIVE-11135
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
> Attachments: HIVE-11135.1.patch
>
>
> When I run set and save command at Beeline in my environment. And 
> NullPointerException occurred as follows.
> {code}
> [root@hive ~]# /usr/local/hive/bin/beeline
> Beeline version 2.0.0-SNAPSHOT by Apache Hive
> beeline> !set
> java.lang.NullPointerException
> beeline> !save
> Saving preferences to: /root/.beeline/beeline.properties
> java.lang.NullPointerException
> {code}
> This problem has occurred because the following method's return value in 
> BeeLineOpts#toProperties is null.
> {code}
> beeLine.getReflector().invoke(this, "get" + names[i], new 
> Object[0]).toString()
> {code}
> Therefore it is modified so as to avoid NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11131) Get row information on DataWritableWriter once for better writing performance

2015-06-27 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-11131:
---
Attachment: HIVE-11131.3.patch

> Get row information on DataWritableWriter once for better writing performance
> -
>
> Key: HIVE-11131
> URL: https://issues.apache.org/jira/browse/HIVE-11131
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 1.2.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-11131.2.patch, HIVE-11131.3.patch
>
>
> DataWritableWriter is a class used to write Hive records to Parquet files. 
> This class is getting all the information about how to parse a record, such 
> as schema and object inspector, every time a record is written (or write() is 
> called).
> We can make this class perform better by initializing some writers per data
> type once, and saving all object inspectors on each writer.
> The class expects that the next records written will have the same object 
> inspectors and schema, so there is no need to have conditions for that. When 
> a new schema is written, DataWritableWriter is created again by Parquet. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11071) FIx the output of beeline dbinfo command

2015-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604466#comment-14604466
 ] 

Hive QA commented on HIVE-11071:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742331/HIVE-11071.1.patch

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 9033 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_multiinsert
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_assert_true
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_assert_true2
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4418/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4418/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4418/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742331 - PreCommit-HIVE-TRUNK-Build

> FIx the output of beeline dbinfo command
> 
>
> Key: HIVE-11071
> URL: https://issues.apache.org/jira/browse/HIVE-11071
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
> Attachments: HIVE-11071-001-output.txt, HIVE-11071-001.patch, 
> HIVE-11071.1.patch
>
>
> When dbinfo is executed by beeline, it is displayed as follows. 
> {code}
> 0: jdbc:hive2://localhost:10001/> !dbinfo
> Error: Method not supported (state=,code=0)
> allTablesAreSelectabletrue
> Error: Method not supported (state=,code=0)
> Error: Method not supported (state=,code=0)
> Error: Method not supported (state=,code=0)
> getCatalogSeparator   .
> getCatalogTerminstance
> getDatabaseProductNameApache Hive
> getDatabaseProductVersion 2.0.0-SNAPSHOT
> getDefaultTransactionIsolation0
> getDriverMajorVersion 1
> getDriverMinorVersion 1
> getDriverName Hive JDBC
> ...
> {code}
> The method name of Error is not understood. I fix this output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10616) TypeInfoUtils doesn't handle DECIMAL with just precision specified

2015-06-27 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604478#comment-14604478
 ] 

Xuefu Zhang commented on HIVE-10616:


Agreed with Alex, I don't think there is any issue. Metadata always comes with 
two parameters, even if user omits some in which case default is filled in. As 
noted in the comments, the only chance that you might no parameters at all is 
in the metadata migrated from that prior to decimal precision/scale support 
where no parameters are stored. I believe that it's impossible to have a case 
where there is only one parameter (precision) stored in the metadata.

Please provide a repro case otherwise.

> TypeInfoUtils doesn't handle DECIMAL with just precision specified
> --
>
> Key: HIVE-10616
> URL: https://issues.apache.org/jira/browse/HIVE-10616
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 1.0.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Attachments: HIVE-10616.1.patch
>
>
> The parseType method in TypeInfoUtils doesn't handle decimal types with just 
> precision specified although that's a valid type definition. 
> As a result, TypeInfoUtils.getTypeInfoFromTypeString will always return 
> decimal(10,0) for any decimal() string. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11100) Beeline should escape semi-colon in queries

2015-06-27 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604479#comment-14604479
 ] 

Chaoyu Tang commented on HIVE-11100:


[~xuefuz] CLI uses ";" to terminate a command, or delimit multiple commands in 
a single line. If ";" is used in a query, it has to be escaped with "\". 
Beeline uses ";" to terminate a command, HIVE-9877 enhanced Beeline to support 
multiple commands separated by ";" in a single line, this JIRA adds the support 
that ";" in a query could be escaped by "\".
So both JIRAs are trying to make Beeline more compatible to CLI in term of the 
use of ";". Actually their implementations are also similar.

> Beeline should escape semi-colon in queries
> ---
>
> Key: HIVE-11100
> URL: https://issues.apache.org/jira/browse/HIVE-11100
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.0, 1.1.0
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-11100.patch
>
>
> Beeline should escape the semicolon in queries. for example, the query like 
> followings:
> CREATE TABLE beeline_tb (c1 int, c2 string) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY ';' LINES TERMINATED BY '\n';
> or 
> CREATE TABLE beeline_tb (c1 int, c2 string) ROW FORMAT DELIMITED FIELDS 
> TERMINATED BY '\;' LINES TERMINATED BY '\n';
> both failed.
> But the 2nd query with semicolon escaped with "\" works in CLI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-27 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604488#comment-14604488
 ] 

Xuefu Zhang commented on HIVE-10983:


Patch looks good to me. However, I'm wondering if it's possible to construct a 
test case for this. Forget it if it's too much work. Thanks.

> SerDeUtils bug  ,when Text is reused 
> -
>
> Key: HIVE-10983
> URL: https://issues.apache.org/jira/browse/HIVE-10983
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
>  Labels: patch
> Fix For: 0.14.1, 1.2.0
>
> Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
> HIVE-10983.3.patch.txt, HIVE-10983.4.patch.txt, HIVE-10983.5.patch.txt
>
>
> {noformat}
> The mothod transformTextToUTF8 and transformTextFromUTF8  have a error bug,It 
> invoke a bad method of Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> When i query data from a lzo table , I found  in results : the length of the 
> current row is always largr  than the previous row, and sometimes,the current 
>  row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select *   from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content  of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
>   `line` string)
> PARTITIONED BY (
>   `logdate` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\\U'
> WITH SERDEPROPERTIES (
>   'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT  "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
>   OUTPUTFORMAT 
> "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-27 Thread xiaowei wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604490#comment-14604490
 ] 

xiaowei wang commented on HIVE-10983:
-

Haha 

> SerDeUtils bug  ,when Text is reused 
> -
>
> Key: HIVE-10983
> URL: https://issues.apache.org/jira/browse/HIVE-10983
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
>  Labels: patch
> Fix For: 0.14.1, 1.2.0
>
> Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
> HIVE-10983.3.patch.txt, HIVE-10983.4.patch.txt, HIVE-10983.5.patch.txt
>
>
> {noformat}
> The mothod transformTextToUTF8 and transformTextFromUTF8  have a error bug,It 
> invoke a bad method of Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> When i query data from a lzo table , I found  in results : the length of the 
> current row is always largr  than the previous row, and sometimes,the current 
>  row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select *   from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content  of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
>   `line` string)
> PARTITIONED BY (
>   `logdate` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\\U'
> WITH SERDEPROPERTIES (
>   'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT  "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
>   OUTPUTFORMAT 
> "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-27 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604491#comment-14604491
 ] 

Xuefu Zhang commented on HIVE-10983:


Actually, this issue is probably the same as HIVE-2. The fix here covers 
more cases. However, patch in HIVE-2 contains a test case. It's great if we 
can consolidate.

> SerDeUtils bug  ,when Text is reused 
> -
>
> Key: HIVE-10983
> URL: https://issues.apache.org/jira/browse/HIVE-10983
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
>  Labels: patch
> Fix For: 0.14.1, 1.2.0
>
> Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
> HIVE-10983.3.patch.txt, HIVE-10983.4.patch.txt, HIVE-10983.5.patch.txt
>
>
> {noformat}
> The mothod transformTextToUTF8 and transformTextFromUTF8  have a error bug,It 
> invoke a bad method of Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> When i query data from a lzo table , I found  in results : the length of the 
> current row is always largr  than the previous row, and sometimes,the current 
>  row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select *   from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content  of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
>   `line` string)
> PARTITIONED BY (
>   `logdate` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\\U'
> WITH SERDEPROPERTIES (
>   'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT  "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
>   OUTPUTFORMAT 
> "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11112) ISO-8859-1 text output has fragments of previous longer rows appended

2015-06-27 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604493#comment-14604493
 ] 

Xuefu Zhang commented on HIVE-2:


[~ychena], could you check if the patch in HIVE-10983 fixes the problem here. I 
like the test case here though.

> ISO-8859-1 text output has fragments of previous longer rows appended
> -
>
> Key: HIVE-2
> URL: https://issues.apache.org/jira/browse/HIVE-2
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-2.1.patch
>
>
> If a LazySimpleSerDe table is created using ISO 8859-1 encoding, query 
> results for a string column are incorrect for any row that was preceded by a 
> row containing a longer string.
> Example steps to reproduce:
> 1. Create a table using ISO 8859-1 encoding:
> {code:sql}
> CREATE TABLE person_lat1 (name STRING)
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH 
> SERDEPROPERTIES ('serialization.encoding'='ISO8859_1');
> {code}
> 2. Copy an ISO-8859-1 encoded text file into the appropriate warehouse folder 
> in HDFS. I'll attach an example file containing the following text: 
> {noformat}
> Müller,Thomas
> Jørgensen,Jørgen
> Peña,Andrés
> Nåm,Fæk
> {noformat}
> 3. Execute {{SELECT * FROM person_lat1}}
> Result - The following output appears:
> {noformat}
> +---+--+
> | person_lat1.name |
> +---+--+
> | Müller,Thomas |
> | Jørgensen,Jørgen |
> | Peña,Andrésørgen |
> | Nåm,Fækdrésørgen |
> +---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10983) SerDeUtils bug ,when Text is reused

2015-06-27 Thread xiaowei wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604497#comment-14604497
 ] 

xiaowei wang commented on HIVE-10983:
-

I can merge HIVE-2 in my patch. How is it? 

> SerDeUtils bug  ,when Text is reused 
> -
>
> Key: HIVE-10983
> URL: https://issues.apache.org/jira/browse/HIVE-10983
> Project: Hive
>  Issue Type: Bug
>  Components: API, CLI
>Affects Versions: 0.14.0, 1.0.0, 1.2.0
> Environment: Hadoop 2.3.0-cdh5.0.0
> Hive 0.14
>Reporter: xiaowei wang
>Assignee: xiaowei wang
>  Labels: patch
> Fix For: 0.14.1, 1.2.0
>
> Attachments: HIVE-10983.1.patch.txt, HIVE-10983.2.patch.txt, 
> HIVE-10983.3.patch.txt, HIVE-10983.4.patch.txt, HIVE-10983.5.patch.txt
>
>
> {noformat}
> The mothod transformTextToUTF8 and transformTextFromUTF8  have a error bug,It 
> invoke a bad method of Text,getBytes()!
> The method getBytes of Text returns the raw bytes; however, only data up to 
> Text.length is valid.A better way is  use copyBytes()  if you need the 
> returned array to be precisely the length of the data.
> But the copyBytes is added behind hadoop1. 
> {noformat}
> When i query data from a lzo table , I found  in results : the length of the 
> current row is always largr  than the previous row, and sometimes,the current 
>  row contains the contents of the previous row。 For example ,i execute a sql ,
> {code:sql}
> select *   from web_searchhub where logdate=2015061003
> {code}
> the result of sql see blow.Notice that ,the second row content contains the 
> first row content.
> {noformat}
> INFO [03:00:05.589] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42098,session=3151,thread=254 2015061003
> INFO [03:00:05.594] <18941e66-9962-44ad-81bc-3519f47ba274> 
> session=901,thread=223ession=3151,thread=254 2015061003
> {noformat}
> The content  of origin lzo file content see below ,just 2 rows.
> {noformat}
> INFO [03:00:05.635]  
> session=3148,thread=285
> INFO [03:00:05.635] HttpFrontServer::FrontSH 
> msgRecv:Remote=/10.13.193.68:42095,session=3148,thread=285
> {noformat}
> I think this error is caused by the Text reuse,and I found the solutions .
> Addicational, table create sql is : 
> {code:sql}
> CREATE EXTERNAL TABLE `web_searchhub`(
>   `line` string)
> PARTITIONED BY (
>   `logdate` string)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '\\U'
> WITH SERDEPROPERTIES (
>   'serialization.encoding'='GBK')
> STORED AS INPUTFORMAT  "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
>   OUTPUTFORMAT 
> "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/raw.db/web/web_searchhub' 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11135) Fix the Beeline set and save command in order to avoid the NullPointerException

2015-06-27 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604501#comment-14604501
 ] 

Xuefu Zhang commented on HIVE-11135:


+1

> Fix the Beeline set and save command in order to avoid the 
> NullPointerException
> ---
>
> Key: HIVE-11135
> URL: https://issues.apache.org/jira/browse/HIVE-11135
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
> Attachments: HIVE-11135.1.patch
>
>
> When I run set and save command at Beeline in my environment. And 
> NullPointerException occurred as follows.
> {code}
> [root@hive ~]# /usr/local/hive/bin/beeline
> Beeline version 2.0.0-SNAPSHOT by Apache Hive
> beeline> !set
> java.lang.NullPointerException
> beeline> !save
> Saving preferences to: /root/.beeline/beeline.properties
> java.lang.NullPointerException
> {code}
> This problem has occurred because the following method's return value in 
> BeeLineOpts#toProperties is null.
> {code}
> beeLine.getReflector().invoke(this, "get" + names[i], new 
> Object[0]).toString()
> {code}
> Therefore it is modified so as to avoid NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11112) ISO-8859-1 text output has fragments of previous longer rows appended

2015-06-27 Thread xiaowei wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604502#comment-14604502
 ] 

xiaowei wang commented on HIVE-2:
-

[~ychena]  provide a very good test case!
can I merge the test case of this patch into HIve-10983?  

> ISO-8859-1 text output has fragments of previous longer rows appended
> -
>
> Key: HIVE-2
> URL: https://issues.apache.org/jira/browse/HIVE-2
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 1.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-2.1.patch
>
>
> If a LazySimpleSerDe table is created using ISO 8859-1 encoding, query 
> results for a string column are incorrect for any row that was preceded by a 
> row containing a longer string.
> Example steps to reproduce:
> 1. Create a table using ISO 8859-1 encoding:
> {code:sql}
> CREATE TABLE person_lat1 (name STRING)
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH 
> SERDEPROPERTIES ('serialization.encoding'='ISO8859_1');
> {code}
> 2. Copy an ISO-8859-1 encoded text file into the appropriate warehouse folder 
> in HDFS. I'll attach an example file containing the following text: 
> {noformat}
> Müller,Thomas
> Jørgensen,Jørgen
> Peña,Andrés
> Nåm,Fæk
> {noformat}
> 3. Execute {{SELECT * FROM person_lat1}}
> Result - The following output appears:
> {noformat}
> +---+--+
> | person_lat1.name |
> +---+--+
> | Müller,Thomas |
> | Jørgensen,Jørgen |
> | Peña,Andrésørgen |
> | Nåm,Fækdrésørgen |
> +---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10673) Dynamically partitioned hash join for Tez

2015-06-27 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604508#comment-14604508
 ] 

Xuefu Zhang commented on HIVE-10673:


Sorry if this obvious, but could we have details about the problem and proposed 
solution? From the description I don't quite sure if this is a feature addition 
or bug fix. Thanks.

> Dynamically partitioned hash join for Tez
> -
>
> Key: HIVE-10673
> URL: https://issues.apache.org/jira/browse/HIVE-10673
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, Query Processor
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-10673.1.patch, HIVE-10673.2.patch, 
> HIVE-10673.3.patch, HIVE-10673.4.patch, HIVE-10673.5.patch
>
>
> Reduce-side hash join (using MapJoinOperator), where the Tez inputs to the 
> reducer are unsorted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9625) Delegation tokens for HMS are not renewed

2015-06-27 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-9625:
--
Attachment: HIVE-9625.1.patch

> Delegation tokens for HMS are not renewed
> -
>
> Key: HIVE-9625
> URL: https://issues.apache.org/jira/browse/HIVE-9625
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-9625-branch-1.patch, HIVE-9625.1.patch, 
> HIVE-9625.1.patch, HIVE-9625.1.patch
>
>
> AFAICT the delegation tokens stored in [HiveSessionImplwithUGI 
> |https://github.com/apache/hive/blob/trunk/service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java#L45]
>  for HMS + Impersonation are never renewed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7180) BufferedReader is not closed in MetaStoreSchemaInfo ctor

2015-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604510#comment-14604510
 ] 

Hive QA commented on HIVE-7180:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742347/HIVE-7180.4.patch

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 9033 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_multiinsert
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_assert_true
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_assert_true2
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4419/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4419/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4419/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742347 - PreCommit-HIVE-TRUNK-Build

> BufferedReader is not closed in MetaStoreSchemaInfo ctor
> 
>
> Key: HIVE-7180
> URL: https://issues.apache.org/jira/browse/HIVE-7180
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.1
>Reporter: Ted Yu
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-7180.3.patch, HIVE-7180.4.patch, HIVE-7180.patch, 
> HIVE-7180_001.patch
>
>
> Here is related code:
> {code}
>   BufferedReader bfReader =
> new BufferedReader(new FileReader(upgradeListFile));
>   String currSchemaVersion;
>   while ((currSchemaVersion = bfReader.readLine()) != null) {
> upgradeOrderList.add(currSchemaVersion.trim());
> {code}
> BufferedReader / FileReader should be closed upon return from ctor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5837) SQL standard based secure authorization for hive

2015-06-27 Thread Premchandra Preetham Kukillaya (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604514#comment-14604514
 ] 

Premchandra Preetham Kukillaya commented on HIVE-5837:
--

Hi Thejas,
In the Spark Thrift Server  the SQL Std Authorization does not get triggered so 
is this a new defect?


--hiveconf 
hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator
 --hiveconf 
hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory
 --hiveconf hive.server2.enable.doAs=false 

> SQL standard based secure authorization for hive
> 
>
> Key: HIVE-5837
> URL: https://issues.apache.org/jira/browse/HIVE-5837
> Project: Hive
>  Issue Type: New Feature
>  Components: Authorization, SQLStandardAuthorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: SQL standard authorization hive.pdf
>
>
> The current default authorization is incomplete and not secure. The 
> alternative of storage based authorization provides security but does not 
> provide fine grained authorization.
> The proposal is to support secure fine grained authorization in hive using 
> SQL standard based authorization model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7180) BufferedReader is not closed in MetaStoreSchemaInfo ctor

2015-06-27 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604536#comment-14604536
 ] 

Alexander Pivovarov commented on HIVE-7180:
---

The errors are unrelated. Previous build (4418) had the same 10 errors

> BufferedReader is not closed in MetaStoreSchemaInfo ctor
> 
>
> Key: HIVE-7180
> URL: https://issues.apache.org/jira/browse/HIVE-7180
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.1
>Reporter: Ted Yu
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-7180.3.patch, HIVE-7180.4.patch, HIVE-7180.patch, 
> HIVE-7180_001.patch
>
>
> Here is related code:
> {code}
>   BufferedReader bfReader =
> new BufferedReader(new FileReader(upgradeListFile));
>   String currSchemaVersion;
>   while ((currSchemaVersion = bfReader.readLine()) != null) {
> upgradeOrderList.add(currSchemaVersion.trim());
> {code}
> BufferedReader / FileReader should be closed upon return from ctor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7150) FileInputStream is not closed in HiveConnection#getHttpClient()

2015-06-27 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604543#comment-14604543
 ] 

Xuefu Zhang commented on HIVE-7150:
---

+1

> FileInputStream is not closed in HiveConnection#getHttpClient()
> ---
>
> Key: HIVE-7150
> URL: https://issues.apache.org/jira/browse/HIVE-7150
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Ted Yu
>Assignee: Alexander Pivovarov
> Attachments: HIVE-7150.1.patch, HIVE-7150.2.patch, HIVE-7150.3.patch, 
> HIVE-7150.4.patch
>
>
> Here is related code:
> {code}
> sslTrustStore.load(new FileInputStream(sslTrustStorePath),
> sslTrustStorePassword.toCharArray());
> {code}
> The FileInputStream is not closed upon returning from the method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7180) BufferedReader is not closed in MetaStoreSchemaInfo ctor

2015-06-27 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604544#comment-14604544
 ] 

Xuefu Zhang commented on HIVE-7180:
---

+1

> BufferedReader is not closed in MetaStoreSchemaInfo ctor
> 
>
> Key: HIVE-7180
> URL: https://issues.apache.org/jira/browse/HIVE-7180
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.1
>Reporter: Ted Yu
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-7180.3.patch, HIVE-7180.4.patch, HIVE-7180.patch, 
> HIVE-7180_001.patch
>
>
> Here is related code:
> {code}
>   BufferedReader bfReader =
> new BufferedReader(new FileReader(upgradeListFile));
>   String currSchemaVersion;
>   while ((currSchemaVersion = bfReader.readLine()) != null) {
> upgradeOrderList.add(currSchemaVersion.trim());
> {code}
> BufferedReader / FileReader should be closed upon return from ctor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7150) FileInputStream is not closed in HiveConnection#getHttpClient()

2015-06-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604547#comment-14604547
 ] 

Hive QA commented on HIVE-7150:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742348/HIVE-7150.4.patch

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9033 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_multiinsert
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_assert_true
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_assert_true2
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4420/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4420/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4420/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742348 - PreCommit-HIVE-TRUNK-Build

> FileInputStream is not closed in HiveConnection#getHttpClient()
> ---
>
> Key: HIVE-7150
> URL: https://issues.apache.org/jira/browse/HIVE-7150
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Ted Yu
>Assignee: Alexander Pivovarov
> Attachments: HIVE-7150.1.patch, HIVE-7150.2.patch, HIVE-7150.3.patch, 
> HIVE-7150.4.patch
>
>
> Here is related code:
> {code}
> sslTrustStore.load(new FileInputStream(sslTrustStorePath),
> sslTrustStorePassword.toCharArray());
> {code}
> The FileInputStream is not closed upon returning from the method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)