[jira] [Resolved] (HIVE-23800) Add hooks when HiveServer2 stops due to OutOfMemoryError

2020-10-14 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-23800.
-
Fix Version/s: 4.0.0
 Assignee: Zhihua Deng
   Resolution: Fixed

merged into master. Thank you [~dengzh]!

> Add hooks when HiveServer2 stops due to OutOfMemoryError
> 
>
> Key: HIVE-23800
> URL: https://issues.apache.org/jira/browse/HIVE-23800
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Make oom hook an interface of HiveServer2,  so user can implement the hook to 
> do something before HS2 stops, such as dumping the heap or altering the 
> devops.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23800) Add hooks when HiveServer2 stops due to OutOfMemoryError

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23800?focusedWorklogId=500511&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500511
 ]

ASF GitHub Bot logged work on HIVE-23800:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 07:47
Start Date: 14/Oct/20 07:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk merged pull request #1205:
URL: https://github.com/apache/hive/pull/1205


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500511)
Time Spent: 6h 10m  (was: 6h)

> Add hooks when HiveServer2 stops due to OutOfMemoryError
> 
>
> Key: HIVE-23800
> URL: https://issues.apache.org/jira/browse/HIVE-23800
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Make oom hook an interface of HiveServer2,  so user can implement the hook to 
> do something before HS2 stops, such as dumping the heap or altering the 
> devops.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24106?focusedWorklogId=500517&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500517
 ]

ASF GitHub Bot logged work on HIVE-24106:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 07:51
Start Date: 14/Oct/20 07:51
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk merged pull request #1456:
URL: https://github.com/apache/hive/pull/1456


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500517)
Time Spent: 1h 50m  (was: 1h 40m)

> Abort polling on the operation state when the current thread is interrupted
> ---
>
> Key: HIVE-24106
> URL: https://issues.apache.org/jira/browse/HIVE-24106
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> If running HiveStatement asynchronously as a task like in a thread or future, 
>  if we interrupt the task,  the HiveStatement would continue to poll on the 
> operation state until finish. It's may better to provide a way to abort the 
> executing in such case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted

2020-10-14 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-24106.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

merged into master. Thank you [~dengzh]!

> Abort polling on the operation state when the current thread is interrupted
> ---
>
> Key: HIVE-24106
> URL: https://issues.apache.org/jira/browse/HIVE-24106
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> If running HiveStatement asynchronously as a task like in a thread or future, 
>  if we interrupt the task,  the HiveStatement would continue to poll on the 
> operation state until finish. It's may better to provide a way to abort the 
> executing in such case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-23712) metadata-only queries return incorrect results with empty acid partition

2020-10-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-23712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor resolved HIVE-23712.
-
Resolution: Fixed

> metadata-only queries return incorrect results with empty acid partition
> 
>
> Key: HIVE-23712
> URL: https://issues.apache.org/jira/browse/HIVE-23712
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Similarly to HIVE-15397, queries can return incorrect results for 
> metadata-only queries, here is a repro scenario which affects master:
> {code}
> set hive.support.concurrency=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> set hive.optimize.metadataonly=true;
> create table test1 (id int, val string) partitioned by (val2 string) STORED 
> AS ORC TBLPROPERTIES ('transactional'='true');
> describe formatted test1;
> alter table test1 add partition (val2='foo');
> alter table test1 add partition (val2='bar');
> insert into test1 partition (val2='foo') values (1, 'abc');
> select distinct val2, current_timestamp from test1;
> insert into test1 partition (val2='bar') values (1, 'def');
> delete from test1 where val2 = 'bar';
> select '--> hive.optimize.metadataonly=true';
> select distinct val2, current_timestamp from test1;
> set hive.optimize.metadataonly=false;
> select '--> hive.optimize.metadataonly=false';
> select distinct val2, current_timestamp from test1;
> select current_timestamp, * from test1;
> {code}
> in this case 2 rows returned instead of 1 after a delete with metadata only 
> optimization:
> https://github.com/abstractdog/hive/commit/a7f03513564d01f7c3ba4aa61c4c6537100b4d3f#diff-cb23043000831f41fe7041cb38f82224R114-R128



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23712) metadata-only queries return incorrect results with empty acid partition

2020-10-14 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-23712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213716#comment-17213716
 ] 

László Bodor commented on HIVE-23712:
-

forgot about this one, but now it's merged to master, thanks for the review 
[~mustafaiman], [~ashutoshc]!

> metadata-only queries return incorrect results with empty acid partition
> 
>
> Key: HIVE-23712
> URL: https://issues.apache.org/jira/browse/HIVE-23712
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Similarly to HIVE-15397, queries can return incorrect results for 
> metadata-only queries, here is a repro scenario which affects master:
> {code}
> set hive.support.concurrency=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> set hive.optimize.metadataonly=true;
> create table test1 (id int, val string) partitioned by (val2 string) STORED 
> AS ORC TBLPROPERTIES ('transactional'='true');
> describe formatted test1;
> alter table test1 add partition (val2='foo');
> alter table test1 add partition (val2='bar');
> insert into test1 partition (val2='foo') values (1, 'abc');
> select distinct val2, current_timestamp from test1;
> insert into test1 partition (val2='bar') values (1, 'def');
> delete from test1 where val2 = 'bar';
> select '--> hive.optimize.metadataonly=true';
> select distinct val2, current_timestamp from test1;
> set hive.optimize.metadataonly=false;
> select '--> hive.optimize.metadataonly=false';
> select distinct val2, current_timestamp from test1;
> select current_timestamp, * from test1;
> {code}
> in this case 2 rows returned instead of 1 after a delete with metadata only 
> optimization:
> https://github.com/abstractdog/hive/commit/a7f03513564d01f7c3ba4aa61c4c6537100b4d3f#diff-cb23043000831f41fe7041cb38f82224R114-R128



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23712) metadata-only queries return incorrect results with empty acid partition

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23712?focusedWorklogId=500526&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500526
 ]

ASF GitHub Bot logged work on HIVE-23712:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 08:04
Start Date: 14/Oct/20 08:04
Worklog Time Spent: 10m 
  Work Description: abstractdog merged pull request #1182:
URL: https://github.com/apache/hive/pull/1182


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500526)
Time Spent: 50m  (was: 40m)

> metadata-only queries return incorrect results with empty acid partition
> 
>
> Key: HIVE-23712
> URL: https://issues.apache.org/jira/browse/HIVE-23712
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Similarly to HIVE-15397, queries can return incorrect results for 
> metadata-only queries, here is a repro scenario which affects master:
> {code}
> set hive.support.concurrency=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> set hive.optimize.metadataonly=true;
> create table test1 (id int, val string) partitioned by (val2 string) STORED 
> AS ORC TBLPROPERTIES ('transactional'='true');
> describe formatted test1;
> alter table test1 add partition (val2='foo');
> alter table test1 add partition (val2='bar');
> insert into test1 partition (val2='foo') values (1, 'abc');
> select distinct val2, current_timestamp from test1;
> insert into test1 partition (val2='bar') values (1, 'def');
> delete from test1 where val2 = 'bar';
> select '--> hive.optimize.metadataonly=true';
> select distinct val2, current_timestamp from test1;
> set hive.optimize.metadataonly=false;
> select '--> hive.optimize.metadataonly=false';
> select distinct val2, current_timestamp from test1;
> select current_timestamp, * from test1;
> {code}
> in this case 2 rows returned instead of 1 after a delete with metadata only 
> optimization:
> https://github.com/abstractdog/hive/commit/a7f03513564d01f7c3ba4aa61c4c6537100b4d3f#diff-cb23043000831f41fe7041cb38f82224R114-R128



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23712) metadata-only queries return incorrect results with empty acid partition

2020-10-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-23712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-23712:

Fix Version/s: 4.0.0

> metadata-only queries return incorrect results with empty acid partition
> 
>
> Key: HIVE-23712
> URL: https://issues.apache.org/jira/browse/HIVE-23712
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Similarly to HIVE-15397, queries can return incorrect results for 
> metadata-only queries, here is a repro scenario which affects master:
> {code}
> set hive.support.concurrency=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> set hive.optimize.metadataonly=true;
> create table test1 (id int, val string) partitioned by (val2 string) STORED 
> AS ORC TBLPROPERTIES ('transactional'='true');
> describe formatted test1;
> alter table test1 add partition (val2='foo');
> alter table test1 add partition (val2='bar');
> insert into test1 partition (val2='foo') values (1, 'abc');
> select distinct val2, current_timestamp from test1;
> insert into test1 partition (val2='bar') values (1, 'def');
> delete from test1 where val2 = 'bar';
> select '--> hive.optimize.metadataonly=true';
> select distinct val2, current_timestamp from test1;
> set hive.optimize.metadataonly=false;
> select '--> hive.optimize.metadataonly=false';
> select distinct val2, current_timestamp from test1;
> select current_timestamp, * from test1;
> {code}
> in this case 2 rows returned instead of 1 after a delete with metadata only 
> optimization:
> https://github.com/abstractdog/hive/commit/a7f03513564d01f7c3ba4aa61c4c6537100b4d3f#diff-cb23043000831f41fe7041cb38f82224R114-R128



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24221) Use vectorizable expression to combine multiple columns in semijoin bloom filters

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24221?focusedWorklogId=500527&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500527
 ]

ASF GitHub Bot logged work on HIVE-24221:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 08:10
Start Date: 14/Oct/20 08:10
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #1544:
URL: https://github.com/apache/hive/pull/1544#discussion_r504482010



##
File path: ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java
##
@@ -233,6 +235,23 @@ public static ExprNodeGenericFuncDesc 
and(List exps) {
 return new ExprNodeGenericFuncDesc(TypeInfoFactory.booleanTypeInfo, new 
GenericUDFOPAnd(), "and", flatExps);
   }
 
+  /**
+   * Create an expression for computing a hash by recursively hashing given 
expressions by two:
+   * 
+   * Input: HASH(A, B, C, D)
+   * Output: HASH(HASH(HASH(A,B),C),D)
+   * 
+   */
+  public static ExprNodeGenericFuncDesc hash(List exps) {
+assert exps.size() >= 2;
+ExprNodeDesc hashExp = exps.get(0);
+for (int i = 1; i < exps.size(); i++) {
+  List hArgs = Arrays.asList(hashExp, exps.get(i));
+  hashExp = new ExprNodeGenericFuncDesc(TypeInfoFactory.intTypeInfo, new 
GenericUDFMurmurHash(), "hash", hArgs);

Review comment:
   it seems like we have some inconsistency in `GenericUDFMurmurHash` which 
is registered as `murmur_hash` in the `FunctionRegistry` ; however in the UDF's 
annotation it only has `hash` - and here as well we use simply "hash".
   a change like this will most likely cause a lot of q.out changes - could you 
file a follow-up ticket?
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500527)
Time Spent: 20m  (was: 10m)

> Use vectorizable expression to combine multiple columns in semijoin bloom 
> filters
> -
>
> Key: HIVE-24221
> URL: https://issues.apache.org/jira/browse/HIVE-24221
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
> Environment: 
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, multi-column semijoin reducers use an n-ary call to 
> GenericUDFMurmurHash to combine multiple values into one, which is used as an 
> entry to the bloom filter. However, there are no vectorized operators that 
> treat n-ary inputs. The same goes for the vectorized implementation of 
> GenericUDFMurmurHash introduced in HIVE-23976. 
> The goal of this issue is to choose an alternative way to combine multiple 
> values into one to pass in the bloom filter comprising only vectorized 
> operators.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted

2020-10-14 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213735#comment-17213735
 ] 

Zhihua Deng commented on HIVE-24106:


Thanks a lot [~kgyrtkirk] for the help and reviews!

> Abort polling on the operation state when the current thread is interrupted
> ---
>
> Key: HIVE-24106
> URL: https://issues.apache.org/jira/browse/HIVE-24106
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> If running HiveStatement asynchronously as a task like in a thread or future, 
>  if we interrupt the task,  the HiveStatement would continue to poll on the 
> operation state until finish. It's may better to provide a way to abort the 
> executing in such case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23800) Add hooks when HiveServer2 stops due to OutOfMemoryError

2020-10-14 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213742#comment-17213742
 ] 

Zhihua Deng commented on HIVE-23800:


Many thanks for your comments and reviews, [~kgyrtkirk]!

> Add hooks when HiveServer2 stops due to OutOfMemoryError
> 
>
> Key: HIVE-23800
> URL: https://issues.apache.org/jira/browse/HIVE-23800
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Make oom hook an interface of HiveServer2,  so user can implement the hook to 
> do something before HS2 stops, such as dumping the heap or altering the 
> devops.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24266) Committed rows in hflush'd ACID files may be missing from query result

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24266?focusedWorklogId=500547&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500547
 ]

ASF GitHub Bot logged work on HIVE-24266:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 09:29
Start Date: 14/Oct/20 09:29
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #1576:
URL: https://github.com/apache/hive/pull/1576#discussion_r504534912



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java
##
@@ -1156,13 +1157,36 @@ public BISplitStrategy(Context context, FileSystem fs, 
Path dir,
   } else {
 TreeMap blockOffsets = 
SHIMS.getLocationsWithOffset(fs, fileStatus);
 for (Map.Entry entry : 
blockOffsets.entrySet()) {
-  if (entry.getKey() + entry.getValue().getLength() > logicalLen) {
+  long blockOffset = entry.getKey();
+  long blockLength = entry.getValue().getLength();
+  if(blockOffset > logicalLen) {
 //don't create splits for anything past logical EOF
-continue;
+//map is ordered, thus any possible entry in the iteration 
after this is bound to be > logicalLen
+break;
   }
-  OrcSplit orcSplit = new OrcSplit(fileStatus.getPath(), fileKey, 
entry.getKey(),
-entry.getValue().getLength(), entry.getValue().getHosts(), 
null, isOriginal, true,
-deltas, -1, logicalLen, dir, offsetAndBucket);
+  long splitLength = blockLength;
+
+  long blockEndOvershoot = (blockOffset + blockLength) - 
logicalLen;
+  if (blockEndOvershoot > 0) {
+// if logicalLen is placed within a block, we should make 
(this last) split out of the part of this block
+// -> we should read less than block end
+splitLength -= blockEndOvershoot;
+  } else if (blockOffsets.lastKey() == blockOffset && 
blockEndOvershoot < 0) {
+// This is the last block but it ends before logicalLen
+// This can happen with HDFS if hflush was called and blocks 
are not persisted to disk yet, but content
+// is otherwise available for readers, as DNs have these 
buffers in memory at this time.
+// -> we should read more than (persisted) block end, but 
surely not more than the whole block
+if (fileStatus instanceof HdfsLocatedFileStatus) {
+  HdfsLocatedFileStatus hdfsFileStatus = 
(HdfsLocatedFileStatus)fileStatus;
+  if (hdfsFileStatus.getLocatedBlocks().isUnderConstruction()) 
{
+// blockEndOvershoot is negative here...
+splitLength = Math.min(splitLength - blockEndOvershoot, 
hdfsFileStatus.getBlockSize());

Review comment:
   Maybe this is just a theoretical problem, but if the `blockOffset + 
hdfsFileStatus.getBlockSize()` is greater in the last block than the 
`logicalLen`, then we should throw an exception, and then we do not need the 
min here





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500547)
Time Spent: 50m  (was: 40m)

> Committed rows in hflush'd ACID files may be missing from query result
> --
>
> Key: HIVE-24266
> URL: https://issues.apache.org/jira/browse/HIVE-24266
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> in HDFS environment if a writer is using hflush to write ORC ACID files 
> during a transaction commit, the results might be seen as missing when 
> reading the table before this file is completely persisted to disk (thus 
> synced)
> This is due to hflush not persisting the new buffers to disk, it rather just 
> ensures that new readers can see the new content. This causes the block 
> information to be incomplete, on which BISplitStrategy relies on. Although 
> the side file (_flush_length) tracks the proper end of the file that is being 
> written, this information is neglected in the favour of block information, 
> and we may end up generating a very short split instead of the larger, 
> available length.
> When ETLSplitStrategy is used there is not even a try to rely on ACID side 
> file when calcu

[jira] [Work logged] (HIVE-24266) Committed rows in hflush'd ACID files may be missing from query result

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24266?focusedWorklogId=500555&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500555
 ]

ASF GitHub Bot logged work on HIVE-24266:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 09:51
Start Date: 14/Oct/20 09:51
Worklog Time Spent: 10m 
  Work Description: szlta commented on a change in pull request #1576:
URL: https://github.com/apache/hive/pull/1576#discussion_r504549035



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java
##
@@ -1156,13 +1157,36 @@ public BISplitStrategy(Context context, FileSystem fs, 
Path dir,
   } else {
 TreeMap blockOffsets = 
SHIMS.getLocationsWithOffset(fs, fileStatus);
 for (Map.Entry entry : 
blockOffsets.entrySet()) {
-  if (entry.getKey() + entry.getValue().getLength() > logicalLen) {
+  long blockOffset = entry.getKey();
+  long blockLength = entry.getValue().getLength();
+  if(blockOffset > logicalLen) {
 //don't create splits for anything past logical EOF
-continue;
+//map is ordered, thus any possible entry in the iteration 
after this is bound to be > logicalLen
+break;
   }
-  OrcSplit orcSplit = new OrcSplit(fileStatus.getPath(), fileKey, 
entry.getKey(),
-entry.getValue().getLength(), entry.getValue().getHosts(), 
null, isOriginal, true,
-deltas, -1, logicalLen, dir, offsetAndBucket);
+  long splitLength = blockLength;
+
+  long blockEndOvershoot = (blockOffset + blockLength) - 
logicalLen;
+  if (blockEndOvershoot > 0) {
+// if logicalLen is placed within a block, we should make 
(this last) split out of the part of this block
+// -> we should read less than block end
+splitLength -= blockEndOvershoot;
+  } else if (blockOffsets.lastKey() == blockOffset && 
blockEndOvershoot < 0) {
+// This is the last block but it ends before logicalLen
+// This can happen with HDFS if hflush was called and blocks 
are not persisted to disk yet, but content
+// is otherwise available for readers, as DNs have these 
buffers in memory at this time.
+// -> we should read more than (persisted) block end, but 
surely not more than the whole block
+if (fileStatus instanceof HdfsLocatedFileStatus) {
+  HdfsLocatedFileStatus hdfsFileStatus = 
(HdfsLocatedFileStatus)fileStatus;
+  if (hdfsFileStatus.getLocatedBlocks().isUnderConstruction()) 
{
+// blockEndOvershoot is negative here...
+splitLength = Math.min(splitLength - blockEndOvershoot, 
hdfsFileStatus.getBlockSize());

Review comment:
   okay, that makes sense





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500555)
Time Spent: 1h  (was: 50m)

> Committed rows in hflush'd ACID files may be missing from query result
> --
>
> Key: HIVE-24266
> URL: https://issues.apache.org/jira/browse/HIVE-24266
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> in HDFS environment if a writer is using hflush to write ORC ACID files 
> during a transaction commit, the results might be seen as missing when 
> reading the table before this file is completely persisted to disk (thus 
> synced)
> This is due to hflush not persisting the new buffers to disk, it rather just 
> ensures that new readers can see the new content. This causes the block 
> information to be incomplete, on which BISplitStrategy relies on. Although 
> the side file (_flush_length) tracks the proper end of the file that is being 
> written, this information is neglected in the favour of block information, 
> and we may end up generating a very short split instead of the larger, 
> available length.
> When ETLSplitStrategy is used there is not even a try to rely on ACID side 
> file when calculating file length, so that needs to fixed too.
> Moreover we might see the newly committed rows not to appear due to OrcTail 
> caching in ETLSplitStrategy. For now I'm just going to recommend turning tha

[jira] [Assigned] (HIVE-24272) LLAP: local directories should be cleaned up on ContainerRunnerImpl.queryFailed

2020-10-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-24272:
---

Assignee: László Bodor

> LLAP: local directories should be cleaned up on 
> ContainerRunnerImpl.queryFailed
> ---
>
> Key: HIVE-24272
> URL: https://issues.apache.org/jira/browse/HIVE-24272
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> Currently, QueryTracker.cleanupLocalDirs is only called on 
> [QueryTracker.queryComplete|https://github.com/apache/hive/blob/eeffb0e4e7feab7cea0dba9e7a2b63808b2023f7/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/QueryTracker.java#L308].
>  We need to investigate what happens to local files (shuffle intermediate 
> files) on query failures. I guess ContainerRunnerImpl.queryFailed could be a 
> caller for that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24272) LLAP: local directories should be cleaned up on ContainerRunnerImpl.queryFailed

2020-10-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24272:

Description: Currently, QueryTracker.cleanupLocalDirs is only called on 
[QueryTracker.queryComplete|https://github.com/apache/hive/blob/eeffb0e4e7feab7cea0dba9e7a2b63808b2023f7/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/QueryTracker.java#L308].
 We need to investigate what happens to local files (shuffle intermediate 
files) on query failures. I guess ContainerRunnerImpl.queryFailed could be a 
caller for that.

> LLAP: local directories should be cleaned up on 
> ContainerRunnerImpl.queryFailed
> ---
>
> Key: HIVE-24272
> URL: https://issues.apache.org/jira/browse/HIVE-24272
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Priority: Major
>
> Currently, QueryTracker.cleanupLocalDirs is only called on 
> [QueryTracker.queryComplete|https://github.com/apache/hive/blob/eeffb0e4e7feab7cea0dba9e7a2b63808b2023f7/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/QueryTracker.java#L308].
>  We need to investigate what happens to local files (shuffle intermediate 
> files) on query failures. I guess ContainerRunnerImpl.queryFailed could be a 
> caller for that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24273) grouping key is case sensitive

2020-10-14 Thread zhaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong reassigned HIVE-24273:
---


> grouping  key is case sensitive
> ---
>
> Key: HIVE-24273
> URL: https://issues.apache.org/jira/browse/HIVE-24273
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0, 4.0.0
>Reporter: zhaolong
>Assignee: zhaolong
>Priority: Major
>
> grouping key is case sensitive,  the follow step can reproduce
> 1.create table testaa(name string, age int);
> 2.select GROUPING(name) from testaa group by name;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24273) grouping key is case sensitive

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24273?focusedWorklogId=500609&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500609
 ]

ASF GitHub Bot logged work on HIVE-24273:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 12:21
Start Date: 14/Oct/20 12:21
Worklog Time Spent: 10m 
  Work Description: fsilent opened a new pull request #1579:
URL: https://github.com/apache/hive/pull/1579


   
   ### What changes were proposed in this pull request?
   fix HIVE-24273 grouping key is case sensitive
   
   
   ### Why are the changes needed?
   i think this function should case insensitive
   
   
   ### How was this patch tested?
   in 4.0.0 test env
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500609)
Remaining Estimate: 0h
Time Spent: 10m

> grouping  key is case sensitive
> ---
>
> Key: HIVE-24273
> URL: https://issues.apache.org/jira/browse/HIVE-24273
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0, 4.0.0
>Reporter: zhaolong
>Assignee: zhaolong
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> grouping key is case sensitive,  the follow step can reproduce
> 1.create table testaa(name string, age int);
> 2.select GROUPING(name) from testaa group by name;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24273) grouping key is case sensitive

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24273:
--
Labels: pull-request-available  (was: )

> grouping  key is case sensitive
> ---
>
> Key: HIVE-24273
> URL: https://issues.apache.org/jira/browse/HIVE-24273
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0, 4.0.0
>Reporter: zhaolong
>Assignee: zhaolong
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> grouping key is case sensitive,  the follow step can reproduce
> 1.create table testaa(name string, age int);
> 2.select GROUPING(name) from testaa group by name;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24273) grouping key is case sensitive

2020-10-14 Thread zhaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong updated HIVE-24273:

   Attachment: 0001-fix-HIVE-24273-grouping-key-is-case-sensitive.patch
Fix Version/s: 4.0.0
   Status: Patch Available  (was: Open)

> grouping  key is case sensitive
> ---
>
> Key: HIVE-24273
> URL: https://issues.apache.org/jira/browse/HIVE-24273
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0, 4.0.0
>Reporter: zhaolong
>Assignee: zhaolong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: 0001-fix-HIVE-24273-grouping-key-is-case-sensitive.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> grouping key is case sensitive,  the follow step can reproduce
> 1.create table testaa(name string, age int);
> 2.select GROUPING(name) from testaa group by name;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24273) grouping key is case sensitive

2020-10-14 Thread zhaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong updated HIVE-24273:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> grouping  key is case sensitive
> ---
>
> Key: HIVE-24273
> URL: https://issues.apache.org/jira/browse/HIVE-24273
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0, 4.0.0
>Reporter: zhaolong
>Assignee: zhaolong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: 0001-fix-HIVE-24273-grouping-key-is-case-sensitive.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> grouping key is case sensitive,  the follow step can reproduce
> 1.create table testaa(name string, age int);
> 2.select GROUPING(name) from testaa group by name;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HIVE-24273) grouping key is case sensitive

2020-10-14 Thread zhaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong reopened HIVE-24273:
-

[~chinnalalam] hi chinna, please help to review this patch.

> grouping  key is case sensitive
> ---
>
> Key: HIVE-24273
> URL: https://issues.apache.org/jira/browse/HIVE-24273
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0, 4.0.0
>Reporter: zhaolong
>Assignee: zhaolong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: 0001-fix-HIVE-24273-grouping-key-is-case-sensitive.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> grouping key is case sensitive,  the follow step can reproduce
> 1.create table testaa(name string, age int);
> 2.select GROUPING(name) from testaa group by name;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-24273) grouping key is case sensitive

2020-10-14 Thread zhaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24273 started by zhaolong.
---
> grouping  key is case sensitive
> ---
>
> Key: HIVE-24273
> URL: https://issues.apache.org/jira/browse/HIVE-24273
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0, 4.0.0
>Reporter: zhaolong
>Assignee: zhaolong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: 0001-fix-HIVE-24273-grouping-key-is-case-sensitive.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> grouping key is case sensitive,  the follow step can reproduce
> 1.create table testaa(name string, age int);
> 2.select GROUPING(name) from testaa group by name;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24221) Use vectorizable expression to combine multiple columns in semijoin bloom filters

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24221?focusedWorklogId=500618&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500618
 ]

ASF GitHub Bot logged work on HIVE-24221:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 12:33
Start Date: 14/Oct/20 12:33
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #1544:
URL: https://github.com/apache/hive/pull/1544#discussion_r504638460



##
File path: ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java
##
@@ -233,6 +235,23 @@ public static ExprNodeGenericFuncDesc 
and(List exps) {
 return new ExprNodeGenericFuncDesc(TypeInfoFactory.booleanTypeInfo, new 
GenericUDFOPAnd(), "and", flatExps);
   }
 
+  /**
+   * Create an expression for computing a hash by recursively hashing given 
expressions by two:
+   * 
+   * Input: HASH(A, B, C, D)
+   * Output: HASH(HASH(HASH(A,B),C),D)
+   * 
+   */
+  public static ExprNodeGenericFuncDesc hash(List exps) {
+assert exps.size() >= 2;
+ExprNodeDesc hashExp = exps.get(0);
+for (int i = 1; i < exps.size(); i++) {
+  List hArgs = Arrays.asList(hashExp, exps.get(i));
+  hashExp = new ExprNodeGenericFuncDesc(TypeInfoFactory.intTypeInfo, new 
GenericUDFMurmurHash(), "hash", hArgs);

Review comment:
   Good catch @kgyrtkirk ! I've never noticed that we have two different 
UDFs for hashing. Indeed having the same annotation can create quite some 
confusion and difficult to debug problems. I guess your suggestion is to change 
the annotation of GenericUDFMurmurHash to murmur_hash right?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500618)
Time Spent: 0.5h  (was: 20m)

> Use vectorizable expression to combine multiple columns in semijoin bloom 
> filters
> -
>
> Key: HIVE-24221
> URL: https://issues.apache.org/jira/browse/HIVE-24221
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
> Environment: 
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, multi-column semijoin reducers use an n-ary call to 
> GenericUDFMurmurHash to combine multiple values into one, which is used as an 
> entry to the bloom filter. However, there are no vectorized operators that 
> treat n-ary inputs. The same goes for the vectorized implementation of 
> GenericUDFMurmurHash introduced in HIVE-23976. 
> The goal of this issue is to choose an alternative way to combine multiple 
> values into one to pass in the bloom filter comprising only vectorized 
> operators.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24272) LLAP: local directories should be cleaned up on ContainerRunnerImpl.queryFailed

2020-10-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24272:

Attachment: HIVE-24272.WIP.patch

> LLAP: local directories should be cleaned up on 
> ContainerRunnerImpl.queryFailed
> ---
>
> Key: HIVE-24272
> URL: https://issues.apache.org/jira/browse/HIVE-24272
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: HIVE-24272.WIP.patch
>
>
> Currently, QueryTracker.cleanupLocalDirs is only called on 
> [QueryTracker.queryComplete|https://github.com/apache/hive/blob/eeffb0e4e7feab7cea0dba9e7a2b63808b2023f7/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/QueryTracker.java#L308].
>  We need to investigate what happens to local files (shuffle intermediate 
> files) on query failures. I guess ContainerRunnerImpl.queryFailed could be a 
> caller for that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-19253) HMS ignores tableType property for external tables

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19253?focusedWorklogId=500623&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500623
 ]

ASF GitHub Bot logged work on HIVE-19253:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 12:46
Start Date: 14/Oct/20 12:46
Worklog Time Spent: 10m 
  Work Description: szehonCriteo commented on pull request #1537:
URL: https://github.com/apache/hive/pull/1537#issuecomment-708377965


   HI Vihang, thanks for looking at the review.
   
   I took a look at failing test TestHiveMetastoreTransformation, I'm not at 
all familiar with the new feature, looked a bit for the specs and could not 
find it so I spent a bit of time on it..
   
   What I found is that before, the test was actually testing the wrong thing, 
it was in fact setting the tableType as EXTERNAL but the 
DefaultMetastoreTransformer was actually treating it as MANAGED because of the 
very issue addressed by this PR.  So I just updated the test to try to test the 
actual expected behavior of EXTERNAL tables (EXT_READ, EXT_WRITE capabilities) 
and the other test was putting transactional which is not allowed for EXTERNAL 
tables so I changed that as well.
   
   I could not reproduce the other error of the TransactionalKafkaWriterTest
   
   Submitting again to run tests (although they have been failing lately for a 
test framework reason, I think).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500623)
Time Spent: 0.5h  (was: 20m)

> HMS ignores tableType property for external tables
> --
>
> Key: HIVE-19253
> URL: https://issues.apache.org/jira/browse/HIVE-19253
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0, 4.0.0
>Reporter: Alex Kolbasov
>Assignee: Vihang Karajgaonkar
>Priority: Major
>  Labels: newbie, pull-request-available
> Attachments: HIVE-19253.01.patch, HIVE-19253.02.patch, 
> HIVE-19253.03.patch, HIVE-19253.03.patch, HIVE-19253.04.patch, 
> HIVE-19253.05.patch, HIVE-19253.06.patch, HIVE-19253.07.patch, 
> HIVE-19253.08.patch, HIVE-19253.09.patch, HIVE-19253.10.patch, 
> HIVE-19253.11.patch, HIVE-19253.12.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When someone creates a table using Thrift API they may think that setting 
> tableType to {{EXTERNAL_TABLE}} creates an external table. And boom - their 
> table is gone later because HMS will silently change it to managed table.
> here is the offending code:
> {code:java}
>   private MTable convertToMTable(Table tbl) throws InvalidObjectException,
>   MetaException {
> ...
> // If the table has property EXTERNAL set, update table type
> // accordingly
> String tableType = tbl.getTableType();
> boolean isExternal = 
> Boolean.parseBoolean(tbl.getParameters().get("EXTERNAL"));
> if (TableType.MANAGED_TABLE.toString().equals(tableType)) {
>   if (isExternal) {
> tableType = TableType.EXTERNAL_TABLE.toString();
>   }
> }
> if (TableType.EXTERNAL_TABLE.toString().equals(tableType)) {
>   if (!isExternal) { // Here!
> tableType = TableType.MANAGED_TABLE.toString();
>   }
> }
> {code}
> So if the EXTERNAL parameter is not set, table type is changed to managed 
> even if it was external in the first place - which is wrong.
> More over, in other places code looks at the table property to decide table 
> type and some places look at parameter. HMS should really make its mind which 
> one to use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24257) Wrong check constraint naming in Hive metastore

2020-10-14 Thread Sankar Hariappan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213897#comment-17213897
 ] 

Sankar Hariappan commented on HIVE-24257:
-

[~ashish-kumar-sharma], 
I think, this change breaks the backward compatibility and HMS client interface 
will be impacted. We cannot change it now.

> Wrong check constraint naming in Hive metastore
> ---
>
> Key: HIVE-24257
> URL: https://issues.apache.org/jira/browse/HIVE-24257
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Current 
> struct SQLCheckConstraint {
>   1: string catName, // catalog name
>   2: string table_db,// table schema
>   3: string table_name,  // table name
>   4: string column_name, // column name
>   5: string check_expression,// check expression
>   6: string dc_name, // default name
>   7: bool enable_cstr,   // Enable/Disable
>   8: bool validate_cstr, // Validate/No validate
>   9: bool rely_cstr  // Rely/No Rely
> }
> Naming for CheckConstraint is wrong it should be cc_name instead of dc_name



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24257) Wrong check constraint naming in Hive metastore

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24257?focusedWorklogId=500641&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500641
 ]

ASF GitHub Bot logged work on HIVE-24257:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 13:07
Start Date: 14/Oct/20 13:07
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #1570:
URL: https://github.com/apache/hive/pull/1570#issuecomment-708389751


   I think, this change breaks the backward compatibility and HMS client 
interface will be impacted. We cannot change it now.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500641)
Time Spent: 0.5h  (was: 20m)

> Wrong check constraint naming in Hive metastore
> ---
>
> Key: HIVE-24257
> URL: https://issues.apache.org/jira/browse/HIVE-24257
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Current 
> struct SQLCheckConstraint {
>   1: string catName, // catalog name
>   2: string table_db,// table schema
>   3: string table_name,  // table name
>   4: string column_name, // column name
>   5: string check_expression,// check expression
>   6: string dc_name, // default name
>   7: bool enable_cstr,   // Enable/Disable
>   8: bool validate_cstr, // Validate/No validate
>   9: bool rely_cstr  // Rely/No Rely
> }
> Naming for CheckConstraint is wrong it should be cc_name instead of dc_name



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24253) HMS and HS2 needs to support keystore/truststores types besides JKS by config

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24253?focusedWorklogId=500681&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500681
 ]

ASF GitHub Bot logged work on HIVE-24253:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 14:30
Start Date: 14/Oct/20 14:30
Worklog Time Spent: 10m 
  Work Description: yongzhi opened a new pull request #1580:
URL: https://github.com/apache/hive/pull/1580


   …esides JKS by config
   
   HiveServer2:
Add new properties:
 hive.server2.keystore.type
JDBC:
 Remove hard-coded SSL_TRUST_STORE_TYPE(JKS) from jdbc connection for HS2
 Add trustStoreType param for jdbc connection
   Hive MetaStore:
 New properties for service and client:
  metastore.keystore.type
  metastore.truststore.type
 Add bcfks into metastore.dbaccess.ssl.truststore.type stringvalidator
   Tests:
Unit tests for HS2 and HMS
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500681)
Remaining Estimate: 0h
Time Spent: 10m

> HMS and HS2 needs to support keystore/truststores types besides JKS by config
> -
>
> Key: HIVE-24253
> URL: https://issues.apache.org/jira/browse/HIVE-24253
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When HiveMetaStoreClient connects to HMS with enabled SSL, HMS should support 
> the Keystore type configurable and default to keystore type specified for the 
> JDK and not always use JKS. Same as HIVE-23958 for hive, HMS should support 
> to set additional keystore/truststore types used for different applications 
> like for FIPS crypto algorithms.
> Also, make hive keystore type and algorithm configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24253) HMS and HS2 needs to support keystore/truststores types besides JKS by config

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24253:
--
Labels: pull-request-available  (was: )

> HMS and HS2 needs to support keystore/truststores types besides JKS by config
> -
>
> Key: HIVE-24253
> URL: https://issues.apache.org/jira/browse/HIVE-24253
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When HiveMetaStoreClient connects to HMS with enabled SSL, HMS should support 
> the Keystore type configurable and default to keystore type specified for the 
> JDK and not always use JKS. Same as HIVE-23958 for hive, HMS should support 
> to set additional keystore/truststore types used for different applications 
> like for FIPS crypto algorithms.
> Also, make hive keystore type and algorithm configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24274) Implement Query Text based MaterializedView rewrite

2020-10-14 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa reassigned HIVE-24274:
-


> Implement Query Text based MaterializedView rewrite
> ---
>
> Key: HIVE-24274
> URL: https://issues.apache.org/jira/browse/HIVE-24274
> Project: Hive
>  Issue Type: Improvement
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>
> Besides the way queries are currently rewritten to use materialized views in 
> Hive this project provides an alternative:
> Compare the query text with the materialized views query text stored. If we 
> found a match the original query's logical plan can be replaced by a scan on 
> the materialized view.
> - Only materialized views which are enabled to rewrite can participate
> - Use existing *HiveMaterializedViewsRegistry* through *Hive* object by 
> adding a lookup method by query text.
> - There might be more than one materialized views which have the same query 
> text. In this case chose the first valid one.
> - Validation can be done by calling 
> *Hive.validateMaterializedViewsFromRegistry()*
> - The scope of this first patch is rewriting queries which entire text can be 
> matched only.
> - Use the expanded query text (fully qualified column and table names) for 
> comparing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24253) HMS and HS2 needs to support keystore/truststores types besides JKS by config

2020-10-14 Thread Yongzhi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-24253:

Component/s: HiveServer2

> HMS and HS2 needs to support keystore/truststores types besides JKS by config
> -
>
> Key: HIVE-24253
> URL: https://issues.apache.org/jira/browse/HIVE-24253
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Standalone Metastore
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When HiveMetaStoreClient connects to HMS with enabled SSL, HMS should support 
> the Keystore type configurable and default to keystore type specified for the 
> JDK and not always use JKS. Same as HIVE-23958 for hive, HMS should support 
> to set additional keystore/truststore types used for different applications 
> like for FIPS crypto algorithms.
> Also, make hive keystore type and algorithm configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24253) HMS and HS2 needs to support keystore/truststores types besides JKS by config

2020-10-14 Thread Yongzhi Chen (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17214083#comment-17214083
 ] 

Yongzhi Chen commented on HIVE-24253:
-

[~ngangam], [~thejas] could you review the PR?
https://github.com/apache/hive/pull/1580 ?

Thanks

> HMS and HS2 needs to support keystore/truststores types besides JKS by config
> -
>
> Key: HIVE-24253
> URL: https://issues.apache.org/jira/browse/HIVE-24253
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Standalone Metastore
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When HiveMetaStoreClient connects to HMS with enabled SSL, HMS should support 
> the Keystore type configurable and default to keystore type specified for the 
> JDK and not always use JKS. Same as HIVE-23958 for hive, HMS should support 
> to set additional keystore/truststore types used for different applications 
> like for FIPS crypto algorithms.
> Also, make hive keystore type and algorithm configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24275) Configurations to delay the deletion of obsolete files by the Cleaner

2020-10-14 Thread Kishen Das (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kishen Das updated HIVE-24275:
--
Summary: Configurations to delay the deletion of obsolete files by the 
Cleaner  (was: Introduce a configuration to delay the deletion of obsolete 
files by the Cleaner)

> Configurations to delay the deletion of obsolete files by the Cleaner
> -
>
> Key: HIVE-24275
> URL: https://issues.apache.org/jira/browse/HIVE-24275
> Project: Hive
>  Issue Type: New Feature
>Reporter: Kishen Das
>Priority: Major
>
> Whenever compaction happens, the cleaner immediately deletes older obsolete 
> files. In certain cases it would be beneficial to retain these for certain 
> period. For example : if you are serving the file metadata from cache and 
> don't want to invalidate the cache during compaction because of performance 
> reasons. 
> For this purpose we should introduce a configuration 
> hive.compactor.delayed.cleanup.enabled, which if enabled will delay the 
> cleaning up obsolete files. There should be a separate configuration 
> CLEANER_RETENTION_TIME to specific the duration till which we should retain 
> these older obsolete files. 
> It might be beneficial to have one more configuration to decide whether to 
> retain files involved in an aborted transaction 
> hive.compactor.aborted.txn.delayed.cleanup.enabled . 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24275) Configurations to delay the deletion of obsolete files by the Cleaner

2020-10-14 Thread Kishen Das (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kishen Das updated HIVE-24275:
--
Description: 
Whenever compaction happens, the cleaner immediately deletes older obsolete 
files. In certain cases it would be beneficial to retain these for certain 
period. For example : if you are serving the file metadata from cache and don't 
want to invalidate the cache during compaction because of performance reasons. 

For this purpose we should introduce a configuration 
hive.compactor.delayed.cleanup.enabled, which if enabled will delay the 
cleaning up obsolete files. There should be a separate configuration 
CLEANER_RETENTION_TIME to specify the duration till which we should retain 
these older obsolete files. 

It might be beneficial to have one more configuration to decide whether to 
retain files involved in an aborted transaction 
hive.compactor.aborted.txn.delayed.cleanup.enabled . 

  was:
Whenever compaction happens, the cleaner immediately deletes older obsolete 
files. In certain cases it would be beneficial to retain these for certain 
period. For example : if you are serving the file metadata from cache and don't 
want to invalidate the cache during compaction because of performance reasons. 

For this purpose we should introduce a configuration 
hive.compactor.delayed.cleanup.enabled, which if enabled will delay the 
cleaning up obsolete files. There should be a separate configuration 
CLEANER_RETENTION_TIME to specific the duration till which we should retain 
these older obsolete files. 

It might be beneficial to have one more configuration to decide whether to 
retain files involved in an aborted transaction 
hive.compactor.aborted.txn.delayed.cleanup.enabled . 


> Configurations to delay the deletion of obsolete files by the Cleaner
> -
>
> Key: HIVE-24275
> URL: https://issues.apache.org/jira/browse/HIVE-24275
> Project: Hive
>  Issue Type: New Feature
>Reporter: Kishen Das
>Priority: Major
>
> Whenever compaction happens, the cleaner immediately deletes older obsolete 
> files. In certain cases it would be beneficial to retain these for certain 
> period. For example : if you are serving the file metadata from cache and 
> don't want to invalidate the cache during compaction because of performance 
> reasons. 
> For this purpose we should introduce a configuration 
> hive.compactor.delayed.cleanup.enabled, which if enabled will delay the 
> cleaning up obsolete files. There should be a separate configuration 
> CLEANER_RETENTION_TIME to specify the duration till which we should retain 
> these older obsolete files. 
> It might be beneficial to have one more configuration to decide whether to 
> retain files involved in an aborted transaction 
> hive.compactor.aborted.txn.delayed.cleanup.enabled . 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24266) Committed rows in hflush'd ACID files may be missing from query result

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24266?focusedWorklogId=500816&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500816
 ]

ASF GitHub Bot logged work on HIVE-24266:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 19:21
Start Date: 14/Oct/20 19:21
Worklog Time Spent: 10m 
  Work Description: szlta merged pull request #1576:
URL: https://github.com/apache/hive/pull/1576


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500816)
Time Spent: 1h 10m  (was: 1h)

> Committed rows in hflush'd ACID files may be missing from query result
> --
>
> Key: HIVE-24266
> URL: https://issues.apache.org/jira/browse/HIVE-24266
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> in HDFS environment if a writer is using hflush to write ORC ACID files 
> during a transaction commit, the results might be seen as missing when 
> reading the table before this file is completely persisted to disk (thus 
> synced)
> This is due to hflush not persisting the new buffers to disk, it rather just 
> ensures that new readers can see the new content. This causes the block 
> information to be incomplete, on which BISplitStrategy relies on. Although 
> the side file (_flush_length) tracks the proper end of the file that is being 
> written, this information is neglected in the favour of block information, 
> and we may end up generating a very short split instead of the larger, 
> available length.
> When ETLSplitStrategy is used there is not even a try to rely on ACID side 
> file when calculating file length, so that needs to fixed too.
> Moreover we might see the newly committed rows not to appear due to OrcTail 
> caching in ETLSplitStrategy. For now I'm just going to recommend turning that 
> cache off to anyone that wants real time row updates to be read in:
> {code:java}
> set hive.orc.cache.stripe.details.mem.size=0;  {code}
> ..as tweaking with that code would probably open a can of worms..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24266) Committed rows in hflush'd ACID files may be missing from query result

2020-10-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita resolved HIVE-24266.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Committed to master. Thanks for the review [~pvary]

> Committed rows in hflush'd ACID files may be missing from query result
> --
>
> Key: HIVE-24266
> URL: https://issues.apache.org/jira/browse/HIVE-24266
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> in HDFS environment if a writer is using hflush to write ORC ACID files 
> during a transaction commit, the results might be seen as missing when 
> reading the table before this file is completely persisted to disk (thus 
> synced)
> This is due to hflush not persisting the new buffers to disk, it rather just 
> ensures that new readers can see the new content. This causes the block 
> information to be incomplete, on which BISplitStrategy relies on. Although 
> the side file (_flush_length) tracks the proper end of the file that is being 
> written, this information is neglected in the favour of block information, 
> and we may end up generating a very short split instead of the larger, 
> available length.
> When ETLSplitStrategy is used there is not even a try to rely on ACID side 
> file when calculating file length, so that needs to fixed too.
> Moreover we might see the newly committed rows not to appear due to OrcTail 
> caching in ETLSplitStrategy. For now I'm just going to recommend turning that 
> cache off to anyone that wants real time row updates to be read in:
> {code:java}
> set hive.orc.cache.stripe.details.mem.size=0;  {code}
> ..as tweaking with that code would probably open a can of worms..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24276) HiveServer2 loggerconf jsp Cross-Site Scripting (XSS) Vulnerability

2020-10-14 Thread Rajkumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh reassigned HIVE-24276:
-


> HiveServer2 loggerconf jsp Cross-Site Scripting (XSS) Vulnerability 
> 
>
> Key: HIVE-24276
> URL: https://issues.apache.org/jira/browse/HIVE-24276
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24276) HiveServer2 loggerconf jsp Cross-Site Scripting (XSS) Vulnerability

2020-10-14 Thread Rajkumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17214235#comment-17214235
 ] 

Rajkumar Singh commented on HIVE-24276:
---

pull request created :- https://github.com/apache/hive/pull/1581

> HiveServer2 loggerconf jsp Cross-Site Scripting (XSS) Vulnerability 
> 
>
> Key: HIVE-24276
> URL: https://issues.apache.org/jira/browse/HIVE-24276
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24253) HMS and HS2 needs to support keystore/truststores types besides JKS by config

2020-10-14 Thread Thejas Nair (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17214244#comment-17214244
 ] 

Thejas Nair commented on HIVE-24253:


cc [~krisden]

 

> HMS and HS2 needs to support keystore/truststores types besides JKS by config
> -
>
> Key: HIVE-24253
> URL: https://issues.apache.org/jira/browse/HIVE-24253
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Standalone Metastore
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When HiveMetaStoreClient connects to HMS with enabled SSL, HMS should support 
> the Keystore type configurable and default to keystore type specified for the 
> JDK and not always use JKS. Same as HIVE-23958 for hive, HMS should support 
> to set additional keystore/truststore types used for different applications 
> like for FIPS crypto algorithms.
> Also, make hive keystore type and algorithm configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24217?focusedWorklogId=500843&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500843
 ]

ASF GitHub Bot logged work on HIVE-24217:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 20:50
Start Date: 14/Oct/20 20:50
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1542:
URL: https://github.com/apache/hive/pull/1542#discussion_r504964132



##
File path: standalone-metastore/metastore-server/src/main/resources/package.jdo
##
@@ -1549,6 +1549,83 @@
 
   
 
+
+
+  
+
+  
+  
+
+  
+  
+
+  
+  
+
+  
+  
+
+  
+  
+
+  
+  
+
+  
+  
+
+  
+  
+
+  
+  
+
+
+
+
+
+  
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+  
+
+  

Review comment:
   Thanks @zeroflag for working on this , this is very cool. 
   
   There are a few challenges with keeping redundant information outside of the 
procedure text that I can think of. One of them is that while the semantics of 
the procedure definition may be well defined, the representation for the other 
fields may not be defined clearly. Another usual challenge is that if there are 
any changes in the future, you will have to ensure backwards compatibility for 
those fields too.
   
   Going through the specific implementation of type handling, it seems you are 
keeping length and scale in different fields. This is not done when we store 
types in HMS for Hive. Any reason for that? Also, checking the documentation, 
it seems HPL/SQL can apply some transformations to the field type. Are those 
transformations applied before storing the definition or later on?
   
   Based on that, I think keeping a lean representation in HMS has its 
advantages like @kgyrtkirk mentioned, specifically if those fields are not 
actually being used for the time being. It's preferable to add those fields 
later on if needed, rather than backtracking and removing fields.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500843)
Time Spent: 1h 50m  (was: 1h 40m)

> HMS storage backend for HPL/SQL stored procedures
> -
>
> Key: HIVE-24217
> URL: https://issues.apache.org/jira/browse/HIVE-24217
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, hpl/sql, Metastore
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HPL_SQL storedproc HMS storage.pdf
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> HPL/SQL procedures are currently stored in text files. The goal of this Jira 
> is to implement a Metastore backend for storing and loading these procedures. 
> This is an incremental step towards having fully capable stored procedures in 
> Hive.
>  
> See the attached design for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24253) HMS and HS2 needs to support keystore/truststores types besides JKS by config

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24253?focusedWorklogId=500852&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500852
 ]

ASF GitHub Bot logged work on HIVE-24253:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 21:17
Start Date: 14/Oct/20 21:17
Worklog Time Spent: 10m 
  Work Description: risdenk commented on a change in pull request #1580:
URL: https://github.com/apache/hive/pull/1580#discussion_r504976602



##
File path: 
service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java
##
@@ -136,6 +136,12 @@ public void onClosed(Connection connection) {
   ConfVars.HIVE_SERVER2_SSL_KEYSTORE_PATH.varname 
   + " Not configured for SSL connection");
 }
+String keyStoreType = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_SSL_KEYSTORE_TYPE).trim();
+if (keyStoreType.isEmpty()) {
+  keyStoreType = KeyStore.getDefaultType();
+}
+String keyStoreAlgorithm = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_SSL_KEYSTORE_ALGORITHM).trim();

Review comment:
   `keyStoreAlgorithm` looks unused? Might have to be used around line 153?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500852)
Time Spent: 20m  (was: 10m)

> HMS and HS2 needs to support keystore/truststores types besides JKS by config
> -
>
> Key: HIVE-24253
> URL: https://issues.apache.org/jira/browse/HIVE-24253
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Standalone Metastore
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When HiveMetaStoreClient connects to HMS with enabled SSL, HMS should support 
> the Keystore type configurable and default to keystore type specified for the 
> JDK and not always use JKS. Same as HIVE-23958 for hive, HMS should support 
> to set additional keystore/truststore types used for different applications 
> like for FIPS crypto algorithms.
> Also, make hive keystore type and algorithm configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=500883&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500883
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 22:56
Start Date: 14/Oct/20 22:56
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on a change in pull request #1577:
URL: https://github.com/apache/hive/pull/1577#discussion_r505043441



##
File path: ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java
##
@@ -1852,6 +1863,9 @@ public void close() throws IOException {
   closeSparkSession();
   registry.closeCUDFLoaders();
   dropSessionPaths(sessionConf);
+  if (pathCleaner != null) {
+pathCleaner.shutdown();

Review comment:
   I dont see how this happens. Everything that needs to be deleted is 
guaranteed to be in PathCleaner's deleteActions list by the time shutdown is 
called. PathCleaner goes over everything in the list before exiting. See: 
https://github.com/apache/hive/pull/1577/files#diff-80bcef5e921dc6ac67a42ad691b771943f0ffa608454dae389c76b29139c61f9R65
 
   
   I didn't get your point with 1024 entries either. That is just the initial 
capacity of intermediate ArrayList. That is not an upper bound.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500883)
Time Spent: 0.5h  (was: 20m)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24253) HMS and HS2 needs to support keystore/truststores types besides JKS by config

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24253?focusedWorklogId=500888&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500888
 ]

ASF GitHub Bot logged work on HIVE-24253:
-

Author: ASF GitHub Bot
Created on: 14/Oct/20 23:53
Start Date: 14/Oct/20 23:53
Worklog Time Spent: 10m 
  Work Description: yongzhi commented on a change in pull request #1580:
URL: https://github.com/apache/hive/pull/1580#discussion_r505082133



##
File path: 
service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java
##
@@ -136,6 +136,12 @@ public void onClosed(Connection connection) {
   ConfVars.HIVE_SERVER2_SSL_KEYSTORE_PATH.varname 
   + " Not configured for SSL connection");
 }
+String keyStoreType = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_SSL_KEYSTORE_TYPE).trim();
+if (keyStoreType.isEmpty()) {
+  keyStoreType = KeyStore.getDefaultType();
+}
+String keyStoreAlgorithm = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_SSL_KEYSTORE_ALGORITHM).trim();

Review comment:
   I will add





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500888)
Time Spent: 0.5h  (was: 20m)

> HMS and HS2 needs to support keystore/truststores types besides JKS by config
> -
>
> Key: HIVE-24253
> URL: https://issues.apache.org/jira/browse/HIVE-24253
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Standalone Metastore
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When HiveMetaStoreClient connects to HMS with enabled SSL, HMS should support 
> the Keystore type configurable and default to keystore type specified for the 
> JDK and not always use JKS. Same as HIVE-23958 for hive, HMS should support 
> to set additional keystore/truststore types used for different applications 
> like for FIPS crypto algorithms.
> Also, make hive keystore type and algorithm configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23955) Classification of Error Codes in Replication

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23955?focusedWorklogId=500903&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500903
 ]

ASF GitHub Bot logged work on HIVE-23955:
-

Author: ASF GitHub Bot
Created on: 15/Oct/20 00:55
Start Date: 15/Oct/20 00:55
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1358:
URL: https://github.com/apache/hive/pull/1358


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500903)
Time Spent: 2h  (was: 1h 50m)

> Classification of Error Codes in Replication
> 
>
> Key: HIVE-23955
> URL: https://issues.apache.org/jira/browse/HIVE-23955
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23955.01.patch, HIVE-23955.02.patch, 
> HIVE-23955.03.patch, HIVE-23955.04.patch, Retry Logic for Replication.pdf
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23811) deleteReader SARG rowId/bucketId are not getting validated properly

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23811?focusedWorklogId=500904&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500904
 ]

ASF GitHub Bot logged work on HIVE-23811:
-

Author: ASF GitHub Bot
Created on: 15/Oct/20 00:55
Start Date: 15/Oct/20 00:55
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1214:
URL: https://github.com/apache/hive/pull/1214


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500904)
Time Spent: 40m  (was: 0.5h)

> deleteReader SARG rowId/bucketId are not getting validated properly
> ---
>
> Key: HIVE-23811
> URL: https://issues.apache.org/jira/browse/HIVE-23811
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Though we are iterating over min/max stripeIndex, we always seem to pick 
> ColumnStats from first stripe
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java#L596]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24042) Fix typo in MetastoreConf.java

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24042?focusedWorklogId=500902&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500902
 ]

ASF GitHub Bot logged work on HIVE-24042:
-

Author: ASF GitHub Bot
Created on: 15/Oct/20 00:55
Start Date: 15/Oct/20 00:55
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1406:
URL: https://github.com/apache/hive/pull/1406#issuecomment-708794434


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500902)
Remaining Estimate: 0h
Time Spent: 10m

> Fix typo in MetastoreConf.java
> --
>
> Key: HIVE-24042
> URL: https://issues.apache.org/jira/browse/HIVE-24042
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: mmuuooaa
>Priority: Trivial
> Attachments: HIVE-24042.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Fix typo in MetastoreConf.java: correct word "riven" in package name to 
> "hadoop.hive.metastore".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24043) Retain original path info in Warehouse.makeSpecFromName()'s logger

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24043?focusedWorklogId=500901&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500901
 ]

ASF GitHub Bot logged work on HIVE-24043:
-

Author: ASF GitHub Bot
Created on: 15/Oct/20 00:55
Start Date: 15/Oct/20 00:55
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1407:
URL: https://github.com/apache/hive/pull/1407#issuecomment-708794402


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500901)
Time Spent: 20m  (was: 10m)

> Retain original path info in Warehouse.makeSpecFromName()'s logger
> --
>
> Key: HIVE-24043
> URL: https://issues.apache.org/jira/browse/HIVE-24043
> Project: Hive
>  Issue Type: Improvement
>Reporter: mmuuooaa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24043.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The warn logger in Warehouse.makeSpecFromName() not retain original path 
> info, for example:
> {code:java}
> 20/08/07 14:32:28 WARN warehouse: Cannot create partition spec from 
> hdfs://nameservice/; missing keys [dt1]
> {code}
> the log content was expect to be the full hdfs path but 'hdfs://nameservice'
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24042) Fix typo in MetastoreConf.java

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24042:
--
Labels: pull-request-available  (was: )

> Fix typo in MetastoreConf.java
> --
>
> Key: HIVE-24042
> URL: https://issues.apache.org/jira/browse/HIVE-24042
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: mmuuooaa
>Priority: Trivial
>  Labels: pull-request-available
> Attachments: HIVE-24042.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Fix typo in MetastoreConf.java: correct word "riven" in package name to 
> "hadoop.hive.metastore".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-22344) I can't run hive in command line

2020-10-14 Thread wagnkai (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17214354#comment-17214354
 ] 

wagnkai edited comment on HIVE-22344 at 10/15/20, 1:23 AM:
---

Incompatibility between hive's guava version and Hadoop


was (Author: wank125):
 
 
 Incompatibility between hive's guava version and Hadoop

> I can't run hive in command line
> 
>
> Key: HIVE-22344
> URL: https://issues.apache.org/jira/browse/HIVE-22344
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 3.1.2
> Environment: hive: 3.1.2
> hadoop 3.2.1
>  
>Reporter: Smith Cruise
>Priority: Blocker
>
> I can't run hive in command. It tell me :
> {code:java}
> [hadoop@master lib]$ hive
> which: no hbase in 
> (/home/hadoop/apache-hive-3.1.2-bin/bin:{{pwd}}/bin:/home/hadoop/.local/bin:/home/hadoop/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin)
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/home/hadoop/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/home/hadoop/hadoop3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
> at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
> at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
> at org.apache.hadoop.mapred.JobConf.setJar(JobConf.java:536)
> at org.apache.hadoop.mapred.JobConf.setJarByClass(JobConf.java:554)
> at org.apache.hadoop.mapred.JobConf.(JobConf.java:448)
> at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5141)
> at org.apache.hadoop.hive.conf.HiveConf.(HiveConf.java:5099)
> at 
> org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:97)
> at 
> org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:81)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:699)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> {code}
> I don't know what's wrong about it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22344) I can't run hive in command line

2020-10-14 Thread wagnkai (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17214354#comment-17214354
 ] 

wagnkai commented on HIVE-22344:


 
 
 Incompatibility between hive's guava version and Hadoop

> I can't run hive in command line
> 
>
> Key: HIVE-22344
> URL: https://issues.apache.org/jira/browse/HIVE-22344
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 3.1.2
> Environment: hive: 3.1.2
> hadoop 3.2.1
>  
>Reporter: Smith Cruise
>Priority: Blocker
>
> I can't run hive in command. It tell me :
> {code:java}
> [hadoop@master lib]$ hive
> which: no hbase in 
> (/home/hadoop/apache-hive-3.1.2-bin/bin:{{pwd}}/bin:/home/hadoop/.local/bin:/home/hadoop/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin)
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/home/hadoop/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/home/hadoop/hadoop3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
> at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
> at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
> at org.apache.hadoop.mapred.JobConf.setJar(JobConf.java:536)
> at org.apache.hadoop.mapred.JobConf.setJarByClass(JobConf.java:554)
> at org.apache.hadoop.mapred.JobConf.(JobConf.java:448)
> at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5141)
> at org.apache.hadoop.hive.conf.HiveConf.(HiveConf.java:5099)
> at 
> org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:97)
> at 
> org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:81)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:699)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> {code}
> I don't know what's wrong about it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24253) HMS and HS2 needs to support keystore/truststores types besides JKS by config

2020-10-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24253?focusedWorklogId=500929&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-500929
 ]

ASF GitHub Bot logged work on HIVE-24253:
-

Author: ASF GitHub Bot
Created on: 15/Oct/20 02:45
Start Date: 15/Oct/20 02:45
Worklog Time Spent: 10m 
  Work Description: yongzhi commented on a change in pull request #1580:
URL: https://github.com/apache/hive/pull/1580#discussion_r505131640



##
File path: 
service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java
##
@@ -136,6 +136,12 @@ public void onClosed(Connection connection) {
   ConfVars.HIVE_SERVER2_SSL_KEYSTORE_PATH.varname 
   + " Not configured for SSL connection");
 }
+String keyStoreType = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_SSL_KEYSTORE_TYPE).trim();
+if (keyStoreType.isEmpty()) {
+  keyStoreType = KeyStore.getDefaultType();
+}
+String keyStoreAlgorithm = 
hiveConf.getVar(ConfVars.HIVE_SERVER2_SSL_KEYSTORE_ALGORITHM).trim();

Review comment:
   For the MetastoreConf changes in testing because the previous HiveConf 
metastore related properties are all deprecated, I just switch to the 
corresponding properties in MetastoreConf





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 500929)
Time Spent: 40m  (was: 0.5h)

> HMS and HS2 needs to support keystore/truststores types besides JKS by config
> -
>
> Key: HIVE-24253
> URL: https://issues.apache.org/jira/browse/HIVE-24253
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Standalone Metastore
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When HiveMetaStoreClient connects to HMS with enabled SSL, HMS should support 
> the Keystore type configurable and default to keystore type specified for the 
> JDK and not always use JKS. Same as HIVE-23958 for hive, HMS should support 
> to set additional keystore/truststore types used for different applications 
> like for FIPS crypto algorithms.
> Also, make hive keystore type and algorithm configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24277) Temporary table with constraints is persisted in HMS

2020-10-14 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-24277:
--


> Temporary table with constraints is persisted in HMS
> 
>
> Key: HIVE-24277
> URL: https://issues.apache.org/jira/browse/HIVE-24277
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
> Fix For: 4.0.0
>
>
> Run below in a session
> {noformat}
> 0: jdbc:hive2://zk1-nikhil.q5dzd3jj30bupgln50> create temporary table ttemp 
> (id int default 0);
> INFO  : Compiling 
> command(queryId=hive_20201015050509_99267861-56f7-4940-ae3f-5a895dc3d2cb): 
> create temporary table ttemp (id int default 0)
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20201015050509_99267861-56f7-4940-ae3f-5a895dc3d2cb); 
> Time taken: 0.625 seconds
> INFO  : Executing 
> command(queryId=hive_20201015050509_99267861-56f7-4940-ae3f-5a895dc3d2cb): 
> create temporary table ttemp (id int default 0)
> INFO  : Starting task [Stage-0:DDL] in serial mode
> INFO  : Completed executing 
> command(queryId=hive_20201015050509_99267861-56f7-4940-ae3f-5a895dc3d2cb); 
> Time taken: 4.02 seconds
> INFO  : OK
> No rows affected (5.32 seconds)
> {noformat}
> Running "show tables" in another session will return that temporary table in 
> output
> {noformat}
> 0: jdbc:hive2://zk1-nikhil.q5dzd3jj30bupgln50> show tables
> . . . . . . . . . . . . . . . . . . . . . . .> ;
> INFO  : Compiling 
> command(queryId=hive_20201015050554_7882c055-f084-4919-9a18-800d3fe4dcf7): 
> show tables
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from 
> deserializer)], properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20201015050554_7882c055-f084-4919-9a18-800d3fe4dcf7); 
> Time taken: 0.065 seconds
> INFO  : Executing 
> command(queryId=hive_20201015050554_7882c055-f084-4919-9a18-800d3fe4dcf7): 
> show tables
> INFO  : Starting task [Stage-0:DDL] in serial mode
> INFO  : Completed executing 
> command(queryId=hive_20201015050554_7882c055-f084-4919-9a18-800d3fe4dcf7); 
> Time taken: 0.057 seconds
> INFO  : OK
> +--+
> | tab_name |
> +--+
> | ttemp|
> +--+
> {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)