[jira] [Work logged] (HIVE-25086) Create Ranger Deny Policy for replication db in all cases if hive.repl.ranger.target.deny.policy is set to true.

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25086?focusedWorklogId=601019=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-601019
 ]

ASF GitHub Bot logged work on HIVE-25086:
-

Author: ASF GitHub Bot
Created on: 24/May/21 05:33
Start Date: 24/May/21 05:33
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2240:
URL: https://github.com/apache/hive/pull/2240#discussion_r637698171



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ranger/RangerRestClientImpl.java
##
@@ -420,8 +458,8 @@ boolean checkConnectionPlain(String url, HiveConf hiveConf) 
{
   }
 
   @Override
-  public List addDenyPolicies(List rangerPolicies, 
String rangerServiceName,
-String sourceDb, String targetDb) 
throws SemanticException {
+  public RangerPolicy getDenyPolicyForReplicatedDb(String rangerServiceName,

Review comment:
   yes, while defining ranger policy name.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 601019)
Time Spent: 2h 10m  (was: 2h)

> Create Ranger Deny Policy for replication db in all cases if 
> hive.repl.ranger.target.deny.policy is set to true.
> 
>
> Key: HIVE-25086
> URL: https://issues.apache.org/jira/browse/HIVE-25086
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25086) Create Ranger Deny Policy for replication db in all cases if hive.repl.ranger.target.deny.policy is set to true.

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25086?focusedWorklogId=601018=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-601018
 ]

ASF GitHub Bot logged work on HIVE-25086:
-

Author: ASF GitHub Bot
Created on: 24/May/21 05:32
Start Date: 24/May/21 05:32
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2240:
URL: https://github.com/apache/hive/pull/2240#discussion_r637697872



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ranger/RangerRestClientImpl.java
##
@@ -444,7 +482,7 @@ boolean checkConnectionPlain(String url, HiveConf hiveConf) 
{
 List denyExceptionsPolicyItemAccesses 
= new ArrayList();
 
-resourceNameList.add(sourceDb);
+resourceNameList.add(targetDb);
 resourceNameList.add("dummy");

Review comment:
   It is required to avoid new deny policy from overriding the existing 
policy. Ref. HIVE-24371.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 601018)
Time Spent: 2h  (was: 1h 50m)

> Create Ranger Deny Policy for replication db in all cases if 
> hive.repl.ranger.target.deny.policy is set to true.
> 
>
> Key: HIVE-25086
> URL: https://issues.apache.org/jira/browse/HIVE-25086
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24663) Batch process in ColStatsProcessor for partitions.

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24663?focusedWorklogId=601013=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-601013
 ]

ASF GitHub Bot logged work on HIVE-24663:
-

Author: ASF GitHub Bot
Created on: 24/May/21 05:12
Start Date: 24/May/21 05:12
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on a change in pull request #2266:
URL: https://github.com/apache/hive/pull/2266#discussion_r637691746



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/info/desc/DescTableOperation.java
##
@@ -213,11 +214,14 @@ private void getColumnDataColPathSpecified(Table table, 
Partition part, List partitions = new ArrayList();
-  partitions.add(part.getName());
+  // The partition name is converted to lowercase before generating the 
stats. So we should use the same

Review comment:
   The lower case conversion is only for column name ..the value part is 
not converted.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 601013)
Time Spent: 2h 10m  (was: 2h)

> Batch process in ColStatsProcessor for partitions.
> --
>
> Key: HIVE-24663
> URL: https://issues.apache.org/jira/browse/HIVE-24663
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: performance, pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When large number of partitions (>20K) are processed, ColStatsProcessor runs 
> into DB issues. 
> {{ db.setPartitionColumnStatistics(request);}} gets stuck for hours together 
> and in some cases postgres stops processing. 
> It would be good to introduce small batches for stats gathering in 
> ColStatsProcessor instead of bulk update.
> Ref: 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java#L181
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java#L199



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24663) Batch process in ColStatsProcessor for partitions.

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24663?focusedWorklogId=601012=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-601012
 ]

ASF GitHub Bot logged work on HIVE-24663:
-

Author: ASF GitHub Bot
Created on: 24/May/21 05:11
Start Date: 24/May/21 05:11
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on a change in pull request #2266:
URL: https://github.com/apache/hive/pull/2266#discussion_r637691386



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -5390,6 +5406,493 @@ public void countOpenTxns() throws MetaException {
 }
   }
 
+  private void cleanOldStatsFromPartColStatTable(Map 
statsPartInfoMap,
+ Map 
newStatsMap,
+ Connection dbConn) throws 
SQLException {
+PreparedStatement statementDelete = null;
+int numRows = 0;
+int maxNumRows = MetastoreConf.getIntVar(conf, 
ConfVars.DIRECT_SQL_MAX_ELEMENTS_VALUES_CLAUSE);
+String delete = "DELETE FROM \"PART_COL_STATS\" where \"PART_ID\" = ? AND 
\"COLUMN_NAME\" = ?";
+
+try {
+  statementDelete = dbConn.prepareStatement(delete);
+  for (Map.Entry entry : newStatsMap.entrySet()) {
+// If the partition does not exist (deleted/removed by some other 
task), no need to update the stats.
+if (!statsPartInfoMap.containsKey(entry.getKey())) {
+  continue;
+}
+
+ColumnStatistics colStats = (ColumnStatistics) entry.getValue();
+for (ColumnStatisticsObj statisticsObj : colStats.getStatsObj()) {
+  statementDelete.setLong(1, 
statsPartInfoMap.get(entry.getKey()).partitionId);
+  statementDelete.setString(2, statisticsObj.getColName());
+  numRows++;
+  statementDelete.addBatch();
+  if (numRows == maxNumRows) {
+statementDelete.executeBatch();
+numRows = 0;
+LOG.info("Executed delete " + delete + " for numRows " + numRows);
+  }
+}
+  }
+
+  if (numRows != 0) {
+statementDelete.executeBatch();
+  }
+} finally {
+  closeStmt(statementDelete);
+}
+  }
+
+  private long getMaxCSId(Connection dbConn) throws SQLException {
+Statement stmtInt = null;
+ResultSet rsInt = null;
+long maxCsId = 0;
+try {
+  stmtInt = dbConn.createStatement();
+  while (maxCsId == 0) {
+String query = "SELECT \"NEXT_VAL\" FROM \"SEQUENCE_TABLE\" WHERE 
\"SEQUENCE_NAME\"= "

Review comment:
   done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 601012)
Time Spent: 2h  (was: 1h 50m)

> Batch process in ColStatsProcessor for partitions.
> --
>
> Key: HIVE-24663
> URL: https://issues.apache.org/jira/browse/HIVE-24663
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: performance, pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> When large number of partitions (>20K) are processed, ColStatsProcessor runs 
> into DB issues. 
> {{ db.setPartitionColumnStatistics(request);}} gets stuck for hours together 
> and in some cases postgres stops processing. 
> It would be good to introduce small batches for stats gathering in 
> ColStatsProcessor instead of bulk update.
> Ref: 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java#L181
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java#L199



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25117) Vector PTF ClassCastException with Decimal64

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25117?focusedWorklogId=601011=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-601011
 ]

ASF GitHub Bot logged work on HIVE-25117:
-

Author: ASF GitHub Bot
Created on: 24/May/21 04:55
Start Date: 24/May/21 04:55
Worklog Time Spent: 10m 
  Work Description: ramesh0201 commented on a change in pull request #2286:
URL: https://github.com/apache/hive/pull/2286#discussion_r637686832



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java
##
@@ -4962,9 +4969,8 @@ private static void createVectorPTFDesc(Operator ptfOp,
 evaluatorWindowFrameDefs,
 evaluatorInputExprNodeDescLists);
 
-TypeInfo[] reducerBatchTypeInfos = vContext.getAllTypeInfos();
-
 vectorPTFDesc.setReducerBatchTypeInfos(reducerBatchTypeInfos);
+
vectorPTFDesc.setReducerBatchDataTypePhysicalVariations(reducerBatchDataTypePhysicalVariations);

Review comment:
   Resolved thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 601011)
Time Spent: 50m  (was: 40m)

> Vector PTF ClassCastException with Decimal64
> 
>
> Key: HIVE-25117
> URL: https://issues.apache.org/jira/browse/HIVE-25117
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: vector_ptf_classcast_exception.q
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Only reproduces when there is at least 1 buffered batch, so needed 2 rows 
> with 1 row/batch:
> {code:java}
> set hive.vectorized.testing.reducer.batch.size=1;
> {code}
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25117) Vector PTF ClassCastException with Decimal64

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25117?focusedWorklogId=601008=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-601008
 ]

ASF GitHub Bot logged work on HIVE-25117:
-

Author: ASF GitHub Bot
Created on: 24/May/21 04:44
Start Date: 24/May/21 04:44
Worklog Time Spent: 10m 
  Work Description: ramesh0201 commented on pull request #2286:
URL: https://github.com/apache/hive/pull/2286#issuecomment-846728695


   None of the vector PTF decimal operators have support for decimal64. Like we 
need to have corresponding decimal64 version for VectorPTFEvaluatorDecimalSum 
and similar classes. I will create a jira to handle this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 601008)
Time Spent: 40m  (was: 0.5h)

> Vector PTF ClassCastException with Decimal64
> 
>
> Key: HIVE-25117
> URL: https://issues.apache.org/jira/browse/HIVE-25117
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: vector_ptf_classcast_exception.q
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Only reproduces when there is at least 1 buffered batch, so needed 2 rows 
> with 1 row/batch:
> {code:java}
> set hive.vectorized.testing.reducer.batch.size=1;
> {code}
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25086) Create Ranger Deny Policy for replication db in all cases if hive.repl.ranger.target.deny.policy is set to true.

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25086?focusedWorklogId=601006=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-601006
 ]

ASF GitHub Bot logged work on HIVE-25086:
-

Author: ASF GitHub Bot
Created on: 24/May/21 04:11
Start Date: 24/May/21 04:11
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #2240:
URL: https://github.com/apache/hive/pull/2240#discussion_r637674090



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ranger/RangerRestClientImpl.java
##
@@ -79,6 +74,7 @@
   private static final String RANGER_REST_URL_EXPORTJSONFILE = 
"service/plugins/policies/exportJson";
   private static final String RANGER_REST_URL_IMPORTJSONFILE =
   "service/plugins/policies/importPoliciesFromFile";
+  private static final String RANGER_REST_URL_DELETEPOLICY = 
"service/public/v2/api/policy";

Review comment:
   is this not available as part of service/plugins/policies?

##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java
##
@@ -127,6 +127,7 @@ public int execute() {
   if (shouldLoadAtlasMetadata()) {
 addAtlasLoadTask();
   }
+  initiateRangerDenytask();

Review comment:
   you can do this only if deny config is enabled

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ranger/RangerRestClientImpl.java
##
@@ -420,8 +458,8 @@ boolean checkConnectionPlain(String url, HiveConf hiveConf) 
{
   }
 
   @Override
-  public List addDenyPolicies(List rangerPolicies, 
String rangerServiceName,
-String sourceDb, String targetDb) 
throws SemanticException {
+  public RangerPolicy getDenyPolicyForReplicatedDb(String rangerServiceName,

Review comment:
   are u using the source db param

##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java
##
@@ -97,7 +97,6 @@ public void targetAndSourceHaveDifferentEncryptionZoneKeys() 
throws Throwable {
 
 WarehouseInstance replica = new WarehouseInstance(LOG, miniDFSCluster,
 new HashMap() {{
-  put(HiveConf.ConfVars.HIVE_IN_TEST.varname, "false");

Review comment:
   why is this removed?

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ranger/RangerRestClientImpl.java
##
@@ -444,7 +482,7 @@ boolean checkConnectionPlain(String url, HiveConf hiveConf) 
{
 List denyExceptionsPolicyItemAccesses 
= new ArrayList();
 
-resourceNameList.add(sourceDb);
+resourceNameList.add(targetDb);
 resourceNameList.add("dummy");

Review comment:
   is this dummy needed?

##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/RangerDenyTask.java
##
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.exec.repl;
+
+import com.google.common.annotations.VisibleForTesting;
+import org.apache.commons.collections.CollectionUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.utils.SecurityUtils;
+import org.apache.hadoop.hive.ql.ErrorMsg;
+import org.apache.hadoop.hive.ql.exec.Task;
+import org.apache.hadoop.hive.ql.exec.repl.ranger.RangerRestClient;
+import org.apache.hadoop.hive.ql.exec.repl.ranger.RangerRestClientImpl;
+import org.apache.hadoop.hive.ql.exec.repl.ranger.NoOpRangerRestClient;
+import org.apache.hadoop.hive.ql.exec.repl.ranger.RangerPolicy;
+import org.apache.hadoop.hive.ql.exec.repl.ranger.RangerExportPolicyList;
+import org.apache.hadoop.hive.ql.exec.repl.util.ReplUtils;
+import org.apache.hadoop.hive.ql.parse.SemanticException;
+import org.apache.hadoop.hive.ql.parse.repl.metric.event.Status;
+import org.apache.hadoop.hive.ql.plan.api.StageType;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.Serializable;
+import java.net.URL;
+import java.util.ArrayList;
+/**
+ * RangerDenyTask.
+ *
+ * Task to add Ranger Deny 

[jira] [Work logged] (HIVE-25117) Vector PTF ClassCastException with Decimal64

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25117?focusedWorklogId=601005=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-601005
 ]

ASF GitHub Bot logged work on HIVE-25117:
-

Author: ASF GitHub Bot
Created on: 24/May/21 03:45
Start Date: 24/May/21 03:45
Worklog Time Spent: 10m 
  Work Description: ramesh0201 commented on a change in pull request #2286:
URL: https://github.com/apache/hive/pull/2286#discussion_r637668639



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ptf/VectorPTFOperator.java
##
@@ -250,13 +253,16 @@ protected VectorizedRowBatch setupOverflowBatch() throws 
HiveException {
 for (int i = 0; i < outputProjectionColumnMap.length; i++) {
   int outputColumn = outputProjectionColumnMap[i];
   String typeName = outputTypeInfos[i].getTypeName();
-  allocateOverflowBatchColumnVector(overflowBatch, outputColumn, typeName);
+  allocateOverflowBatchColumnVector(overflowBatch, outputColumn, typeName, 
outputDataTypePhysicalVariations[i]);
 }
 
 // Now, add any scratch columns needed for children operators.
 int outputColumn = initialColumnCount;
+DataTypePhysicalVariation[] dataTypePhysicalVariations = 
vOutContext.getScratchDataTypePhysicalVariations();
 for (String typeName : vOutContext.getScratchColumnTypeNames()) {
-  allocateOverflowBatchColumnVector(overflowBatch, outputColumn++, 
typeName);
+  allocateOverflowBatchColumnVector(overflowBatch, outputColumn, typeName,
+  dataTypePhysicalVariations[outputColumn-initialColumnCount]);

Review comment:
   Actually based on the code here, I think we index the scratch column 
based on the outputColumnNum
   
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java#L525
   
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java#L800
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 601005)
Time Spent: 0.5h  (was: 20m)

> Vector PTF ClassCastException with Decimal64
> 
>
> Key: HIVE-25117
> URL: https://issues.apache.org/jira/browse/HIVE-25117
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: vector_ptf_classcast_exception.q
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Only reproduces when there is at least 1 buffered batch, so needed 2 rows 
> with 1 row/batch:
> {code:java}
> set hive.vectorized.testing.reducer.batch.size=1;
> {code}
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24883) Support ARRAY/STRUCT types in equality SMB and Common merge join

2021-05-23 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera resolved HIVE-24883.

Resolution: Fixed

> Support ARRAY/STRUCT  types in equality SMB and Common merge join
> -
>
> Key: HIVE-24883
> URL: https://issues.apache.org/jira/browse/HIVE-24883
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Hive fails to execute joins on array type columns as the comparison functions 
> are not able to handle array and struct type columns.   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24883) Support ARRAY/STRUCT types in equality SMB and Common merge join

2021-05-23 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-24883:
---
Parent: HIVE-20962
Issue Type: Sub-task  (was: Bug)

> Support ARRAY/STRUCT  types in equality SMB and Common merge join
> -
>
> Key: HIVE-24883
> URL: https://issues.apache.org/jira/browse/HIVE-24883
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Hive fails to execute joins on array type columns as the comparison functions 
> are not able to handle array and struct type columns.   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-2508) Join on union type fails

2021-05-23 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-2508:
--
Parent: HIVE-20962
Issue Type: Sub-task  (was: Bug)

> Join on union type fails
> 
>
> Key: HIVE-2508
> URL: https://issues.apache.org/jira/browse/HIVE-2508
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Priority: Major
>  Labels: uniontype
>
> {code}
> hive> CREATE TABLE DEST1(key UNIONTYPE, value BIGINT) STORED 
> AS TEXTFILE;
> OK
> Time taken: 0.076 seconds
> hive> CREATE TABLE DEST2(key UNIONTYPE, value BIGINT) STORED 
> AS TEXTFILE;
> OK
> Time taken: 0.034 seconds
> hive> SELECT * FROM DEST1 JOIN DEST2 on (DEST1.key = DEST2.key);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25042) Add support for map data type in Common merge join and SMB Join

2021-05-23 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-25042:
---
Parent: HIVE-20962
Issue Type: Sub-task  (was: Bug)

> Add support for map data type in Common merge join and SMB Join
> ---
>
> Key: HIVE-25042
> URL: https://issues.apache.org/jira/browse/HIVE-25042
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Priority: Major
>
> Merge join results depends on the underlying sorter used by the mapper task 
> as we need to judge the direction after each key comparison. So the 
> comparison done during join has to match the way the records are sorted by 
> the mapper. As per the sorter used by mapper task (PipelinedSorter), 
> hash-maps with same key-value pair in different order are not equal. So the 
> merge join also behaves the same way. But map join treats them as equal. We 
> have to modify the pipelined sorter code to handle the map datatype. Then 
> support has to be added in the Join code to support map types.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25042) Add support for map data type in Common merge join and SMB Join

2021-05-23 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-25042:
---
Parent: (was: HIVE-24883)
Issue Type: Bug  (was: Sub-task)

> Add support for map data type in Common merge join and SMB Join
> ---
>
> Key: HIVE-25042
> URL: https://issues.apache.org/jira/browse/HIVE-25042
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Priority: Major
>
> Merge join results depends on the underlying sorter used by the mapper task 
> as we need to judge the direction after each key comparison. So the 
> comparison done during join has to match the way the records are sorted by 
> the mapper. As per the sorter used by mapper task (PipelinedSorter), 
> hash-maps with same key-value pair in different order are not equal. So the 
> merge join also behaves the same way. But map join treats them as equal. We 
> have to modify the pipelined sorter code to handle the map datatype. Then 
> support has to be added in the Join code to support map types.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24995) Add support for complex type operator in Join with non equality condition

2021-05-23 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-24995:
---
Parent: HIVE-20962
Issue Type: Sub-task  (was: Bug)

> Add support for complex type operator in Join with non equality condition 
> --
>
> Key: HIVE-24995
> URL: https://issues.apache.org/jira/browse/HIVE-24995
> Project: Hive
>  Issue Type: Sub-task
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>
> This subtask is specifically to support non equal comparison like greater 
> than, smaller than etc as join condition. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24995) Add support for complex type operator in Join with non equality condition

2021-05-23 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-24995:
---
Parent: (was: HIVE-24883)
Issue Type: Bug  (was: Sub-task)

> Add support for complex type operator in Join with non equality condition 
> --
>
> Key: HIVE-24995
> URL: https://issues.apache.org/jira/browse/HIVE-24995
> Project: Hive
>  Issue Type: Bug
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>
> This subtask is specifically to support non equal comparison like greater 
> than, smaller than etc as join condition. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24883) Support ARRAY/STRUCT types in equality SMB and Common merge join

2021-05-23 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-24883:
---
Description: Hive fails to execute joins on array type columns as the 
comparison functions are not able to handle array and struct type columns.     
(was: Hive fails to execute joins on array type columns as the comparison 
functions are not able to handle array type columns.   )

> Support ARRAY/STRUCT  types in equality SMB and Common merge join
> -
>
> Key: HIVE-24883
> URL: https://issues.apache.org/jira/browse/HIVE-24883
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Hive fails to execute joins on array type columns as the comparison functions 
> are not able to handle array and struct type columns.   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24883) Support ARRAY/STRUCT types in equality SMB and Common merge join

2021-05-23 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-24883:
---
Summary: Support ARRAY/STRUCT  types in equality SMB and Common merge join  
(was: Add support for complex types columns in Hive Joins)

> Support ARRAY/STRUCT  types in equality SMB and Common merge join
> -
>
> Key: HIVE-24883
> URL: https://issues.apache.org/jira/browse/HIVE-24883
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Hive fails to execute joins on array type columns as the comparison functions 
> are not able to handle array type columns.   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24810) Use JDK 8 String Switch in TruncDateFromTimestamp

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24810?focusedWorklogId=600990=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-600990
 ]

ASF GitHub Bot logged work on HIVE-24810:
-

Author: ASF GitHub Bot
Created on: 24/May/21 00:35
Start Date: 24/May/21 00:35
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on pull request #2002:
URL: https://github.com/apache/hive/pull/2002#issuecomment-846656268


   @pgaref If you are available for a quick review :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 600990)
Time Spent: 1h  (was: 50m)

> Use JDK 8 String Switch in TruncDateFromTimestamp
> -
>
> Key: HIVE-24810
> URL: https://issues.apache.org/jira/browse/HIVE-24810
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25151) Remove Unused Interner from HiveMetastoreChecker

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25151?focusedWorklogId=600989=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-600989
 ]

ASF GitHub Bot logged work on HIVE-25151:
-

Author: ASF GitHub Bot
Created on: 24/May/21 00:34
Start Date: 24/May/21 00:34
Worklog Time Spent: 10m 
  Work Description: belugabehr merged pull request #2309:
URL: https://github.com/apache/hive/pull/2309


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 600989)
Time Spent: 1h  (was: 50m)

> Remove Unused Interner from HiveMetastoreChecker
> 
>
> Key: HIVE-25151
> URL: https://issues.apache.org/jira/browse/HIVE-25151
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {code:java|title=HiveMetastoreChecker}
>   for (int i = 0; i < getPartitionSpec(table, partition).size(); i++) {
> Path qualifiedPath = partPath.makeQualified(fs);
> pathInterner.intern(qualifiedPath);
> partPaths.add(qualifiedPath);
> partPath = partPath.getParent();
>   }
> {code}
>  
> The items are being "interned" and then the returned values are ignored.  
> This is wrong and make the {{Interner}} useless.
> For now simply remove this stuff.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25152) Remove Superfluous Logging Code

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25152?focusedWorklogId=600988=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-600988
 ]

ASF GitHub Bot logged work on HIVE-25152:
-

Author: ASF GitHub Bot
Created on: 24/May/21 00:33
Start Date: 24/May/21 00:33
Worklog Time Spent: 10m 
  Work Description: belugabehr merged pull request #2310:
URL: https://github.com/apache/hive/pull/2310


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 600988)
Time Spent: 3h 40m  (was: 3.5h)

> Remove Superfluous Logging Code
> ---
>
> Key: HIVE-25152
> URL: https://issues.apache.org/jira/browse/HIVE-25152
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> So much logging code can be removed to lessen the amount of code in the 
> project (and perhaps some small performance gains).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24810) Use JDK 8 String Switch in TruncDateFromTimestamp

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24810?focusedWorklogId=600963=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-600963
 ]

ASF GitHub Bot logged work on HIVE-24810:
-

Author: ASF GitHub Bot
Created on: 23/May/21 18:36
Start Date: 23/May/21 18:36
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on pull request #2002:
URL: https://github.com/apache/hive/pull/2002#issuecomment-846606241


   @miklosgergely  :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 600963)
Time Spent: 50m  (was: 40m)

> Use JDK 8 String Switch in TruncDateFromTimestamp
> -
>
> Key: HIVE-24810
> URL: https://issues.apache.org/jira/browse/HIVE-24810
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25141) Review Error Level Logging in HMS Module

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25141?focusedWorklogId=600962=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-600962
 ]

ASF GitHub Bot logged work on HIVE-25141:
-

Author: ASF GitHub Bot
Created on: 23/May/21 18:34
Start Date: 23/May/21 18:34
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on pull request #2299:
URL: https://github.com/apache/hive/pull/2299#issuecomment-846605840


   @miklosgergely I made a couple of changes to get tests to pass.  Can you 
please take a quick look to validate the change?  Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 600962)
Time Spent: 40m  (was: 0.5h)

> Review Error Level Logging in HMS Module
> 
>
> Key: HIVE-25141
> URL: https://issues.apache.org/jira/browse/HIVE-25141
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> * Remove "log *and* throw" (it should be one or the other
>  * Remove superfluous code
>  * Ensure the stack traces are being logged (and not just the Exception 
> message) to ease troubleshooting
>  * Remove double-printing the Exception message (SLF4J dictates that the 
> Exception message will be printed as part of the logger's formatting



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25151) Remove Unused Interner from HiveMetastoreChecker

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25151?focusedWorklogId=600961=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-600961
 ]

ASF GitHub Bot logged work on HIVE-25151:
-

Author: ASF GitHub Bot
Created on: 23/May/21 18:33
Start Date: 23/May/21 18:33
Worklog Time Spent: 10m 
  Work Description: belugabehr opened a new pull request #2309:
URL: https://github.com/apache/hive/pull/2309


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 600961)
Time Spent: 50m  (was: 40m)

> Remove Unused Interner from HiveMetastoreChecker
> 
>
> Key: HIVE-25151
> URL: https://issues.apache.org/jira/browse/HIVE-25151
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code:java|title=HiveMetastoreChecker}
>   for (int i = 0; i < getPartitionSpec(table, partition).size(); i++) {
> Path qualifiedPath = partPath.makeQualified(fs);
> pathInterner.intern(qualifiedPath);
> partPaths.add(qualifiedPath);
> partPath = partPath.getParent();
>   }
> {code}
>  
> The items are being "interned" and then the returned values are ignored.  
> This is wrong and make the {{Interner}} useless.
> For now simply remove this stuff.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25151) Remove Unused Interner from HiveMetastoreChecker

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25151?focusedWorklogId=600960=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-600960
 ]

ASF GitHub Bot logged work on HIVE-25151:
-

Author: ASF GitHub Bot
Created on: 23/May/21 18:31
Start Date: 23/May/21 18:31
Worklog Time Spent: 10m 
  Work Description: belugabehr closed pull request #2309:
URL: https://github.com/apache/hive/pull/2309


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 600960)
Time Spent: 40m  (was: 0.5h)

> Remove Unused Interner from HiveMetastoreChecker
> 
>
> Key: HIVE-25151
> URL: https://issues.apache.org/jira/browse/HIVE-25151
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code:java|title=HiveMetastoreChecker}
>   for (int i = 0; i < getPartitionSpec(table, partition).size(); i++) {
> Path qualifiedPath = partPath.makeQualified(fs);
> pathInterner.intern(qualifiedPath);
> partPaths.add(qualifiedPath);
> partPath = partPath.getParent();
>   }
> {code}
>  
> The items are being "interned" and then the returned values are ignored.  
> This is wrong and make the {{Interner}} useless.
> For now simply remove this stuff.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23931) Send ValidWriteIdList and tableId to get_*_constraints HMS APIs

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23931?focusedWorklogId=600939=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-600939
 ]

ASF GitHub Bot logged work on HIVE-23931:
-

Author: ASF GitHub Bot
Created on: 23/May/21 16:01
Start Date: 23/May/21 16:01
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on pull request #2211:
URL: https://github.com/apache/hive/pull/2211#issuecomment-846585770


   @kgyrtkirk Adding
   
   struct UniqueConstraintsRequest {
   1: TableReference table;
   }
   
   Above change  will not be compatible with older HMS client. Should I go 
ahead and implement the above change as part of this PR or Should I create a 
epic for cleaning up the deprecated api in metastore and move all api to 
request model where we can have common models for multiple api?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 600939)
Time Spent: 1h 50m  (was: 1h 40m)

> Send ValidWriteIdList and tableId to get_*_constraints HMS APIs
> ---
>
> Key: HIVE-23931
> URL: https://issues.apache.org/jira/browse/HIVE-23931
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Send ValidWriteIdList and tableId to get_*_constraints HMS APIs. This would 
> be required in order to decide whether the response should be served from the 
> Cache or backing DB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23931) Send ValidWriteIdList and tableId to get_*_constraints HMS APIs

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23931?focusedWorklogId=600940=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-600940
 ]

ASF GitHub Bot logged work on HIVE-23931:
-

Author: ASF GitHub Bot
Created on: 23/May/21 16:01
Start Date: 23/May/21 16:01
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma edited a comment on pull request 
#2211:
URL: https://github.com/apache/hive/pull/2211#issuecomment-846585770


   @kgyrtkirk 
   
   struct UniqueConstraintsRequest {
   1: TableReference table;
   }
   
   Above change  will not be compatible with older HMS client. Should I go 
ahead and implement the above change as part of this PR or Should I create a 
epic for cleaning up the deprecated api in metastore and move all api to 
request model where we can have common models for multiple api?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 600940)
Time Spent: 2h  (was: 1h 50m)

> Send ValidWriteIdList and tableId to get_*_constraints HMS APIs
> ---
>
> Key: HIVE-23931
> URL: https://issues.apache.org/jira/browse/HIVE-23931
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Send ValidWriteIdList and tableId to get_*_constraints HMS APIs. This would 
> be required in order to decide whether the response should be served from the 
> Cache or backing DB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-25093) date_format() UDF is returning values in UTC time zone only

2021-05-23 Thread Ashish Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17350053#comment-17350053
 ] 

Ashish Sharma edited comment on HIVE-25093 at 5/23/21, 3:53 PM:


[~zabetak]

shuser@hn0-testja:~$ *timedatectl*
*Local time: Thu 2021-05-06 12:03:32 IST*
*Universal time: Thu 2021-05-06 06:33:32 UTC*
RTC time: Thu 2021-05-06 06:33:32
Time zone: Asia/Kolkata (IST, +0530)
Network time on: yes
NTP synchronized: yes
RTC in local TZ: no

Please checkout the local time and UTC time value in "timedatectl"


0: jdbc:hive2://zk0-testja.e0mrrixnyxde5h1suy> *select 
date_format(current_timestamp,"-MM-dd HH:mm:ss.SSS z");*
--

_c0
--

*{color:red}2021-05-06 12:08:15.118 UTC{color}*
--
1 row selected (1.074 seconds)


So the output of the query is *2021-05-06 12:08:15.118 UTC* . So the problem 
here is if _date_format _is compatible to UTC time then output should be 
*2021-05-06 06:33:59.078 UTC* or if it compatible to "hive.local.time.zone" 
then result should be *2021-05-06 12:08:15.118 IST*. Not combination of both.

Output of "current_timestamp" depends upon config "hive.local.time.zone" which 
is set of IST in the given example. So the input to date_format is given is 
*2021-05-06 12:08:15.118 IST* if the result is present in UTC then result 
should be *2021-05-06 06:33:59.078 UTC* not *2021-05-06 12:08:15.118 UTC*.


Also interpretation of 'z' in  "-MM-dd HH:mm:ss.SSS z" is local time zone. 
_date_format _ use SimpleDateFormat. Please check out the official 
documentation 
https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html 



was (Author: ashish-kumar-sharma):
[~zabetak]

shuser@hn0-testja:~$ *timedatectl*
*Local time: Thu 2021-05-06 12:03:32 IST*
*Universal time: Thu 2021-05-06 06:33:32 UTC*
RTC time: Thu 2021-05-06 06:33:32
Time zone: Asia/Kolkata (IST, +0530)
Network time on: yes
NTP synchronized: yes
RTC in local TZ: no

Please checkout the local time and UTC time value in "timedatectl"


0: jdbc:hive2://zk0-testja.e0mrrixnyxde5h1suy> *select 
date_format(current_timestamp,"-MM-dd HH:mm:ss.SSS z");*
--

_c0
--

*{color:red}2021-05-06 12:08:15.118 UTC{color}*
--
1 row selected (1.074 seconds)


So the output of the query is *2021-05-06 12:08:15.118 UTC* . So the problem 
here is if _date_format _is compliment to UTC time then output should be 
*2021-05-06 06:33:59.078 UTC* or if it compliment to "hive.local.time.zone" 
then result should be *2021-05-06 12:08:15.118 IST*. Not the combination of 
both.

Output of "current_timestamp" depends upon config "hive.local.time.zone" which 
is set of IST in the given example. So the input to date_format is given is 
*2021-05-06 12:08:15.118 IST* if the result is present in UTC then result 
should be *2021-05-06 06:33:59.078 UTC* not *2021-05-06 12:08:15.118 UTC*.


Also interpretation of 'z' in  "-MM-dd HH:mm:ss.SSS z" is local time zone. 
_date_format _ use SimpleDateFormat. Please check out the official 
documentation 
https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html 


> date_format() UDF is returning values in UTC time zone only 
> 
>
> Key: HIVE-25093
> URL: https://issues.apache.org/jira/browse/HIVE-25093
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.1.2
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> *HIVE - 1.2*
> sshuser@hn0-dateti:~$ *timedatectl*
>   Local time: Thu 2021-05-06 11:56:08 IST
>   Universal time: Thu 2021-05-06 06:26:08 UTC
> RTC time: Thu 2021-05-06 06:26:08
>Time zone: Asia/Kolkata (IST, +0530)
>  Network time on: yes
> NTP synchronized: yes
>  RTC in local TZ: no
> sshuser@hn0-dateti:~$ beeline
> 0: jdbc:hive2://localhost:10001/default> *select 
> date_format(current_timestamp,"-MM-dd HH:mm:ss.SSS z");*
> +--+--+
> | _c0  |
> +--+--+
> | 2021-05-06 11:58:53.760 IST  |
> +--+--+
> 1 row selected (1.271 seconds)
> *HIVE - 3.1.0*
> sshuser@hn0-testja:~$ *timedatectl*
>   Local time: Thu 2021-05-06 12:03:32 IST
>   Universal time: Thu 2021-05-06 06:33:32 UTC
> RTC time: Thu 2021-05-06 06:33:32
>Time zone: Asia/Kolkata (IST, +0530)
>  Network time on: yes
> NTP synchronized: yes
>  RTC in local TZ: no
> sshuser@hn0-testja:~$ beeline
> 0: jdbc:hive2://zk0-testja.e0mrrixnyxde5h1suy> *select 
> date_format(current_timestamp,"-MM-dd HH:mm:ss.SSS 

[jira] [Comment Edited] (HIVE-25093) date_format() UDF is returning values in UTC time zone only

2021-05-23 Thread Ashish Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17350053#comment-17350053
 ] 

Ashish Sharma edited comment on HIVE-25093 at 5/23/21, 3:01 PM:


[~zabetak]

shuser@hn0-testja:~$ *timedatectl*
*Local time: Thu 2021-05-06 12:03:32 IST*
*Universal time: Thu 2021-05-06 06:33:32 UTC*
RTC time: Thu 2021-05-06 06:33:32
Time zone: Asia/Kolkata (IST, +0530)
Network time on: yes
NTP synchronized: yes
RTC in local TZ: no

Please checkout the local time and UTC time value in "timedatectl"


0: jdbc:hive2://zk0-testja.e0mrrixnyxde5h1suy> *select 
date_format(current_timestamp,"-MM-dd HH:mm:ss.SSS z");*
--

_c0
--

*{color:red}2021-05-06 12:08:15.118 UTC{color}*
--
1 row selected (1.074 seconds)


So the output of the query is *2021-05-06 12:08:15.118 UTC* . So the problem 
here is if _date_format _is compliment to UTC time then output should be 
*2021-05-06 06:33:59.078 UTC* or if it compliment to "hive.local.time.zone" 
then result should be *2021-05-06 12:08:15.118 IST*. Not the combination of 
both.

Output of "current_timestamp" depends upon config "hive.local.time.zone" which 
is set of IST in the given example. So the input to date_format is given is 
*2021-05-06 12:08:15.118 IST* if the result is present in UTC then result 
should be *2021-05-06 06:33:59.078 UTC* not *2021-05-06 12:08:15.118 UTC*.


Also interpretation of 'z' in  "-MM-dd HH:mm:ss.SSS z" is local time zone. 
_date_format _ use SimpleDateFormat. Please check out the official 
documentation 
https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html 



was (Author: ashish-kumar-sharma):
[~zabetak]

shuser@hn0-testja:~$ *timedatectl*
*Local time: Thu 2021-05-06 12:03:32 IST*
*Universal time: Thu 2021-05-06 06:33:32 UTC*
RTC time: Thu 2021-05-06 06:33:32
Time zone: Asia/Kolkata (IST, +0530)
Network time on: yes
NTP synchronized: yes
RTC in local TZ: no

Please checkout the local time and UTC time value in "timedatectl"


0: jdbc:hive2://zk0-testja.e0mrrixnyxde5h1suy> *select 
date_format(current_timestamp,"-MM-dd HH:mm:ss.SSS z");*
--

_c0
--

*{color:red}2021-05-06 12:08:15.118 UTC{color}*
--
1 row selected (1.074 seconds)


So the output of the query is *{color:red}2021-05-06 12:08:15.118 UTC{color}* . 
So the problem here is if _date_format _is compliment to UTC time then output 
should be *{color:red}2021-05-06 06:33:59.078 UTC{color}* or if it compliment 
to "hive.local.time.zone" then result should be *{color:red}2021-05-06 
12:08:15.118 IST{color}*. Not the combination of both.

Output of "current_timestamp" depends upon config "hive.local.time.zone" which 
is set of IST in the given example. So the input to date_format is given is 
*{color:red}2021-05-06 12:08:15.118 IST{color}* if the result is present in UTC 
then result should be *{color:red}2021-05-06 06:33:59.078 UTC{color}* not 
*{color:red}2021-05-06 12:08:15.118 UTC{color}*.


Also interpretation of 'z' in  "-MM-dd HH:mm:ss.SSS z" is local time zone. 
_date_format _ use SimpleDateFormat. Please check out the official 
documentation 
https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html 


> date_format() UDF is returning values in UTC time zone only 
> 
>
> Key: HIVE-25093
> URL: https://issues.apache.org/jira/browse/HIVE-25093
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.1.2
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> *HIVE - 1.2*
> sshuser@hn0-dateti:~$ *timedatectl*
>   Local time: Thu 2021-05-06 11:56:08 IST
>   Universal time: Thu 2021-05-06 06:26:08 UTC
> RTC time: Thu 2021-05-06 06:26:08
>Time zone: Asia/Kolkata (IST, +0530)
>  Network time on: yes
> NTP synchronized: yes
>  RTC in local TZ: no
> sshuser@hn0-dateti:~$ beeline
> 0: jdbc:hive2://localhost:10001/default> *select 
> date_format(current_timestamp,"-MM-dd HH:mm:ss.SSS z");*
> +--+--+
> | _c0  |
> +--+--+
> | 2021-05-06 11:58:53.760 IST  |
> +--+--+
> 1 row selected (1.271 seconds)
> *HIVE - 3.1.0*
> sshuser@hn0-testja:~$ *timedatectl*
>   Local time: Thu 2021-05-06 12:03:32 IST
>   Universal time: Thu 2021-05-06 06:33:32 UTC
> RTC time: Thu 2021-05-06 06:33:32
>Time zone: Asia/Kolkata (IST, +0530)
>  Network time on: yes
> NTP synchronized: yes
>  RTC in local TZ: no
> sshuser@hn0-testja:~$ beeline
> 

[jira] [Commented] (HIVE-25093) date_format() UDF is returning values in UTC time zone only

2021-05-23 Thread Ashish Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17350053#comment-17350053
 ] 

Ashish Sharma commented on HIVE-25093:
--

[~zabetak]

shuser@hn0-testja:~$ *timedatectl*
*Local time: Thu 2021-05-06 12:03:32 IST*
*Universal time: Thu 2021-05-06 06:33:32 UTC*
RTC time: Thu 2021-05-06 06:33:32
Time zone: Asia/Kolkata (IST, +0530)
Network time on: yes
NTP synchronized: yes
RTC in local TZ: no

Please checkout the local time and UTC time value in "timedatectl"


0: jdbc:hive2://zk0-testja.e0mrrixnyxde5h1suy> *select 
date_format(current_timestamp,"-MM-dd HH:mm:ss.SSS z");*
--

_c0
--

*{color:red}2021-05-06 12:08:15.118 UTC{color}*
--
1 row selected (1.074 seconds)


So the output of the query is *{color:red}2021-05-06 12:08:15.118 UTC{color}* . 
So the problem here is if _date_format _is compliment to UTC time then output 
should be *{color:red}2021-05-06 06:33:59.078 UTC{color}* or if it compliment 
to "hive.local.time.zone" then result should be *{color:red}2021-05-06 
12:08:15.118 IST{color}*. Not the combination of both.

Output of "current_timestamp" depends upon config "hive.local.time.zone" which 
is set of IST in the given example. So the input to date_format is given is 
*{color:red}2021-05-06 12:08:15.118 IST{color}* if the result is present in UTC 
then result should be *{color:red}2021-05-06 06:33:59.078 UTC{color}* not 
*{color:red}2021-05-06 12:08:15.118 UTC{color}*.


Also interpretation of 'z' in  "-MM-dd HH:mm:ss.SSS z" is local time zone. 
_date_format _ use SimpleDateFormat. Please check out the official 
documentation 
https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html 


> date_format() UDF is returning values in UTC time zone only 
> 
>
> Key: HIVE-25093
> URL: https://issues.apache.org/jira/browse/HIVE-25093
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.1.2
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> *HIVE - 1.2*
> sshuser@hn0-dateti:~$ *timedatectl*
>   Local time: Thu 2021-05-06 11:56:08 IST
>   Universal time: Thu 2021-05-06 06:26:08 UTC
> RTC time: Thu 2021-05-06 06:26:08
>Time zone: Asia/Kolkata (IST, +0530)
>  Network time on: yes
> NTP synchronized: yes
>  RTC in local TZ: no
> sshuser@hn0-dateti:~$ beeline
> 0: jdbc:hive2://localhost:10001/default> *select 
> date_format(current_timestamp,"-MM-dd HH:mm:ss.SSS z");*
> +--+--+
> | _c0  |
> +--+--+
> | 2021-05-06 11:58:53.760 IST  |
> +--+--+
> 1 row selected (1.271 seconds)
> *HIVE - 3.1.0*
> sshuser@hn0-testja:~$ *timedatectl*
>   Local time: Thu 2021-05-06 12:03:32 IST
>   Universal time: Thu 2021-05-06 06:33:32 UTC
> RTC time: Thu 2021-05-06 06:33:32
>Time zone: Asia/Kolkata (IST, +0530)
>  Network time on: yes
> NTP synchronized: yes
>  RTC in local TZ: no
> sshuser@hn0-testja:~$ beeline
> 0: jdbc:hive2://zk0-testja.e0mrrixnyxde5h1suy> *select 
> date_format(current_timestamp,"-MM-dd HH:mm:ss.SSS z");*
> +--+
> | _c0  |
> +--+
> | *2021-05-06 06:33:59.078 UTC*  |
> +--+
> 1 row selected (13.396 seconds)
> 0: jdbc:hive2://zk0-testja.e0mrrixnyxde5h1suy> *set 
> hive.local.time.zone=Asia/Kolkata;*
> No rows affected (0.025 seconds)
> 0: jdbc:hive2://zk0-testja.e0mrrixnyxde5h1suy> *select 
> date_format(current_timestamp,"-MM-dd HH:mm:ss.SSS z");*
> +--+
> | _c0  |
> +--+
> | *{color:red}2021-05-06 12:08:15.118 UTC{color}*  | 
> +--+
> 1 row selected (1.074 seconds)
> expected result was *2021-05-06 12:08:15.118 IST*
> As part of HIVE-12192 it was decided to have a common time zone for all 
> computation i.e. "UTC". Due to which data_format() function was hard coded to 
> "UTC".
> But later in HIVE-21039 it was decided that user session time zone value 
> should be the default not UTC. 
> date_format() was not fixed as part of HIVE-21039.
> what should be the ideal time zone value of date_format().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24852) Add support for Snapshots during external table replication

2021-05-23 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HIVE-24852:

Attachment: Design Doc HDFS Snapshots for External Table Replication-02.pdf

> Add support for Snapshots during external table replication
> ---
>
> Key: HIVE-24852
> URL: https://issues.apache.org/jira/browse/HIVE-24852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Design Doc HDFS Snapshots for External Table 
> Replication-01.pdf, Design Doc HDFS Snapshots for External Table 
> Replication-02.pdf
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Add support for use of snapshot diff for external table replication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25152) Remove Superfluous Logging Code

2021-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25152?focusedWorklogId=600919=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-600919
 ]

ASF GitHub Bot logged work on HIVE-25152:
-

Author: ASF GitHub Bot
Created on: 23/May/21 10:58
Start Date: 23/May/21 10:58
Worklog Time Spent: 10m 
  Work Description: pgaref commented on pull request #2310:
URL: https://github.com/apache/hive/pull/2310#issuecomment-846543650


   > ... continued...
   > 
   > That is to say:
   > 
   > ```java
   > LOG.info("New Final Path: FS " + fsp.finalPaths[filesIdx]);
   > LOG.info("New Final Path: FS {}", fsp.finalPaths[filesIdx]);
   > ```
   > 
   > These two statements, will always produce the same output (as INFO is on 
by default in every production environment under the sun). However, the second 
one will always have the overhead of having to find the anchor in the format 
string and then replace it with the value. The first example, there is simply a 
string concatenation that occurs. This will be faster and I don't find this log 
particularly hard to read.
   
   Hey @belugabehr thanks for the details! I agree adding anchors on all INFO 
statements wont add and perf value here considering that INFO is the default 
log lvl -- I was mostly thinking consistency when posting the comments, 
eventually using a single logging style but maybe thats out of the scope of 
this PR.
   
   Happy to +1 as is and maybe discuss this as part of a new ticket


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 600919)
Time Spent: 3.5h  (was: 3h 20m)

> Remove Superfluous Logging Code
> ---
>
> Key: HIVE-25152
> URL: https://issues.apache.org/jira/browse/HIVE-25152
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> So much logging code can be removed to lessen the amount of code in the 
> project (and perhaps some small performance gains).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25155) Bump ORC to 1.6.8

2021-05-23 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25155:
-


> Bump ORC to 1.6.8
> -
>
> Key: HIVE-25155
> URL: https://issues.apache.org/jira/browse/HIVE-25155
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Trivial
>
>  https://orc.apache.org/news/2021/05/21/ORC-1.6.8/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-22186) Update ORC to version 1.6

2021-05-23 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-22186.
---
Resolution: Duplicate

Dup of HIVE-23553

> Update ORC to version 1.6
> -
>
> Key: HIVE-22186
> URL: https://issues.apache.org/jira/browse/HIVE-22186
> Project: Hive
>  Issue Type: Improvement
>Reporter: Bartłomiej Tartanus
>Priority: Major
>
> [https://orc.apache.org/docs/releases.html]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)