[jira] [Created] (HIVE-24368) Optimise AcidUtils::getAcidFilesForStats for ACID tables

2020-11-10 Thread Rajesh Balamohan (Jira)
Rajesh Balamohan created HIVE-24368:
---

 Summary: Optimise AcidUtils::getAcidFilesForStats for ACID tables
 Key: HIVE-24368
 URL: https://issues.apache.org/jira/browse/HIVE-24368
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Rajesh Balamohan


After insert, hive gathers statistics for ACID table and that becomes expensive 
over time, due to number of delta folders and scanning .

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L2648]

 
{noformat}
public static List getAcidFilesForStats(
Table table, Path dir, Configuration jc, FileSystem fs) throws 
IOException {
  ...
  Directory acidInfo = AcidUtils.getAcidState(fs, dir, jc, idList, null, 
false, hdfsDirSnapshots);
  ...
  ..+ other calls
  ...
  }

 {noformat}
 

Runtime keeps increasing as more deltas are generated. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24367) Explore whether HiveAlterHandler::alterTable can be optimised for non-partitioned tablesInbox

2020-11-10 Thread Rajesh Balamohan (Jira)
Rajesh Balamohan created HIVE-24367:
---

 Summary: Explore whether HiveAlterHandler::alterTable can be 
optimised for non-partitioned tablesInbox
 Key: HIVE-24367
 URL: https://issues.apache.org/jira/browse/HIVE-24367
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Rajesh Balamohan


{color:#22}Writing lots of delta in non-partitioned table creates runtime 
issues, when lot of delta folders are present.{color}

{color:#22} {color}

{color:#22}Following code in HiveAlterHandler is invoked for every insert 
operation. It computes {{{color}

{color:#22}updateTableStatsSlow}} for every insert causing runtime 
delays.{color}

{color:#22} {color}
{noformat}
if (MetaStoreUtils.requireCalStats(null, null, newt, environmentContext) &&
!isPartitionedTable) {
  Database db = msdb.getDatabase(catName, newDbName);
  assert(isReplicated == HiveMetaStore.HMSHandler.isDbReplicationTarget(db));
  // Update table stats. For partitioned table, we update stats in 
alterPartition()
  MetaStoreUtils.updateTableStatsSlow(db, newt, wh, false, true, 
environmentContext);
}
{noformat}
{color:#22}It would be good to explore whether only the newly added delta 
can be listed for computing stats. This would avoid huge listing call during 
stats collection.{color}

{color:#22}e.g queries to repro{color}
{noformat}
CREATE TABLE IF NOT EXISTS test (name String, value int);
INSERT INTO test VALUES('K1',1);
INSERT INTO test VALUES('K2',2);
..
..
..
INSERT INTO test VALUES('K2',2)

 {noformat}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24366) changeMarker value sent to atlas export API is set to 0 in the 2nd repl dump call

2020-11-10 Thread Arko Sharma (Jira)
Arko Sharma created HIVE-24366:
--

 Summary: changeMarker value sent to atlas export API is set to 0 
in the 2nd repl dump call
 Key: HIVE-24366
 URL: https://issues.apache.org/jira/browse/HIVE-24366
 Project: Hive
  Issue Type: Bug
Reporter: Arko Sharma
Assignee: Arko Sharma






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24365) SWO should not create complex and redundant filter expressions

2020-11-10 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24365:
---

 Summary: SWO should not create complex and redundant filter 
expressions
 Key: HIVE-24365
 URL: https://issues.apache.org/jira/browse/HIVE-24365
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


for q88 we have complex and mostly unreadable filter expressions; because 
before merging 2 branches the TS filterexpression is pushed into a FIL operator.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


What is the difference between varname and hivename in MetastoreConf.java?

2020-11-10 Thread qq
What is the difference between varname and hivename in the picture below?

Fw:How is the Hive MetaStore authenticated for accessing HDFS?

2020-11-10 Thread qq
-- Original --
From:   
 "user" 
   <987626...@qq.com>;
Date: Tue, Nov 10, 2020 09:22 PM
To: "user"

[jira] [Created] (HIVE-24364) Fix flakiness of TestLlapExtClientWithCloudDeploymentConfigs

2020-11-10 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24364:
---

 Summary: Fix flakiness of 
TestLlapExtClientWithCloudDeploymentConfigs
 Key: HIVE-24364
 URL: https://issues.apache.org/jira/browse/HIVE-24364
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


it failed recently a few times in some PRs for me
falky check:
http://ci.hive.apache.org/job/hive-flaky-check/139/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)