[jira] [Created] (HIVE-27142) Map Join not working as expected when joining non-native tables with native tables

2023-03-15 Thread Syed Shameerur Rahman (Jira)
Syed Shameerur Rahman created HIVE-27142:


 Summary:  Map Join not working as expected when joining non-native 
tables with native tables
 Key: HIVE-27142
 URL: https://issues.apache.org/jira/browse/HIVE-27142
 Project: Hive
  Issue Type: Bug
Affects Versions: All Versions
Reporter: Syed Shameerur Rahman
Assignee: Syed Shameerur Rahman
 Fix For: 4.0.0


*1. Issue :*

When *_hive.auto.convert.join=true_* and if the underlying query is trying to 
join a large non-native hive table with a small native hive table, The map join 
is happening in the wrong side i.e on the map task which process the small 
native hive table and it can lead to OOM when the non-native table is really 
large and only few map tasks are spawned to scan the small native hive tables.

 

*2. Why is this happening ?*

This happens due to improper stats collection/computation of non native hive 
tables. Since the non-native hive tables are actually stored in a different 
location which Hive does not know of and only a temporary path which is visible 
to Hive while creating a non native table does not store the actual data, The 
stats collection logic tend to under estimate the data/rows and hence causes 
the map join to happen in the wrong side.

 

*3. Potential Solutions*

 3.1  Turn off *_hive.auto.convert.join=false._* This can have a negative 
impact of the query    if  the same query is trying to do multiple joins i.e 
one join with non-native tables and other join where both the tables are native.

 3.2 Compute stats for non-native table by firing the ANALYZE TABLE <> command 
before joining native and non-native commands. The user may or may not choose 
to do it.

 3.3 Don't collect/estimate stats for non-native hive tables by default 
(Preferred solution)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27143) Improve HCatStorer move task

2023-03-15 Thread Yi Zhang (Jira)
Yi Zhang created HIVE-27143:
---

 Summary: Improve HCatStorer move task
 Key: HIVE-27143
 URL: https://issues.apache.org/jira/browse/HIVE-27143
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Affects Versions: 3.1.3
Reporter: Yi Zhang


moveTask in hcatalog is inefficient, it does 2 iterations dryRun and execution, 
and is sequential. This can be improved.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27144) Alter table partitions need not DBNotificationListener for external tables

2023-03-15 Thread Rajesh Balamohan (Jira)
Rajesh Balamohan created HIVE-27144:
---

 Summary: Alter table partitions need not DBNotificationListener 
for external tables
 Key: HIVE-27144
 URL: https://issues.apache.org/jira/browse/HIVE-27144
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Rajesh Balamohan


DBNotificationListener for external tables may not be needed. 

Even for "analyze table blah compute statistics for columns" for external 
partitioned tables, it invokes DBNotificationListener for all partitions. 


{noformat}
at org.datanucleus.store.query.Query.execute(Query.java:1726)
  at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:374)
  at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:216)
  at 
org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:11774)
  at jdk.internal.reflect.GeneratedMethodAccessor135.invoke(Unknown Source)
  at 
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.18/DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(java.base@11.0.18/Method.java:566)
  at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
  at com.sun.proxy.$Proxy33.addNotificationEvent(Unknown Source)
  at 
org.apache.hive.hcatalog.listener.DbNotificationListener.process(DbNotificationListener.java:1308)
  at 
org.apache.hive.hcatalog.listener.DbNotificationListener.onAlterPartition(DbNotificationListener.java:458)
  at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$14.notify(MetaStoreListenerNotifier.java:161)
  at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:328)
  at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:390)
  at 
org.apache.hadoop.hive.metastore.HiveAlterHandler.alterPartitions(HiveAlterHandler.java:863)
  at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_partitions_with_environment_context(HiveMetaStore.java:6253)
  at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_partitions_req(HiveMetaStore.java:6201)
  at 
jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@11.0.18/Native 
Method)
  at 
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@11.0.18/NativeMethodAccessorImpl.java:62)
  at 
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.18/DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(java.base@11.0.18/Method.java:566)
  at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
  at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
  at com.sun.proxy.$Proxy34.alter_partitions_req(Unknown Source)
  at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_partitions_req.getResult(ThriftHiveMetastore.java:21532)
  at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_partitions_req.getResult(ThriftHiveMetastore.java:21511)
  at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
  at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
  at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:652)
  at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:647)
  at java.security.AccessController.doPrivileged(java.base@11.0.18/Native 
Method)
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)