[ 
https://issues.apache.org/jira/browse/HIVE-29690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Ahuja updated HIVE-29690:
--------------------------------
    Issue Type: Bug  (was: Improvement)

> HMS: StackOverflowError when calling get_table_objects_by_name_req API with 
> both tablenames and pattern
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-29690
>                 URL: https://issues.apache.org/jira/browse/HIVE-29690
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 4.2.0
>            Reporter: Vikram Ahuja
>            Assignee: Vikram Ahuja
>            Priority: Major
>
> When we call HMSHandler.get_table_objects_by_name_req(req) HMS API on a 
> database with large number of tables(~5000) with pattern and tableNames. The 
> following exception is thrown in the HMS side:
> {code:java}
> 2026-06-29 16:03:47,778 ERROR handler.AbstractRequestHandler: GetTableHandler 
> [fb57da2e-2590-4b03-ae11-c4d14f01eb6f-10645] 
> Failedjava.lang.StackOverflowError at 
> org.datanucleus.store.query.expression.ExpressionCompiler.isOperator(ExpressionCompiler.java:894)
>     at 
> org.datanucleus.store.query.expression.ExpressionCompiler.compileOrAndExpression(ExpressionCompiler.java:202)
>         at 
> org.datanucleus.store.query.expression.ExpressionCompiler.compileExpression(ExpressionCompiler.java:191)
> 2026-06-29 16:03:47,803 ERROR metastore.RetryingHMSHandler: 
> MetaException(message:GetTableHandler 
> [fb57da2e-2590-4b03-ae11-c4d14f01eb6f-10645] failed with null)      at 
> org.apache.hadoop.hive.metastore.handler.AbstractRequestHandler.getRequestStatus(AbstractRequestHandler.java:245)
>     at 
> org.apache.hadoop.hive.metastore.handler.AbstractRequestHandler.getResult(AbstractRequestHandler.java:288)
>    at 
> org.apache.hadoop.hive.metastore.handler.GetTableHandler.getTables(GetTableHandler.java:582)
>  at 
> org.apache.hadoop.hive.metastore.HMSHandler.get_table_objects_by_name_req(HMSHandler.java:1472)
>       at 
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
>         at java.base/java.lang.reflect.Method.invoke(Method.java:580)   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:91)
>        at 
> org.apache.hadoop.hive.metastore.AbstractHMSHandlerProxy.invoke(AbstractHMSHandlerProxy.java:82)
>      at jdk.proxy2/jdk.proxy2.$Proxy31.get_table_objects_by_name_req(Unknown 
> Source) at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_objects_by_name_req.getResult(ThriftHiveMetastore.java:21162)
>    at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_objects_by_name_req.getResult(ThriftHiveMetastore.java:21141)
>    at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:103)
>      at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
>      at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
>     at java.base/java.lang.Thread.run(Thread.java:1583) {code}
> This error is only seen when both tablepattern and tableNames are passed. If 
> either of them are null, then this issue does not occur.
>  
> Wrote a sample program to recreate this issue:
> {code:java}
> import org.apache.hadoop.hive.metastore.api.GetTablesRequest;
> import org.apache.hadoop.hive.metastore.api.GetTablesResult;
> import org.apache.hadoop.hive.metastore.api.Table;
> import org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore;
> import org.apache.thrift.protocol.TBinaryProtocol;
> import org.apache.thrift.transport.TSocket;
> import org.apache.thrift.transport.TTransport;
> import java.util.List;
> public class DirectHMSHandlerClient {
>     public static void main(String[] args) throws Exception {
>         String host = "localhost";
>         int port = 9083;
>         // 1. Open Thrift transport to the running HMS
>         TTransport transport = new TSocket(host, port);
>         transport.open();
>         ThriftHiveMetastore.Client client =
>                 new ThriftHiveMetastore.Client(new 
> TBinaryProtocol(transport));
>         try {
>             // 2. List all databases
>             List<String> databases = client.get_all_databases();
>             System.out.println("=== Databases ===");
>             databases.forEach(System.out::println);
>             // 3. List all tables in 'testl'
>             List<String> tables = client.get_all_tables("mydb1");
>             System.out.println("\n=== Tables in 'mydb1' ===");
>             System.out.println("Total tables: " + tables.size());
>             // 4. Call get_table_objects_by_name_req directly
>             if (!tables.isEmpty()) {
>                 GetTablesRequest req = new GetTablesRequest("mydb1");
>                 req.setTablesPattern(".*");
>                 System.out.println("\n=== get_table_objects_by_name_req : 
> with tablesPattern '.*' and names = null ===");
>                 GetTablesResult result = 
> client.get_table_objects_by_name_req(req);
>                 req = new GetTablesRequest("mydb1");
>                 req.setTblNames(tables);
>                 System.out.println("\n=== get_table_objects_by_name_req : 
> with tablesPattern 'null' and names not null ===");
>                 result = client.get_table_objects_by_name_req(req);
>                 req = new GetTablesRequest("mydb1");
>                 req.setTblNames(tables);
>                 req.setTablesPattern(".*");
>                 System.out.println("\n=== get_table_objects_by_name_req : 
> with tablesPattern '.*' and names not null ===");
>                 // The below code will give StackOverflowError                
>   
>                 result = client.get_table_objects_by_name_req(req);
>             }
>         } finally {
>             transport.close();
>         }
>     }
> }
>  
> Compile Command: 
> HADOOP_LIB=/Users/vikram/hadoop-3.3.6.3.3.6.2-0/share/hadoop
>   
> HIVE_LIB=/Users/vikram/opensource/hive/packaging/target/apache-hive-4.3.0-SNAPSHOT-bin/apache-hive-4.3.0-SNAPSHOT-bin/lib
>   CP=$(echo $HIVE_LIB/*.jar | tr ' ' ':')
>   CP="$CP:$(echo $HADOOP_LIB/common/*.jar | tr ' ' ':')"
>   CP="$CP:$(echo $HADOOP_LIB/common/lib/*.jar | tr ' ' ':')"
>   javac -proc:none -cp "$CP" 
> /Users/vikram/opensource/hms-client-test/DirectHMSHandlerClient.java -d 
> /Users/vikram/opensource/hms-client-test/
> Run Command:
> HADOOP_LIB=/Users/vikram/Work/hadoop-3.3.6.3.3.6.2-0/share/hadoop
>   
> HIVE_LIB=/Users/vikram/opensource/hive/packaging/target/apache-hive-4.3.0-SNAPSHOT-bin/apache-hive-4.3.0-SNAPSHOT-bin/lib
>   CP=/Users/vikram/opensource/hms-client-test
>   CP="$CP:$(echo $HIVE_LIB/*.jar | tr ' ' ':')"
>   CP="$CP:$(echo $HADOOP_LIB/common/*.jar | tr ' ' ':')"
>   CP="$CP:$(echo $HADOOP_LIB/common/lib/*.jar | tr ' ' ':')"
>   java -cp "$CP" DirectHMSHandlerClient
> {code}
>  
> *Root Cause: DataNucleus ExpressionCompiler Stack Overflow*
> HIVE-24769 : getTableObjectsByName was refactored to support a flexible 
> dynamic filter builder (appendSimpleCondition). The original JDOQL contains() 
> approach was replaced with an explicit OR chain:
>  
> Old (Hive 3):   "... && tbl_names.contains(tableName)"
>                   → SQL: WHERE tableName IN ('t1', 't2', ..., 'tN')
>  
> New (Hive 4):   "tableName == :p1 || tableName == :p2 || ... || tableName == 
> :pN"
>                   → one OR operator per table name
>  
> DataNucleus (the JDO/ORM layer used by HMS) parses JDOQL filter strings at 
> query creation time using a recursive descent parser. The relevant methods 
> are mutually recursive:
>  
>   compileExpression()
>     └── compileOrAndExpression()
>           └── compileExpression()        ← recurses back
>                 └── compileOrAndExpression()
>                       ...
>  
> Every || operator in the filter string consumes one Java stack frame. With 
> 5000 tables:
>  -  getTableObjectsByName builds a filter with 4999 || operators and 10,000 
> bound parameters
>  - DataNucleus attempts to parse this at pm.newQuery(...) time
>  -  ~10,000 stack frames are consumed → StackOverflowError
>  
> when tablePattern is sent as NULL then batching is applied by default and 
> when tablePattern is not NULL then batching is not applied. In the code it 
> does not apply batching when tablePattern is not NULL and tableNames is also 
> not NULL thus causing this issue.
>  
> {*}Fix{*}: Apply batching on the basis of tableNames and not tablePattern to 
> fix this issue



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to