[jira] [Updated] (FLINK-33871) Reduce getTable call for hive client and optimize graph generation time

ASF GitHub Bot (Jira) Mon, 18 Dec 2023 00:33:44 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-33871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


ASF GitHub Bot updated FLINK-33871:
-----------------------------------
    Labels: pull-request-available  (was: )

> Reduce getTable call for hive client and optimize graph generation time
> -----------------------------------------------------------------------
>
>                 Key: FLINK-33871
>                 URL: https://issues.apache.org/jira/browse/FLINK-33871
>             Project: Flink
>          Issue Type: Improvement
>            Reporter: hehuiyuan
>            Priority: Major
>              Labels: pull-request-available
>
> HiveCatalog.getHiveTable method wastes a lot of time when generate graph, 
> because the number of calls  is relatively high.
> I have an sql task with over 2000 rows,  the HiveCatalog.getHiveTable  method 
> is called 4879 times , but only six hive tables were used. 
> ![image](https://github.com/apache/flink/assets/18002496/d5f0daf3-f80a-4790-ae21-4e75dff9cfd7)
> The client.getTable method costs a lot of time.  
> ![image](https://github.com/apache/flink/assets/18002496/be0d176f-3915-4b92-a177-f1cfaf6d2927)
> There is a statistic that jobmanager interacts with hive when generate graph.
> If One call takes approximately 50 milliseconds ,
> How much time it spends  : 4879 * 50 =243950ms  = 243.95s  = 4min
> We can cache and  client.getTable method  is only  called six times.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-33871) Reduce getTable call for hive client and optimize graph generation time

Reply via email to