[ https://issues.apache.org/jira/browse/FLINK-14243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16940619#comment-16940619 ]
jackylau commented on FLINK-14243: ---------------------------------- [~ykt836] The original udf function uses cache, which you can see hive system udf like get_json_object. > flink hiveudf need some check when it is using cache > ---------------------------------------------------- > > Key: FLINK-14243 > URL: https://issues.apache.org/jira/browse/FLINK-14243 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive, Table SQL / Planner > Affects Versions: 1.9.0 > Reporter: jackylau > Priority: Major > Fix For: 1.10.0 > > > Flink1.9 bring in hive connector, but it will have some problem when the > original hive udf using cache. We konw that hive is processed level parallel > based on jvm, while flink/spark is task level parallel. If flink just calls > the hive udf, it wll exists thread-safe problem when using cache. > So it may need check the hive udf code and if it is not thread-safe, and set > the flink parallize=1 -- This message was sent by Atlassian Jira (v8.3.4#803005)