[ https://issues.apache.org/jira/browse/SPARK-44700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
jiahong.li updated SPARK-44700: ------------------------------- Description: _SQL_ like below: ``` select tmp.* from (select device_id, ads_id, from_json(regexp_replace(device_personas, '(?<=(\\\\{|,))"device_', '"user_device_'), ${device_schema}) as tmp from input ) ${device_schema} has more than 100 fields ``` if Rule: OptimizeCsvJsonExprs been applied, the expression, regexp_replace, will be invoked many times, that costs so much time. was: _SQL_ like below: ``` select tmp.* from (select device_id, ads_id, from_json(regexp_replace(device_personas, '(?<=(\\\{|,))"device_', '"user_device_'), ${device_schema}) as tmp from input ) ${device_schema} has more than 100 fields ``` if Rule: OptimizeCsvJsonExprs been applied, the expression, regexp_replace, will be invoked many times, that costs so much time > Rule OptimizeCsvJsonExprs should not be applied to expression like > from_json(regexp_replace) > -------------------------------------------------------------------------------------------- > > Key: SPARK-44700 > URL: https://issues.apache.org/jira/browse/SPARK-44700 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.4.0, 3.4.1 > Reporter: jiahong.li > Priority: Minor > > _SQL_ like below: > ``` > select tmp.* > from > (select > device_id, ads_id, > from_json(regexp_replace(device_personas, '(?<=(\\\\{|,))"device_', > '"user_device_'), ${device_schema}) as tmp > from input ) > ${device_schema} has more than 100 fields > ``` > if Rule: OptimizeCsvJsonExprs been applied, the expression, regexp_replace, > will be invoked many times, that costs so much time. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org