[ 
https://issues.apache.org/jira/browse/SPARK-44700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jiahong.li updated SPARK-44700:
-------------------------------
    Description: 
_SQL_ like below:

```

select tmp.* 
 from
 (select
        device_id, ads_id, 
        from_json(regexp_replace(device_personas, '(?<=(\\\\{|,))"device_', 
'"user_device_'), ${device_schema}) as tmp
        from input )

${device_schema} has more than 100 fields
```

if Rule: OptimizeCsvJsonExprs  been applied, the expression, regexp_replace, 
will be invoked many times, that costs so much time.

 

  was:
_SQL_ like below:

```

select tmp.* 
 from
 (select
        device_id, ads_id, 
        from_json(regexp_replace(device_personas, '(?<=(\\\{|,))"device_', 
'"user_device_'), ${device_schema}) as tmp
        from input )

${device_schema} has more than 100 fields
```

if Rule: OptimizeCsvJsonExprs  been applied, the expression, regexp_replace, 
will be invoked many times, that costs so much time

 


> Rule OptimizeCsvJsonExprs should not be applied to expression like 
> from_json(regexp_replace)
> --------------------------------------------------------------------------------------------
>
>                 Key: SPARK-44700
>                 URL: https://issues.apache.org/jira/browse/SPARK-44700
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.4.0, 3.4.1
>            Reporter: jiahong.li
>            Priority: Minor
>
> _SQL_ like below:
> ```
> select tmp.* 
>  from
>  (select
>         device_id, ads_id, 
>         from_json(regexp_replace(device_personas, '(?<=(\\\\{|,))"device_', 
> '"user_device_'), ${device_schema}) as tmp
>         from input )
> ${device_schema} has more than 100 fields
> ```
> if Rule: OptimizeCsvJsonExprs  been applied, the expression, regexp_replace, 
> will be invoked many times, that costs so much time.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to