[ 
https://issues.apache.org/jira/browse/FLINK-16627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benchao Li reassigned FLINK-16627:
----------------------------------

    Assignee: yisha zhou

> Support only generate non-null values when serializing into JSON
> ----------------------------------------------------------------
>
>                 Key: FLINK-16627
>                 URL: https://issues.apache.org/jira/browse/FLINK-16627
>             Project: Flink
>          Issue Type: New Feature
>          Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Planner
>    Affects Versions: 1.10.0
>            Reporter: jackray wang
>            Assignee: yisha zhou
>            Priority: Not a Priority
>              Labels: auto-deprioritized-major, auto-deprioritized-minor, 
> auto-unassigned, pull-request-available, sprint
>
> {code:java}
> //sql
> CREATE TABLE sink_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //sql
> CREATE TABLE source_kafka ( subtype STRING , svt STRING ) WITH (……)
> {code}
>  
> {code:java}
> //scala udf
> class ScalaUpper extends ScalarFunction {    
> def eval(str: String) : String= { 
>        if(str == null){
>            return ""
>        }else{
>            return str
>        }
>     }
>     
> }
> btenv.registerFunction("scala_upper", new ScalaUpper())
> {code}
>  
> {code:java}
> //sql
> insert into sink_kafka select subtype, scala_upper(svt)  from source_kafka
> {code}
>  
>  
> ----
> Sometimes the svt's value is null, inert into kafkas json like  
> \{"subtype":"qin","svt":null}
> If the amount of data is small, it is acceptable,but we process 10TB of data 
> every day, and there may be many nulls in the json, which affects the 
> efficiency. If you can add a parameter to remove the null key when defining a 
> sinktable, the performance will be greatly improved
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to