Thank you, Francesco

> On 3 Feb 2022, at 18:21, Francesco Guardiani <france...@ververica.com> wrote:
> 
> Hi,
> 
> I think the more stable option would be the first one, as it also gives you 
> more flexibility. Reading the row as string and then parsing it in a query 
> definitely costs more, and makes less straightforward to use the other Schema 
> features of table, such as watermark definition, primary keys, etc.
> 
> I guess you can implement it straightforwardly subclassing the existing json 
> format provided by flink, in particular JsonRowDataDeserializationSchema.
> 
> A third solution would be to create a SplitFunction, like the one you 
> created, which directly performs the parsing, outputting rows rather than 
> strings. This removes the double parsing issue, but still create problems 
> when interacting with other schema features.
> 
> Hope it helps,
> FG
> 
> On Thu, Feb 3, 2022 at 3:56 PM Илья Соин <ilya.soin...@gmail.com 
> <mailto:ilya.soin...@gmail.com>> wrote:
> Hi, 
> 
> I’m using the Table / SQL API. 
> 
> I have a stream of strings, where each message contains several json strings 
> separated by "\n”. 
> For example:
> {“timestamp”: “2021-01-01T00:00:00”, person: {“name”: “Vasya”}}\n 
> {“timestamp”: “2021-01-01T01:00:00”, person: {“name”: “Max” }}
> 
> I would like to split each message by “\n”, parse each string as a json 
> object and get some of the fields. 
> 
> AFIK there are 2 ways to do it:
> 
> 1) Write custom deserialiser and provide it in source table DDL, i.e. 
> CREATE TABLE source (
>     timestamp STRING,
>     person: ROW(name STRING)
> )
> WITH(‘format’ = ‘multiline-json’, …);
> 
> 2) Use ‘format’ = ‘raw’ and extract the needed fields using .jsonValue, i.e.
> 
> CREATE TABLE source (
>     row STRING
> );
> 
> env.from("source")
>         .joinLateral(
>             call(SplitFunction.class, $("row"), "\n").as(“msg")
>         )
>         .select(
>              $("msg").jsonValue("$.timestamp", DataTypes.STRING()),
>              $("msg").jsonValue(“$.person.name <http://person.name/>", 
> DataTypes.STRING()).as(“name”)
>        );
> 
> In 2), will each call of .jsonValue parse the string all over again or will 
> it reuse the same JsonNode object internally? Which option better fits my 
> problem?
> 
> __
> Best, Ilya

Reply via email to