退订

2024-02-26 Thread 18679131354
退订

来自杨作青的邮件

2024-02-26 Thread 杨作青
退订

Re: Schema Evolution & Json Schemas

2024-02-26 Thread Salva Alcántara
Awesome Andrew, thanks a lot for the info!

On Sun, Feb 25, 2024 at 4:37 PM Andrew Otto  wrote:

> >  the following code generator
> Oh, and FWIW we avoid code generation and POJOs, and instead rely on
> Flink's Row or RowData abstractions.
>
>
>
>
>
> On Sun, Feb 25, 2024 at 10:35 AM Andrew Otto  wrote:
>
>> Hi!
>>
>> I'm not sure if this totally is relevant for you, but we use JSONSchema
>> and JSON with Flink at the Wikimedia Foundation.
>> We explicitly disallow the use of additionalProperties
>> ,
>> unless it is to define Map type fields
>> 
>> (where additionalProperties itself is a schema).
>>
>> We have JSONSchema converters and JSON Serdes to be able to use our
>> JSONSchemas and JSON records with both the DataStream API (as Row) and
>> Table API (as RowData).
>>
>> See:
>> -
>> https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master/eventutilities-flink/src/main/java/org/wikimedia/eventutilities/flink/formats/json
>> -
>> https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master/eventutilities-flink/#managing-a-object
>>
>> State schema evolution is supported via the EventRowTypeInfo wrapper
>> 
>> .
>>
>> Less directly about Flink: I gave a talk at Confluent's Current conf in
>> 2022 about why we use JSONSchema
>> .
>> See also this blog post series if you are interested
>> 
>> !
>>
>> -Andrew Otto
>>  Wikimedia Foundation
>>
>>
>> On Fri, Feb 23, 2024 at 1:58 AM Salva Alcántara 
>> wrote:
>>
>>> I'm facing some issues related to schema evolution in combination with
>>> the usage of Json Schemas and I was just wondering whether there are any
>>> recommended best practices.
>>>
>>> In particular, I'm using the following code generator:
>>>
>>> - https://github.com/joelittlejohn/jsonschema2pojo
>>>
>>> Main gotchas so far relate to the `additionalProperties` field. When
>>> setting that to true, the resulting POJO is not valid according to Flink
>>> rules because the generated getter/setter methods don't follow the java
>>> beans naming conventions, e.g., see here:
>>>
>>> - https://github.com/joelittlejohn/jsonschema2pojo/issues/1589
>>>
>>> This means that the Kryo fallback is used for serialization purposes,
>>> which is not only bad for performance but also breaks state schema
>>> evolution.
>>>
>>> So, because of that, setting `additionalProperties` to `false` looks
>>> like a good idea but then your job will break if an upstream/producer
>>> service adds a property to the messages you are reading. To solve this
>>> problem, the POJOs for your job (as a reader) can be generated to ignore
>>> the `additionalProperties` field (via the `@JsonIgnore` Jackson
>>> annotation). This seems to be a good overall solution to the problem, but
>>> looks a bit convoluted to me / didn't come without some trial & error (=
>>> pain & frustration).
>>>
>>> Is there anyone here facing similar issues? It would be good to hear
>>> your thoughts on this!
>>>
>>> BTW, this is very interesting article that touches on the above
>>> mentioned difficulties:
>>> -
>>> https://www.creekservice.org/articles/2024/01/09/json-schema-evolution-part-2.html
>>>
>>>
>>>


Re: Flink DataStream 作业如何获取到作业血缘?

2024-02-26 Thread Feng Jin
通过 JobGraph 可以获得 transformation 信息,可以获得具体的 Source 或者 Doris
Sink,之后再通过反射获取里面的 properties 信息进行提取。

可以参考 OpenLineage[1] 的实现.


1.
https://github.com/OpenLineage/OpenLineage/blob/main/integration/flink/shared/src/main/java/io/openlineage/flink/visitor/wrapper/FlinkKafkaConsumerWrapper.java


Best,
Feng


On Mon, Feb 26, 2024 at 6:20 PM casel.chen  wrote:

> 一个Flink DataStream 作业从mysql cdc消费处理后写入apache
> doris,请问有没有办法(从JobGraph/StreamGraph)获取到source/sink
> connector信息,包括连接字符串、数据库名、表名等?


Flink DataStream 作业如何获取到作业血缘?

2024-02-26 Thread casel.chen
一个Flink DataStream 作业从mysql cdc消费处理后写入apache 
doris,请问有没有办法(从JobGraph/StreamGraph)获取到source/sink connector信息,包括连接字符串、数据库名、表名等?

Re: Flink Scala Positions in India or USA !

2024-02-26 Thread Martijn Visser
Hi,

Please don't use the mailing list for this purpose.

Best regards,

Martijn

On Wed, Feb 21, 2024 at 4:08 PM sri hari kali charan Tummala
 wrote:
>
> Hi Folks,
>
> I am currently seeking full-time positions in Flink Scala in India or the USA 
> (non consulting) , specifically at the Principal or Staff level positions in 
> India or USA.
>
> I require an h1b transfer and assistance with relocation from India , my i40 
> is approved.
>
> Thanks & Regards
> Sri Tummala
>