Re: [Flink blogs]

Fabian Paul Thu, 30 Sep 2021 01:19:21 -0700

Hi Etienne,

Thanks for reaching out I think your list already looks very appealing.


> * - metrics (https://github.com/apache/flink/pull/14510): it was
>   dealing with delimiters. I think it is a bit low level for a blog post ?
> *

I am also unsure whether this a good fit to present. I can only imagine showing 
what kind of use-case it supports.


> 
> * - migration of pipelines from DataSet API to DataStream API: it is
>   already discussed in the flink website
> *

This is definitely something I’d like to see in my opinion it can also become a 
series because the topic has a lot of aspects. If you want to write a 
post about it it would be great to show the migration of a more complex 
pipeline (i.e. old formats, incompatible types ….). Many users will 
eventually face this so it has a big impact. FYI probably only Flink 1.13 is 
the latest version with full DataSet support.

> 
> * - accumulators (https://github.com/apache/flink/pull/14558): it was
>   about an asynchronous get, once again a bit too low level for a blog
>   post ?
> *

To me accumulator are a kind of internal concept but maybe you can provide the 
use-case which drove this change? Probably explaining the 
semantics of them is already complicated.


> 
> * - FileInputFormat mainly parquet improvements and fixes
>   (https://github.com/apache/flink/pull/15725,
>   https://github.com/apache/flink/pull/15172,
>   https://github.com/apache/flink/pull/15156): interesting but as this
>   API is being decommissioned, it might not be a good subject ?
> *

You have already summarized it: it is being deprecated and a much more 
interesting topic is the migration from DataSet to the DataStream API in 
case these old formats are used.


> 
> * - doing a manual join in DataStream API in batch mode with
>   
> /KeyedCoProcessFunction///(https://issues.apache.org/jira/browse/FLINK-22587).
>   As the target is more Flink table/SQL for these kind of things, the
>   same deprecation comment as above applies.
> *
> 

I tend to not show this topic because my recommendation would be to use the 
Table API directly and not build your own join in the DataStream API ;)

> => maybe a blog post on back pressure in checkpointing 
> (https://github.com/apache/flink/pull/13040). WDYT ?
> 

This is also an interesting topic but we constantly work on improving the 
situation and I am unsure if the blogpost is already not up-to-date anymore 
when it is released. 


Please let me know what you think I am also happy to give more feedback for one 
of the topics in more detail if you need it.

Best,
Fabian

Re: [Flink blogs]

Reply via email to