Re:Re: [DISCUSS] Release 2.0 Work Items

2023-07-18 Thread Wencong Liu
Hi Chesnay,
Thanks for the reply. I think it is reasonable to remove the configuration 
argument
in AbstractUdfStreamOperator#open if it is consistently empty. I'll propose a 
discuss
about the specific actions in FLINK-6912 at a later time.


Best,
Wencong Liu











At 2023-07-18 16:38:59, "Chesnay Schepler"  wrote:
>On 18/07/2023 10:33, Wencong Liu wrote:
>> For FLINK-6912:
>>
>>  There are three implementations of RichFunction that actually use
>> the Configuration parameter in RichFunction#open:
>>  1. ContinuousFileMonitoringFunction#open: It uses the configuration
>> to configure the FileInputFormat. [1]
>>  2. OutputFormatSinkFunction#open: It uses the configuration
>> to configure the OutputFormat. [2]
>>  3. InputFormatSourceFunction#open: It uses the configuration
>>   to configure the InputFormat. [3]
>
>And none of them should have any effect since the configuration is empty.
>
>See org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator#open.


Re:Re: [DISCUSS] Release 2.0 Work Items

2023-07-18 Thread Wencong Liu
Thanks Xintong Song and Matthias for the insightful discussion!


I have double-checked the jira tickets that belong to the 
"Need action in 1.18" section and have some inputs to share.

For FLINK-4675:

The argument StreamExecutionEnvironment in 
WindowAssigner.getDefaultTrigger() 
is not used in all implementations of WindowAssigner and is no longer needed.

For FLINK-6912:

There are three implementations of RichFunction that actually use 
the Configuration parameter in RichFunction#open:
1. ContinuousFileMonitoringFunction#open: It uses the configuration 
to configure the FileInputFormat. [1]
2. OutputFormatSinkFunction#open: It uses the configuration 
to configure the OutputFormat. [2]
3. InputFormatSourceFunction#open: It uses the configuration
 to configure the InputFormat. [3]
I think RichFunction#open should still take a Configuration 
instance as an argument.

For FLINK-5336:

There are three classes that de/serialize the Path through IOReadWritable 
interface:
1. FileSourceSplitSerializer: It de/serializes the Path during the process 
of de/serializing FileSourceSplit. [4]
2. TestManagedSinkCommittableSerializer: It de/serializes the Path during 
the process of de/serializing TestManagedCommittable. [5]
3. TestManagedFileSourceSplitSerializer: It de/serializes the Path during 
the process of de/serializing TestManagedIterableSourceSplit. [6]
I think the Path should still implement the IOReadWritable interface.


I plan to propose a discussion about removing argument in FLINK-4675 and 
comment the conclusion in FLINK-6912 and FLINK-5336, WDYT?

[1] 
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/ContinuousFileMonitoringFunction.java#L199
[2] 
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/sink/OutputFormatSinkFunction.java#L63
[3] 
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/InputFormatSourceFunction.java#L64C2-L64C2
[4] 
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSourceSplitSerializer.java#L67

[5] 
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-table/flink-table-common/src/test/java/org/apache/flink/table/connector/sink/TestManagedSinkCommittableSerializer.java#L113
[6] 
https://github.com/apache/flink/blob/9c3c8afbd9325b5df8291bd831da2d9f8785b30a/flink-table/flink-table-common/src/test/java/org/apache/flink/table/connector/source/TestManagedFileSourceSplitSerializer.java#L56

















At 2023-07-17 12:23:51, "Xintong Song"  wrote:
>Hi Matthias,
>
>How's it going with the summary of existing 2.0.0 jira tickets?
>
>I have gone through everything listed under FLINK-3957[1], and will
>continue with other Jira tickets whose fix-version is 2.0.0.
>
>Here are my 2-cents on the FLINK-3975 subtasks. Hope this helps on your
>summary.
>
>I'd suggest going ahead with the following tickets.
>
>   - Need action in 1.18
>  - FLINK-4675: Double-check whether the argument is indeed not used.
>  Introduce a new non-argument API, and mark the original one as
>  `@Deprecated`. FLIP needed.
>  - FLINK-6912: Double-check whether the argument is indeed not used.
>  Introduce a new non-argument API, and mark the original one as
>  `@Deprecated`. FLIP needed.
>  - FLINK-5336: Double-check whether `IOReadableWritable` is indeed not
>  needed for `Path`. Mark methods from `IOReadableWritable` as
>`@Deprecated`
>  in `Path`. FLIP needed.
>   - Need no action in 1.18
>  - FLINK-4602/14068: Already listed in the release 2.0 wiki [2]
>  - FLINK-3986/3991/3992/4367/5130/7691: Subsumed by "Deprecated
>  methods/fields/classes in DataStream" in the release 2.0 wiki [2]
>  - FLINK-6375: Change the hashCode behavior of `LongValue` (and other
>  numeric types).
>
>I'd suggest not doing the following tickets.
>
>   - FLINK-4147/4330/9529/14658: These changes are non-trivial for both
>   developers and users. Also, we are taking them into consideration designing
>   the new ProcessFunction API. I'd be in favor of letting users migrate to
>   the ProcessFunction API directly once it's ready, rather than forcing users
>   to adapt to the breaking changes twice.
>   - FLINK-3610: Only affects Scala API, which will soon be removed.
>
>I don't have strong opinions on whether to work on the following tickets or
>not. Some of them are not very clear to me based on the description and
>conversation on the ticket, others may require further investigation and
>evaluation to decide. Unless someone volunteers to look into them, I'd be
>slightly