Having looked at the proposed set of methods to remove I've noticed that
some of them are actually annotated with @Public. According to our
stability guarantees, only major releases (1.0, 2.0, etc.) can break APIs
with this annotation. Hence, I believe that we cannot simply remove them
unless the community decides to change the stability guarantees we give or
by making the next release a major release (Flink 2.0).

Cheers,
Till

On Tue, Aug 18, 2020 at 5:57 AM Yun Gao <yungao...@aliyun.com> wrote:

> +1 for removing the methods that are deprecated for a while & have
> alternative methods.
>
> One specific thing is that if we remove the DataStream#split, do we
> consider enabling side-output in more operators in the future ? Currently
> it should be only available in ProcessFunctions, but not available to other
> commonly used UDF like Source or AsyncFunction[1].
>
> One temporary solution occurs to me is to add a ProcessFunction after the
> operators want to use side-output. But I think the solution is not very
> direct to come up with and if it really works we might add it to the
> document of side-output.
>
> [1] https://issues.apache.org/jira/browse/FLINK-7954
>
> Best,
>  Yun
>
> ------------------Original Mail ------------------
> *Sender:*Kostas Kloudas <kklou...@gmail.com>
> *Send Date:*Tue Aug 18 03:52:44 2020
> *Recipients:*Dawid Wysakowicz <dwysakow...@apache.org>
> *CC:*dev <d...@flink.apache.org>, user <user@flink.apache.org>
> *Subject:*Re: [DISCUSS] Removing deprecated methods from DataStream API
>
>> +1 for removing them.
>>
>>
>>
>> From a quick look, most of them (not all) have been deprecated a long
>> time ago.
>>
>>
>>
>> Cheers,
>>
>> Kostas
>>
>>
>>
>> On Mon, Aug 17, 2020 at 9:37 PM Dawid Wysakowicz wrote:
>>
>> >
>>
>> > @David Yes, my idea was to remove any use of fold method and all
>> related classes including WindowedStream#fold
>>
>> >
>>
>> > @Klou Good idea to also remove the deprecated enableCheckpointing() &
>> StreamExecutionEnvironment#readFile and alike. I did another pass over some
>> of the classes and thought we could also drop:
>>
>> >
>>
>> > ExecutionConfig#set/getCodeAnalysisMode
>>
>> > ExecutionConfig#disable/enableSysoutLogging
>>
>> > ExecutionConfig#set/isFailTaskOnCheckpointError
>>
>> > ExecutionConfig#isLatencyTrackingEnabled
>>
>> >
>>
>> > As for the `forceCheckpointing` I am not fully convinced to doing it.
>> As far as I know iterations still do not participate in checkpointing
>> correctly. Therefore it still might make sense to force it. In other words
>> there is no real alternative to that method. Unless we only remove the
>> methods from StreamExecutionEnvironment and redirect to the setter in
>> CheckpointConfig. WDYT?
>>
>> >
>>
>> > An updated list of methods I suggest to remove:
>>
>> >
>>
>> > ExecutionConfig#set/getCodeAnalysisMode (deprecated in 1.9)
>>
>> > ExecutionConfig#disable/enableSysoutLogging (deprecated in 1.10)
>>
>> > ExecutionConfig#set/isFailTaskOnCheckpointError (deprecated in 1.9)
>>
>> > ExecutionConfig#isLatencyTrackingEnabled (deprecated in 1.7)
>>
>> > ExecutionConfig#(get)/setNumberOfExecutionRetries() (deprecated in 1.1?)
>>
>> >
>> StreamExecutionEnvironment#readFile,readFileStream(...),socketTextStream(...),socketTextStream(...)
>> (deprecated in 1.2)
>>
>> > RuntimeContext#getAllAccumulators (deprecated in 0.10)
>>
>> > DataStream#fold and all related classes and methods such as
>> FoldFunction, FoldingState, FoldingStateDescriptor ... (deprecated in
>> 1.3/1.4)
>>
>> > StreamExecutionEnvironment#setStateBackend(AbstractStateBackend)
>> (deprecated in 1.5)
>>
>> > DataStream#split (deprecated in 1.8)
>>
>> > Methods in (Connected)DataStream that specify keys as either indices or
>> field names such as DataStream#keyBy, DataStream#partitionCustom,
>> ConnectedStream#keyBy, .... (deprecated in 1.11)
>>
>> >
>>
>> > Bear in mind that majority of the options listed above in
>> ExecutionConfig take no effect. They were left there purely to satisfy the
>> binary compatibility. Personally I don't see any benefit of leaving a
>> method and silently dropping the underlying feature. The only configuration
>> that is respected is setting the number of execution retries.
>>
>> >
>>
>> > I also wanted to make it explicit that most of the changes above would
>> result in a binary incompatibility and require additional exclusions in the
>> japicmp. Those are:
>>
>> >
>>
>> > ExecutionConfig#set/getCodeAnalysisMode (deprecated in 1.9)
>>
>> > ExecutionConfig#disable/enableSysoutLogging (deprecated in 1.10)
>>
>> > ExecutionConfig#set/isFailTaskOnCheckpointError (deprecated in 1.9)
>>
>> > ExecutionConfig#isLatencyTrackingEnabled (deprecated in 1.7)
>>
>> > ExecutionConfig#(get)/setNumberOfExecutionRetries() (deprecated in 1.1?)
>>
>> > DataStream#fold and all related classes and methods such as
>> FoldFunction, FoldingState, FoldingStateDescriptor ... (deprecated in
>> 1.3/1.4)
>>
>> > DataStream#split (deprecated in 1.8)
>>
>> > Methods in (Connected)DataStream that specify keys as either indices or
>> field names such as DataStream#keyBy, DataStream#partitionCustom,
>> ConnectedStream#keyBy, .... (deprecated in 1.11)
>>
>> >
>> StreamExecutionEnvironment#readFile,readFileStream(...),socketTextStream(...),socketTextStream(...)
>> (deprecated in 1.2)
>>
>> >
>>
>> > Looking forward to more opinions on the issue.
>>
>> >
>>
>> > Best,
>>
>> >
>>
>> > Dawid
>>
>> >
>>
>> >
>>
>> > On 17/08/2020 12:49, Kostas Kloudas wrote:
>>
>> >
>>
>> > Thanks a lot for starting this Dawid,
>>
>> >
>>
>> > Big +1 for the proposed clean-up, and I would also add the deprecated
>>
>> > methods of the StreamExecutionEnvironment like:
>>
>> >
>>
>> > enableCheckpointing(long interval, CheckpointingMode mode, boolean
>> force)
>>
>> > enableCheckpointing()
>>
>> > isForceCheckpointing()
>>
>> >
>>
>> > readFile(FileInputFormat inputFormat,String
>>
>> > filePath,FileProcessingMode watchType,long interval, FilePathFilter
>>
>> > filter)
>>
>> > readFileStream(...)
>>
>> >
>>
>> > socketTextStream(String hostname, int port, char delimiter, long
>> maxRetry)
>>
>> > socketTextStream(String hostname, int port, char delimiter)
>>
>> >
>>
>> > There are more, like the (get)/setNumberOfExecutionRetries() that were
>>
>> > deprecated long ago, but I have not investigated to see if they are
>>
>> > actually easy to remove.
>>
>> >
>>
>> > Cheers,
>>
>> > Kostas
>>
>> >
>>
>> > On Mon, Aug 17, 2020 at 10:53 AM Dawid Wysakowicz
>>
>> > wrote:
>>
>> >
>>
>> > Hi devs and users,
>>
>> >
>>
>> > I wanted to ask you what do you think about removing some of the
>> deprecated APIs around the DataStream API.
>>
>> >
>>
>> > The APIs I have in mind are:
>>
>> >
>>
>> > RuntimeContext#getAllAccumulators (deprecated in 0.10)
>>
>> > DataStream#fold and all related classes and methods such as
>> FoldFunction, FoldingState, FoldingStateDescriptor ... (deprecated in
>> 1.3/1.4)
>>
>> > StreamExecutionEnvironment#setStateBackend(AbstractStateBackend)
>> (deprecated in 1.5)
>>
>> > DataStream#split (deprecated in 1.8)
>>
>> > Methods in (Connected)DataStream that specify keys as either indices or
>> field names such as DataStream#keyBy, DataStream#partitionCustom,
>> ConnectedStream#keyBy, .... (deprecated in 1.11)
>>
>> >
>>
>> > I think the first three should be straightforward. They are long
>> deprecated. The getAccumulators method is not used very often in my
>> opinion. The same applies to the DataStream#fold which additionally is not
>> very performant. Lastly the setStateBackend has an alternative with a class
>> from the AbstractStateBackend hierarchy, therefore it will be still code
>> compatible. Moreover if we remove the
>> #setStateBackend(AbstractStateBackend) we will get rid off warnings users
>> have right now when setting a statebackend as the correct method cannot be
>> used without an explicit casting.
>>
>> >
>>
>> > As for the DataStream#split I know there were some objections against
>> removing the #split method in the past. I still believe the output tags can
>> replace the split method already.
>>
>> >
>>
>> > The only problem in the last set of methods I propose to remove is that
>> they were deprecated only in the last release and those method were only
>> partially deprecated. Moreover some of the methods were not deprecated in
>> ConnectedStreams. Nevertheless I'd still be inclined to remove the methods
>> in this release.
>>
>> >
>>
>> > Let me know what do you think about it.
>>
>> >
>>
>> > Best,
>>
>> >
>>
>> > Dawid
>
>

Reply via email to