About Deprecating split/select for DataStream API

2019-06-15 Thread Xingcan Cui
Hi all, Recently, I noticed that the split/select methods in DataStream API have been marked as deprecated since 1.7.2 and 1.8.0 (the related JIRA issue FLINK-11084 ). Although the two methods can be replaced by the more powerful side output f

Re: About Deprecating split/select for DataStream API

2019-06-16 Thread SHI Xiaogang
Hi Xingcan, Thanks for bringing it up for discusson. I agree with you that we should not deprecate the split/select methods. Their semantics are very clear and they are widely adopted by Flink users. We should fix these problems instead of simply deprecating the methods. Regards, Xiaogang Xingc

Re: About Deprecating split/select for DataStream API

2019-06-16 Thread vino yang
Hi, I also think it is valuable and reasonable to keep the split/select APIs. They are very convenient and widely used in our platform. I think they are also used in other users' jobs. If the community has doubts about this, IMHO, it would be better to start a user survey. Best, Vino SHI Xiaogan

Re: About Deprecating split/select for DataStream API

2019-06-16 Thread Jark Wu
+1 to keep the split/select API. I think if there are some problems with the API, it's better to fix them instead of deprecating them. And select/split are straightforward and convenient APIs. It's worth to have them. Regards, Jark On Mon, 17 Jun 2019 at 14:46, vino yang wrote: > Hi, > > I also

Re: About Deprecating split/select for DataStream API

2019-06-17 Thread Dawid Wysakowicz
Hi all, Thank you for starting the discussion. To start with I have to say I am not entirely against leaving them. On the other hand I totally disagree that the semantics are clearly defined. Actually the design is fundamentally flawed. 1. We use String as a selector for elements. This is not th

Re: About Deprecating split/select for DataStream API

2019-06-17 Thread SHI Xiaogang
Hi Dawid, As the select method is only allowed on SplitStreams, it's impossible to construct the example ds.split().select("a", "b").select("c", "d"). Are you meaning ds.split().select("a", "b").split().select("c", "d")? If so, then the tagging in the first split operation should not affect the s

Re: About Deprecating split/select for DataStream API

2019-06-17 Thread Dawid Wysakowicz
Yes you are correct. The problem I described applies to the split not select as I wrote in the first email. Sorry for that. I will try to prepare a correct example. Let's have a look at this example:     val splitted1 = ds.split(if (1) then "a")     val splitted2 = ds.split(if (!=1) then "a") I

Re: About Deprecating split/select for DataStream API

2019-06-17 Thread SHI Xiaogang
Hi Dawid, Thanks a lot for your example. I think most users will expect splitted1 to be empty in the example. The unexpected results produced, in my opinion, is due to our problematic implementation, instead of the confusing semantics. We can fix the problem if we add a SELECT operator to filter

RE: About Deprecating split/select for DataStream API

2019-06-17 Thread xingcanc
t based on side output? (like the implementation for join on coGroup) Any feedback is welcome : ) Best, Xingcan -Original Message- From: SHI Xiaogang Sent: Monday, June 17, 2019 8:08 AM To: Dawid Wysakowicz Cc: dev@flink.apache.org Subject: Re: About Deprecating split/select for Dat

Re: About Deprecating split/select for DataStream API

2019-06-17 Thread Dian Fu
blic > API. If we come to a consensus on that, how about rewriting it based on side > output? (like the implementation for join on coGroup) > > Any feedback is welcome : ) > > Best, > Xingcan > > -Original Message- > From: SHI Xiaogang > Sent: Monday,

Re: About Deprecating split/select for DataStream API

2019-06-17 Thread Dawid Wysakowicz
t; output? (like the implementation for join on coGroup) >> >> Any feedback is welcome : ) >> >> Best, >> Xingcan >> >> -Original Message- >> From: SHI Xiaogang >> Sent: Monday, June 17, 2019 8:08 AM >> To: Dawid Wysakowicz >> Cc: de

Re: About Deprecating split/select for DataStream API

2019-06-18 Thread Xingcan Cui
union does not support different >>> data types either. >>> 3. We need a complete and easy-to-use transformation set for DataStream >>> API. Enabling side output for flatMap may not be an ultimate solution. >>> >>> To summarize, maybe we should not e

Re: About Deprecating split/select for DataStream API

2019-06-18 Thread SHI Xiaogang
gt; > >>> 1. The split/select may have been widely used without touching the > broken part. > >>> 2. Though restricted compared with side output, the semantics for > split/select itself is acceptable since union does not support different > data types either. > >>> 3.