Re: Starting to make changes for Spark 3 -- what can we delete?

2018-10-17 Thread DB Tsai
I'll +1 on removing those legacy mllib code. Many users are confused about the APIs, and some of them have weird behaviors (for example, in gradient descent, the intercept is regularized which supports not to). DB Tsai | Siri Open Source Technologies [not a contribution] |  Apple, Inc >

Re: Starting to make changes for Spark 3 -- what can we delete?

2018-10-17 Thread Erik Erlandson
My understanding was that the legacy mllib api was frozen, with all new dev going to ML, but it was not going to be removed. Although removing it would get rid of a lot of `OldXxx` shims. On Wed, Oct 17, 2018 at 12:55 AM Marco Gaido wrote: > Hi all, > > I think a very big topic on this would

Re: Starting to make changes for Spark 3 -- what can we delete?

2018-10-17 Thread Marco Gaido
Hi all, I think a very big topic on this would be: what do we want to do with the old mllib API? For long I have been told that it was going to be removed on 3.0. Is this still the plan? Thanks, Marco Il giorno mer 17 ott 2018 alle ore 03:11 Marcelo Vanzin ha scritto: > Might be good to take

Re: Starting to make changes for Spark 3 -- what can we delete?

2018-10-16 Thread Marcelo Vanzin
Might be good to take a look at things marked "@DeveloperApi" and whether they should stay that way. e.g. I was looking at SparkHadoopUtil and I've always wanted to just make it private to Spark. I don't see why apps would need any of those methods. On Tue, Oct 16, 2018 at 10:18 AM Sean Owen

Starting to make changes for Spark 3 -- what can we delete?

2018-10-16 Thread Sean Owen
There was already agreement to delete deprecated things like Flume and Kafka 0.8 support in master. I've got several more on my radar, and wanted to highlight them and solicit general opinions on where we should accept breaking changes. For example how about removing accumulator v1?