[ https://issues.apache.org/jira/browse/SPARK-27683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16837967#comment-16837967 ]
Sean Owen commented on SPARK-27683: ----------------------------------- There are many usages of TraversableOnce, though many are in internal packages and classes, which isn't so urgent to address. The concern are public APIs, and the main one is flatMap / flatMapValues / flatMapGroups. These accept a function that returns a TraversableOnce. That's nice as TraversableOnce is a supertype of Iterable and Iterator, so one can return either in a flatMap. In Scala 2.13 IterableOnce will play that role but it isn't available in 2.12. This makes it hard to create an API method that works in both, and cuts off the possibility, I think, of deprecating the current method while adding the new one. IterableOnce will have basically two subclasses, Iterable and Iterator. These exist now. We could change to support both of those in flatMap now and deprecate the existing method. However this won't compile as there would be two methods with the same name and signature after erasure. Even if we drop the TraversableOnce version it won't work for the same reason. There's a scala-collections-compat library that attempts to bridge some of the difference between 2.12 and 2.13. It does provide some help with IterableOnce, but, the compat class is in a different package (scala.collection.compat) than the final one, and is in any event just a type def for TraversableOnce. It doesn't seem to help. I considered adding a dummy implementation of IterableOnce to our source, extending TraversableOnce. However this too won't help without defining implicit conversion from Iterable and Iterator to IterableOnce that users would have to import. We could instead change the one flatMap method to accept an Iterator, or an Iterable. Either one makes some usages of flatMap stop working. Of the two, Iterator is probably the better choice. It's less restrictive on the caller, it's how the Java equivalent works now, and is more consistent with what TraversableOnce means now. That would mean you can't flatMap to a collection, which is unfortuante; you'd have to add ".iterator". Another option is to of course maintain separate source trees for 2.12 and 2.13 in the future. That's somewhat painful if it means maintaining two versions of PairRDDFunctions, RDD, DStream, etc. We may be able to break out just the part that varies into a separate class though. I'm interested in thoughts on whether it's better to go for separate source trees to minimize change needed from callers, or, whether requiring an Iterator is acceptable enough as a breaking change in 3.0. But if we're going to do that it has to be for 3.0, and unfortunately I don't see a way to keep the existing method as deprecated while adding the new one. > Remove usage of TraversableOnce > ------------------------------- > > Key: SPARK-27683 > URL: https://issues.apache.org/jira/browse/SPARK-27683 > Project: Spark > Issue Type: Sub-task > Components: ML, Spark Core, SQL, Structured Streaming > Affects Versions: 3.0.0 > Reporter: Sean Owen > Assignee: Sean Owen > Priority: Major > > As with {{Traversable}}, {{TraversableOnce}} is going away in Scala 2.13. We > should use {{IterableOnce}} instead. This one is a bigger change as there are > more API methods with the existing signature. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org