Github user mateiz commented on the pull request:
https://github.com/apache/incubator-spark/pull/539#issuecomment-35439917
Hey Prashant, I've looked at this more, and one final thing I'd like to do
is to see if we can reduce the number of methods and classes people have to
deal with. In particular, in the current code, we have versions of each method
for both Function and IFunction. Ideally, we'd like something like the
following:
* Users on Java 6/7 still write code the way they used to, e.g.
`rdd.map(new Function() { ... })`).
* The Function, PairFunction, etc classes are now interfaces with just a
call() method. We wrap them into Scala function objects only later, using
classes that are not visible in the public API (e.g. we have a `private[spark]
class FunctionWrapper` that takes a `Function` and also extends
`scala.Function1`).
* This means that Java 8 users use the same methods as 6/7 ones, but can
pass in a lambda.
The only problem I see this is that the old API had overloaded methods
based on the *type* of the function passed, e.g. `map` could take both a
`Function` and a `PairFunction`. This is not going to work with lambdas, as
mentioned above, so my suggestion here is to slightly *break* the old API, so
that users who want to pass a `PairFunction` have to use `mapToPair`. It's a
bit unfortunate that we have to do this, but the good thing is that it
immediately creates a compile error and we can tell people how to recompile.
If you find a way that still meets the 3 goals above and doesn't break the
API, that's even better, but I'd go for this one right now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post your response.
If your project does not have this feature enabled and wishes so, or if the
feature is enabled but not working, please contact infrastructure at
[email protected] or file a JIRA ticket with INFRA.
---