Github user mateiz commented on the pull request:
https://github.com/apache/incubator-spark/pull/539#issuecomment-35218879
BTW, I've looked into this myself, and created a short project at
https://github.com/mateiz/java8-test to show how an RDD-like API might work in
Java. To make a long story short:
* It looks like map() operations overloaded by just the return type of the
lambda don't work (e.g. if you have `map(Function<T, U>)` as well as
`map(PairFunction<T, K, V>)`).
* The solution for this in Java 8's own
[Stream](http://download.java.net/jdk8/docs/api/java/util/stream/Stream.html)
API is to have separate functions for mapping to "specialized" types, like
`mapToInt` and `mapToLong`. (Apparently at some point an overloaded `map` also
worked, see
[this](https://www.surveymonkey.com/sr.aspx?sm=9UyN8RRvMX8BnpTdd4rYgDlXU9uUVALNDjNn_2fY2e9_2fo_3d),
but I'm not sure we can make it happen again.)
Given this, I'd like us to simply add mapToPair and mapToDouble, and same
for flatMap and such. If we do it, we can keep our API functions accepting
lambda expressions throughout. I'm very much against requiring the user to
write Function.of -- whatever that does, we should be able to do it after on
the raw lambda expression in order to convert it to a Scala Function object.
Personally I'm okay if this breaks the current Java API slightly, though it
may also be possible to do it in a backward-compatible way (e.g. keep our
current PairFunction, which is an abstract class so it can't be passed through
as a lambda expression anyway, and add a new one that is an interface). But
let's figure that out after we have a basic version working with the "new" API
we want.
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post your response.
If your project does not have this feature enabled and wishes so, or if the
feature is enabled but not working, please contact infrastructure at
[email protected] or file a JIRA ticket with INFRA.