[
https://issues.apache.org/jira/browse/CRUNCH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Whiting updated CRUNCH-585:
---------------------------------
Attachment: RC1-0001-Java-8-lambda-support-for-Apache-Crunch.patch
Ok, here comes an update, and I think this one is pretty much complete. I've
been using it in a real project for a little while and it seems to be working
well. Improvements on the previous versions:
1) Javadocs for everything
2) MemPipeline unit tests for all non-trivial operations.
3) Improved naming
4) Build conditional on jdk8
5) 1 or 2 small bug fixes (as a result of testing :-))
I now consider this an "RC" for merging into master, but it's a big patch so
it'd be great to get another pair of eyes over it first.
> Move Java 8 lambda support into separate module
> ------------------------------------------------
>
> Key: CRUNCH-585
> URL: https://issues.apache.org/jira/browse/CRUNCH-585
> Project: Crunch
> Issue Type: Improvement
> Reporter: David Whiting
> Fix For: 0.14.0
>
> Attachments: 0001-Java-8-lambda-support-for-Apache-Crunch.patch,
> 0001-Java-8-lambda-support-for-Apache-Crunch.patch,
> RC1-0001-Java-8-lambda-support-for-Apache-Crunch.patch
>
>
> As discussed on a previous dev list thread, this patch implements a set of
> operations to conveniently use Java 8 lambda expressions and method
> references to construct Crunch pipelines by wrapping the PCollection
> instances into analogous "LCollection" instances which delegate the necessary
> operations, in much the same way the Scrunch wraps the Crunch Core API.
> I'm still not 100% convinced that this is better for the user than the
> existing lambda support via IMapFn and IDoFn PCollection operations, so I'm
> still interested in people's views on this.
> Advantages:
> - Concise self-contained implementation
> - Methods implemented in terms of a very basic subset of PCollection
> operations (useful if we want to scale down the PCollection API at some point)
> - API can be written in terms of the Java 8 library, operating on streams and
> functional interfaces, making in more familiar to a new developer.
> - Retain "type '.' and see what I can do" experience.
> - Really easy to add new operations (just default method on interface)
> Disadvantages:
> - PCollections must be wrapped into LCollections before use.
> - LCollections must be unwrapped into PCollections to access some existing
> operations.
> - Using counters and other contextual data is far more complex.
> Some limitations of this particular patch:
> - Some omissions in API (not sure how much to implement)
> - No Javadocs yet.
> - Very poor tests.
> - Naming is a bit off (eg. reduce() or reduceValues(), get() or underlying())
> I can fix all that, but I wanted to bring the community in at this point to
> get some feedback on both the idea and the implementation as it's quite a big
> patch.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)