[
https://issues.apache.org/jira/browse/PIG-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451647#comment-13451647
]
Dmitriy V. Ryaboy commented on PIG-2831:
----------------------------------------
Excellent, Prasanth! Any idea how much RAM that saves us?
Question: why did you choose to create a TupleFieldFilter function instead of
using the more generic Guava Functions ? Because of exception handling? You
could just rethrow using RuntimeExceptions. Also, not sure why TupleFieldFilter
is so specific (one field only). It could be just about anything that takes a
Tuple and returns a Tuple, right?
> MR-Cube implementation (Distributed cubing for holistic measures)
> -----------------------------------------------------------------
>
> Key: PIG-2831
> URL: https://issues.apache.org/jira/browse/PIG-2831
> Project: Pig
> Issue Type: Sub-task
> Reporter: Prasanth J
> Assignee: Prasanth J
> Attachments: PIG-2831.1.git.patch, PIG-2831.2.git.patch,
> PIG-2831.3.git.patch, PIG-2831.4.git.patch, PIG-2831.5.git.patch,
> PIG-2831.6.git.patch, PIG-2831.7.git.patch
>
>
> Implementing distributed cube materialization on holistic measure based on
> MR-Cube approach as described in http://arnab.org/files/mrcube.pdf.
> Primary steps involved:
> 1) Identify if the measure is holistic or not
> 2) Determine algebraic attribute (can be detected automatically for few
> cases, if automatic detection fails user should hint the algebraic attribute)
> 3) Modify MRPlan to insert a sampling job which executes naive cube algorithm
> and generates annotated cube lattice (contains large group partitioning
> information)
> 4) Modify plan to distribute annotated cube lattice to all mappers using
> distributed cache
> 5) Execute actual cube materialization on full dataset
> 6) Modify MRPlan to insert a post process job for combining the results of
> actual cube materialization job
> 7) OOM exception handling
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira