[
https://issues.apache.org/jira/browse/DATAFU-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133366#comment-15133366
]
Matthew Hayes commented on DATAFU-114:
--------------------------------------
Sorry for the late response. The change looks reasonable to me. There should
be a test for this though (understandable there isn't one since you couldn't
build it). I went ahead and wrote one below. If this test looks reasonable to
you I'll commit both pieces of code. I'm taking a look at DATAFU-95.
{code}
@Test
public void firstTupleFromBagAccumulateTest() throws Exception
{
TupleFactory tf = TupleFactory.getInstance();
BagFactory bf = BagFactory.getInstance();
FirstTupleFromBag op = new FirstTupleFromBag();
Tuple defaultValue = tf.newTuple(1000);
op.accumulate(tf.newTuple(Arrays.asList(bf.newDefaultBag(Arrays.asList(tf.newTuple(4))),
defaultValue)));
op.accumulate(tf.newTuple(Arrays.asList(bf.newDefaultBag(Arrays.asList(tf.newTuple(9))),
defaultValue)));
op.accumulate(tf.newTuple(Arrays.asList(bf.newDefaultBag(Arrays.asList(tf.newTuple(16))),
defaultValue)));
assertEquals(op.getValue(), tf.newTuple(4));
op.cleanup();
op.accumulate(tf.newTuple(Arrays.asList(bf.newDefaultBag(Arrays.asList(tf.newTuple(11))),
defaultValue)));
op.accumulate(tf.newTuple(Arrays.asList(bf.newDefaultBag(Arrays.asList(tf.newTuple(17))),
defaultValue)));
op.accumulate(tf.newTuple(Arrays.asList(bf.newDefaultBag(Arrays.asList(tf.newTuple(5))),
defaultValue)));
assertEquals(op.getValue(), tf.newTuple(11));
op.cleanup();
op.accumulate(tf.newTuple(Arrays.asList(bf.newDefaultBag(), defaultValue)));
assertEquals(op.getValue(), defaultValue);
op.cleanup();
}
{code}
> Make FirstTupleFromBag implement Accumulator
> --------------------------------------------
>
> Key: DATAFU-114
> URL: https://issues.apache.org/jira/browse/DATAFU-114
> Project: DataFu
> Issue Type: Improvement
> Affects Versions: 1.3.0
> Environment: All
> Reporter: Eyal Allweil
> Priority: Minor
> Labels: easyfix, newbie, performance
> Attachments: FirstTupleFromBag.java
>
>
> FirstTupleFromBag only needs the first tuple from the bag, but because it
> doesn't implement Accumulator the entire bag needs to be passed to it
> in-memory. The fix is very minor and will make the UDF support large bags.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)