[
https://issues.apache.org/jira/browse/PIG-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14679791#comment-14679791
]
kexianda commented on PIG-4645:
-------------------------------
[~mohitsabharwal], thanks for your comments.
The build-in LongAccumulatorParam is defined as a implicit singleton object.
{code:title=Accumulators.scala|borderStyle=solid}
object AccumulatorParam {
implicit object LongAccumulatorParam extends AccumulatorParam[Long] {
def addInPlace(t1: Long, t2: Long) = t1 + t2
def zero(initialValue: Long) = 0L
}
//...
}
//user can write code like this in scala
val accLong = sc.accumulator(0L)(LongAccumulatorParam)
{code}
But, Java has no exact equivalent to a singleton object.
{code}
//sparkContext.sc().accumulable(0L, "long",
AccumulatorParam.LongAccumulatorParam$); //oops!
{code}
In JavaSparkContext.scala, there are helper functions intAccumulator() &
doubleAccumulator() for int and double. But no such helper function for Long.
{code}
Accumulator<Integer> intAcc = sparkContext.intAccumulator(0, "integer");
Accumulator<Double> doubleAcc = sparkContext.doubleAccumulator(0.0, "double");
//Accumulator<long> doubleAcc = sparkContext.longAccumulator(0.0, "long");
//oops!
{code}
That's why we have to implement AccumulatorParam<Long>.
> Support hadoop-like Counter using spark accumulator
> ---------------------------------------------------
>
> Key: PIG-4645
> URL: https://issues.apache.org/jira/browse/PIG-4645
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Reporter: kexianda
> Assignee: kexianda
> Fix For: spark-branch
>
> Attachments: PIG-4645.patch
>
>
> Pig collect Input/Output statistic info via Counter in MR/Tez mode, we need
> to support this using spark accumulator.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)