If it's about the verbosity, you can just use iter.size instead of your
self-written count, right?
val numVertices =
(srcVertices union targetVertices).distinct.reduceGroup { iter =>
iter.size }
Performance-wise, this is the same, though.
Cheers
Stefan
On Sat, Nov 22, 2014 at 8:17 PM, Márton Balassi <[email protected]>
wrote:
> Hey,
>
> There was a thread recently on the dev list that might be interesting to
> you [1].
> I do not know the exact state of the code though.
>
> [1]
> http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Hi-Aggregation-support-td2311.html#a2547
>
> Cheers,
>
> Marton
>
> On Sat, Nov 22, 2014 at 8:09 PM, Sebastian Schelter <[email protected]>
> wrote:
>
>> Hi,
>>
>> Is there a simple way to count the number of elements of a dataset? At
>> the moment, I have to use the following code, which is pretty verbose and
>> unefficient.
>>
>> val numVertices =
>> (srcVertices union targetVertices).distinct.reduceGroup { iter =>
>> var count = 1L
>> while (iter.hasNext) {
>> count += 1
>> iter.next
>> }
>> count
>> }
>>
>> Best,
>> Sebastian
>>
>
>