Re: How to use groupByKey and CqlPagingInputFormat

Martin Gammelsæter Sat, 05 Jul 2014 11:00:06 -0700

Ah, I see. Thank you!

As we are in the process of building the system we have not tried with
any large amounts of data yet, but when the time comes I'll try both
implementations and do a small benchmark.


On Fri, Jul 4, 2014 at 9:20 PM, Mohammed Guller <moham...@glassbeam.com> wrote:
> As far as I know, there is not much difference, except that the outer 
> parenthesis is redundant. The problem with your original code was that there 
> was mismatch in the opening and closing parenthesis. Sometimes the error 
> messages are misleading :-)
>
> Do you see any performance difference with the Datastax spark driver?
>
> Mohammed
>
> -----Original Message-----
> From: Martin Gammelsæter [mailto:martingammelsae...@gmail.com]
> Sent: Friday, July 4, 2014 12:43 AM
> To: user@spark.apache.org
> Subject: Re: How to use groupByKey and CqlPagingInputFormat
>
> On Thu, Jul 3, 2014 at 10:29 PM, Mohammed Guller <moham...@glassbeam.com> 
> wrote:
>> Martin,
>>
>> 1) The first map contains the columns in the primary key, which could be a 
>> compound primary key containing multiple columns,  and the second map 
>> contains all the non-key columns.
>
> Ah, thank you, that makes sense.
>
>> 2) try this fixed code:
>>     val navnrevmap = casRdd.map{
>>       case (key, value) =>
>>         (ByteBufferUtil.string(value.get("navn")),
>>            ByteBufferUtil.toInt(value.get("revisjon")))
>>        }.groupByKey()
>
> I changed from CqlPagingInputFormat to the new Datastax cassandra-spark 
> driver, which is a bit easier to work with, but thanks! I'm curious though, 
> what is the semantic difference between
> map({}) and map{}?



-- 
Mvh.
Martin Gammelsæter
92209139

Re: How to use groupByKey and CqlPagingInputFormat

Reply via email to