I already checked and G is taking 1 secs for each task. is this too much?
if yes how to avoid this?

On 16 April 2015 at 21:58, Akhil Das <ak...@sigmoidanalytics.com> wrote:

> Open the driver ui and see which stage is taking time, you can look
> whether its adding any GC time etc.
>
> Thanks
> Best Regards
>
> On Thu, Apr 16, 2015 at 9:56 PM, Jeetendra Gangele <gangele...@gmail.com>
> wrote:
>
>> Hi All I have below code whether distinct is running for more time.
>>
>> blockingRdd is the combination of <Long,String> and it will have 400K
>> records
>> JavaPairRDD<Long,Integer>
>> completeDataToprocess=blockingRdd.flatMapValues( new Function<String,
>> Iterable<Integer>>(){
>>
>> @Override
>> public Iterable<Integer> call(String v1) throws Exception {
>> return ckdao.getSingelkeyresult(v1);
>> }
>>  }).distinct(32);
>>
>> I am running distinct on 800K records and its taking 2 hours on 16 cores
>> and 20 GB RAM.
>>
>
>

Reply via email to