Only beginning and ending part of data. The rest in the partition can
be compared without shuffle.

Sincerely,

DB Tsai
----------------------------------------------------------
Web: https://www.dbtsai.com
PGP Key ID: 0xAF08DF8D


On Sun, Dec 6, 2015 at 6:27 PM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
>
>
>
>
> On Saturday, December 5, 2015 3:00 PM, DB Tsai <dbt...@dbtsai.com> wrote:
>
>
> This is tricky. You need to shuffle the ending and beginning elements
> using mapPartitionWithIndex.
>
>
> Does this mean that I need to shuffle the all elements in different
> partitions into one partition, then compare them by way of any two adjacent
> elements?
> It seems good, if it is like that.
>
> One more issue, will it loss parallelism since there become only one
> partition ...
>
> Thanks very much in advance!
>
>
>
>
>
>
> Sincerely,
>
> DB Tsai
> ----------------------------------------------------------
> Web: https://www.dbtsai.com
> PGP Key ID: 0xAF08DF8D
>
>
> On Fri, Dec 4, 2015 at 10:30 PM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:
>> Hi All,
>>
>> I would like to compare any two adjacent elements in one given rdd, just
>> as
>> the single machine code part:
>>
>> int a[N] = {...};
>> for (int i=0; i < N - 1; ++i) {
>>    compareFun(a[i], a[i+1]);
>> }
>> ...
>>
>> mapPartitions may work for some situations, however, it could not compare
>> elements in different  partitions.
>> foreach also seems not work.
>>
>> Thanks,
>> Zhiliang
>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to