On Saturday, December 5, 2015 3:00 PM, DB Tsai wrote:
This is tricky. You need to shuffle the ending and beginning elements
using mapPartitionWithIndex.
Does this mean that I need to shuffle the all elements in different partitions
into one partition, then
On Monday, December 7, 2015 10:37 AM, DB Tsai wrote:
Only beginning and ending part of data. The rest in the partition can
be compared without shuffle.
Would you help write a few pseudo-code about it...It seems that there is not
shuffle related API , or
Only beginning and ending part of data. The rest in the partition can
be compared without shuffle.
Sincerely,
DB Tsai
--
Web: https://www.dbtsai.com
PGP Key ID: 0xAF08DF8D
On Sun, Dec 6, 2015 at 6:27 PM, Zhiliang Zhu
For this, mapWithPartitionsWithIndex would also properly work for filter.
Here is the code copied for stack-overflow, which is used to remove the first
line of a csv file:
JavaRDD rawInputRdd = sparkContext.textFile(dataFile);
Function2 removeHeader= new Function2
Hi All,
I would like to compare any two adjacent elements in one given rdd, just as the
single machine code part:
int a[N] = {...};for (int i=0; i < N - 1; ++i) { compareFun(a[i], a[i+1]);}...
mapPartitions may work for some situations, however, it could not compare
elements in different
Hi DB Tsai,
Thanks very much for your kind reply!
Sorry that for one more issue, as tested it seems that filter could only return
JavaRDD but not any JavaRDD , is it ?Then it is not much convenient
to do general filter for RDD, mapPartitions could work some, but if some
partition will left and
This is tricky. You need to shuffle the ending and beginning elements
using mapPartitionWithIndex.
Sincerely,
DB Tsai
--
Web: https://www.dbtsai.com
PGP Key ID: 0xAF08DF8D
On Fri, Dec 4, 2015 at 10:30 PM, Zhiliang Zhu