You say you reduceByKey but are you really collecting all the tuples
for a vehicle in a collection, like what groupByKey does already? Yes,
if one vehicle has a huge amount of data that could fail.
Otherwise perhaps you are simply not increasing memory from the default.
Maybe you can consider
It is wonderful to see some idea.
Now the questions:
1) What is a track segment?
Ans) It is the line that contains two adjacent points when all points are
arranged by time. Say a vehicle moves (t1, p1) - (t2, p2) - (t3, p3).
Then the segments are (p1, p2), (p2, p3) when the time ordering is (t1
Hi,
I have an RDD containing Vehicle Number , timestamp, Position.
I want to get the lag function equivalent to my RDD to be able to create
track segment of each Vehicle.
Any help?
PS: I have tried reduceByKey and then splitting the List of position in
tuples. For me it runs out of memory
Perhaps, its just me but lag function isnt familiar to me ..
But have you tried configuring the spark appropriately
http://spark.apache.org/docs/latest/configuration.html
On Tue, Oct 14, 2014 at 5:37 PM, Manas Kar manasdebashis...@gmail.com
wrote:
Hi,
I have an RDD containing Vehicle Number