Hi All,
I am iterating over data frame's paritions using df.foreachPartition .
Upon each iteration of row , i am initializing DAO to insert the row into
cassandra.
Each of these iteration takes almost 1 and half minute to finish.
In my workflow , this is part of an action and 100 partitions are being
created for the df as i can see 100 tasks being created , where the insert
dao operation is being performed.
Since each of these 100 tasks , takes around 1 and half minute to complete
, it takes around 2 hour for this small insert operation.
Is anyone facing the same scenario and is there any time efficient way to
handle this?
This latency is not good in out use case.
Any pointer to improve/minimise the latency will be really appreciated.


-- 
Thanks
Deepak

Reply via email to