I tried using RDD#mapPartitions but my job completes prematurely and
without error as if nothing gets done. What I have is fairly simple
sc
.textFile(inputFile)
.map(parser.parse)
.mapPartitions(bulkLoad)
But the Iterator[T] of mapPartitions
bulkLoad has the connection to MongoDB ?
On Fri, Nov 21, 2014 at 4:34 PM, Benny Thompson ben.d.tho...@gmail.com
wrote:
I tried using RDD#mapPartitions but my job completes prematurely and
without error as if nothing gets done. What I have is fairly simple
sc
On Thu, Nov 20, 2014 at 10:18 PM, Benny Thompson ben.d.tho...@gmail.com
wrote:
I'm trying to use MongoDB as a destination for an ETL I'm writing in
Spark. It appears I'm gaining a lot of overhead in my system databases
(and possibly in the primary documents themselves); I can only assume