issue with spark and bson input

Dmitriy Selivanov Tue, 05 Aug 2014 13:44:31 -0700

Hello, I have issue when try to use bson file as spark input. I use
mongo-hadoop-connector 1.3.0 and spark 1.0.0:
    val sparkConf = new SparkConf()
    val sc = new SparkContext(sparkConf)
    val config = new Configuration()
    config.set("mongo.job.input.format",
"com.mongodb.hadoop.BSONFileInputFormat")
    config.set("mapred.input.dir", "file:///root/jobs/dump/input.bson")
    config.set("mongo.output.uri", "mongodb://" + args(0) + "/" + args(2))
    val mongoRDD = sc.newAPIHadoopFile("file:///root/jobs/dump/input.bson",
classOf[BSONFileInputFormat], classOf[Object], classOf[BSONObject], config)


But on last line I recieve error: "inferred type arguments
[Object,org.bson.BSONObject,com.mongodb.hadoop.BSONFileInputFormat] do not
conform to method newAPIHadoopFile's type parameter bounds [K,V,F <:
org.apache.hadoop.mapreduce.InputFormat[K,V]]"
this is very strange, because BSONFileInputFormat
extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat:
https://github.com/mongodb/mongo-hadoop/blob/master/core/src/main/java/com/mongodb/hadoop/BSONFileInputFormat.java
How I can solve this issue?
I have no problems with com.mongodb.hadoop.MongoInputFormat when use
mongodb collection as input.
And moreover seems there is no problem with java api:
https://github.com/crcsmnky/mongodb-spark-demo/blob/master/src/main/java/com/mongodb/spark/demo/Recommender.java
I'm not professional java/scala developer, please help.

-- 
Regards
Dmitriy Selivanov

issue with spark and bson input

Reply via email to