You might think about another storage layer not being mongodb
(hdfs+orc+compression or hdfs+parquet+compression) to improve performance
Le jeu. 3 sept. 2015 à 9:15, Akhil Das a
écrit :
> On SSD you will get around 30-40MB/s on a single machine (on 4 cores).
>
>
Because of existing architecture , i am bound to use mongodb.
Please suggest for this
On Thu, Sep 3, 2015 at 9:10 PM, Jörn Franke wrote:
> You might think about another storage layer not being mongodb
> (hdfs+orc+compression or hdfs+parquet+compression) to improve
On SSD you will get around 30-40MB/s on a single machine (on 4 cores).
Thanks
Best Regards
On Mon, Aug 31, 2015 at 3:13 PM, Deepesh Maheshwari <
deepesh.maheshwar...@gmail.com> wrote:
> tried it,,gives the same above exception
>
> Exception in thread "main" java.io.IOException: No FileSystem
Hi, I am trying to read mongodb in Spark newAPIHadoopRDD.
/ Code */
config.set("mongo.job.input.format", "com.mongodb.hadoop.MongoInputFormat");
config.set("mongo.input.uri",SparkProperties.MONGO_OUTPUT_URI);
config.set("mongo.input.query","{host: 'abc.com'}");
JavaSparkContext sc=new
Can you try with these key value classes and see the performance?
inputFormatClassName = "com.mongodb.hadoop.MongoInputFormat"
keyClassName = "org.apache.hadoop.io.Text"
valueClassName = "org.apache.hadoop.io.MapWritable"
Taken from databricks blog
Here's a piece of code which works well for us (spark 1.4.1)
Configuration bsonDataConfig = new Configuration();
bsonDataConfig.set("mongo.job.input.format",
"com.mongodb.hadoop.BSONFileInputFormat");
Configuration predictionsConfig = new Configuration();
FYI, newAPIHadoopFile and newAPIHadoopRDD uses the NewHadoopRDD class
itself underneath and it doesnt mean it will only read from HDFS. Give it a
shot if you haven't tried it already (it just the inputformat and the
reader which are different from your approach).
Thanks
Best Regards
On Mon, Aug
Hi Akhil,
This code snippet is from below link
https://github.com/crcsmnky/mongodb-spark-demo/blob/master/src/main/java/com/mongodb/spark/demo/Recommender.java
Here it reading data from HDFS file system but in our case i need to read
from mongodb.
I have tried it earlier and now again tried it