BTW, if you decide to try the mongodb, please use the 3.0+ version with "wiredtiger" engine.
On Sat, Nov 28, 2015 at 11:30 PM, Yu Zhang <yuz1...@iastate.edu> wrote: > If you need to construct multiple indexes, hbase will perform better, the > writing speed is slow in mongodb with many indexes and the memory cost is > huge! > > But my concern is: with mongodb, you could easily cooperate with js and > with some visualization tools like D3.js, the work will become smooth as > breeze. > > Could you provide additional details of the data size and number of > operations you need in your program? I believe this is a quite general > question and hope to hear any comments and thoughts. > > On Tue, Nov 24, 2015 at 9:50 AM, Ted Yu <yuzhih...@gmail.com> wrote: > >> You should consider using HBase as the NoSQL database. >> w.r.t. 'The data in the DB should be indexed', you need to design the >> schema in HBase carefully so that the retrieval is fast. >> >> Disclaimer: I work on HBase. >> >> On Tue, Nov 24, 2015 at 4:46 AM, sparkuser2345 <hm.spark.u...@gmail.com> >> wrote: >> >>> I'm interested in knowing which NoSQL databases you use with Spark and >>> what >>> are your experiences. >>> >>> On a general level, I would like to use Spark streaming to process >>> incoming >>> data, fetch relevant aggregated data from the database, and update the >>> aggregates in the DB based on the incoming records. The data in the DB >>> should be indexed to be able to fetch the relevant data fast and to allow >>> fast interactive visualization of the data. >>> >>> I've been reading about MongoDB+Spark and I've got the impression that >>> there >>> are some challenges in fetching data by indices and in updating >>> documents, >>> but things are moving so fast, so I don't know if these are relevant >>> anymore. Do you find any benefit from using HBase with Spark as HBase is >>> built on top of HDFS? >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Experiences-about-NoSQL-databases-with-Spark-tp25462.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >