Hello Uladzimir / Shiva, >From ElasticSearch documentation (i have to see the logical plan of a query to confirm), the richness of filters (like regex,..) is pretty good while comparing to Cassandra. As for aggregates, i think Spark Dataframes is quite rich enough to tackle. Let me know your thoughts.
Thanks, Muthu On Wed, Mar 15, 2017 at 10:55 AM, vvshvv <vvs...@gmail.com> wrote: > Hi muthu, > > I agree with Shiva, Cassandra also supports SASI indexes, which can > partially replace Elasticsearch functionality. > > Regards, > Uladzimir > > > > Sent from my Mi phone > On Shiva Ramagopal <tr.s...@gmail.com>, Mar 15, 2017 5:57 PM wrote: > > Probably Cassandra is a good choice if you are mainly looking for a > datastore that supports fast writes. You can ingest the data into a table > and define one or more materialized views on top of it to support your > queries. Since you mention that your queries are going to be simple you can > define your indexes in the materialized views according to how you want to > query the data. > > Thanks, > Shiva > > > > On Wed, Mar 15, 2017 at 7:58 PM, Muthu Jayakumar <bablo...@gmail.com> > wrote: > >> Hello Vincent, >> >> Cassandra may not fit my bill if I need to define my partition and other >> indexes upfront. Is this right? >> >> Hello Richard, >> >> Let me evaluate Apache Ignite. I did evaluate it 3 months back and back >> then the connector to Apache Spark did not support Spark 2.0. >> >> Another drastic thought may be repartition the result count to 1 (but >> have to be cautions on making sure I don't run into Heap issues if the >> result is too large to fit into an executor) and write to a relational >> database like mysql / postgres. But, I believe I can do the same using >> ElasticSearch too. >> >> A slightly over-kill solution may be Spark to Kafka to ElasticSearch? >> >> More thoughts welcome please. >> >> Thanks, >> Muthu >> >> On Wed, Mar 15, 2017 at 4:53 AM, Richard Siebeling <rsiebel...@gmail.com> >> wrote: >> >>> maybe Apache Ignite does fit your requirements >>> >>> On 15 March 2017 at 08:44, vincent gromakowski < >>> vincent.gromakow...@gmail.com> wrote: >>> >>>> Hi >>>> If queries are statics and filters are on the same columns, Cassandra >>>> is a good option. >>>> >>>> Le 15 mars 2017 7:04 AM, "muthu" <bablo...@gmail.com> a écrit : >>>> >>>> Hello there, >>>> >>>> I have one or more parquet files to read and perform some aggregate >>>> queries >>>> using Spark Dataframe. I would like to find a reasonable fast datastore >>>> that >>>> allows me to write the results for subsequent (simpler queries). >>>> I did attempt to use ElasticSearch to write the query results using >>>> ElasticSearch Hadoop connector. But I am running into connector write >>>> issues >>>> if the number of Spark executors are too many for ElasticSearch to >>>> handle. >>>> But in the schema sense, this seems a great fit as ElasticSearch has >>>> smartz >>>> in place to discover the schema. Also in the query sense, I can perform >>>> simple filters and sort using ElasticSearch and for more complex >>>> aggregate, >>>> Spark Dataframe can come back to the rescue :). >>>> Please advice on other possible data-stores I could use? >>>> >>>> Thanks, >>>> Muthu >>>> >>>> >>>> >>>> -- >>>> View this message in context: http://apache-spark-user-list. >>>> 1001560.n3.nabble.com/Fast-write-datastore-tp28497.html >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>>> >>>> >>>> >>> >> >