Hi Muthu,. I did not catch from your message, what performance do you expect from subsequent queries?
Regards, Uladzimir On Mar 15, 2017 9:03 PM, "Muthu Jayakumar" <bablo...@gmail.com> wrote: > Hello Uladzimir / Shiva, > > From ElasticSearch documentation (i have to see the logical plan of a > query to confirm), the richness of filters (like regex,..) is pretty good > while comparing to Cassandra. As for aggregates, i think Spark Dataframes > is quite rich enough to tackle. > Let me know your thoughts. > > Thanks, > Muthu > > > On Wed, Mar 15, 2017 at 10:55 AM, vvshvv <vvs...@gmail.com> wrote: > >> Hi muthu, >> >> I agree with Shiva, Cassandra also supports SASI indexes, which can >> partially replace Elasticsearch functionality. >> >> Regards, >> Uladzimir >> >> >> >> Sent from my Mi phone >> On Shiva Ramagopal <tr.s...@gmail.com>, Mar 15, 2017 5:57 PM wrote: >> >> Probably Cassandra is a good choice if you are mainly looking for a >> datastore that supports fast writes. You can ingest the data into a table >> and define one or more materialized views on top of it to support your >> queries. Since you mention that your queries are going to be simple you can >> define your indexes in the materialized views according to how you want to >> query the data. >> >> Thanks, >> Shiva >> >> >> >> On Wed, Mar 15, 2017 at 7:58 PM, Muthu Jayakumar <bablo...@gmail.com> >> wrote: >> >>> Hello Vincent, >>> >>> Cassandra may not fit my bill if I need to define my partition and other >>> indexes upfront. Is this right? >>> >>> Hello Richard, >>> >>> Let me evaluate Apache Ignite. I did evaluate it 3 months back and back >>> then the connector to Apache Spark did not support Spark 2.0. >>> >>> Another drastic thought may be repartition the result count to 1 (but >>> have to be cautions on making sure I don't run into Heap issues if the >>> result is too large to fit into an executor) and write to a relational >>> database like mysql / postgres. But, I believe I can do the same using >>> ElasticSearch too. >>> >>> A slightly over-kill solution may be Spark to Kafka to ElasticSearch? >>> >>> More thoughts welcome please. >>> >>> Thanks, >>> Muthu >>> >>> On Wed, Mar 15, 2017 at 4:53 AM, Richard Siebeling <rsiebel...@gmail.com >>> > wrote: >>> >>>> maybe Apache Ignite does fit your requirements >>>> >>>> On 15 March 2017 at 08:44, vincent gromakowski < >>>> vincent.gromakow...@gmail.com> wrote: >>>> >>>>> Hi >>>>> If queries are statics and filters are on the same columns, Cassandra >>>>> is a good option. >>>>> >>>>> Le 15 mars 2017 7:04 AM, "muthu" <bablo...@gmail.com> a écrit : >>>>> >>>>> Hello there, >>>>> >>>>> I have one or more parquet files to read and perform some aggregate >>>>> queries >>>>> using Spark Dataframe. I would like to find a reasonable fast >>>>> datastore that >>>>> allows me to write the results for subsequent (simpler queries). >>>>> I did attempt to use ElasticSearch to write the query results using >>>>> ElasticSearch Hadoop connector. But I am running into connector write >>>>> issues >>>>> if the number of Spark executors are too many for ElasticSearch to >>>>> handle. >>>>> But in the schema sense, this seems a great fit as ElasticSearch has >>>>> smartz >>>>> in place to discover the schema. Also in the query sense, I can perform >>>>> simple filters and sort using ElasticSearch and for more complex >>>>> aggregate, >>>>> Spark Dataframe can come back to the rescue :). >>>>> Please advice on other possible data-stores I could use? >>>>> >>>>> Thanks, >>>>> Muthu >>>>> >>>>> >>>>> >>>>> -- >>>>> View this message in context: http://apache-spark-user-list. >>>>> 1001560.n3.nabble.com/Fast-write-datastore-tp28497.html >>>>> Sent from the Apache Spark User List mailing list archive at >>>>> Nabble.com. >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>>>> >>>>> >>>>> >>>> >>> >> >