Hello there, I have one or more parquet files to read and perform some aggregate queries using Spark Dataframe. I would like to find a reasonable fast datastore that allows me to write the results for subsequent (simpler queries). I did attempt to use ElasticSearch to write the query results using ElasticSearch Hadoop connector. But I am running into connector write issues if the number of Spark executors are too many for ElasticSearch to handle. But in the schema sense, this seems a great fit as ElasticSearch has smartz in place to discover the schema. Also in the query sense, I can perform simple filters and sort using ElasticSearch and for more complex aggregate, Spark Dataframe can come back to the rescue :). Please advice on other possible data-stores I could use?
Thanks, Muthu -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Fast-write-datastore-tp28497.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org