Re: cache spark sql parquet file in memory?

2014-06-07 Thread Michael Armbrust
Not a stupid question! I would like to be able to do this. For now, you might try writing the data to tachyon http://tachyon-project.org/ instead of HDFS. This is untested though, please report any issues you run into. Michael On Fri, Jun 6, 2014 at 8:13 PM, Xu (Simon) Chen xche...@gmail.com

Re: cache spark sql parquet file in memory?

2014-06-07 Thread Marek Wiewiorka
I was also thinking of using tachyon to store parquet files - maybe tomorrow I will give a try as well. 2014-06-07 20:01 GMT+02:00 Michael Armbrust mich...@databricks.com: Not a stupid question! I would like to be able to do this. For now, you might try writing the data to tachyon

Re: cache spark sql parquet file in memory?

2014-06-07 Thread Xu (Simon) Chen
Is there a way to start tachyon on top of a yarn cluster? On Jun 7, 2014 2:11 PM, Marek Wiewiorka marek.wiewio...@gmail.com wrote: I was also thinking of using tachyon to store parquet files - maybe tomorrow I will give a try as well. 2014-06-07 20:01 GMT+02:00 Michael Armbrust

cache spark sql parquet file in memory?

2014-06-06 Thread Xu (Simon) Chen
This might be a stupid question... but it seems that saveAsParquetFile() writes everything back to HDFS. I am wondering if it is possible to cache parquet-format intermediate results in memory, and therefore making spark sql queries faster. Thanks. -Simon