Re: cache spark sql parquet file in memory?

2014-06-07 Thread Xu (Simon) Chen
Is there a way to start tachyon on top of a yarn cluster?
 On Jun 7, 2014 2:11 PM, "Marek Wiewiorka" 
wrote:

> I was also thinking of using tachyon to store parquet files - maybe
> tomorrow I will give a try as well.
>
>
> 2014-06-07 20:01 GMT+02:00 Michael Armbrust :
>
>> Not a stupid question!  I would like to be able to do this.  For now, you
>> might try writing the data to tachyon 
>> instead of HDFS.  This is untested though, please report any issues you run
>> into.
>>
>> Michael
>>
>>
>> On Fri, Jun 6, 2014 at 8:13 PM, Xu (Simon) Chen 
>> wrote:
>>
>>> This might be a stupid question... but it seems that saveAsParquetFile()
>>> writes everything back to HDFS. I am wondering if it is possible to cache
>>> parquet-format intermediate results in memory, and therefore making spark
>>> sql queries faster.
>>>
>>> Thanks.
>>> -Simon
>>>
>>
>>
>


Re: cache spark sql parquet file in memory?

2014-06-07 Thread Marek Wiewiorka
I was also thinking of using tachyon to store parquet files - maybe
tomorrow I will give a try as well.


2014-06-07 20:01 GMT+02:00 Michael Armbrust :

> Not a stupid question!  I would like to be able to do this.  For now, you
> might try writing the data to tachyon 
> instead of HDFS.  This is untested though, please report any issues you run
> into.
>
> Michael
>
>
> On Fri, Jun 6, 2014 at 8:13 PM, Xu (Simon) Chen  wrote:
>
>> This might be a stupid question... but it seems that saveAsParquetFile()
>> writes everything back to HDFS. I am wondering if it is possible to cache
>> parquet-format intermediate results in memory, and therefore making spark
>> sql queries faster.
>>
>> Thanks.
>> -Simon
>>
>
>


Re: cache spark sql parquet file in memory?

2014-06-07 Thread Michael Armbrust
Not a stupid question!  I would like to be able to do this.  For now, you
might try writing the data to tachyon  instead
of HDFS.  This is untested though, please report any issues you run into.

Michael


On Fri, Jun 6, 2014 at 8:13 PM, Xu (Simon) Chen  wrote:

> This might be a stupid question... but it seems that saveAsParquetFile()
> writes everything back to HDFS. I am wondering if it is possible to cache
> parquet-format intermediate results in memory, and therefore making spark
> sql queries faster.
>
> Thanks.
> -Simon
>


cache spark sql parquet file in memory?

2014-06-06 Thread Xu (Simon) Chen
This might be a stupid question... but it seems that saveAsParquetFile()
writes everything back to HDFS. I am wondering if it is possible to cache
parquet-format intermediate results in memory, and therefore making spark
sql queries faster.

Thanks.
-Simon