how to cache table with OFF_HEAP storage level in SparkSQL thriftserver

2015-03-23 Thread LiuZeshan
hi all:

I got a spark on yarn cluster (spark-1.3.0, hadoop-2.2.0) with hive-0.12.0 and 
tachyon-0.6.1, 
and now I start SparkSQL thriftserver with start-thriftserver.sh, and use 
beeline to connect to thriftserver according to spark document. 


My question is: how to cache table with specified storage level, such as 
OFF_HEAP to me?


I have dug into spark document and spark-user mail list, and did not get any 
idea. 


If I run `cache table TABLENAME` in beeline prompt line, I find this on monitor 
UI. 
I think rdd is cached in default storage level(MEMORY_ONLY), that is not what I 
want.

Thanks

2C04F90E@476EDA34.00DC1055
Description: Binary data


OFF_HEAP storage level

2014-07-04 Thread Ajay Srivastava
Hi,
I was checking different storage level of an RDD and found OFF_HEAP.
Has anybody used this level ?
If i use this level, where will data be stored ? If not in heap, does it mean 
that we can avoid GC ?
How can I use this level ? I did not find anything in archive regarding this.
Can someone also explain the behavior of storage level - NONE ? 


Regards,
Ajay


RE: OFF_HEAP storage level

2014-07-04 Thread Shao, Saisai
Hi Ajay,

StorageLevel OFF_HEAP means for can cache your RDD into Tachyon, the 
prerequisite is that you should deploy Tachyon among Spark.

Yes, it can alleviate GC, since you offload JVM memory into system managed 
memory.

You can use rdd.persist(...) to use this level, details can be checked in 
BlockManager.scala, TachyonBlockManager and TachyonStore.

StorageLevel NONE means the rdd will not be cached, and if you want to use this 
rdd again, you should re-compute from the source to get the data.

Thanks
Jerry

From: Ajay Srivastava [mailto:a_k_srivast...@yahoo.com]
Sent: Friday, July 04, 2014 2:19 PM
To: user@spark.apache.org
Subject: OFF_HEAP storage level

Hi,
I was checking different storage level of an RDD and found OFF_HEAP.
Has anybody used this level ?
If i use this level, where will data be stored ? If not in heap, does it mean 
that we can avoid GC ?
How can I use this level ? I did not find anything in archive regarding this.
Can someone also explain the behavior of storage level - NONE ?


Regards,
Ajay