I get an the error every time while I run a query on a large data set. I think use MEMORY_AND_DISK can avoid this problem under the limited resources. "15/10/23 17:37:13 Reporter WARN org.apache.spark.deploy.yarn.YarnAllocator>> Container killed by YARN for exceeding memory limits. 7.6 GB of 7.5 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead."
2015-10-23 19:40 GMT+08:00 Xuefu Zhang <xzh...@cloudera.com>: > Yeah. for that, you cannot really cache anything through Hive on Spark. > Could you detail more what you want to achieve? > > When needed, Hive on Spark uses memory+disk for storage level. > > On Fri, Oct 23, 2015 at 4:29 AM, Jone Zhang <joyoungzh...@gmail.com> > wrote: > >> 1.But It's no way to set Storage Level through properties file in spark, >> Spark provided "def persist(newLevel: StorageLevel)" >> api only... >> >> 2015-10-23 19:03 GMT+08:00 Xuefu Zhang <xzh...@cloudera.com>: >> >>> quick answers: >>> 1. you can pretty much set any spark configuration at hive using set >>> command. >>> 2. no. you have to make the call. >>> >>> >>> >>> On Thu, Oct 22, 2015 at 10:32 PM, Jone Zhang <joyoungzh...@gmail.com> >>> wrote: >>> >>>> 1.How can i set Storage Level when i use Hive on Spark? >>>> 2.Do Spark have any intention of dynamically determined Hive on >>>> MapReduce or Hive on Spark, base on SQL features. >>>> >>>> Thanks in advance >>>> Best regards >>>> >>> >>> >> >