Is cacheTable similar to asTempTable before?
Sent from my iPhone
> On 19 Jan, 2016, at 4:18 am, George Sigletos wrote:
>
> Thanks Kevin for your reply.
>
> I was suspecting the same thing as well, although it still does not make much
> sense to me why would you need
According to the documentation they are exactly the same, but in my queries
dataFrame.cache()
results in much faster execution times vs doing
sqlContext.cacheTable("tableName")
Is there any explanation about this? I am not caching the RDD prior to
creating the dataframe. Using Pyspark on Spark
Hi George,
I believe that sqlContext.cacheTable("tableName") is to be used when you
want to cache the data that is being used within a Spark SQL query. For
example, take a look at the code below.
> val myData = sqlContext.load("com.databricks.spark.csv", Map("path" ->
> "hdfs://somepath/file",