Re: Spark dataset cache vs tempview

2016-11-06 Thread Mich Talebzadeh
With regard to use of tempTable createOrReplaceTempView is backed by an in-memory hash table that maps table name (a string) to a logical query plan. Fragments of that logical query plan may or may not be cached. However, calling register alone will not result in any materialization of results.

Spark dataset cache vs tempview

2016-11-05 Thread Rohit Verma
I have a parquet file which I reading atleast 4-5 times within my application. I was wondering what is most efficient thing to do. Option 1. While writing parquet file, immediately read it back to dataset and call cache. I am assuming by doing an immediate read I might use some existing