Once I get "hive.execution.engine=spark" working, how would I go about loading portions of my data into memory? Lets say I have a 100TB database and want to load all of last weeks data in spark memory, is this possible or even beneficial? Or am I thinking about hive on spark in the wrong way.
I also assume hive on spark could get me to near-real-time capabilities for large queries. Is this true?