Hi All,
I'm using Spark/Shark as the foundation for some reporting that I'm doing
and have a customers table with approximately 3 million rows that I've
cached in memory.
I've also created a partitioned table that I've also cached in memory on a
per day basis
FROM
customers_cached
INSERT
- dev list + user list
Shark is not officially supported anymore so you are better off moving to
Spark SQL.
Shark doesnt support Hive partitioning logic anyways, it has its version of
partitioning on in-memory blocks but is independent of whether you
partition your data in hive or not.
Mayur