Hi Mike,
FYI: Is you are using Spark 2.x, you might have issues with encoders if you
use a case class with Enumeration type field, see
https://issues.apache.org/jira/browse/SPARK-17248
For (1), (2), I would guess Int would be better (space-wise), but I am not
familiar with parquet's internals.
Hi Spark Users,
I want to store Enum type (such as Vehicle Type: Car, SUV, Wagon) in my
data. My storage format will be parquet and I need to access the data from
Spark-shell, Spark SQL CLI, and hive. My questions:
1) Should I store my Enum type as String or store it as numeric encoding
(aka