We're doing some testing storing Hyperloglog synopsis in Kudu. It works well in spark, but the hope is to also query through Impala with a UDF. Spark would remain as the writer, with Impala read-only. To work with Impala I'm wondering if it's best to define the HLL data as Kudu string type with plain encoding, or perhaps it's possible to keep it as binary but declare it as string in an external table definition? I presume the latter is not possible since Kudu's generated external table script does not do this. Please forgive me for not conducting my own experimentation but I figured someone here has run up against this before, and if so please let me know!
-Cliff
