We don't support Kudu binary columns in Impala: https://issues.apache.org/jira/browse/IMPALA-5323. At least with Impala/Kudu using a string should work fine. We use strings internally in Impala for storing HLL intermediates for stats computation.
On Sat, Dec 15, 2018 at 7:17 PM Cliff Resnick <[email protected]> wrote: > We're doing some testing storing Hyperloglog synopsis in Kudu. It works > well in spark, but the hope is to also query through Impala with a UDF. > Spark would remain as the writer, with Impala read-only. To work with > Impala I'm wondering if it's best to define the HLL data as Kudu string > type with plain encoding, or perhaps it's possible to keep it as binary but > declare it as string in an external table definition? I presume the latter > is not possible since Kudu's generated external table script does not do > this. Please forgive me for not conducting my own experimentation but I > figured someone here has run up against this before, and if so please let > me know! > > -Cliff > > >
