No, I didn't remove any newline characters. newline became 0A By using
perl or python in a transform if I had "Hi how are you\n" It would be
come 486920686f772061726520796f75200A
>From there it would pass that to the unhex() function in hive in the insert
statement. That allowed me to move the da
Thanks John. When you say "hexed" data, do you mean binary encoded to ascii
hex? This would remove the raw newline characters.
We considered Base64 encoding our data, a similar idea, which would also remove
raw newlines. But my preference is to put real binary data into Hive, and find
a way to
Hi Chuck -
I've used binary columns with Newlines in the data. I used RCFile format
for my storage method. Works great so far. Whether or not this is "the" way
to get data in, I use hexed data (my transform script outputs hex encoded)
and the final insert into the table gets a unhex(sourcedata).
I am trying to use BINARY columns and believe I have the perfect use-case for
it, but I am missing something. Has anyone used this for true binary data
(which may contain newlines)?
Here is the background... I have some files that each contain just one logical
field, which is a binary object.