Re: BINARY column type

2012-12-01 Thread John Omernik
No, I didn't remove any newline characters. newline became 0A By using perl or python in a transform if I had "Hi how are you\n" It would be come 486920686f772061726520796f75200A >From there it would pass that to the unhex() function in hive in the insert statement. That allowed me to move the da

RE: BINARY column type

2012-12-01 Thread Connell, Chuck
Thanks John. When you say "hexed" data, do you mean binary encoded to ascii hex? This would remove the raw newline characters. We considered Base64 encoding our data, a similar idea, which would also remove raw newlines. But my preference is to put real binary data into Hive, and find a way to

Re: BINARY column type

2012-12-01 Thread John Omernik
Hi Chuck - I've used binary columns with Newlines in the data. I used RCFile format for my storage method. Works great so far. Whether or not this is "the" way to get data in, I use hexed data (my transform script outputs hex encoded) and the final insert into the table gets a unhex(sourcedata).

BINARY column type

2012-12-01 Thread Connell, Chuck
I am trying to use BINARY columns and believe I have the perfect use-case for it, but I am missing something. Has anyone used this for true binary data (which may contain newlines)? Here is the background... I have some files that each contain just one logical field, which is a binary object.