This …works!
quite surprised as per the steps I outlined, the issue manifested even
without CTAS (regular SELECT)
still don't see how could that be related …or those are two separate issues?
Also, maybe you know - is there any way to make it work for TextFile?
Thank you,
Maciek
On Tue, Oct 7, 2014 at 7:13 AM, Navis류승우 navis@nexr.com wrote:
Try with set hive.default.fileformat=SequenceFile;
Thanks,
Navis
2014-10-06 20:51 GMT+09:00 Maciek mac...@sonra.io:
Hello,
I've encountered a situation when printing new lines corrupts
(multiplies) the returned dataset.
This seem to be similar to HIVE-3012
https://issues.apache.org/jira/browse/HIVE-3012 (fixed on 0.11), but
as I'm on Hive 0.13 it's still the case.
Here are the steps to illustrate/reproduce:
1. Fist let'e create table with one row and one column by selecting from
any existing table (substitute ANYTABLE respecitvely):
CREATE TABLE singlerow AS SELECT 'worldofhostels' wordsmerged FROM
ANYTABLE LIMIT 1;
and verify:
SELECT * FROM singlerow;
OK---
worldofhostels
Time taken: 0.028 seconds, Fetched: 1 row(s)
All good so far.
2. Now let's introduce newline here by:
SELECT regexp_replace(wordsmerged,'of',\nof\n) wordsseparate FROM
singlerow;
OK--
world
of
hostels
Time taken: 6.404 seconds, Fetched: 3 row(s)
and I'm suddenly getting 3 rows now.
3. This is not just for CLI output as when submitting CTAS, it
materializes such corrupted result set:
CREATE TABLE corrupted AS
SELECT regexp_replace(wordsmerged,'of',\nof\n) wordsseparate,
wordsmerged FROM singlerow;
hive select * from corrupted;
OK
world NULL
of NULL
hostels worldofhostels
Time taken: 0.029 seconds, Fetched: 3 row(s)
Apparently, the same happens - new table is split into multiple rows with
columns following the one in question (like wordsmerged) become NULLs
Am i doing something wrong here?
Regards,
Maciek
--
Kind Regards
Maciek Kocon