[
https://issues.apache.org/jira/browse/HIVE-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072719#comment-13072719
]
Amareshwari Sriramadasu commented on HIVE-2303:
-----------------------------------------------
This problem occurs because FileSinkOperator generates a TableDesc with default
properties for storing the output. Solution is to escape the delimiters for the
output table.
Shouldn't escaping of delimiters happen always in LazySimpleSerde?
> files with control-A,B are not delimited correctly.
> ---------------------------------------------------
>
> Key: HIVE-2303
> URL: https://issues.apache.org/jira/browse/HIVE-2303
> Project: Hive
> Issue Type: Bug
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
>
> The following is from one of our users:
>
> create external table impressions (imp string, msg string)
> row format delimited
> fields terminated by '\t'
> lines terminated by '\n'
> stored as textfile
> location '/xxx';
>
> Some strings in my data contains Control-A, Control-B etc as internal
> delimiters. If I do a
>
> Select * from impressions limit 10;
>
> All fields were able to print correctly. However if I do a
>
> Select * from impressions where msg regexp '.*' limit 10;
>
> The fields were broken by the control characters. The difference between the
> 2 commands is that the latter requires a map-reduce job.
>
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira