[ 
https://issues.apache.org/jira/browse/DRILL-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277803#comment-15277803
 ] 

Arina Ielchiieva commented on DRILL-3149:
-----------------------------------------

After the fix:
1. multibyte line delimiters will be available.
For example, with "\r\n", to treat it as delimiter we can update storage plugin 
by adding: "lineDelimiter": "\r\n" or use select with options query:
{code} select * from table(dfs.`my_table`(type=>'text', 
'lineDelimiter'=>'\r\n')){code}
Still *\n* is treated as standard delimiter, so if file has new lines split by 
them will also occur even if lineDelimiter is overriden.
Example:
Data set:
{noformat}a|||b\nc|||d{noformat}
Select:
{code}select * from table(dfs.`my_table`(type=>'text', 
lineDelimiter=>'|||')){code}
Result:
{noformat}
a
b
c
d
{noformat}
2. select with options with honor java character literals (ex: \r, \n, \t). 
Queries with them will work correctly:
{code} select * from table(dfs.`my_table`(type=>'text', 
'lineDelimiter'=>'\r\n', 'fieldDelimiter'=>'\t')){code}

> TextReader should support multibyte line delimiters
> ---------------------------------------------------
>
>                 Key: DRILL-3149
>                 URL: https://issues.apache.org/jira/browse/DRILL-3149
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Text & CSV
>    Affects Versions: 1.0.0, 1.1.0
>            Reporter: Jim Scott
>            Assignee: Arina Ielchiieva
>            Priority: Minor
>             Fix For: Future
>
>
> lineDelimiter in the TextFormatConfig doesn't support \r\n for record 
> delimiters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to