[ 
https://issues.apache.org/jira/browse/DRILL-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

achyut09 updated DRILL-8496:
----------------------------
    Description: 
I have the following csv-

 
{code:java}
"id"^"first_name"^"last_name"^"email"^"gender"
"1"^"John"^"143 \\"^"
[email protected]"^"Male"
"2"^"Willaim"^"Khan"^"[email protected]"^"Male"{code}
and when i run a drill query (SELECT *
FROM dfs.`C:\Users\achyu\Documents\dir2`)-
I am getting the following error-
{code:java}
UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
This is my dfs configuration for csv in apache drill.I am using the version 
1.21.1-

"csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", 
"fieldDelimiter": "^", "quote": "\"", "escape": "
", "comment": "#", "extractHeader": true }
 
Turns out this is because of this particular portion- "143 \\"
In this csv 143 \\ is part of the data and its not an escape character, But as 
this character is before the quote its failing. If i just give a space between 
\\ and quote then it works completely fine.
I guess this is a bug.
Any insights(for escaping the escape character before the quote) or workaround 
on the same?

 

  was:
I have the following csv-

 
{code:java}
"id"^"first_name"^"last_name"^"email"^"gender"
"1"^"John"^"143 \\"^"
[email protected]"^"Male"
"2"^"Willaim"^"Khan"^"[email protected]"^"Male"{code}
and when i run a drill query (SELECT *
FROM dfs.`C:\Users\achyu\Documents\dir2`)-
I am getting the following error-
{code:java}
UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
This is my dfs configuration for csv in apache drill.I am using the version 
1.21.1-
{quote}{quote}"csv": \{ "type": "text", "extensions": [ "csv" ], 
"lineDelimiter": "\n", "fieldDelimiter": "^", "quote": "\"", "escape": "\\", 
"comment": "#", "extractHeader": true }{quote}{quote}
 
Turns out this is because of this particular portion- "143 
"
In this csv 
is part of the data and its not an escape character,But as this character is 
before the quote its failing. If i just give a space between 
and quote then it works completely fine.
I guess this is a bug.
Any insights(for escaping the escape character before the quote) or workaround 
on the same?


> Drill Query fails when the escape character(which is part of the data) is 
> just before the quote
> -----------------------------------------------------------------------------------------------
>
>                 Key: DRILL-8496
>                 URL: https://issues.apache.org/jira/browse/DRILL-8496
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.21.1
>            Reporter: achyut09
>            Priority: Critical
>              Labels: Drill
>
> I have the following csv-
>  
> {code:java}
> "id"^"first_name"^"last_name"^"email"^"gender"
> "1"^"John"^"143 \\"^"
> [email protected]"^"Male"
> "2"^"Willaim"^"Khan"^"[email protected]"^"Male"{code}
> and when i run a drill query (SELECT *
> FROM dfs.`C:\Users\achyu\Documents\dir2`)-
> I am getting the following error-
> {code:java}
> UserRemoteException :  DATA_READ ERROR: Unexpected character '101' following 
> quoted value of CSV field. Expecting '94'. Cannot parse CSV input." {code}
> This is my dfs configuration for csv in apache drill.I am using the version 
> 1.21.1-
> "csv": { "type": "text", "extensions": [ "csv" ], "lineDelimiter": "\n", 
> "fieldDelimiter": "^", "quote": "\"", "escape": "
> ", "comment": "#", "extractHeader": true }
>  
> Turns out this is because of this particular portion- "143 \\"
> In this csv 143 \\ is part of the data and its not an escape character, But 
> as this character is before the quote its failing. If i just give a space 
> between \\ and quote then it works completely fine.
> I guess this is a bug.
> Any insights(for escaping the escape character before the quote) or 
> workaround on the same?
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to