elgabbas commented on issue #34291:
URL: https://github.com/apache/arrow/issues/34291#issuecomment-1440005031

   Thanks @eitsupi 
   
   As I mentioned in the previous message, it seems the problem is due to an 
extra non-necessary quotation.
   If I manually removed it (second example below: `Arrow_parse_Example5.txt`), 
I can load the data.
   
   ```
   # This failed
   Occ <- read_delim_arrow(file = 
"https://github.com/apache/arrow/files/10804095/Arrow_parse_Example4.txt";, 
delim = "\t")
   # Error in `read_delim_arrow()`: ! Invalid: CSV parse error: Row #3: 
Expected 3 columns, got 2: 2417934775   "TEXT1 ""Quoted"" TEXT2 49.6275
   
   # This works
   Occ <- read_delim_arrow(file = 
"https://github.com/apache/arrow/files/10804096/Arrow_parse_Example5.txt";, 
delim = "\t")
   ```
   The only difference between both files is the removal of extra double 
quotation.
   
   Using `quote = ""`, I was able to overcome this specific issue, but this is 
how the data look like now (which is not neat!):
   ```
   Occ <- read_delim_arrow(file = 
"https://github.com/apache/arrow/files/10804095/Arrow_parse_Example4.txt";, 
delim = "\t", quote = "")
   # A tibble: 2 × 3
   # V1 V2                                V3
   # 2417934775 "TEXT1\"\"NoQuoted\"\" TEXT2"   49.6
   # 2417934775 "\"TEXT1 \"\"Quoted\"\" TEXT2"  49.6
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to