RE: Bug in Drill 1.3 CSV - please confirm

Geercken, Uwe Wed, 25 Nov 2015 00:10:50 -0800

Abdel,

I sent you the file to the email address. But I found the problem already:


The drillbit runs on a Linux box. When I check the file

        file test.csv

the output is:

        test.csv: ASCII text, with CRLF line terminators

In this case the query:

        select col1,col2,col3  from dfs.datatransfer.`test.csv`

does not work.

If I do a:

        dos2unix test.csv       

The query does work properly!

So drill does not properly recognize a CRLF linebreak which is standard on 
Windows system.

Just for the sake of it, if I do the opposite:

        unix2dos test.csv

again it does not work.

Should I file a bug?

Regards,

Uwe


-----Original Message-----
From: Abdel Hakim Deneche [mailto:adene...@maprtech.com] 
Sent: Dienstag, 24. November 2015 18:12
To: user
Subject: Re: Bug in Drill 1.3 CSV - please confirm

Hi Uwe,

I couldn't reproduce the issue using the 1.3 release! can you send me the dummy 
test file you created, to my email address (you can't send it to an apache 
mailing list).

Thanks

On Tue, Nov 24, 2015 at 3:03 AM, Geercken, Uwe <uwe.geerc...@swissport.com>
wrote:

> I have downloaded 1.3 and made a quick test of the new extractHeader 
> feature for text files.
>
> So I updated the storage details and created a dummy test file:
>
> col1,col2,col3
> geercken,uwe,22
> karlson,peter,33
>
>
> when I query the data with this: select *  from 
> dfs.datatransfer.`test.csv` - it works.
>
> when I query the data with this: select col1,col2  from 
> dfs.datatransfer.`test.csv` - it works.
>
> when I query the data with this: select col1,col2,col3  from 
> dfs.datatransfer.`test.csv` - it gives me an exception:
>
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> ArrayIndexOutOfBoundsException: -1 Fragment 0:0
>
>
> I figured out, that if I add a comma (,) after "col3" in the header it 
> works. So obviously the process does not notice the last column of the 
> header.
>
> If I set extractHeader to false and add skipFirstLine instead and do this:
> select columns[0], columns[1], columns[2]  from 
> dfs.datatransfer.`test.csv`
> - then it works. So the problem seems to be only the header row.
>
>
> I verified the same problem with other files, but can somebody please 
> cross-check before I add a Jire?
>
> Thanks,
>
> Uwe
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training 
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

RE: Bug in Drill 1.3 CSV - please confirm

Reply via email to