[ 
https://issues.apache.org/jira/browse/DRILL-7020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756537#comment-16756537
 ] 

Paul Rogers commented on DRILL-7020:
------------------------------------

The size limitation is hard-coded into the "complaint" text reader, as you 
noted. I'm not sure the limit is necessary. Drill uses a 4-byte offset vector 
to track VARCHAR values within a VARCHAR vector. Might be as easy as removing 
the size check.

> big varchar doesn't work with extractHeader=true
> ------------------------------------------------
>
>                 Key: DRILL-7020
>                 URL: https://issues.apache.org/jira/browse/DRILL-7020
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Text & CSV
>    Affects Versions: 1.15.0
>            Reporter: benj
>            Priority: Major
>
> with a TEST file of csv type like
> {code:java}
> col1,col2
> w,x
> ...y...,z
> {code}
> where ...y... is > 65536 characters string (let say 66000 for example)
> SELECT with +*extractHeader=false*+ are OK
> {code:java}
> SELECT * FROM TABLE(tmp.`TEST`(type => 'text', fieldDelimiter => ',', 
> extractHeader => false));
>     col1  | col2
> +---------+------
> | w       | x
> | ...y... | z
> {code}
> But SELECT with +*extractHeader=true*+ gives an error
> {code:java}
> SELECT * FROM TABLE(tmp.`TEST`(type => 'text', fieldDelimiter => ',', 
> extractHeader => true));
> Error: UNSUPPORTED_OPERATION ERROR: Trying to write something big in a column
> columnIndex 1
> Limit 65536
> Fragment 0:0
> {code}
> Note that is possible to use extractHeader=false with skipFirstLine=true but 
> in this case it's not possible to automatically get columns names.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to