paul-rogers opened a new pull request #1802: DRILL-7258: Remove field width limit for text reader URL: https://github.com/apache/drill/pull/1802 The V2 text reader enforced a limit of 64K characters when using column headers, but not when using the columns[] array. The V3 reader enforced the 64K limit in both cases. This patch removes the limit in both cases. The limit now is the 16MB vector size limit. With headers, no one column can exceed 16MB. With the columns[] array, no one row can exceed 16MB. (The 16MB limit is set by the Netty memory allocator.) Added an "appendBytes()" method to the scalar column writer which adds additional bytes to those already written for a specific column or array element value. The method is implemented for VarChar, Var16Char and VarBinary vectors. It throws an exception for all other types. When used with a type conversion shim, the appendBytes() method throws an exception. This should be OK because, the previous setBytes() should have failed because a huge value is not acceptable for numeric or date types conversions. Added unit tests of the append feature, and for the append feature in the batch overflow case (when appending bytes causes the vector or batch to overflow.) Also added tests to verify the lack of column width limit with the text reader, both with and without headers.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services