The reader already is supposed to have support for a two byte delimiter. Apparently, there is a bug somewhere in the state management.
The code is a byte-based approach (as opposed to character-based approach). I'm guessing there is an issue in one or both of these blocks: [1][2]. I think the code just needs to be debugged to figure out why the separator is not being recognized. It is also possible that there is an additional problem in [3] where we prematurely detect a new line when there isn't one. See the "newLine" uses. Hopefully that can give you some pointers on where to look. thanks, Jacques [1] https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/TextInput.java#L274 [2] https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/TextInput.java#L334 [3] https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/TextReader.java -- Jacques Nadeau CTO and Co-Founder, Dremio On Thu, Oct 29, 2015 at 7:15 AM, Edmon Begoli <[email protected]> wrote: > I can do it, but with little bit of a guidance where in the Drill code base > to apply the fix. > > Ideally, someone would tell me where to look in the reader that had it > fixed in a different context, and then give a suggestion where to apply it. > > Thank you, > Edmon > > On Wed, Oct 28, 2015 at 10:53 PM, Jacques Nadeau <[email protected]> > wrote: > > > Jim's fix wasn't lost. It was in the context of very different reader. > That > > reader was deprecated because there were a number of other issues and > > performance problems with it. Those items were addressed in this reader. > > > > In terms of someone looking at this soon, I agree that this would be > great. > > Can someone raise their hand? > > > > -- > > Jacques Nadeau > > CTO and Co-Founder, Dremio > > > > On Wed, Oct 28, 2015 at 6:10 PM, Edmon Begoli <[email protected]> wrote: > > > > > May I please escalate this issue for 1.3 or 1.4: > > > > > > https://issues.apache.org/jira/browse/DRILL-3149 > > > > > > I understand that Jim's fixed was lost. > > > > > > Can the fix be recovered and slipped into 1.3? > > > > > > It is causing us to re-format very large volume of files to check and > > > remove these line terminators. > > > > > > Thank you, > > > Edmon > > > > > >
