On 13 October 2012 12:55, sebb <[email protected]> wrote:
> Before r1397883, the Lexer operated only on char fields; it's now been
> converted to use Character, which means that unboxing is needed.
>
> Also, the Character fields need to be checked for null before use.
>
> It has just occurred to me that there is a genuine illegal char value
> for everything except the delimiter - that is, the delimiter itself.
> It does not make sense for there to be no delimiter, nor does it make
> sense for any other meta-character to be the same as the delimiter.
>
> So rather than having
>
> Character escape;
> ...
> boolean isEscape(final int c) {
> return escape != null && c == escape.charValue();
> }
>
> one could use the simpler (and more efficient)
>
> char escape;
> ...
> boolean isEscape(final int c) { // similarly for isEncapsulator etc.
> return escape != delimiter;
> }
Sorry, that's rubbish; it needs to be:
boolean isEscape(final int c) { // similarly for isEncapsulator etc.
return escape != delimiter && c == escape;
}
which is hardly better than before.
However, if the Lexer ctor ensures that escape (etc) can never be the
same as the delimiter, then the check can be simplified to:
boolean isEscape(final int c) { // similarly for isEncapsulator etc.
return c == escape;
}
> This would have the added bonus of automatically disallowing delimiter
> as the escape (or encapsulator etc.) because they would not be
> recognised.
> [At present the code does not check this]
>
> The Lexer ctor would need to be changed to convert a null escape
> Character (comment etc) to the delimiter.
and it would need to throw IAE or similar if any of the meta chars
match the delimiter.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]