maropu commented on a change in pull request #31362: URL: https://github.com/apache/spark/pull/31362#discussion_r567524336
########## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParserUtils.scala ########## @@ -187,33 +178,19 @@ object ParserUtils { sb.append(highSurrogate.toChar) sb.append(lowSurrogate.toChar) } - i += 9 - } else if (i + 4 < strLength) { + charBuffer.position(charBuffer.position() + 10) + case OCTAL_CHAR_PATTERN(cp) => // \000 style character literals. - - val i1 = b.charAt(i + 1) - val i2 = b.charAt(i + 2) - val i3 = b.charAt(i + 3) - - if ((i1 >= '0' && i1 <= '1') && (i2 >= '0' && i2 <= '7') && (i3 >= '0' && i3 <= '7')) { - val tmp = ((i3 - '0') + ((i2 - '0') << 3) + ((i1 - '0') << 6)).asInstanceOf[Char] - sb.append(tmp) - i += 3 - } else { - appendEscapedChar(i1) - i += 1 - } - } else if (i + 2 < strLength) { + sb.append(Integer.parseInt(cp, 8).toChar) + charBuffer.position(charBuffer.position() + 4) + case ESCAPED_CHAR_PATTERN(c) => // escaped character literals. - val n = b.charAt(i + 1) - appendEscapedChar(n) - i += 1 - } - } else { - // non-escaped character literals. - sb.append(currentChar) + appendEscapedChar(c.charAt(0)) + charBuffer.position(charBuffer.position() + 2) + case _ => + // non-escaped character literals. + sb.append(charBuffer.get()) Review comment: Just out of curiosity; the performance of the `non-escpaed` long string case is almost the same before/after this PR? This improvement itself looks fine though. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org