I wrote: > Thanks for the sample which shows the difference. Ah, once I got on the right track, there is no problem seeing dramatic improvements with the patch. It changes some nasty O(N^2) cases to O(N). In particular, the fixes affect parsing of large strings encoded with multi-byte character encodings and containing email addresses or URLs with a non-IP-address host component. It strikes me as odd that URLs without a slash following the host portion, or with an IP address, are treated so differently in the parser, but if we want to address that, it's a matter for another patch. I'm inclined to think that the minimal differences found in some of my tests probably have more to do with happenstance of code alignment than the particulars of the patch. I did find one significant (although easily solved) problem. In the patch, the recursive setup of usewide, pgwstr, and wstr are not conditioned by #ifdef USE_WIDE_UPPER_LOWER in the non-patched version. Unless there's a good reason for that, the #ifdef should be added. Less critical, but worth fixing one way or the other, TParserClose does not drop breadcrumbs conditioned on #ifdef WPARSER_TRACE, but TParserCopyClose does. I think this should be consistent. Finally, there's that spelling error in the comment for TParserCopyInit. Please fix. If a patch is produced with fixes for these three things, I'd say it'll be ready for committer. I'm marking it as Waiting on Author for fixes to these three items. Sorry for the delay in review. I hope there's still time to get this committed in this CF. -Kevin
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers