-#ifdef MBS_SUPPORT

Please drop the !__STDC__ part.
It stopped being useful about 10 years ago.

With pleasure.

I also expanded the macros, since FUNC was a bit redundant.

+#ifdef __STDC__
+#define FUNC(F, P) static int F(int c) { return P(c); }
+#else
+#define FUNC(F, P) static int F(c) int c; { return P(c); }
+#endif
...
+static predicate *
+find_pred (const char *str)
+{
+  int i;

s/int/unsigned int/

Please use "unsigned int" not just "unsigned".

Fine. I fixed it throughout dfa.[ch], see posted patch. I also prefer "unsigned int" but I used the latter since it prevailed in those two files.

+  if (MB_CUR_MAX>  1)
+    {
+      REALLOC_IF_NECESSARY(dfa->mbcsets, struct mb_char_classes,
+                           dfa->mbcsets_alloc, dfa->nmbcsets + 1);
+
+      /* dfa->multibyte_prop[] hold the index of dfa->mbcsets.
+         We will update dfa->multibyte_prop[] in addtok(), because we can't
+         decide the index in dfa->tokens[].  */
+
+      /* Initialize work area.  */
+      work_mbc =&(dfa->mbcsets[dfa->nmbcsets++]);
+      work_mbc->nchars = work_mbc->nranges = work_mbc->nch_classes = 0;
+      work_mbc->nequivs = work_mbc->ncoll_elems = 0;

I was going to write "Please don't put multiple initializations on
the same line." but then saw this was merely an indentation change,
so never mind.

Indeed; the logical step is to change it to a single memset, actually. I did it as a follow-up.

ACK!
Finally.  Thanks!

You didn't ack one patch, but it's quite trivial so I pushed it too.

The only missing optimization WRT glibc would be to expand UTF-8 "." in terms of character sets. I don't plan to do it anytime soon, but it should be easy, just special-case ANYCHAR in atom().

Anyway, I think we are now on comparable performance, and considerably less bugginess, than most GNU/Linux distros around.

Thanks for the reviews!

Paolo


Reply via email to