> However, I still can't entirely shake the notion that we're overdoing it
> here.  Maybe we could simply make the preprocessor and compiler grok
> UTF8 directly and get rid of the special casing.  All compiler
> input processing would return back to 8-bit only.

Converting everything to utf8 before preprocessing would work, yes, if
it is then converted back to unicode before the tokenization.

The alternative is a needlessly messy (handling utf-8 in the
tokenizer).

Define name/argument handling would be the only thing that needs to be
altered in cpp to handle utf-8.

Then again, just switching data[i] to IND(i) or similar, and have that
be defined to index_shared_string(data,i) (or, to break with
convetions in the code, not use a macro at all and instead just use
the function directly) is actually significantly easier than adding
utf-8 support to the preprocessor.

It is however bound to be somewhat slower in most cases. But I do not
really think the difference matters at all, considering everything
else we are doing in there.

--
Per Hedbor
  • Re:... Arne Goedeke
  • Re:... Stephen R. van den Berg
  • Re:... Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum
  • Re:... Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum
  • Re:... Arne Goedeke
  • Re:... Martin Nilsson (Opera Mini - AFK!) @ Pike (-) developers forum
  • Re:... Per Hedbor () @ Pike (-) developers forum
  • Re:... Per Hedbor () @ Pike (-) developers forum
  • Re:... Stephen R. van den Berg
  • Re:... Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum
  • Re:... Per Hedbor () @ Pike (-) developers forum
  • Re:... Martin Bähr
  • Re:... Per Hedbor () @ Pike (-) developers forum

Reply via email to