On Monday, 21 de November de 2011 15.45.49, Giuseppe D'Angelo wrote: > But first: do we all (esp. Thiago, Lars) agree to use the UTF-8 > version for now (and pay for the pattern/subject string/offsets > conversions) and then write and enable a UTF-16 codepath when PCRE > ships with proper support for it (by detecting its version at > runtime)?
Yes. Also note that it might be easier to convert to UTF-8, execute the RE and then scan forward the UTF-16 string by counting the number of UTF-8 bytes per character and map the offsets that we *do* need (match offset and captures). > Also: what's the minimum PCRE version Qt should require? I see that > Debian 6 (stable) uses 8.02 [1], Ubuntu 10.04 LTS uses 7.8 [2]. For > other distributions of course YMMV. Is it OK to depend on even more > recent versions? For instance, PCRE 8.10 adds UCP support (basically > make \w \d etc. match the corresponding Unicode properties), and PCRE > 8.20 adds a JIT feature (which promises large perfomance benefits) [3] > [4]. > Again: should we resort to depend on a "old" version, detect the > proper one at runtime, and optionally enabling those features? I don't know. We should choose the features we want and then require that. Unicode matching sounds interesting. > About the API itself: would you like more three classes (raw pattern > -> compiled pattern -> result of a match), or only two (pattern -> > result of a match)? Two sounds better. I don't see the point in having a distinction between a raw and a compiled pattern. We might just need a pattern class and simply have a method to compile it. -- Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org Software Architect - Intel Open Source Technology Center PGP/GPG: 0x6EF45358; fingerprint: E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development