Am Samstag, 6. April 2013 um 19:32:58, schrieb Tommaso Cucinotta <[email protected]> > On 03/04/13 22:40, Kornel Benko wrote: > > I want to find (as regular expression) the string "použiť". In tex, it looks > > "použi\v{t}". > > But the searched string (as it is diplayed while filling the search form) > > it is > > "\regexp{pou\check{z} it\mkern-5mu\mathchar19\endregexp{}}". > > Ok, I could reproduce thanks to the file you sent me. From a first impression, > it seems to me that the problem is NOT the regexp matching engine. Indeed, > when > searching with regexp but with no non-ASCII char in the regexp, it works fine > and it finds for example > > použ\regexp{i\endregexp{}}ť > > However, when the regexp contains non-ASCII chars, then it's misinterpreted > in the text conversion. I suspect it's due to the fact that the regexp inset > has been essentially derived from a math inset, so it's not expecting any > non-ASCII stuff therein, and it's not applying the > regular non-ASCII chars mangling that is instead done correctly for text. > > I'll try to look into it. > > T.
Yes, all non-ascii characters are treated as being part of math. Therefore
they are replaced according to our unicode-file.
In most cases we respect the type of InsetMathHull (for regex it is hullRegexp).
Somewhere we split the search string at non-ascii, but I failed to find where.
Kornel
signature.asc
Description: This is a digitally signed message part.
