Unicode.from_num is giving an error that 2 or more arguments are required.

e.g.,
local patternu=unicode.from_num(0x2153)

> 
> If you've got unicode subject why wouldn't one want to search it
> with a unicode pattern?

Regex metacharacters, etc., don't rely on high value characters, no need for 
special encoding. If you needed to include some literal utf8 text in the 
pattern, I expect it would be easier to do it with pcre's \x{....} than with 
unicode handles or utf8 strings. If using a literal utf8 string, anything 
conflicting with PCRE metacharacters would need escaping. The literal would 
also need concatenating with the rest of the pattern text. All of that would be 
done more easily with utf8 strings than with unicode handles.

[snip] 
 
> unicode plugin doesn't know it's being used by another plugin,
> and can't prevent being so used.

> 
> Could be a regex service:
> 
> regex.allow_unicode_handles(0/1)

My thought was perhaps user could use unicode to set and unset a global 
variable that regex could read. If regex sets the variable, it doesn't mean 
unicode is loaded.

> 
> to override config ini setting -- but again, don't undertand why
> it should be necessary. If for some reason you can't use unicode
> because regex is interfering, there's an error of some kind in my
> code.

I'm not using unicode handles in any regex services (except recent testing). I 
prefer therefore that regex not be doing extra work looking for unicode handles 
that aren't there. :D

For test purposes I just tried putting a unicode handle into pattern and 
subject (both worked). I also tried putting a unicode handle in the replacement 
string, but I got the error: regex.pcreReplace: PCRE exec failed Matching error 
-3
Programminng Error: PCRE_ERROR_BADOPTION

The option was "utf8". Also tried "u". Worked fine as long as the replacement 
string was not a unicode handle (including if the replacement string was 
decoded from a unicode handle to a utf8 string).

Is it safe to chain the utf8 operations as done below?

local 
subjectstring=unicode.from_nums(0x00BC,0x0020,0x2153,0x00A0,0x2154).to_utf8
;local replaceu=unicode.from_num(0x2154);; fails
local replaceu=unicode.new(" ")
unicode.default_get_set_type("numeric")
replaceu[0]=0x2154
local replacestring=unicode.to_utf8(replaceu)
;local test=regex.pcrereplace(?"\x{2153}", subjectstring, replaceu, "utf8") 
;;fails
local test=regex.pcrereplace(?"\x{2153}", subjectstring, replacestring, "utf8")
unicode.messagebox("OK", unicode.from_utf8(test))
local test=regex.pcrereplace(?"\x{2154}", test, ?"2/3", "utf8")
win.debug(unicode.from_utf8(test).to_ascii)

Regards,
Sheri

Reply via email to