In particular I've noticed that "substr" doesn't seem to work correctly when dealing with wide characters. For example:
use utf8; ... $blah =~ m/<wide_regex>/g; $position = pos $blah;
seems to give the correct character position but,
$matched = substr($blah, $position - length($blah), length($blah));
doesn't put the matched text into $matched when there are wide characters in $blah -- i.e., it seems to work off bytes rather than characters.
Are these issues documented somewhere and are there standard techniques for dealing with them?
John Blumel