On Tue, Apr 11, 2006 at 11:30:41AM -0400, John E. Malmberg wrote:
> What I would like to know is if I have figured out this patch fragment
> correct for getting the UTF8 attribute passed back and forth.
> 
> Specifically, when I am returning a UTF8 encoded string back to Perl, do 
> I need to run it through sv_utf8_upgrade(), or is there a better method?

Sorry, missed this question, which I knew the answer to.

> +  if (rslt != NULL) [
> +    sv_usepvn(ST(0),rslt,strlen(rslt));
> +    if (fs_utf8) {
> +       sv_utf8_upgrade(ST(0));
> +    }
> +  }

No, sv_utf8_upgrade is for converting an SV holding a sequence of bytes that
are ISO-8859-1 characters into an SV holding a (longer) sequence of bytes
that are those same characters encoded in UTF-8.

What I think you need here is

   ST(0) = sv_newmortal();
-  if (rslt != NULL) sv_usepvn(ST(0),rslt,strlen(rslt));
+  if (rslt != NULL) [
+    sv_usepvn(ST(0),rslt,strlen(rslt));
+    if (fs_utf8) {
+       SvUTF8_on(ST(0));
+    }
+  }

because you need to signal to the internals that the sequence of bytes in
the SV is in UTF-8.

(I'm assuming that the sequence of bytes in rslt was in ISO-8859-1 if fs_utf8
was false, and UTF-8 if fs_utf8 was true. If not, I misunderstood something)

If you're re-using an existing SV (rather than the new one created here by
sv_newmortal()), I'd add an else block with SvUTF8_off(...), as there have
been bugs in the core caused by scalars getting SvUTF8(...) turned on, but
then never turned on, so it "leaks" through on scalar re-use.

Nicholas Clark

Reply via email to