On Wed, Jul 20, 2011 at 8:58 AM, Ekki Plicht (DF4OR) <e...@plicht.de> wrote:

> Hello Bill,
> thanks for the answer.
>
> On Wed, Jul 20, 2011 at 07:51, Bill Moseley <mose...@hank.org> wrote:
> >
> >
> > On Wed, Jul 20, 2011 at 1:31 AM, Ekki Plicht (DF4OR) <e...@plicht.de>
> wrote:
> >>
> >> Hi.
> >>
> >> The goal is to use TT for an order confirmation, to be sent by email.
> >>
> >> The template file itself is written in UTF-8, the variable strings
> >> (coming from the web) are converted from ISO-8859-1 to utf-8 with
> >> Locale::Recode. The conversion works fine, Umlauts (diacritical chars)
> >> like äöü and such are printed correctly.
> >
> > Hum, that does not sound exactly right.  If your templates are UTF-8 then
> > tell TT about that with ENCODING => 'UTF-8' .
>
> Tried that, interestingly enough that breaks the output. When I do
> this the umlauts in the normal body are represented garbled in
> whatever encoding. The template file has no BOM, but the Linux 'file'
> utility recognizes it as UTF-8 encoded.
>

I think that indicates a problem on how you are handing the encodings.  That
ENCODING option tells TT to decode your template as UTF-8 when reading it.
 That means it's encoded as UTF-8 on disk.

Then when you render it (in a web page, to a file, or into an email) you
must encode it into the final encoding.  Use UTF-8 there, too and encode
with Encode::encode_utf8().



>
>
> > And for your variables use
> > Encode::decode to decode them as ISO-8869-1.  Process the template then
> when
> > sending the mal set the charset to UTF-8 and called Encode::encode_utf8()
> > when adding to the body.
> > You want to work with decoded characters inside of Perl and TT, in
> general.
> >  I suppose you could do everything in encoded UTF-8 octets, but  then
> your
> > character lengths might be off.
>
> I am not sure I get you here.
> My understanding is that Perl (and therefore TT and others) can handle
> UTF-8 fine since many years. And UTF-8 is just another encoding. Are
> you saying that I should encode the template and everything (vars) in
> ISO-8859-1 and only recode it to UTF-8 before sending? That would give
> me a hard time developing the template, since I don't see the proper
> formatting until I send it. Currently I store the output for debugging
> purposes to a file and look at that.
>

Not exactly.  I'm saying you must decode your external data when bringing it
into Perl.  Decoding converts the octets into "characters" used inside of
Perl.  (Inside of Perl it's UTF8, but pretend you don't know that and that
inside of Perl you work with abstract characters.)

So, if your templates are UTF-8 using ENCODING => 'UTF-8' will have TT
decode your template when it's loaded.  If your variables are, say, in a
database encoded as ISO-8859-1 then use something like Encoding::decode(
'iso-8859-1', $text, $CHECK ) when loading your data.  Then your templates
and variables are all "characters".  length() will report character length,
not octet length.  Many DBD:: drivers provide that feature.

Finally process your template and when ready to output make sure you
encode_utf8 and set any appropriate headers to indicate the encoding (say in
mail or web page).



>
> Well, I have to work with some conversion anyway, because part if the
> vars I have here are encoded DOS CP850. Don't ask why, the truth is
> much too sad to tell it here.
>
> To make a long story short, I convert everything to UTF-8, using a
> UTF-8 encoded template, and the format() routine breaks when I do
> this. I will try to make a simple test case which shoes this.
>
>
Again, encodings (like UTF-8) are for external representation -- storage on
disk or sending over the wire.  Inside of Perl always decode first into
characters.


> >> The problem:
> >> I use these strings in formats to get a tabular layout, for example of
> >> the ordered products. At another place I convert the vars to String
> >> objects and use var.left(40).
> >
> >
> > And frankly I kind of think the text vs. html argument is a bit out
> dated,
> > so I'd just use HTML in the email and let the client format it correctly
> > into tables.  I suspect that would solve a lot of potential headaches.
>
> IBTD. Call me oldfashioned, but I think that there is a reason why the
> largest resellers like amazon.com use text mails as confirmations etc.
> When it comes to business and customer orders I want to make sure that
> our mails reach the customer and don't end up in spam filters. HTML
> formatting is _one_ hint for a filter that it _might_ by spam. That is
> not to say that all spam is HTML formatted, but much of it is.
>
> So, I want to go with text formatted mails. HTML is not an option.
>

I think that was true some years back.  I think mail clients are too varied
now to be able to expect them to always display your text/plain in a
fixed-width font.  I'm not a fan of HTML email, either, but it's kind of
moot at this point in time.  Better to say "this is a column" and let the
client display it correctly.

I don't have much mail from Amazon archived, but I just looked and it's all
HTML.  I use Gmail so it renders the HTML by default.

But, if you can assume all mail clients will used a fixed-width font then
your first problem was probably using length() with encoded text and trying
to base column widths on that.  Work with decoded content and then length
will return the number of characters and you might be ok trying to manually
format it.  I just think that's going to be more work to get right.


-- 
Bill Moseley
mose...@hank.org
_______________________________________________
templates mailing list
templates@template-toolkit.org
http://mail.template-toolkit.org/mailman/listinfo/templates

Reply via email to