Re: [rt-users] RT 3.8 mangles html attachment

2009-02-23 Thread Tom Lahti
On Thu 19.Feb'09 at 11:33:25 -0500, Todd Chapman wrote: Correction, the weird question mark characters are between every character in the original document. Like so: �h�e�a�d�� � ��m�e�t�a� Fascinating. Does it do this with all html attachments? That looks suspiciously like full

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-23 Thread Jesse Vincent
On Mon, Feb 23, 2009 at 11:14:00AM -0800, Tom Lahti wrote: On Thu 19.Feb'09 at 11:33:25 -0500, Todd Chapman wrote: Correction, the weird question mark characters are between every character in the original document. Like so: �h�e�a�d�� � ��m�e�t�a� Fascinating. Does it do

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-23 Thread Todd Chapman
On Mon, Feb 23, 2009 at 2:14 PM, Jesse Vincent je...@bestpractical.com wrote: On Mon, Feb 23, 2009 at 11:14:00AM -0800, Tom Lahti wrote: On Thu 19.Feb'09 at 11:33:25 -0500, Todd Chapman wrote: Correction, the weird question mark characters are between every character in the

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-23 Thread Tom Lahti
Todd and I got further into it. We're using Encode::Guess, which should handle this. Todd had some promising places to dig for a bug. Curious: does Encode::Guess handle UTF-16(LE|BE) without a byte order mark? That would be ... fascinating. -- -- Tom Lahti

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-23 Thread Todd Chapman
On Mon, Feb 23, 2009 at 6:04 PM, Tom Lahti t...@bitstatement.net wrote: Todd and I got further into it. We're using Encode::Guess, which should handle this. Todd had some promising places to dig for a bug. Curious: does Encode::Guess handle UTF-16(LE|BE) without a byte order mark? That would

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-22 Thread Tim Cutts
On 19 Feb 2009, at 4:35 pm, Jesse Vincent wrote: On Thu 19.Feb'09 at 11:33:25 -0500, Todd Chapman wrote: Correction, the weird question mark characters are between every character in the original document. Like so: �h�e�a�d�� � ��m�e�t�a� Fascinating. Does it do this with all

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-20 Thread Jesse Vincent
On Thu 19.Feb'09 at 13:48:48 -0500, Todd Chapman wrote: Well, what does the database say for content-type? Is the content in the database 'right'? Sorry. And thanks again for all the help! mysql select Subject, Filename, ContentType, ContentEncoding, Headers from

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-20 Thread Todd Chapman
On Fri, Feb 20, 2009 at 1:04 PM, Jesse Vincent je...@bestpractical.com wrote: On Thu 19.Feb'09 at 13:48:48 -0500, Todd Chapman wrote: Well, what does the database say for content-type? Is the content in the database 'right'? Sorry. And thanks again for all the help! mysql

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-20 Thread Jesse Vincent
Hmmm. Just noticed this error: [Fri Feb 20 18:32:55 2009] [debug]: Converting 'UTF-16' to 'utf-8' for text/html - Re Eprize RPC interface failing on DC registration.htm (/opt/rt3-devel/bin/../lib/RT/I18N.pm:234) [Fri Feb 20 18:32:55 2009] [error]: Encoding error: UTF-16:Unrecognised BOM 78

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-20 Thread Todd Chapman
The attached script and input file trigger the error. I think the problem is the loop on @lines. The BOM is only in the first line so the rest is cornfused. On Fri, Feb 20, 2009 at 1:43 PM, Jesse Vincent je...@bestpractical.com wrote: Hmmm. Just noticed this error: [Fri Feb 20 18:32:55 2009]

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-20 Thread Jesse Vincent
On Fri 20.Feb'09 at 14:46:49 -0500, Todd Chapman wrote: The attached script and input file trigger the error. I think the problem is the loop on @lines. The BOM is only in the first line so the rest is cornfused. If you're up for actually rewriting that as a test file that loads the data

[rt-users] RT 3.8 mangles html attachment

2009-02-19 Thread Todd Chapman
We have an RT instance in a trusted environment. I have the following in RT_SiteConfig.pm: Set($TrustHTMLAttachments, 1); Set($PreferRichText, 1); Set($MaxAttachmentSize , 1000); I even turned of the HTML scrubber, yet when I attach an html file to a ticket and then save it back to my

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-19 Thread Jesse Vincent
On Thu, Feb 19, 2009 at 10:44:50AM -0500, Todd Chapman wrote: We have an RT instance in a trusted environment. I have the following in RT_SiteConfig.pm: Set($TrustHTMLAttachments, 1); Set($PreferRichText, 1); Set($MaxAttachmentSize , 1000); I even turned of the HTML scrubber, yet

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-19 Thread Todd Chapman
Thanks for the reply Jessee, The html no longer displays correctly in the browser after canonicalization. Suggestions? On Thu, Feb 19, 2009 at 10:47 AM, Jesse Vincent je...@bestpractical.comwrote: On Thu, Feb 19, 2009 at 10:44:50AM -0500, Todd Chapman wrote: We have an RT instance in a

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-19 Thread Todd Chapman
The original file when opened up in a browser looks like a formatted web page. After processing by RT the file look like it is rendered as what looks like plain text in Safari. In Firefox there are a bunch of weird question mark characters representing the spaces between characters. FF's page info

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-19 Thread Jesse Vincent
On Thu 19.Feb'09 at 11:33:25 -0500, Todd Chapman wrote: Correction, the weird question mark characters are between every character in the original document. Like so: �h�e�a�d�� � ��m�e�t�a� Fascinating. Does it do this with all html attachments?

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-19 Thread Todd Chapman
Correction, the weird question mark characters are between every character in the original document. Like so: �h�e�a�d�� � ��m�e�t�a� �h�t�t�p�-�e�q�u�i�v�=�C�o�n�t�e�n�t�-�T�y�p�e� �c�o�n�t�e�n�t�=��t�e�x�t�/�h�t�m�l�;� �c�h�a�r�s�e�t�=�u�n�i�c�o�d�e��� � ��m�e�t�a� �n�a�m�e�=�P�r�o�g�I�d�

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-19 Thread Todd Chapman
According to FF the original file has an encoding of UTF-16LE. It was generated by Word. (I know, I know) On Thu, Feb 19, 2009 at 11:35 AM, Jesse Vincent je...@bestpractical.comwrote: On Thu 19.Feb'09 at 11:33:25 -0500, Todd Chapman wrote: Correction, the weird question mark characters

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-19 Thread Jesse Vincent
On Thu 19.Feb'09 at 11:42:42 -0500, Todd Chapman wrote: According to FF the original file has an encoding of UTF-16LE. It was generated by Word. (I know, I know) Now we're getting somewhere. Was it attached to a mail as an attachment? If so, what do the headers for the original

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-19 Thread Todd Chapman
It was attached using the web interface. On Thu, Feb 19, 2009 at 11:45 AM, Jesse Vincent je...@bestpractical.comwrote: On Thu 19.Feb'09 at 11:42:42 -0500, Todd Chapman wrote: According to FF the original file has an encoding of UTF-16LE. It was generated by Word. (I know, I know)

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-19 Thread Todd Chapman
It was attached in the web interface on the Create.html page. (Not a custom field) On Thu, Feb 19, 2009 at 11:45 AM, Jesse Vincent je...@bestpractical.comwrote: On Thu 19.Feb'09 at 11:42:42 -0500, Todd Chapman wrote: According to FF the original file has an encoding of UTF-16LE. It was

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-19 Thread Jesse Vincent
On Thu 19.Feb'09 at 11:55:19 -0500, Todd Chapman wrote: It was attached in the web interface on the Create.html page. (Not a custom field) And what headers is RT serving it out with? Is RT announcing it as utf8? Is that stored in the database as content-type? If you save the raw data

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-19 Thread Todd Chapman
On Thu, Feb 19, 2009 at 12:29 PM, Jesse Vincent je...@bestpractical.comwrote: On Thu 19.Feb'09 at 11:55:19 -0500, Todd Chapman wrote: It was attached in the web interface on the Create.html page. (Not a custom field) And what headers is RT serving it out with? Is RT announcing it

Re: [rt-users] RT 3.8 mangles html attachment

2009-02-19 Thread Jesse Vincent
meta http-equiv=Content-Type content=text/html; charset=unicode I don't know how FF figures out that it is UTF-16LE. I'd recommend starting in lib/RT/I18N.pm sub SetMIMEEntityToUTF8. instrument there. On Thu, Feb 19, 2009 at 1:48 PM, Todd Chapman t...@chaka.net wrote: On Thu,