Feature Requests item #2828299, was opened at 2009-07-28 13:56
Message generated for change (Comment added) made by sergzhum
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1126468&aid=2828299&group_id=250683

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: webui
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Xueron Nee (srni)
Assigned to: Nobody/Anonymous (nobody)
Summary: UTF-8 encoding for WebUI

Initial Comment:
A message with subject: 
=?ISO-2022-JP?B?WxskQjJENT8bKEJceHs5MGFlfRskQjdvGyhCXSA=?= 
=?ISO-2022-JP?B?SlNIT1BQRVJTGyRCRXFKdUU5ISRLXDx+GyhC?= 
=?ISO-2022-JP?B?GyRCRy5HYz4mSUpFOUQ5P2RBJiEqGyhC?
=

in system.log, the subject was decoded to: [ $B2D5? (B\x{90ae} $B7o (B]   
JSHOPPERS $BEqJuE9!$K\<~ (B   $BG.Gc>&IJE9D9?dA&!* (B
It seems ISO-2022-JP was not recognized and treaded as ISO-8859-1.

:) thanks 

----------------------------------------------------------------------

Comment By: Sergey Zhumatiy (sergzhum)
Date: 2010-03-15 18:00

Message:
I use this patch for dspam 3.6.8 (works for most cases, not all. E.g. I
cannot see decoded mail body):
--- dspam.cgi.orig      2006-05-13 16:17:31.000000000 +0400
+++ dspam.cgi   2006-10-19 14:00:41.000000000 +0400
@@ -20,17 +20,46 @@
 
 use strict;
 use Time::Local;
+use Text::Iconv;
+use MIME::Base64;
+use MIME::QuotedPrint;
+
 use vars qw { %CONFIG %DATA %FORM $MAILBOX $CURRENT_USER $USER $TMPFILE};
 use vars qw { $CURRENT_STORE };
 require "ctime.pl";
 
 # Read configuration parameters common to all CGI scripts
 require "configure.pl";
+my $encoding=$CONFIG{'ENCODING'};
+
+$encoding = "ascii" if $encoding eq '';
 
 if($CONFIG{"DATE_FORMAT"}) {
   use POSIX qw(strftime);
 }
 
+sub recode_subj( $ ){
+  my $subject=$_[0];
+  my ($pre,$post,$k,$enc);
+  
+  if($subject =~ /(.*)=\?(.+)\?(\S)\?(.+)\?\=(.*)/is){
+    $pre=$1;
+    $post=$5;
+    $k=lc($3);
+    $enc=uc($2);
+    $subject=$4;
+    $subject=~tr/\n\r'/.."/; #'/;
+    $subject= ($k eq 'b')? MIME::Base64::decode($subject):
+              (($k eq 'q')? MIME::QuotedPrint::decode($subject):
$subject);
+    $subject =~ tr/\n\r\t\0-\0x19//d;
+    my $converter = Text::Iconv->new($enc, $encoding);
+    $subject = $converter->convert($subject);
+                          
+    return "${pre}${subject}${post}";
+  }
+  return $subject;
+}
+
 #
 # The current CGI script
 #
@@ -190,7 +219,7 @@
 sub DisplayFragment {
   $FORM{'signatureID'} =~ s/\///g;
   $DATA{'FROM'} = $FORM{'from'};
-  $DATA{'SUBJECT'} = $FORM{'subject'};
+  $DATA{'SUBJECT'} = recode_subj($FORM{'subject'});
   $DATA{'INFO'} = $FORM{'info'};
   $DATA{'TIME'} = $FORM{'time'};
   open(FILE, "<$USER.frag/$FORM{'signatureID'}.frag") || &error($!);
@@ -294,9 +323,9 @@
 
     $time = $rec->{'time'};
     $class = $rec->{'class'};
-    $from = $rec->{'from'};
+    $from =  recode_subj($rec->{'from'});
     $signature = $rec->{'signature'};
-    $subject = $rec->{'subject'};
+    $subject = recode_subj($rec->{'subject'});
     $info = $rec->{'info'};
     $messageid = $rec->{'messageid'};
 
@@ -309,7 +338,7 @@
     if ($messageid ne "") {
       $from = $rec{$messageid}->{'from'} 
         if ($from eq "<None Specified>");
-      $subject = $rec{$messageid}->{'subject'} 
+      $subject = recode_subj($rec{$messageid}->{'subject'}) 
         if ($subject eq "<None Specified>");
 
       $rec{$messageid}->{'from'} = $from 
@@ -413,10 +442,10 @@
         <td class="$rowclass" nowrap="true"><small>
         <input name="msgid$retrain_checked_msg_no" type="checkbox"
value="$rclass:$signature">
         $retrain</td>
-       <td class="$rowclass" nowrap="true"><small>$ctime</td>
-       <td class="$rowclass" nowrap="true"><small>$from</td>
-       <td class="$rowclass" nowrap="true"><small>$subject</td>
-       <td class="$rowclass" nowrap="true"><small>$info</td>
+       <td class="${cl}_row $rowclass" nowrap="true"><small>$ctime</td>
+       <td class="${cl}_row $rowclass" nowrap="true"><small>$from</td>
+       <td class="${cl}_row $rowclass" nowrap="true"><small>$subject</td>
+       <td class="${cl}_row $rowclass" nowrap="true"><small>$info</td>
 </tr>
 _END
     $retrain_checked_msg_no++;
@@ -1025,6 +1054,8 @@
     }
 
     $new->{'alert'} = $alert;
+    $new->{Subject}=recode_subj($new->{Subject});
+    $new->{From}=recode_subj($new->{From});
 
     if ($alert) { $rowclass="rowAlert"; }
 

----------------------------------------------------------------------

Comment By: Ion-Mihai "IOnut" Tetcu (itetcu)
Date: 2009-08-02 11:12

Message:
No, it won't make it to 3.9.0, maybe 3.9.1

----------------------------------------------------------------------

Comment By: Stevan Bajic (sbajic)
Date: 2009-07-29 15:13

Message:
And here the PHP code showing how that subject would be needed to be
converted to display properly in the Web UI:
-----
<?php
echo "<html><head><title>test</title></head><body>\n";
$text = array("WxskQjJENT8bKEJceHs5MGFlfRskQjdvGyhCXSA=",
"SlNIT1BQRVJTGyRCRXFKdUU5ISRLXDx+GyhC",
"GyRCRy5HYz4mSUpFOUQ5P2RBJiEqGyhC");
for ($x=0;$x<count($text);$x++) {
  echo mb_convert_encoding(base64_decode($text[$x]), "HTML-ENTITIES",
"ISO-2022-JP") . "<br>\n";
}
echo "</body>\n";
?>
-----

Result:
-----
<html><head><title>test</title></head><body>
[&#21487;&#30097;\x{90ae}&#20214;] <br>
JSHOPPERS&#28120;&#23453;&#24215;&#65292;&#26412;&#21608;<br>
&#29105;&#36023;&#21830;&#21697;&#24215;&#38263;&#25512;&#34214;&#65281;<br>
</body>
-----

Kind Regards from Switzerland

Stevan Bajic

----------------------------------------------------------------------

Comment By: Stevan Bajic (sbajic)
Date: 2009-07-29 14:58

Message:
Hallo Xueron Nee

The decoding is right. The whole line is encoded in Base64 and the
decoding was done right but the code page ISO-2022-JP is not used in the
Web UI.

Try yourself to decode the subjects by using this link here:
http://www.opinionatedgeek.com/dotnet/tools/Base64Decode/Default.aspx

When decoding just remove the starting "=?ISO-2022-JP?B?" part and the
"?=" at the end of the subject.

To illustrate how the decoding part works you could quickly put the
following lines into a file and call it with the PHP interpreter:
-----
<?php
$text =
array("=?ISO-2022-JP?B?WxskQjJENT8bKEJceHs5MGFlfRskQjdvGyhCXSA=?=",
"=?ISO-2022-JP?B?SlNIT1BQRVJTGyRCRXFKdUU5ISRLXDx+GyhC?=",
"=?ISO-2022-JP?B?GyRCRy5HYz4mSUpFOUQ5P2RBJiEqGyhC?=");
for ($x=0;$x<count($text);$x++) {
  $elements = imap_mime_header_decode($text[$x]);
  for ($i=0; $i<count($elements); $i++) {
    echo "Charset: {$elements[$i]->charset}\n";
    echo "Text: {$elements[$i]->text}\n\n";
  }
}
?>
-----

The output should be:
-----
Charset: ISO-2022-JP
Text: [2D5?\x{90ae}7o]

Charset: ISO-2022-JP
Text: JSHOPPERSEqJuE9!$K\<~

Charset: ISO-2022-JP
Text: G.Gc>&IJE9D9?dA&!*
-----

PHP does exactly the same decoding as DSPAM. Is that decoding wrong? I
can't believe PHP to decode it wrongly. Do display that subject line
correctly in the Web UI we would need to decode it and then encode it in
HTML encoded UTF-8 codes. And I don't think that this will happen for
v3.9.0 release. Maybe the new Web UI will handle that. But I don't know.

I am going to change that bug to be a feature request for the new Web UI.


Kind Regards from Switzerland

Stevan Bajic

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1126468&aid=2828299&group_id=250683

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Dspam-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-devel

Reply via email to