Feature Requests item #2828299, was opened at 2009-07-28 13:56
Message generated for change (Comment added) made by sergzhum
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=1126468&aid=2828299&group_id=250683
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: webui
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Xueron Nee (srni)
Assigned to: Nobody/Anonymous (nobody)
Summary: UTF-8 encoding for WebUI
Initial Comment:
A message with subject:
=?ISO-2022-JP?B?WxskQjJENT8bKEJceHs5MGFlfRskQjdvGyhCXSA=?=
=?ISO-2022-JP?B?SlNIT1BQRVJTGyRCRXFKdUU5ISRLXDx+GyhC?=
=?ISO-2022-JP?B?GyRCRy5HYz4mSUpFOUQ5P2RBJiEqGyhC?
=
in system.log, the subject was decoded to: [ $B2D5? (B\x{90ae} $B7o (B]
JSHOPPERS $BEqJuE9!$K\<~ (B $BG.Gc>&IJE9D9?dA&!* (B
It seems ISO-2022-JP was not recognized and treaded as ISO-8859-1.
:) thanks
----------------------------------------------------------------------
Comment By: Sergey Zhumatiy (sergzhum)
Date: 2010-03-15 18:00
Message:
I use this patch for dspam 3.6.8 (works for most cases, not all. E.g. I
cannot see decoded mail body):
--- dspam.cgi.orig 2006-05-13 16:17:31.000000000 +0400
+++ dspam.cgi 2006-10-19 14:00:41.000000000 +0400
@@ -20,17 +20,46 @@
use strict;
use Time::Local;
+use Text::Iconv;
+use MIME::Base64;
+use MIME::QuotedPrint;
+
use vars qw { %CONFIG %DATA %FORM $MAILBOX $CURRENT_USER $USER $TMPFILE};
use vars qw { $CURRENT_STORE };
require "ctime.pl";
# Read configuration parameters common to all CGI scripts
require "configure.pl";
+my $encoding=$CONFIG{'ENCODING'};
+
+$encoding = "ascii" if $encoding eq '';
if($CONFIG{"DATE_FORMAT"}) {
use POSIX qw(strftime);
}
+sub recode_subj( $ ){
+ my $subject=$_[0];
+ my ($pre,$post,$k,$enc);
+
+ if($subject =~ /(.*)=\?(.+)\?(\S)\?(.+)\?\=(.*)/is){
+ $pre=$1;
+ $post=$5;
+ $k=lc($3);
+ $enc=uc($2);
+ $subject=$4;
+ $subject=~tr/\n\r'/.."/; #'/;
+ $subject= ($k eq 'b')? MIME::Base64::decode($subject):
+ (($k eq 'q')? MIME::QuotedPrint::decode($subject):
$subject);
+ $subject =~ tr/\n\r\t\0-\0x19//d;
+ my $converter = Text::Iconv->new($enc, $encoding);
+ $subject = $converter->convert($subject);
+
+ return "${pre}${subject}${post}";
+ }
+ return $subject;
+}
+
#
# The current CGI script
#
@@ -190,7 +219,7 @@
sub DisplayFragment {
$FORM{'signatureID'} =~ s/\///g;
$DATA{'FROM'} = $FORM{'from'};
- $DATA{'SUBJECT'} = $FORM{'subject'};
+ $DATA{'SUBJECT'} = recode_subj($FORM{'subject'});
$DATA{'INFO'} = $FORM{'info'};
$DATA{'TIME'} = $FORM{'time'};
open(FILE, "<$USER.frag/$FORM{'signatureID'}.frag") || &error($!);
@@ -294,9 +323,9 @@
$time = $rec->{'time'};
$class = $rec->{'class'};
- $from = $rec->{'from'};
+ $from = recode_subj($rec->{'from'});
$signature = $rec->{'signature'};
- $subject = $rec->{'subject'};
+ $subject = recode_subj($rec->{'subject'});
$info = $rec->{'info'};
$messageid = $rec->{'messageid'};
@@ -309,7 +338,7 @@
if ($messageid ne "") {
$from = $rec{$messageid}->{'from'}
if ($from eq "<None Specified>");
- $subject = $rec{$messageid}->{'subject'}
+ $subject = recode_subj($rec{$messageid}->{'subject'})
if ($subject eq "<None Specified>");
$rec{$messageid}->{'from'} = $from
@@ -413,10 +442,10 @@
<td class="$rowclass" nowrap="true"><small>
<input name="msgid$retrain_checked_msg_no" type="checkbox"
value="$rclass:$signature">
$retrain</td>
- <td class="$rowclass" nowrap="true"><small>$ctime</td>
- <td class="$rowclass" nowrap="true"><small>$from</td>
- <td class="$rowclass" nowrap="true"><small>$subject</td>
- <td class="$rowclass" nowrap="true"><small>$info</td>
+ <td class="${cl}_row $rowclass" nowrap="true"><small>$ctime</td>
+ <td class="${cl}_row $rowclass" nowrap="true"><small>$from</td>
+ <td class="${cl}_row $rowclass" nowrap="true"><small>$subject</td>
+ <td class="${cl}_row $rowclass" nowrap="true"><small>$info</td>
</tr>
_END
$retrain_checked_msg_no++;
@@ -1025,6 +1054,8 @@
}
$new->{'alert'} = $alert;
+ $new->{Subject}=recode_subj($new->{Subject});
+ $new->{From}=recode_subj($new->{From});
if ($alert) { $rowclass="rowAlert"; }
----------------------------------------------------------------------
Comment By: Ion-Mihai "IOnut" Tetcu (itetcu)
Date: 2009-08-02 11:12
Message:
No, it won't make it to 3.9.0, maybe 3.9.1
----------------------------------------------------------------------
Comment By: Stevan Bajic (sbajic)
Date: 2009-07-29 15:13
Message:
And here the PHP code showing how that subject would be needed to be
converted to display properly in the Web UI:
-----
<?php
echo "<html><head><title>test</title></head><body>\n";
$text = array("WxskQjJENT8bKEJceHs5MGFlfRskQjdvGyhCXSA=",
"SlNIT1BQRVJTGyRCRXFKdUU5ISRLXDx+GyhC",
"GyRCRy5HYz4mSUpFOUQ5P2RBJiEqGyhC");
for ($x=0;$x<count($text);$x++) {
echo mb_convert_encoding(base64_decode($text[$x]), "HTML-ENTITIES",
"ISO-2022-JP") . "<br>\n";
}
echo "</body>\n";
?>
-----
Result:
-----
<html><head><title>test</title></head><body>
[可疑\x{90ae}件] <br>
JSHOPPERS淘宝店,本周<br>
熱買商品店長推薦!<br>
</body>
-----
Kind Regards from Switzerland
Stevan Bajic
----------------------------------------------------------------------
Comment By: Stevan Bajic (sbajic)
Date: 2009-07-29 14:58
Message:
Hallo Xueron Nee
The decoding is right. The whole line is encoded in Base64 and the
decoding was done right but the code page ISO-2022-JP is not used in the
Web UI.
Try yourself to decode the subjects by using this link here:
http://www.opinionatedgeek.com/dotnet/tools/Base64Decode/Default.aspx
When decoding just remove the starting "=?ISO-2022-JP?B?" part and the
"?=" at the end of the subject.
To illustrate how the decoding part works you could quickly put the
following lines into a file and call it with the PHP interpreter:
-----
<?php
$text =
array("=?ISO-2022-JP?B?WxskQjJENT8bKEJceHs5MGFlfRskQjdvGyhCXSA=?=",
"=?ISO-2022-JP?B?SlNIT1BQRVJTGyRCRXFKdUU5ISRLXDx+GyhC?=",
"=?ISO-2022-JP?B?GyRCRy5HYz4mSUpFOUQ5P2RBJiEqGyhC?=");
for ($x=0;$x<count($text);$x++) {
$elements = imap_mime_header_decode($text[$x]);
for ($i=0; $i<count($elements); $i++) {
echo "Charset: {$elements[$i]->charset}\n";
echo "Text: {$elements[$i]->text}\n\n";
}
}
?>
-----
The output should be:
-----
Charset: ISO-2022-JP
Text: [2D5?\x{90ae}7o]
Charset: ISO-2022-JP
Text: JSHOPPERSEqJuE9!$K\<~
Charset: ISO-2022-JP
Text: G.Gc>&IJE9D9?dA&!*
-----
PHP does exactly the same decoding as DSPAM. Is that decoding wrong? I
can't believe PHP to decode it wrongly. Do display that subject line
correctly in the Web UI we would need to decode it and then encode it in
HTML encoded UTF-8 codes. And I don't think that this will happen for
v3.9.0 release. Maybe the new Web UI will handle that. But I don't know.
I am going to change that bug to be a feature request for the new Web UI.
Kind Regards from Switzerland
Stevan Bajic
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=1126468&aid=2828299&group_id=250683
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Dspam-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-devel