Hi Paul,

comments inlined.


I know the reasoning I had when invented NEW_CHARSET. Phone claims it supports multiple encodings - like, for example, latin1, koi8-r, and utf8, but does this in-line and with q=0.x weights. The, webserver decides to present russian content in, for example, koi8. But there's <?xml encoding=utf-8?> ot whatever in wml source anyway, and wml compiler bombs at the stage it calls libxml. So I am explicitly requesting UTF-8 in that case, then recoding any received page to UTF-8 before feeding it to libxml.


But text/plain ...



------------------------------------------------------------------------

Index: gateway.A9/configure.in
===================================================================
--- gateway.A9.orig/configure.in 2004-11-06 17:36:55.728060912 +0300
+++ gateway.A9/configure.in 2004-11-06 17:46:21.818002160 +0300
@@ -216,6 +216,18 @@
]
)
+AC_MSG_CHECKING([whether to do all wapbox xml processing in utf-8])
+AC_ARG_ENABLE(scharset,
+[ --enable-scharset do all wapbox xml processing in utf-8],
+[
+ if test "$enableval" != yes; then
+ AC_MSG_RESULT(no)
+ else
+ AC_MSG_RESULT(yes)
+ AC_DEFINE(NEW_CHARSETS, 1, [Simplify wapbox charset processing])
+ fi
+])
+
dnl Extra feature checks
dnl GW_HAVE_TYPE_FROM(HDRNAME, TYPE, HAVENAME, DESCRIPTION)
Index: gateway.A9/gw/wap-appl.c
===================================================================
--- gateway.A9.orig/gw/wap-appl.c 2004-11-06 17:41:37.645202992 +0300
+++ gateway.A9/gw/wap-appl.c 2004-11-06 17:46:21.819002008 +0300
@@ -718,6 +718,10 @@
* to handle those charsets for all content types, just WML/XHTML. */
static void add_charset_headers(List *headers) {
+#ifdef NEW_CHARSETS
+ if (!http_charset_accepted(headers, "utf-8"))
+ http_header_add(headers, "Accept-Charset", "utf-8");
+#else
long i, len;
gw_assert(charsets != NULL);
@@ -727,6 +731,7 @@
if (!http_charset_accepted(headers, charset))
http_header_add(headers, "Accept-Charset", charset);
}
+#endif
}

ok, reading this, it means from the point of logic:
If there has not been an Accept-Charset header with utf-8, you add it directly. But, when I look into how List *charset is filled, I see in gw/xml_shared.c:77 that UTF-8 is added anyway in the "rest" of the #else that you do. Right? So where is the benefit here?


@@ -1055,11 +1060,29 @@
/* get charset used in content body, default to utf-8 if not present */
if ((charset = find_charset_encoding(content.body)) == NULL)
+#ifdef NEW_CHARSETS
+ if (octstr_len(content.charset) > 0) {
+ charset = octstr_duplicate(content.charset);
+ } else {
+ charset = octstr_imm("UTF-8");
+ }
+#else
charset = octstr_imm("UTF-8"); +#endif

this block sounds reasonable to me even without the #ifdef braces. It means, pick the charset from the xml preamble via find_charset_encoding(), otherwise, check if the HTTP response header gave a charset back, otherwise pick utf-8 as default. Right?


/* convert to utf-8 if original charset is not utf-8 * and device supports it */
+#ifdef NEW_CHARSETS
+ if (octstr_case_compare(charset, octstr_imm("UTF-8")) != 0) {
+ debug("wsp",0,"Converting wml/xhtml from charset <%s> to UTF-8",
+ octstr_get_cstr(charset));
+ if (charset_convert(content.body, octstr_get_cstr(charset), "UTF-8") >= 0) {
+ octstr_destroy(content.charset);
+ content.charset = octstr_create("UTF-8");
+ }
+ }
+#else
if (octstr_case_compare(charset, octstr_imm("UTF-8")) < 0 &&
!http_charset_accepted(device_headers, octstr_get_cstr(charset))) {
if (!http_charset_accepted(device_headers, "UTF-8")) {
@@ -1097,6 +1120,7 @@
}
}
}
+#endif
octstr_destroy(charset);
}

this #ifdef assumes that you device definetly supports utf-8??? How do you garanetee this?


There may be devices that do not?!

I think the present logic (inside the #else section of the patch), does it the right way: convert to utf-8 _only_ when the device has stated via it's headers taht it supports it. Right?


Stipe

mailto:stolj_{at}_wapme.de
-------------------------------------------------------------------
Wapme Systems AG

Vogelsanger Weg 80
40470 Düsseldorf, NRW, Germany

phone: +49.211.74845.0
fax: +49.211.74845.299

mailto:info_{at}_wapme-systems.de
http://www.wapme-systems.de/
-------------------------------------------------------------------



Reply via email to