Re: [naviserver-devel] ns_urldecode -charset
On 30 October 2012 20:39, Stephen Deasey sdea...@gmail.com wrote: On Tue, Oct 30, 2012 at 7:59 PM, Stephen Deasey sdea...@gmail.com wrote: But the code points of iso88591 are a subset of utf8... Actually, this doesn't make sense. The byte encoding of code points above 128 uses two bytes for utf8, but only one byte for iso88591. Yes this is exactly it Stephen. We're communicating with an external system which uses iso8859-1. So for the extended character set the results are different. The external system gets confused if it tries to decode something we encoded using utf-8 eg: nscp 52 ns_urlencode -charset iso8859-1 ú %fa nscp 53 ns_urlencode -charset utf-8 ú %c3%ba nscp 54 ns_urldecode -charset iso8859-1 %c3%ba ú Thanks very much Gustaf for making this change! -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct___ naviserver-devel mailing list naviserver-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/naviserver-devel
Re: [naviserver-devel] ns_urldecode -charset
Dear David, would the following change help you? Before i finalize this change (do this on encode as well, add to documentation, etc.), was this omitted on purpose in naviserver? -gustaf neumann --- a/nsd/urlencode.c Mon Oct 29 13:46:08 2012 +0100 +++ b/nsd/urlencode.c Tue Oct 30 15:41:06 2012 +0100 @@ -504,8 +504,9 @@ NsTclUrlDecodeObjCmd(ClientData arg, Tcl_Interp *interp, int objc, Tcl_Obj *CONST objv[]) { Ns_DString ds; -char*string; +char*string, *charset = NULL; int part = 'q'; +Tcl_Encoding encoding = NULL; Ns_ObjvTable parts[] = { {query,'q'}, @@ -514,7 +515,8 @@ }; Ns_ObjvSpec opts[] = { {-part,Ns_ObjvIndex, part, parts}, -{--, Ns_ObjvBreak,NULL,NULL}, +{-charset, Ns_ObjvString, charset, NULL}, +{--, Ns_ObjvBreak, NULL,NULL}, {NULL, NULL, NULL, NULL} }; Ns_ObjvSpec args[] = { @@ -526,7 +528,11 @@ } Ns_DStringInit(ds); -UrlDecode(ds, string, NULL, part); +if (charset) { +encoding = Ns_GetUrlEncoding(charset); +} + +UrlDecode(ds, string, encoding, part); Tcl_DStringResult(interp, ds); return TCL_OK; On 30.10.12 11:38, David Osborne wrote: Hi, We're currently in the process of porting a fairly large code base from Aolserver to Naviserver for testing (using Naviserver v4.99.4 on Debian Squeeze). One thing that has come up so far is that ns_urldecode seems to have dropped the -charset switch. I'm assuming it used to be present since some of your documentation mentions it: (eg. http://naviserver.sourceforge.net/n/naviserver/files/ns_urldecode.html ) I can't find any mention of why it was dropped? So in Naviserver, is there an alternative to achieve the following: nscp 1 ns_urldecode -charset iso8859-1 %FA ú PS. This is not really a devel question - is there a more appropriate place to ask config/user questions? Thanks in advance. - David -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct___ naviserver-devel mailing list naviserver-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/naviserver-devel
Re: [naviserver-devel] ns_urldecode -charset
Hi Gustaf, Yes, that looks great. Along with the ns_urlencode equivalent I think it would solve our problem. Thanks very much for the reply. - David On 30 October 2012 14:44, Gustaf Neumann wrote: Dear David, would the following change help you? Before i finalize this change (do this on encode as well, add to documentation, etc.), was this omitted on purpose in naviserver? -gustaf neumann --- a/nsd/urlencode.c Mon Oct 29 13:46:08 2012 +0100 +++ b/nsd/urlencode.c Tue Oct 30 15:41:06 2012 +0100 @@ -504,8 +504,9 @@ NsTclUrlDecodeObjCmd(ClientData arg, Tcl_Interp *interp, int objc, Tcl_Obj *CONST objv[]) { Ns_DString ds; -char*string; +char*string, *charset = NULL; int part = 'q'; +Tcl_Encoding encoding = NULL; Ns_ObjvTable parts[] = { {query,'q'}, @@ -514,7 +515,8 @@ }; Ns_ObjvSpec opts[] = { {-part,Ns_ObjvIndex, part, parts}, -{--, Ns_ObjvBreak,NULL,NULL}, +{-charset, Ns_ObjvString, charset, NULL}, +{--, Ns_ObjvBreak, NULL,NULL}, {NULL, NULL, NULL, NULL} }; Ns_ObjvSpec args[] = { @@ -526,7 +528,11 @@ } Ns_DStringInit(ds); -UrlDecode(ds, string, NULL, part); +if (charset) { +encoding = Ns_GetUrlEncoding(charset); +} + +UrlDecode(ds, string, encoding, part); Tcl_DStringResult(interp, ds); return TCL_OK; -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct___ naviserver-devel mailing list naviserver-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/naviserver-devel
Re: [naviserver-devel] ns_urldecode -charset
On Tue, Oct 30, 2012 at 10:38 AM, David Osborne da...@qcode.co.uk wrote: Hi, We're currently in the process of porting a fairly large code base from Aolserver to Naviserver for testing (using Naviserver v4.99.4 on Debian Squeeze). One thing that has come up so far is that ns_urldecode seems to have dropped the -charset switch. I'm assuming it used to be present since some of your documentation mentions it: (eg. http://naviserver.sourceforge.net/n/naviserver/files/ns_urldecode.html ) I can't find any mention of why it was dropped? It was more than 7 years ago so I can't remember the details, but I think there were some other bugs to do with character sets that basically meant forcing everything to be utf-8. Strictly speaking the -charset switch to ns_urldecode might still be needed, and I think it got removed by mistake, but it's usually not needed: So in Naviserver, is there an alternative to achieve the following: nscp 1 ns_urldecode -charset iso8859-1 %FA ú Here for example, if you don't pass -charset then naviserver assumes utf8. But the code points of iso88591 are a subset of utf8 (and ascii is a subset of both), so the result is identical. So you should never have to specify iso88591, because I think you can no longer set the notion of a global default character set -- it's always utf8, and then in some places you can specifically choose if really needed. Where are you getting %FA from? Is it something you're encoding yourself or are you interacting with another system? PS. This is not really a devel question - is there a more appropriate place to ask config/user questions? This is it. -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct ___ naviserver-devel mailing list naviserver-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/naviserver-devel
Re: [naviserver-devel] ns_urldecode -charset
On Tue, Oct 30, 2012 at 7:59 PM, Stephen Deasey sdea...@gmail.com wrote: But the code points of iso88591 are a subset of utf8... Actually, this doesn't make sense. The byte encoding of code points above 128 uses two bytes for utf8, but only one byte for iso88591. Looks like you need -charset. -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct ___ naviserver-devel mailing list naviserver-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/naviserver-devel
Re: [naviserver-devel] ns_urldecode -charset
The changes are committed to bitbucket. see https://bitbucket.org/naviserver/naviserver/changeset/7b89b89802beebeb3db4a37c77f3d2d63c944494 all the best -gustaf On 30.10.12 16:03, David Osborne wrote: Hi Gustaf, Yes, that looks great. Along with the ns_urlencode equivalent I think it would solve our problem. Thanks very much for the reply. - David -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct___ naviserver-devel mailing list naviserver-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/naviserver-devel