You should not be using request-headers or request-bindings if you
don't want them to be interpreted as UTF-8. The documentation for
web-server/http/bindings explicitly says, "We recommend against their
use, but they are provided for compatibility with old code."

Jay

On Fri, May 6, 2016 at 11:49 AM, Tim Brown <tim.br...@cityc.co.uk> wrote:
> Sorry, Jay; I’ve just tested this and I hit:
>
> Servlet (@ /...) exception:
> bytes->string/utf-8: string is not a well-formed UTF-8 encoding
>   string: #"timmeh \351"
>
>   context...:
>
> /usr/local/racket/extra-pkgs/web-server/web-server-lib/web-server/http/bindings.rkt:9:7
>    loop
>
> /usr/local/racket-6.5/share/racket/collects/racket/contract/private/arrow-val-first.rkt:357:18
>
>
> This is in web-server/http/bindings.rkt (where I count no less than five
> `bytes->string/utf-8`s); and I really do think that that should be
> bytes->string/latin-1 both because it covers all 256 code points AND
> it is what HTTP asks for.
>
> That would fix my issue (I hope).
>
>
>
> Also, looking at byte-upcase / bytes-ci=? in
> web-server-lib/web-server/private/util ; can I make a couple of
> suggestions:
>
> 1. I think Eli points out in issue where \277 and \276 are not ci=?
>    to each other. I’m not sure of his specific example; because in
>    Latin-1, they are "3/4" and an upside down "?" -- which I wouldn’t
>    personally consider ci=? But further up the character set; I would
>    say that \311 E' and \350 e' ARE ci=? : but only in Latin-1.
>
>    So should there not be a byte-upcase/latin-1 and byte-upcase/ascii-7
>    and a bytes-ci=?/latin-1 and bytes-ci=?/latin-1
>
> 2. Since this is implemented in a web-server / HTTP context (and for the
>    reasons I set out above w.r.t. the bindings); should util.rkt not use
>    bytes-ci=?/latin-1 ?
>
>
> Since I have an ISO-8859-1 table in front of me:
>
> (define (byte-upcase/latin-1 b)
>   (if ((or (<= 97 b 12)   ; ascii-7: a-z range
>            (<= 224 b 246) ; latin-1: a` to o"
>                           ; latin-1: -:- is not the lower case of x
>            (<= 248 254))  ; latin-1: o/ to |p
>         (- b 32))         ; 97 - 65 = 32
>        b))
>
>
> On 05/05/16 18:46, Jay McCarthy wrote:
>> Hi Tim,
>>
>> I consider this an error. The Web server tries to avoid interpreting
>> anything as UTF-8 unless asked by the servlet. Header comparison
>> incorrectly converted to UTF-8 and I just pushed a fix. Can you verify
>> that it works now with your workload?
>>
>> Jay
>                       - D&C 64:33
>
>
> --
> Tim Brown CEng MBCS <tim.br...@cityc.co.uk>
> ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
>                 City Computing Limited · www.cityc.co.uk
>       City House · Sutton Park Rd · Sutton · Surrey · SM1 2AE · GB
>                 T:+44 20 8770 2110 · F:+44 20 8770 2130
> ────────────────────────────────────────────────────────────────────────
> City Computing Limited registered in London No:1767817.
> Registered Office: City House, Sutton Park Road, Sutton, Surrey, SM1 2AE
> VAT No: GB 918 4680 96



-- 
Jay McCarthy
Associate Professor
PLT @ CS @ UMass Lowell
http://jeapostrophe.github.io

           "Wherefore, be not weary in well-doing,
      for ye are laying the foundation of a great work.
And out of small things proceedeth that which is great."
                          - D&C 64:33

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to