Everything is being served UTF-8... that means the header is UTF-8, the 
xml process directive is UTF-8, and there's a content type meta tag 
setting it to UTF-8, and I can confirm in Firefox via both the View > 
Character Encoding menu an Firebug that the page is being served as 
UTF-8, so even without adding a character-encoding attribute to the 
forms, the code should arrive at the server as UTF-8.

Furthermore, I used TextMate to create a file with a c-cedilla in it, 
saved it, reopened it as UTF-8 *to be certain*, then copied and pasted 
the character into my form in Firefox.

Somewhere between clicking on Save and that character showing up in the 
database, it gets interpreted as Latin1 (despite being UTF-8) and then 
converted into the wrong UTF-8 characters.

Is anyone running Tomcat or some other servlet container and can test this?

Chas.

marius d. wrote:
> just out of curiosity are you setting manually in the HTTP header
> 
> Content-Type: text/html; charset=UTF-8
> 
> and it's still broken?
> 
> P.S.
> Sometimes HTTP equiv from HTML header just doesn't do the trick.
> 
> Br's,
> Marius
> 
> On Mar 15, 11:20 pm, Derek Chen-Becker <dchenbec...@gmail.com> wrote:
>> Crapola:
>>
>> http://jira.codehaus.org/browse/JETTY-958
>>
>> I think I've confirmed that this is not lift. I added a non-lift input text
>> element to an existing lift form:
>>
>> <input name="testthis" type="text" />
>>
>> Then I use the following code, which I believe should be getting direct
>> access to Jetty's HttpServletRequest instance:
>>
>> Log.info("testthis = " + (S.request.map({r =>
>> r.request.getParameter("testthis")}) openOr "not found!"))
>>
>> And when I put a cedilla in, I get:
>>
>> INFO - testthis = ç
>>
>> Can you confirm that you're using Jetty? I also tried the flags listed in
>> the JIRA ticket:
>>
>> -Dorg.mortbay.util.URI.charset=utf-8 -Dfile.encoding=UTF-8
>>
>> But they didn't seem to do anything (it didn't crash, though). I'm not sure
>> if I specified those correctly for use with the Maven jetty:run command
>> line:
>>
>> mvn -Djetty.port=9090 -Dorg.mortbay.util.URI.charset=utf-8
>> -Dfile.encoding=UTF-8 jetty:run
>>
>> Anyways, this doesn't look to be Lift's fault. I know that's not a great
>> answer. I'm trying to think of whether there's a clean, simple way to "undo"
>> the bogus transform but I don't know enough about charset handling. One more
>> interesting thing is that if I change my log code to:
>>
>> Log.info("testthis = " + (S.request.map({r => r.request.getCharacterEncoding
>> + r.request.getParameter("testthis")}) openOr "not found!"))
>>
>> I get:
>>
>> INFO - testthis = nullç
>>
>> Which seems to indicate that the character encoding for the POST isn't being
>> set. I tried overriding it:
>>
>> S.request.foreach{ r => r.request.setCharacterEncoding("UTF-8")}
>>
>> and that seems to have absolutely no effect (in fact, I get the same "null"
>> log message).
>>
>> Derek
>>
>> On Sun, Mar 15, 2009 at 3:08 PM, Charles F. Munat <c...@munat.com> wrote:
>>
>>
>>
>>> Marc Boschma wrote:
>>>>> When I use &ccedil; instead, the problem is that it is *not* converted
>>>>> to ç as it goes into the database, and then on the way out the XML
>>>>> interpreter does not recognize it as a character entity reference
>>>>> and so
>>>>> converts the & to &amp;.
>>>> I think this is due to using the standard Scala XML load functions
>>>> rather than the lift XML parser. From memory I don't think the
>>>> standard parser recognises that many named entities. ie. does &#x00E7;
>>>> work instead of &ccedil; ? If so then that is probably what is
>>>> happening on this issue.
>>> &#x00E7; goes into the database unchanged, but comes back out as
>>> &amp;#x00E7. For that matter, &amp; in the DB comes out as &amp;amp; on
>>> the page.
>>> This is actually fine with me. It means that my users can just type &,
>>> <, > etc. and they will appear on the page that way (rather than being
>>> intepreted as HTML tags). It's safer, too. There is no way for them to
>>> insert HTML, especially script tags.
>>> So really, the only problem I have is that I need to be able to type a ç
>>> and have it still a ç when it gets to the database.
>>> Chas.
> > 

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Lift" group.
To post to this group, send email to liftweb@googlegroups.com
To unsubscribe from this group, send email to 
liftweb+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/liftweb?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to