Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-31 Thread Christopher Schultz

Greg,

On 3/31/22 12:17, Christopher Schultz wrote:

Greg,

On 3/29/22 13:41, gelo1234 wrote:

Have you also tried HTMLT or XHTMLT Serializers?
Default HTMLSerializer cannot handle some unicode characters: 
https://issues.apache.org/jira/browse/SLING-5973?attachmentOrder=asc 


Hmm. Are the HTMLT / XHTMLT serializers built-in? I have disabled all 
blocks during the build, so I'm just using Cocoon core.


I tried using a view, and it's not perfect but what I ended up with is 
Cocoon dumping-out the originally-generated (from the generator) XML and 
the US flag is already broken.


So it's definitely not being broken by the convoluted pipeline.

I'll try to put together an SSCCE[1]

-chris

[1] http://sscce.org/

wt., 29 mar 2022 o 19:37 gelo1234 > napisał(a):


    Hello Chris,

    I think you will not get any icon-type character on output without
    using proper font rendering - like Emoji support? Emoji might not be
    supported by default in Cocoon.
    So this might be the reason why you get HTML entities instead of
    Emoji-icons.
    Also notice:
    https://www.mail-archive.com/dev@cocoon.apache.org/msg61629.html
    

    Greetings,
    Greg



    wt., 29 mar 2022 o 18:36 Christopher Schultz
    mailto:ch...@christopherschultz.net>>
    napisał(a):

    Cédric,

    On 3/29/22 12:06, Cédric Damioli wrote:
 > Could you provide more details ?
 > How is your XML processed before outputting the wrong UTF-8
    sequence ?

    It's somewhat straightforward:

    
    https://source/ " />

    

    

    

    

    

    

    

Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-31 Thread Cédric Damioli

Hi,

To help isolate the issue, could you test with a simpler pipeline with 
only generator/single simple XSLT/xml serializer ?


Cédric

Le 31/03/2022 à 17:54, Christopher Schultz a écrit :

Cédric,

On 3/29/22 12:52, Cédric Damioli wrote:

Do you use Xalan as XSLT Processor ?
If so, I remember https://issues.apache.org/jira/browse/XALANJ-2617 
which could be a cause of your issue.
I resolved it on my side years ago by compiling my own patched version 

> of Xalan.

I'm using whatever Cocoon uses natively. For example, I don't throw-in 
Jackson or StaX or whatever other options there are.


For "markers", you may use labels on your sitemap steps associated 
with a cocoon view.


Yeah, that sound familiar.

Thanks,
-chris


Le 29/03/2022 à 18:36, Christopher Schultz a écrit :

Cédric,

On 3/29/22 12:06, Cédric Damioli wrote:

Could you provide more details ?
How is your XML processed before outputting the wrong UTF-8 sequence ?


It's somewhat straightforward:


  https://source/"; />

  

  

  

  

  

  

  

Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-31 Thread Christopher Schultz

Greg,

On 3/31/22 12:13, Christopher Schultz wrote:

On 3/29/22 13:37, gelo1234 wrote:

Hello Chris,

I think you will not get any icon-type character on output without 
using proper font rendering - like Emoji support? Emoji might not be 
supported by default in Cocoon.


This isn't a font-rendering issue; it's just ... wrong. Either the raw 
character should be output, or the proper set of HTML entities should be 
output. Neither is happening. It's just mojibake somewhere in the pipeline.


So this might be the reason why you get HTML entities instead of 
Emoji-icons.
Also notice: 
https://www.mail-archive.com/dev@cocoon.apache.org/msg61629.html 


I read that, and was hopeful that 2.1.13 would resolve this issue, but 
it hasn't.


Hmm... strangely, the X-Cocoon-Version header still says 2.1.11. Perhaps 
I didn't upgrade properly...


Yeah, I had Cocoon 2.1.11 as a compile-time dependency which was 
dropping cocoon-2.1.11.jar into the web application along with all the 
other artifacts from the 2.1.13 build. Whoops.


I got that all fixed-up, but the behavior is still the same. I was 
pretty hopeful that was the only thing missing.


-chris

wt., 29 mar 2022 o 18:36 Christopher Schultz 
mailto:ch...@christopherschultz.net>> 
napisał(a):


    Cédric,

    On 3/29/22 12:06, Cédric Damioli wrote:
 > Could you provide more details ?
 > How is your XML processed before outputting the wrong UTF-8
    sequence ?

    It's somewhat straightforward:

    
    https://source/ " />

    

    

    

    

    

    

    > To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
    
 >> For additional commands, e-mail: users-h...@cocoon.apache.org
    
 >>
 >

    -
    To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
    
    For additional commands, e-mail: users-h...@cocoon.apache.org
    



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-31 Thread Christopher Schultz

Greg,

On 3/29/22 13:41, gelo1234 wrote:

Have you also tried HTMLT or XHTMLT Serializers?
Default HTMLSerializer cannot handle some unicode characters: 
https://issues.apache.org/jira/browse/SLING-5973?attachmentOrder=asc 


Hmm. Are the HTMLT / XHTMLT serializers built-in? I have disabled all 
blocks during the build, so I'm just using Cocoon core.


Thanks,
-chris

wt., 29 mar 2022 o 19:37 gelo1234 > napisał(a):


Hello Chris,

I think you will not get any icon-type character on output without
using proper font rendering - like Emoji support? Emoji might not be
supported by default in Cocoon.
So this might be the reason why you get HTML entities instead of
Emoji-icons.
Also notice:
https://www.mail-archive.com/dev@cocoon.apache.org/msg61629.html


Greetings,
Greg



wt., 29 mar 2022 o 18:36 Christopher Schultz
mailto:ch...@christopherschultz.net>>
napisał(a):

Cédric,

On 3/29/22 12:06, Cédric Damioli wrote:
 > Could you provide more details ?
 > How is your XML processed before outputting the wrong UTF-8
sequence ?

It's somewhat straightforward:


    https://source/ " />

    

    

    

    

    

    

    > To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org

 >> For additional commands, e-mail:
users-h...@cocoon.apache.org 
 >>
 >

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org

For additional commands, e-mail: users-h...@cocoon.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-31 Thread Christopher Schultz

Greg,

On 3/29/22 13:37, gelo1234 wrote:

Hello Chris,

I think you will not get any icon-type character on output without using 
proper font rendering - like Emoji support? Emoji might not be supported 
by default in Cocoon.


This isn't a font-rendering issue; it's just ... wrong. Either the raw 
character should be output, or the proper set of HTML entities should be 
output. Neither is happening. It's just mojibake somewhere in the pipeline.


So this might be the reason why you get HTML entities instead of 
Emoji-icons.
Also notice: 
https://www.mail-archive.com/dev@cocoon.apache.org/msg61629.html 


I read that, and was hopeful that 2.1.13 would resolve this issue, but 
it hasn't.


Hmm... strangely, the X-Cocoon-Version header still says 2.1.11. Perhaps 
I didn't upgrade properly...


Thanks,
-chris

wt., 29 mar 2022 o 18:36 Christopher Schultz 
mailto:ch...@christopherschultz.net>> 
napisał(a):


Cédric,

On 3/29/22 12:06, Cédric Damioli wrote:
 > Could you provide more details ?
 > How is your XML processed before outputting the wrong UTF-8
sequence ?

It's somewhat straightforward:


    https://source/ " />

    

    

    

    

    

    

    > To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org

 >> For additional commands, e-mail: users-h...@cocoon.apache.org

 >>
 >

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org

For additional commands, e-mail: users-h...@cocoon.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-31 Thread Christopher Schultz

Cédric,

On 3/29/22 12:52, Cédric Damioli wrote:

Do you use Xalan as XSLT Processor ?
If so, I remember https://issues.apache.org/jira/browse/XALANJ-2617 
which could be a cause of your issue.
I resolved it on my side years ago by compiling my own patched version 

> of Xalan.

I'm using whatever Cocoon uses natively. For example, I don't throw-in 
Jackson or StaX or whatever other options there are.


For "markers", you may use labels on your sitemap steps associated with 
a cocoon view.


Yeah, that sound familiar.

Thanks,
-chris


Le 29/03/2022 à 18:36, Christopher Schultz a écrit :

Cédric,

On 3/29/22 12:06, Cédric Damioli wrote:

Could you provide more details ?
How is your XML processed before outputting the wrong UTF-8 sequence ?


It's somewhat straightforward:


  https://source/"; />

  

  

  

  

  

  

  

Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-29 Thread gelo1234
Chris,

Have you also tried HTMLT or XHTMLT Serializers?
Default HTMLSerializer cannot handle some unicode characters:
https://issues.apache.org/jira/browse/SLING-5973?attachmentOrder=asc

Greetings,
Greg


wt., 29 mar 2022 o 19:37 gelo1234  napisał(a):

> Hello Chris,
>
> I think you will not get any icon-type character on output without using
> proper font rendering - like Emoji support? Emoji might not be supported by
> default in Cocoon.
> So this might be the reason why you get HTML entities instead of
> Emoji-icons.
> Also notice:
> https://www.mail-archive.com/dev@cocoon.apache.org/msg61629.html
>
> Greetings,
> Greg
>
>
>
> wt., 29 mar 2022 o 18:36 Christopher Schultz 
> napisał(a):
>
>> Cédric,
>>
>> On 3/29/22 12:06, Cédric Damioli wrote:
>> > Could you provide more details ?
>> > How is your XML processed before outputting the wrong UTF-8 sequence ?
>>
>> It's somewhat straightforward:
>>
>> 
>>https://source/"; />
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>> >> To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
>> >> For additional commands, e-mail: users-h...@cocoon.apache.org
>> >>
>> >
>>
>> -
>> To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
>> For additional commands, e-mail: users-h...@cocoon.apache.org
>>
>>


Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-29 Thread gelo1234
Hello Chris,

I think you will not get any icon-type character on output without using
proper font rendering - like Emoji support? Emoji might not be supported by
default in Cocoon.
So this might be the reason why you get HTML entities instead of
Emoji-icons.
Also notice:
https://www.mail-archive.com/dev@cocoon.apache.org/msg61629.html

Greetings,
Greg



wt., 29 mar 2022 o 18:36 Christopher Schultz 
napisał(a):

> Cédric,
>
> On 3/29/22 12:06, Cédric Damioli wrote:
> > Could you provide more details ?
> > How is your XML processed before outputting the wrong UTF-8 sequence ?
>
> It's somewhat straightforward:
>
> 
>https://source/"; />
>
>
>
>
>
>
>
>
>
>
>
>
>
> >> To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
> >> For additional commands, e-mail: users-h...@cocoon.apache.org
> >>
> >
>
> -
> To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
> For additional commands, e-mail: users-h...@cocoon.apache.org
>
>


Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-29 Thread Cédric Damioli

Do you use Xalan as XSLT Processor ?
If so, I remember https://issues.apache.org/jira/browse/XALANJ-2617 
which could be a cause of your issue.
I resolved it on my side years ago by compiling my own patched version 
of Xalan.


For "markers", you may use labels on your sitemap steps associated with 
a cocoon view.


HTH,
Cédric

Le 29/03/2022 à 18:36, Christopher Schultz a écrit :

Cédric,

On 3/29/22 12:06, Cédric Damioli wrote:

Could you provide more details ?
How is your XML processed before outputting the wrong UTF-8 sequence ?


It's somewhat straightforward:


  https://source/"; />

  

  

  

  

  

  

  

Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-29 Thread Christopher Schultz

Cédric,

On 3/29/22 12:06, Cédric Damioli wrote:

Could you provide more details ?
How is your XML processed before outputting the wrong UTF-8 sequence ?


It's somewhat straightforward:


  https://source/"; />

  

  

  

  

  

  

  

Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-29 Thread Cédric Damioli

Hi Christopher,

Could you provide more details ?
How is your XML processed before outputting the wrong UTF-8 sequence ?

Regards,
Cédric

Le 29/03/2022 à 17:48, Christopher Schultz a écrit :

All,

I'm still struggling with this. I have upgraded to 2.1.13 which 
includes the fix for https://issues.apache.org/jira/browse/COCOON-2352 
but I'm still getting that American flag converted into those 4 HTML 
entities:




I would expect there to be a single (multibyte) character in the 
output with no HTML entities.


I've double-checked, and the source XML contains the flag as a single 
multi-byte character, served as UTF-8.


Any ideas for how to get this working? I'm sure I could put together a 
trivial test-case.


Thanks,
-chris

On 10/30/18 12:18, Christopher Schultz wrote:

All,

Some additional information at the end.

On 10/30/18 11:58, Christopher Schultz wrote:

All,



I'm attempting to do everything with UTF-8 in Cocoon 2.1.11. I have
a servlet generating XML in UTF-8 encoding and I have a pipeline
with a few transforms in it, ultimately serializing to XHTML.



If I have a Unicode character in the XML which is outside of the
BMP, such as this one: 🇺🇸  (that's an American flag, in case your
mail reader doesn't render it correctly), then I end up getting a
series of bytes coming from Cocoon after the transform that look
like UTF-16.



Here's what's in the XML:



Test🇺🇸



Just like that. The bytes in the message for the flag character
are:



f0  9f  87  ba  f0  9f  87  b8



When rendering that into XHTML, I'm getting this in the output:



Test



The American flag in Unicode reference can be found here:
https://apps.timwhitlock.info/unicode/inspect?s=%F0%9F%87%BA%F0%9F%87%

B8


  You can see it broken down a bit better here for "Regional U":
http://www.fileformat.info/info/unicode/char/1f1fa/index.htm



and "Regional S":
http://www.fileformat.info/info/unicode/char/1f1f8/index.htm



What's happening is that some component in Cocoon has decided to
generate HTML entities instead of just emitting the character.
That's okay IMO. But what it does doesn't make sense for a UTF-8
output encodin g.



The first two entities "��" are the decimal numbers
that represent the UTF-16 character for that "Regional Indicator
Symbol Letter U" and they are correct... for UTF-16. If I change
the output encoding from UTF-8 to UTF0-16, then the browser will
render these correctly. Using UTF-8, they show as four of those
ugly [?] characters on the screen.



I had originally just decided to throw up my hands and use UTF-16
encoding even though it's dumb. But it seems that MSIE cannot be
convinced to use UTF-16 no matter what, and I must continue to
support MSIE. :(



So it's back to UTF-8 for me.



How can I get Cocoon to output that character (or "those
characters") correctly?



It needs to be one of the following:



🇺🇸 (HTML decimal entities)
🇺🇸 (HTML hex entities) f0 9f  87  ba
f0  9f  87  b8 (raw UTF-8 bytes)



Does anyone know how/where this conversion is being performed ion
Cocoon? Probably in a XHTML serializer (I'm using
org.apache.cocoon.serialization.XMLSerializer). I'm using
mime-type "text/html" and UTF-8 in my sitemap
for that serializer (the one named "xhtml"). I believe I've mads
very few changes from the default, if any.



I haven't yet figured out how to get from what Java sees (\uE50C
for the "S" for example) to 🇸, but knowing where the code
is that is making that decision would be very helpful.



Any ideas?



-chris


I created a text file (UTF-8) containing only the flag and read it in
using Java and printed all of the code points. There should be 2
"characters" in the file. It's 4 bytes per UTF-8 character so I
assumed I'd end up with 2 'char' primitives in the file, but I ended
up with more.

Here's the loop and the output:

 try(java.io.FileReader in = new java.io.FileReader("file.txt"))
{
 char[] chars = new char[10];

 int count = in.read(chars);

     for(int i=0; i   // Skip any trailing "characters" that are actually a part of this 
one

   if(1 < Character.charCount(cp))
 i += Character.charCount(cp) - 1;
}

Using the above code is completely encoding-agnostic, because it's
describing the Unicode code point and not some set of bytes in a
particular flavor of UTF-x.

-chris


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



--
Cédric Damioli
CMS - Java - Open Source
www.ametys.org


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2022-03-29 Thread Christopher Schultz

All,

I'm still struggling with this. I have upgraded to 2.1.13 which includes 
the fix for https://issues.apache.org/jira/browse/COCOON-2352 but I'm 
still getting that American flag converted into those 4 HTML entities:




I would expect there to be a single (multibyte) character in the output 
with no HTML entities.


I've double-checked, and the source XML contains the flag as a single 
multi-byte character, served as UTF-8.


Any ideas for how to get this working? I'm sure I could put together a 
trivial test-case.


Thanks,
-chris

On 10/30/18 12:18, Christopher Schultz wrote:

All,

Some additional information at the end.

On 10/30/18 11:58, Christopher Schultz wrote:

All,



I'm attempting to do everything with UTF-8 in Cocoon 2.1.11. I have
a servlet generating XML in UTF-8 encoding and I have a pipeline
with a few transforms in it, ultimately serializing to XHTML.



If I have a Unicode character in the XML which is outside of the
BMP, such as this one: 🇺🇸  (that's an American flag, in case your
mail reader doesn't render it correctly), then I end up getting a
series of bytes coming from Cocoon after the transform that look
like UTF-16.



Here's what's in the XML:



Test🇺🇸



Just like that. The bytes in the message for the flag character
are:



f0  9f  87  ba  f0  9f  87  b8



When rendering that into XHTML, I'm getting this in the output:



Test



The American flag in Unicode reference can be found here:
https://apps.timwhitlock.info/unicode/inspect?s=%F0%9F%87%BA%F0%9F%87%

B8


  You can see it broken down a bit better here for "Regional U":
http://www.fileformat.info/info/unicode/char/1f1fa/index.htm



and "Regional S":
http://www.fileformat.info/info/unicode/char/1f1f8/index.htm



What's happening is that some component in Cocoon has decided to
generate HTML entities instead of just emitting the character.
That's okay IMO. But what it does doesn't make sense for a UTF-8
output encodin g.



The first two entities "��" are the decimal numbers
that represent the UTF-16 character for that "Regional Indicator
Symbol Letter U" and they are correct... for UTF-16. If I change
the output encoding from UTF-8 to UTF0-16, then the browser will
render these correctly. Using UTF-8, they show as four of those
ugly [?] characters on the screen.



I had originally just decided to throw up my hands and use UTF-16
encoding even though it's dumb. But it seems that MSIE cannot be
convinced to use UTF-16 no matter what, and I must continue to
support MSIE. :(



So it's back to UTF-8 for me.



How can I get Cocoon to output that character (or "those
characters") correctly?



It needs to be one of the following:



🇺🇸 (HTML decimal entities)
🇺🇸 (HTML hex entities) f0  9f  87  ba
f0  9f  87  b8 (raw UTF-8 bytes)



Does anyone know how/where this conversion is being performed ion
Cocoon? Probably in a XHTML serializer (I'm using
org.apache.cocoon.serialization.XMLSerializer). I'm using
mime-type "text/html" and UTF-8 in my sitemap
for that serializer (the one named "xhtml"). I believe I've mads
very few changes from the default, if any.



I haven't yet figured out how to get from what Java sees (\uE50C
for the "S" for example) to 🇸, but knowing where the code
is that is making that decision would be very helpful.



Any ideas?



-chris


I created a text file (UTF-8) containing only the flag and read it in
using Java and printed all of the code points. There should be 2
"characters" in the file. It's 4 bytes per UTF-8 character so I
assumed I'd end up with 2 'char' primitives in the file, but I ended
up with more.

Here's the loop and the output:

 try(java.io.FileReader in = new java.io.FileReader("file.txt"))
{
 char[] chars = new char[10];

 int count = in.read(chars);

 for(int i=0; i

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: Getting UTF-16 encoding on dynamic content regardless of output content type

2018-10-30 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

All,

Some additional information at the end.

On 10/30/18 11:58, Christopher Schultz wrote:
> All,
> 
> I'm attempting to do everything with UTF-8 in Cocoon 2.1.11. I have
> a servlet generating XML in UTF-8 encoding and I have a pipeline
> with a few transforms in it, ultimately serializing to XHTML.
> 
> If I have a Unicode character in the XML which is outside of the
> BMP, such as this one: 🇺🇸  (that's an American flag, in case your
> mail reader doesn't render it correctly), then I end up getting a
> series of bytes coming from Cocoon after the transform that look
> like UTF-16.
> 
> Here's what's in the XML:
> 
> Test🇺🇸
> 
> Just like that. The bytes in the message for the flag character
> are:
> 
> f0  9f  87  ba  f0  9f  87  b8
> 
> When rendering that into XHTML, I'm getting this in the output:
> 
> Test
> 
> The American flag in Unicode reference can be found here: 
> https://apps.timwhitlock.info/unicode/inspect?s=%F0%9F%87%BA%F0%9F%87%
B8
>
>  You can see it broken down a bit better here for "Regional U": 
> http://www.fileformat.info/info/unicode/char/1f1fa/index.htm
> 
> and "Regional S": 
> http://www.fileformat.info/info/unicode/char/1f1f8/index.htm
> 
> What's happening is that some component in Cocoon has decided to 
> generate HTML entities instead of just emitting the character.
> That's okay IMO. But what it does doesn't make sense for a UTF-8
> output encodin g.
> 
> The first two entities "��" are the decimal numbers
> that represent the UTF-16 character for that "Regional Indicator
> Symbol Letter U" and they are correct... for UTF-16. If I change
> the output encoding from UTF-8 to UTF0-16, then the browser will
> render these correctly. Using UTF-8, they show as four of those
> ugly [?] characters on the screen.
> 
> I had originally just decided to throw up my hands and use UTF-16 
> encoding even though it's dumb. But it seems that MSIE cannot be 
> convinced to use UTF-16 no matter what, and I must continue to
> support MSIE. :(
> 
> So it's back to UTF-8 for me.
> 
> How can I get Cocoon to output that character (or "those
> characters") correctly?
> 
> It needs to be one of the following:
> 
> 🇺🇸 (HTML decimal entities) 
> 🇺🇸 (HTML hex entities) f0  9f  87  ba
> f0  9f  87  b8 (raw UTF-8 bytes)
> 
> Does anyone know how/where this conversion is being performed ion 
> Cocoon? Probably in a XHTML serializer (I'm using 
> org.apache.cocoon.serialization.XMLSerializer). I'm using
> mime-type "text/html" and UTF-8 in my sitemap
> for that serializer (the one named "xhtml"). I believe I've mads
> very few changes from the default, if any.
> 
> I haven't yet figured out how to get from what Java sees (\uE50C
> for the "S" for example) to 🇸, but knowing where the code
> is that is making that decision would be very helpful.
> 
> Any ideas?
> 
> -chris

I created a text file (UTF-8) containing only the flag and read it in
using Java and printed all of the code points. There should be 2
"characters" in the file. It's 4 bytes per UTF-8 character so I
assumed I'd end up with 2 'char' primitives in the file, but I ended
up with more.

Here's the loop and the output:

try(java.io.FileReader in = new java.io.FileReader("file.txt"))
{
char[] chars = new char[10];

int count = in.read(chars);

for(int i=0; ihttps://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlvYhDgACgkQHPApP6U8
pFjPZRAAs9jgubhuIVMs52AmAEPXqVSuG8Y18t7RP7W2F5XouZ69SXqihUKmYODM
tQnOlyGghfUnXAkQ3uVNLjbx+dSKGwpuQbkb8987Po6AgzweL9stmqzowdn4Zcam
ow7aZSp8gmxa31YHbb7pphGPnjzVqr84Mz9MCCCcSMg/1ZkvayarJTWhYkBgeWip
wdxbR2nP7wYNLkEy+v4hLvIcWYI8IeuA2nWb5qNvb6zFVYkPLZZGhdOm19J06cHR
Rvxb83g+8X80ngP6Uztbg0p4/qa7vfJXlM46iCEqOM/7+eE0gMwOGk7Akbt+2Utd
sSNUChUPzgeRZkzSAbOZcnDhGLXCWodEM75GL1nDJED1N+gWtJwRDb4kfLdY337R
ghiVB9yupjFZFhho2BArWl58hx8WrQ9Lawsrn/OFOTjea9A+3/k9QYYCpMObpwJ9
rhTA1bQV9rQbbPC2CG1iajAlb5Moe7tWF1AmhJsqFXKPjMGiIwBlOKRAgcaIZxbr
rJRI4SKDkbIlCTWKOqe4cT/HgDQ/O9mBynZ353EcmSrr4Oye8k91e8SRjUh3UdLh
XfRnMcEKEwJfIzv+JZgJQK8kwERM4mxLrf3tdhvo9IUwN44Z5QKjZjQHbkYQaIT/
m58tqqNmApzH3gyWeyd6F7HqeTO8wlaRMCipBVX6/SW1Qop2Qno=
=YXAW
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Getting UTF-16 encoding on dynamic content regardless of output content type

2018-10-30 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

All,

I'm attempting to do everything with UTF-8 in Cocoon 2.1.11. I have a
servlet generating XML in UTF-8 encoding and I have a pipeline with a
few transforms in it, ultimately serializing to XHTML.

If I have a Unicode character in the XML which is outside of the BMP,
such as this one: 🇺🇸  (that's an American flag, in case your mail
reader doesn't render it correctly), then I end up getting a series of
bytes coming from Cocoon after the transform that look like UTF-16.

Here's what's in the XML:

Test🇺🇸

Just like that. The bytes in the message for the flag character are:

f0  9f  87  ba  f0  9f  87  b8

When rendering that into XHTML, I'm getting this in the output:

Test

The American flag in Unicode reference can be found here:
https://apps.timwhitlock.info/unicode/inspect?s=%F0%9F%87%BA%F0%9F%87%B8

You can see it broken down a bit better here for "Regional U":
http://www.fileformat.info/info/unicode/char/1f1fa/index.htm

and "Regional S":
http://www.fileformat.info/info/unicode/char/1f1f8/index.htm

What's happening is that some component in Cocoon has decided to
generate HTML entities instead of just emitting the character. That's
okay IMO. But what it does doesn't make sense for a UTF-8 output encodin
g.

The first two entities "��" are the decimal numbers that
represent the UTF-16 character for that "Regional Indicator Symbol
Letter U" and they are correct... for UTF-16. If I change the output
encoding from UTF-8 to UTF0-16, then the browser will render these
correctly. Using UTF-8, they show as four of those ugly [?] characters
on the screen.

I had originally just decided to throw up my hands and use UTF-16
encoding even though it's dumb. But it seems that MSIE cannot be
convinced to use UTF-16 no matter what, and I must continue to support
MSIE. :(

So it's back to UTF-8 for me.

How can I get Cocoon to output that character (or "those characters")
correctly?

It needs to be one of the following:

🇺🇸 (HTML decimal entities)
🇺🇸 (HTML hex entities)
f0  9f  87  ba  f0  9f  87  b8 (raw UTF-8 bytes)

Does anyone know how/where this conversion is being performed ion
Cocoon? Probably in a XHTML serializer (I'm using
org.apache.cocoon.serialization.XMLSerializer). I'm using mime-type
"text/html" and UTF-8 in my sitemap for that
serializer (the one named "xhtml"). I believe I've mads very few
changes from the default, if any.

I haven't yet figured out how to get from what Java sees (\uE50C for
the "S" for example) to 🇸, but knowing where the code is that
is making that decision would be very helpful.

Any ideas?

- -chris

-BEGIN PGP SIGNATURE-
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlvYf7cACgkQHPApP6U8
pFhSdg/+NFO0iHGiACYgLyOJoZBay3XTDLptbynh/nTk+RHua7kLoYx4OFE9kLSu
Kf5psWFNrhsr3aRiJ7zmhqronlwG8M2WP8cqSAC8HlYmxTy9eJmrVfGQMLmH4OWB
KaNmRoDW3TCTTQYTkVHFSVv1GxfZVwO1bZrILPgIRgflVNzuERqYCmrdkxRK1z3i
Qau8WKQ/sKBmIAOhlrXALCkU5yfhn6zQpD5A8mmqUZHJACxvyOFhlT+jrqrlWx47
pVmtyyXZxAMc2KqrG9jlY5fG+Jzv3FAyTuCZzZWmgPEGbrdeZdlJi5IlYI6Sm4zZ
nk5d1153wB4+y/JfU/wR4rn22XfbKpS4I1j03vfuGO/WNa1a+WEZ70M3yd6LYveK
JDX6MDFIRt+PvGcC3pxq08iBpzmTaGfaYJU9JY3Ywii51CmzCSxHNjB48NEIYS9C
KTehmgio2MVIVh2mu3p6NV4RoVF81LSiJk+q3OpsKnTAjC85WtuSO/ntLiZwFK2R
USrtpE/nZdF4fZqgSnTJMml7ogc91upcHG8HB3oz1rS256SjhH48ug1XcDAEinEK
cvwonUEKsM33l0apKdk0RdcdQXmWZJVxcOtxphzDYHW9VvaDhNp3yVDAJt+hnlgO
8Pps5av4iyW7KffHFFQf3xPEaYhZYYDniVZTSIFSDAg4OHrBJ/4=
=bW4T
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: CForms Character Encoding

2012-12-04 Thread Francesco Chicchiriccò

On 04/12/2012 13:08, Peter Sparkes wrote:

On 04/12/2012 09:15, Peter Sparkes wrote:

On 04/12/2012 08:46, Francesco Chicchiriccò wrote:

On 04/12/2012 09:40, Peter Sparkes wrote:

I am using C.2.11

I have a CForm implementation and there are, in the xml text, 
special characters such as:


£, Â, ⅗ and â

I am using UTF-8 encoding and such characters in the xml file are 
correctly displaced in the CForm and when the form is saved they 
are correctly saved in the xml file.


However, I am building another CForm, within the same Cocoon 
application, this time using the JXTemplate Generator and I am 
having encoding problems; the above characters are not correctly 
saved in the xml file.


In the sitemap I have:





I suspect the locale setting but do not know how to set it to UTF-8


Hi Peter,
did you take a look at [1] and [2] (for C2.2 but most concepts apply 
to C2.1 as well)?


Regards.

[1] http://wiki.apache.org/cocoon/RequestParameterEncoding
[2] http://cocoon.apache.org/2.2/1366_1_1.html


Hi Francesco,

No, but I will now

Thank you

Peter


Hi Francesco,

All working now

Thank you again


You're welcome :-)

Regard.

--
Francesco Chicchiriccò

ASF Member, Apache Syncope PMC chair, Apache Cocoon PMC Member
http://people.apache.org/~ilgrosso/


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: CForms Character Encoding

2012-12-04 Thread Peter Sparkes

On 04/12/2012 09:15, Peter Sparkes wrote:

On 04/12/2012 08:46, Francesco Chicchiriccò wrote:

On 04/12/2012 09:40, Peter Sparkes wrote:

I am using C.2.11

I have a CForm implementation and there are, in the xml text, special 
characters such as:

£, Â, ⅗ and â

I am using UTF-8 encoding and such characters in the xml file are correctly displaced in the 
CForm and when the form is saved they are correctly saved in the xml file.


However, I am building another CForm, within the same Cocoon application, this time using the 
JXTemplate Generator and I am having encoding problems; the above characters are not correctly 
saved in the xml file.


In the sitemap I have:





I suspect the locale setting but do not know how to set it to UTF-8


Hi Peter,
did you take a look at [1] and [2] (for C2.2 but most concepts apply to C2.1 as 
well)?

Regards.

[1] http://wiki.apache.org/cocoon/RequestParameterEncoding
[2] http://cocoon.apache.org/2.2/1366_1_1.html


Hi Francesco,

No, but I will now

Thank you

Peter


Hi Francesco,

All working now

Thank you again

Peter

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: CForms Character Encoding

2012-12-04 Thread Peter Sparkes

On 04/12/2012 08:46, Francesco Chicchiriccò wrote:

On 04/12/2012 09:40, Peter Sparkes wrote:

I am using C.2.11

I have a CForm implementation and there are, in the xml text, special 
characters such as:

£, Â, ⅗ and â

I am using UTF-8 encoding and such characters in the xml file are correctly displaced in the 
CForm and when the form is saved they are correctly saved in the xml file.


However, I am building another CForm, within the same Cocoon application, this time using the 
JXTemplate Generator and I am having encoding problems; the above characters are not correctly 
saved in the xml file.


In the sitemap I have:





I suspect the locale setting but do not know how to set it to UTF-8


Hi Peter,
did you take a look at [1] and [2] (for C2.2 but most concepts apply to C2.1 as 
well)?

Regards.

[1] http://wiki.apache.org/cocoon/RequestParameterEncoding
[2] http://cocoon.apache.org/2.2/1366_1_1.html


Hi Francesco,

No, but I will now

Thank you

Peter

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: CForms Character Encoding

2012-12-04 Thread Francesco Chicchiriccò

On 04/12/2012 09:40, Peter Sparkes wrote:

I am using C.2.11

I have a CForm implementation and there are, in the xml text, special 
characters such as:


£, Â, ⅗ and â

I am using UTF-8 encoding and such characters in the xml file are 
correctly displaced in the CForm and when the form is saved they are 
correctly saved in the xml file.


However, I am building another CForm, within the same Cocoon 
application, this time using the JXTemplate Generator and I am having 
encoding problems; the above characters are not correctly saved in the 
xml file.


In the sitemap I have:





I suspect the locale setting but do not know how to set it to UTF-8


Hi Peter,
did you take a look at [1] and [2] (for C2.2 but most concepts apply to 
C2.1 as well)?


Regards.

[1] http://wiki.apache.org/cocoon/RequestParameterEncoding
[2] http://cocoon.apache.org/2.2/1366_1_1.html

--
Francesco Chicchiriccò

ASF Member, Apache Syncope PMC chair, Apache Cocoon PMC Member
http://people.apache.org/~ilgrosso/


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



CForms Character Encoding

2012-12-04 Thread Peter Sparkes

I am using C.2.11

I have a CForm implementation and there are, in the xml text, special 
characters such as:

£, Â, ⅗ and â

I am using UTF-8 encoding and such characters in the xml file are correctly displaced in the CForm 
and when the form is saved they are correctly saved in the xml file.


However, I am building another CForm, within the same Cocoon application, this time using the 
JXTemplate Generator and I am having encoding problems; the above characters are not correctly saved 
in the xml file.


In the sitemap I have:





I suspect the locale setting but do not know how to set it to UTF-8

Help please

Peter





-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: more encoding problems

2012-09-20 Thread Jos Snellings
Just wanted to have a look.
I compared to with what I had, and it is the same. No luck.

On Thu, Sep 20, 2012 at 3:18 PM,  wrote:

>
> if you are referring to this part, it's like this:
> 
>   container-**encoding
>   ISO-8859-1
> 
>
> 
> 
>   form-encoding
>   UTF-8
> 
>
> Or some else setting you are interested in?
> This works home and this works in another remote server. I also copied the
> java from working server to this, but nada..
> Tomcat is different, server.xml clone.
>
>
>
> On Thu, 20 Sep 2012 15:05:33 +0200, Jos Snellings <
> jos.snelli...@upperware.biz> wrote:
>
>> What does your web.xml look like?
>>
>> On Thu, Sep 20, 2012 at 2:51 PM,  wrote:
>>
>>
>>  Also XML-element tags in xsp-files have question marks instead of
>> scands.
>>  I also verified that the database is delivering the right stuff. The
>> same database used in this
>> (http://88.148.163.59/cocoon/**palaute_app/linkki/html/1059<http://88.148.163.59/cocoon/palaute_app/linkki/html/1059>[2]),
>>  works
>>
>> with local copy of cocoon.
>>
>>  Updated data, just checked this out. Even this one will fail.
>>  sitemap:
>>
>>  koe.xsp:
>>
>>
>>  Works locally.
>>
>>  - mika -
>>
>>  On Thu, 20 Sep 2012 15:14:35 +0300,  wrote:
>>
>>   Hi Jos,
>>  yep, that's right, the text seems not to be utf-8.
>>  Same problem occurs in another page where the text not working comes
>>  from flowscript.
>>  1) No it's not, it's a copy. But you can verify everything is ok in
>>  
>> http://88.148.163.59/cocoon/**palaute_app/linkki/1059<http://88.148.163.59/cocoon/palaute_app/linkki/1059>[5]
>>  (database
>>
>> reader).
>>  Seems to me, that the text falls apart in the text generator.
>>  2) the url is in private network
>>
>>  - mika -
>>
>>  On Thu, 20 Sep 2012 11:42:32 +0200, Jos Snellings
>>   wrote:
>>   Hi Mika,
>>
>>  Your page declares utf-8 as a character set, but the text you send
>> is
>>  not in utf-8. Hence the question mark
>>  characters.
>>  Question 1: is it the same database that is accessed? (or rather, a
>>  copy)
>>  Question 2: what is the database url?
>>
>>  Cheers,
>>  Jos
>>
>>  On Thu, Sep 20, 2012 at 9:22 AM,  wrote:
>>
>>   Hi,
>>   C2.11, I moved my app from one server to another, which resulted
>>  some pages to broke that is scands won't encode properly.
>>   I have this in my sitemap:
>>
>>   You can try this out:
>>   Old server
>>   
>> http://77.240.23.91/cocoon/**palaute_app/linkki/841<http://77.240.23.91/cocoon/palaute_app/linkki/841>[7]
>>  [2] works
>>   
>> http://77.240.23.91/cocoon/**palaute_app/linkki/html/841<http://77.240.23.91/cocoon/palaute_app/linkki/html/841>[8]
>>  [3]
>> works
>>   New server
>>   
>> http://88.148.163.59/cocoon/**palaute_app/linkki/1059<http://88.148.163.59/cocoon/palaute_app/linkki/1059>[9]
>>  [4] works
>>   
>> http://88.148.163.59/cocoon/**palaute_app/linkki/html/1059<http://88.148.163.59/cocoon/palaute_app/linkki/html/1059>[10]
>>  [5]
>>
>> doesn't
>>  work !!
>>
>>   Any thoughts?
>>   I have tried a lot of things without success.
>>
>>   - mika -
>>
>>
>> --**--**-
>>   To unsubscribe, e-mail: 
>> users-unsubscribe@cocoon.**apache.org[11]
>> [6]
>>   For additional commands, e-mail: users-h...@cocoon.apache.org [12]
>>
>> [7]
>>
>>  --
>>  The doctrine of human equality reposes on this: that there is no man
>>  really clever who has not found that he is stupid.
>>  -- Gilbert K. Chesterson
>>
>>  Links:
>>  --
>>  [1] mailto:m...@digikartta.net [13]
>>  [2] 
>> http://77.240.23.91/cocoon/**palaute_app/linkki/841<http://77.240.23.91/cocoon/palaute_app/linkki/841>[14]
>>  [3] 
>> http://77.240.23.91/cocoon/**palaute_app/linkki/html/841<http://77.240.23.91/cocoon/palaute_app/linkki/html/841>[15]
>>  [4] 
>> http://88.148.163.59/cocoon/**palaute_app/linkki/1059<http://88.148.163.59/cocoon/palaute_app/linkki/1059>[16]
>>  [5] 
>> http://88.148.163.59/cocoon/**palaute_app/linkki/html/1059<http://88.148.163.59/cocoon/palaute_app/linkki/html/1059>[17]
>>  [6] 
>> mailto:users-unsubscribe@**cocoon.apache.org[18]
>>  [7] mailto:u

Re: more encoding problems

2012-09-20 Thread mika


if you are referring to this part, it's like this:

  container-encoding
  ISO-8859-1




  form-encoding
  UTF-8


Or some else setting you are interested in?
This works home and this works in another remote server. I also copied 
the java from working server to this, but nada..

Tomcat is different, server.xml clone.


On Thu, 20 Sep 2012 15:05:33 +0200, Jos Snellings 
 wrote:

What does your web.xml look like?

On Thu, Sep 20, 2012 at 2:51 PM,  wrote:

 Also XML-element tags in xsp-files have question marks instead of
scands.
 I also verified that the database is delivering the right stuff. The
same database used in this
(http://88.148.163.59/cocoon/palaute_app/linkki/html/1059 [2]), works
with local copy of cocoon.

 Updated data, just checked this out. Even this one will fail.
 sitemap:

 koe.xsp:

 Works locally.

 - mika -

 On Thu, 20 Sep 2012 15:14:35 +0300,  wrote:
  Hi Jos,
 yep, that's right, the text seems not to be utf-8.
 Same problem occurs in another page where the text not working comes
 from flowscript.
 1) No it's not, it's a copy. But you can verify everything is ok in
 http://88.148.163.59/cocoon/palaute_app/linkki/1059 [5] (database
reader).
 Seems to me, that the text falls apart in the text generator.
 2) the url is in private network

 - mika -

 On Thu, 20 Sep 2012 11:42:32 +0200, Jos Snellings
  wrote:
  Hi Mika,

 Your page declares utf-8 as a character set, but the text you send
is
 not in utf-8. Hence the question mark
 characters.
 Question 1: is it the same database that is accessed? (or rather, a
 copy)
 Question 2: what is the database url?

 Cheers,
 Jos

 On Thu, Sep 20, 2012 at 9:22 AM,  wrote:

  Hi,
  C2.11, I moved my app from one server to another, which resulted
 some pages to broke that is scands won't encode properly.
  I have this in my sitemap:

  You can try this out:
  Old server
  http://77.240.23.91/cocoon/palaute_app/linkki/841 [7] [2] works
  http://77.240.23.91/cocoon/palaute_app/linkki/html/841 [8] [3]
works
  New server
  http://88.148.163.59/cocoon/palaute_app/linkki/1059 [9] [4] works
  http://88.148.163.59/cocoon/palaute_app/linkki/html/1059 [10] [5]
doesn't
 work !!

  Any thoughts?
  I have tried a lot of things without success.

  - mika -


-
  To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org [11]
[6]
  For additional commands, e-mail: users-h...@cocoon.apache.org [12]
[7]

 --
 The doctrine of human equality reposes on this: that there is no man
 really clever who has not found that he is stupid.
         -- Gilbert K. Chesterson

 Links:
 --
 [1] mailto:m...@digikartta.net [13]
 [2] http://77.240.23.91/cocoon/palaute_app/linkki/841 [14]
 [3] http://77.240.23.91/cocoon/palaute_app/linkki/html/841 [15]
 [4] http://88.148.163.59/cocoon/palaute_app/linkki/1059 [16]
 [5] http://88.148.163.59/cocoon/palaute_app/linkki/html/1059 [17]
 [6] mailto:users-unsubscr...@cocoon.apache.org [18]
 [7] mailto:users-h...@cocoon.apache.org [19]


-
 To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org [20]
 For additional commands, e-mail: users-h...@cocoon.apache.org [21]


-
 To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org [22]
 For additional commands, e-mail: users-h...@cocoon.apache.org [23]

--
The doctrine of human equality reposes on this: that there is no man
really clever who has not found that he is stupid.
        -- Gilbert K. Chesterson



Links:
--
[1] mailto:m...@digikartta.net
[2] http://88.148.163.59/cocoon/palaute_app/linkki/html/1059
[3] http://apache.org/xsp
[4] mailto:m...@digikartta.net
[5] http://88.148.163.59/cocoon/palaute_app/linkki/1059
[6] mailto:jos.snelli...@upperware.biz
[7] http://77.240.23.91/cocoon/palaute_app/linkki/841
[8] http://77.240.23.91/cocoon/palaute_app/linkki/html/841
[9] http://88.148.163.59/cocoon/palaute_app/linkki/1059
[10] http://88.148.163.59/cocoon/palaute_app/linkki/html/1059
[11] mailto:users-unsubscr...@cocoon.apache.org
[12] mailto:users-h...@cocoon.apache.org
[13] mailto:m...@digikartta.net
[14] http://77.240.23.91/cocoon/palaute_app/linkki/841
[15] http://77.240.23.91/cocoon/palaute_app/linkki/html/841
[16] http://88.148.163.59/cocoon/palaute_app/linkki/1059
[17] http://88.148.163.59/cocoon/palaute_app/linkki/html/1059
[18] mailto:users-unsubscr...@cocoon.apache.org
[19] mailto:users-h...@cocoon.apache.org
[20] mailto:users-unsubscr...@cocoon.apache.org
[21] mailto:users-h...@cocoon.apache.org
[22] mailto:users-unsubscr...@cocoon.apache.org
[23] mailto:users-h...@cocoon.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: more encoding problems

2012-09-20 Thread Jos Snellings
What does your web.xml look like?

On Thu, Sep 20, 2012 at 2:51 PM,  wrote:

>
> Also XML-element tags in xsp-files have question marks instead of scands.
> I also verified that the database is delivering the right stuff. The same
> database used in this (http://88.148.163.59/cocoon/**
> palaute_app/linkki/html/1059),
> works with local copy of cocoon.
>
> Updated data, just checked this out. Even this one will fail.
> sitemap:
> 
> 
> 
> 
> 
>
> koe.xsp:
> 
>xmlns:xsp="http://apache.org/**xsp ">
> 
> <Äiti>
> 
> 
>
> Works locally.
>
> - mika -
>
>
> On Thu, 20 Sep 2012 15:14:35 +0300,  wrote:
>
>> Hi Jos,
>> yep, that's right, the text seems not to be utf-8.
>> Same problem occurs in another page where the text not working comes
>> from flowscript.
>> 1) No it's not, it's a copy. But you can verify everything is ok in
>> http://88.148.163.59/cocoon/**palaute_app/linkki/1059(database
>>  reader).
>> Seems to me, that the text falls apart in the text generator.
>> 2) the url is in private network
>>
>> - mika -
>>
>> On Thu, 20 Sep 2012 11:42:32 +0200, Jos Snellings
>>  wrote:
>>
>>> Hi Mika,
>>>
>>> Your page declares utf-8 as a character set, but the text you send is
>>> not in utf-8. Hence the question mark
>>> characters.
>>> Question 1: is it the same database that is accessed? (or rather, a
>>> copy)
>>> Question 2: what is the database url?
>>>
>>> Cheers,
>>> Jos
>>>
>>> On Thu, Sep 20, 2012 at 9:22 AM,  wrote:
>>>
>>>  Hi,
>>>  C2.11, I moved my app from one server to another, which resulted
>>> some pages to broke that is scands won't encode properly.
>>>  I have this in my sitemap:
>>>
>>>  You can try this out:
>>>  Old server
>>>  
>>> http://77.240.23.91/cocoon/**palaute_app/linkki/841[2]
>>>  works
>>>  
>>> http://77.240.23.91/cocoon/**palaute_app/linkki/html/841[3]
>>>  works
>>>  New server
>>>  
>>> http://88.148.163.59/cocoon/**palaute_app/linkki/1059[4]
>>>  works
>>>  
>>> http://88.148.163.59/cocoon/**palaute_app/linkki/html/1059[5]
>>>  doesn't
>>> work !!
>>>
>>>  Any thoughts?
>>>  I have tried a lot of things without success.
>>>
>>>  - mika -
>>>
>>>
>>>
>>> --**--**
>>> -
>>>  To unsubscribe, e-mail: 
>>> users-unsubscribe@cocoon.**apache.org[6]
>>>  For additional commands, e-mail: users-h...@cocoon.apache.org [7]
>>>
>>> --
>>> The doctrine of human equality reposes on this: that there is no man
>>> really clever who has not found that he is stupid.
>>> -- Gilbert K. Chesterson
>>>
>>>
>>>
>>> Links:
>>> --
>>> [1] mailto:m...@digikartta.net
>>> [2] 
>>> http://77.240.23.91/cocoon/**palaute_app/linkki/841
>>> [3] 
>>> http://77.240.23.91/cocoon/**palaute_app/linkki/html/841
>>> [4] 
>>> http://88.148.163.59/cocoon/**palaute_app/linkki/1059
>>> [5] 
>>> http://88.148.163.59/cocoon/**palaute_app/linkki/html/1059
>>> [6] 
>>> mailto:users-unsubscribe@**cocoon.apache.org
>>> [7] mailto:users-help@cocoon.**apache.org 
>>>
>>
>>
>> --**--**-
>> To unsubscribe, e-mail: 
>> users-unsubscribe@cocoon.**apache.org
>> For additional commands, e-mail: users-h...@cocoon.apache.org
>>
>
>
> --**--**-
> To unsubscribe, e-mail: 
> users-unsubscribe@cocoon.**apache.org
> For additional commands, e-mail: users-h...@cocoon.apache.org
>
>


-- 
The doctrine of human equality reposes on this: that there is no man
really clever who has not found that he is stupid.
-- Gilbert K. Chesterson


Re: more encoding problems

2012-09-20 Thread mika


Also XML-element tags in xsp-files have question marks instead of 
scands.
I also verified that the database is delivering the right stuff. The 
same database used in this 
(http://88.148.163.59/cocoon/palaute_app/linkki/html/1059), works with 
local copy of cocoon.


Updated data, just checked this out. Even this one will fail.
sitemap:






koe.xsp:

http://apache.org/xsp";>

<Äiti>



Works locally.

- mika -

On Thu, 20 Sep 2012 15:14:35 +0300,  wrote:

Hi Jos,
yep, that's right, the text seems not to be utf-8.
Same problem occurs in another page where the text not working comes
from flowscript.
1) No it's not, it's a copy. But you can verify everything is ok in
http://88.148.163.59/cocoon/palaute_app/linkki/1059 (database 
reader).

Seems to me, that the text falls apart in the text generator.
2) the url is in private network

- mika -

On Thu, 20 Sep 2012 11:42:32 +0200, Jos Snellings
 wrote:

Hi Mika,

Your page declares utf-8 as a character set, but the text you send 
is

not in utf-8. Hence the question mark
characters.
Question 1: is it the same database that is accessed? (or rather, a
copy)
Question 2: what is the database url?

Cheers,
Jos

On Thu, Sep 20, 2012 at 9:22 AM,  wrote:

 Hi,
 C2.11, I moved my app from one server to another, which resulted
some pages to broke that is scands won't encode properly.
 I have this in my sitemap:

 You can try this out:
 Old server
 http://77.240.23.91/cocoon/palaute_app/linkki/841 [2] works
 http://77.240.23.91/cocoon/palaute_app/linkki/html/841 [3] works
 New server
 http://88.148.163.59/cocoon/palaute_app/linkki/1059 [4] works
 http://88.148.163.59/cocoon/palaute_app/linkki/html/1059 [5] 
doesn't

work !!

 Any thoughts?
 I have tried a lot of things without success.

 - mika -



-
 To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org [6]
 For additional commands, e-mail: users-h...@cocoon.apache.org [7]

--
The doctrine of human equality reposes on this: that there is no man
really clever who has not found that he is stupid.
        -- Gilbert K. Chesterson



Links:
--
[1] mailto:m...@digikartta.net
[2] http://77.240.23.91/cocoon/palaute_app/linkki/841
[3] http://77.240.23.91/cocoon/palaute_app/linkki/html/841
[4] http://88.148.163.59/cocoon/palaute_app/linkki/1059
[5] http://88.148.163.59/cocoon/palaute_app/linkki/html/1059
[6] mailto:users-unsubscr...@cocoon.apache.org
[7] mailto:users-h...@cocoon.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: more encoding problems

2012-09-20 Thread mika


Hi Jos,
yep, that's right, the text seems not to be utf-8.
Same problem occurs in another page where the text not working comes 
from flowscript.
1) No it's not, it's a copy. But you can verify everything is ok in 
http://88.148.163.59/cocoon/palaute_app/linkki/1059 (database reader). 
Seems to me, that the text falls apart in the text generator.

2) the url is in private network

- mika -

On Thu, 20 Sep 2012 11:42:32 +0200, Jos Snellings 
 wrote:

Hi Mika,

Your page declares utf-8 as a character set, but the text you send is
not in utf-8. Hence the question mark
characters.
Question 1: is it the same database that is accessed? (or rather, a
copy)
Question 2: what is the database url?

Cheers,
Jos

On Thu, Sep 20, 2012 at 9:22 AM,  wrote:

 Hi,
 C2.11, I moved my app from one server to another, which resulted
some pages to broke that is scands won't encode properly.
 I have this in my sitemap:

 You can try this out:
 Old server
 http://77.240.23.91/cocoon/palaute_app/linkki/841 [2] works
 http://77.240.23.91/cocoon/palaute_app/linkki/html/841 [3] works
 New server
 http://88.148.163.59/cocoon/palaute_app/linkki/1059 [4] works
 http://88.148.163.59/cocoon/palaute_app/linkki/html/1059 [5] doesn't
work !!

 Any thoughts?
 I have tried a lot of things without success.

 - mika -


-
 To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org [6]
 For additional commands, e-mail: users-h...@cocoon.apache.org [7]

--
The doctrine of human equality reposes on this: that there is no man
really clever who has not found that he is stupid.
        -- Gilbert K. Chesterson



Links:
--
[1] mailto:m...@digikartta.net
[2] http://77.240.23.91/cocoon/palaute_app/linkki/841
[3] http://77.240.23.91/cocoon/palaute_app/linkki/html/841
[4] http://88.148.163.59/cocoon/palaute_app/linkki/1059
[5] http://88.148.163.59/cocoon/palaute_app/linkki/html/1059
[6] mailto:users-unsubscr...@cocoon.apache.org
[7] mailto:users-h...@cocoon.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: more encoding problems

2012-09-20 Thread Jos Snellings
Hi Mika,

Your page declares utf-8 as a character set, but the text you send is not
in utf-8. Hence the question mark
characters.
Question 1: is it the same database that is accessed? (or rather, a copy)
Question 2: what is the database url?

Cheers,
Jos

On Thu, Sep 20, 2012 at 9:22 AM,  wrote:

>
> Hi,
> C2.11, I moved my app from one server to another, which resulted some
> pages to broke that is scands won't encode properly.
> I have this in my sitemap:
>
> 
> 
> 
> 
> 
> 
> 
>
> 
> 
> 
> 
> 
> 
> 
> 
> 
>
> You can try this out:
> Old server
> http://77.240.23.91/cocoon/**palaute_app/linkki/841works
> http://77.240.23.91/cocoon/**palaute_app/linkki/html/841works
> New server
> http://88.148.163.59/cocoon/**palaute_app/linkki/1059works
> http://88.148.163.59/cocoon/**palaute_app/linkki/html/1059doesn't
>  work !!
>
> Any thoughts?
> I have tried a lot of things without success.
>
> - mika -
>
> --**--**-
> To unsubscribe, e-mail: 
> users-unsubscribe@cocoon.**apache.org
> For additional commands, e-mail: users-h...@cocoon.apache.org
>
>


-- 
The doctrine of human equality reposes on this: that there is no man
really clever who has not found that he is stupid.
-- Gilbert K. Chesterson


RE: more encoding problems

2012-09-20 Thread mika


Wrong call,
that ISO-8859-1 was just a test to ensure this setting would have some 
meaning. It has, but the scands disappear before serialization, I think.


- mika -


On Thu, 20 Sep 2012 10:15:16 +0200, Robby Pelssers 
 wrote:

Can you check which serializer you're using and did you explicitly
set the encoding?

http://cocoon.apache.org/2.1/userdocs/default/html-serializer.html
http://cocoon.apache.org/2.1/userdocs/xhtml-serializer.html





-Original Message-
From: m...@digikartta.net [mailto:m...@digikartta.net]
Sent: Thursday, September 20, 2012 10:11 AM
To: users@cocoon.apache.org
Subject: RE: more encoding problems


 Yep,
 I noticed that too. But where it is coming from?
 Remember, the whole cocoon directory is a copy from old to new.
 I also copied e.g. Tomcat server.xml same way.

 - mika -


 On Thu, 20 Sep 2012 10:03:46 +0200, Robby Pelssers
  wrote:

The one that does not work has following encoding in 



Robby

-Original Message-
From: m...@digikartta.net [mailto:m...@digikartta.net]
Sent: Thursday, September 20, 2012 9:23 AM
To: users@cocoon.apache.org
Subject: more encoding problems


 Hi,
 C2.11, I moved my app from one server to another, which resulted
some
 pages to broke that is scands won't encode properly.
 I have this in my sitemap:

 
 
 
 
 
 
 

 
 
 
 
 
 
 
 
 

 You can try this out:
 Old server
 http://77.240.23.91/cocoon/palaute_app/linkki/841 works
 http://77.240.23.91/cocoon/palaute_app/linkki/html/841 works
 New server
 http://88.148.163.59/cocoon/palaute_app/linkki/1059 works
 http://88.148.163.59/cocoon/palaute_app/linkki/html/1059 doesn't
work
 !!

 Any thoughts?
 I have tried a lot of things without success.

 - mika -


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: more encoding problems

2012-09-20 Thread Robby Pelssers
Can you check which serializer you're using and did you explicitly set the 
encoding?

http://cocoon.apache.org/2.1/userdocs/default/html-serializer.html
http://cocoon.apache.org/2.1/userdocs/xhtml-serializer.html





-Original Message-
From: m...@digikartta.net [mailto:m...@digikartta.net] 
Sent: Thursday, September 20, 2012 10:11 AM
To: users@cocoon.apache.org
Subject: RE: more encoding problems


 Yep,
 I noticed that too. But where it is coming from?
 Remember, the whole cocoon directory is a copy from old to new.
 I also copied e.g. Tomcat server.xml same way.

 - mika -


 On Thu, 20 Sep 2012 10:03:46 +0200, Robby Pelssers 
  wrote:
> The one that does not work has following encoding in 
>
>  http-equiv="Content-Type">
>
> Robby
>
> -Original Message-
> From: m...@digikartta.net [mailto:m...@digikartta.net]
> Sent: Thursday, September 20, 2012 9:23 AM
> To: users@cocoon.apache.org
> Subject: more encoding problems
>
>
>  Hi,
>  C2.11, I moved my app from one server to another, which resulted 
> some
>  pages to broke that is scands won't encode properly.
>  I have this in my sitemap:
>
>  
>  
>  
>  
>  
>  
>  
>
>  
>  
>  
>  
>  
>  
>  
>  
>  
>
>  You can try this out:
>  Old server
>  http://77.240.23.91/cocoon/palaute_app/linkki/841 works
>  http://77.240.23.91/cocoon/palaute_app/linkki/html/841 works
>  New server
>  http://88.148.163.59/cocoon/palaute_app/linkki/1059 works
>  http://88.148.163.59/cocoon/palaute_app/linkki/html/1059 doesn't 
> work
>  !!
>
>  Any thoughts?
>  I have tried a lot of things without success.
>
>  - mika -
>
> -
> To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
> For additional commands, e-mail: users-h...@cocoon.apache.org
>
>
> -
> To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
> For additional commands, e-mail: users-h...@cocoon.apache.org


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: more encoding problems

2012-09-20 Thread mika


Yep,
I noticed that too. But where it is coming from?
Remember, the whole cocoon directory is a copy from old to new.
I also copied e.g. Tomcat server.xml same way.

- mika -


On Thu, 20 Sep 2012 10:03:46 +0200, Robby Pelssers 
 wrote:

The one that does not work has following encoding in 

http-equiv="Content-Type">


Robby

-Original Message-
From: m...@digikartta.net [mailto:m...@digikartta.net]
Sent: Thursday, September 20, 2012 9:23 AM
To: users@cocoon.apache.org
Subject: more encoding problems


 Hi,
 C2.11, I moved my app from one server to another, which resulted 
some

 pages to broke that is scands won't encode properly.
 I have this in my sitemap:

 
 
 
 
 
 
 

 
 
 
 
 
 
 
 
 

 You can try this out:
 Old server
 http://77.240.23.91/cocoon/palaute_app/linkki/841 works
 http://77.240.23.91/cocoon/palaute_app/linkki/html/841 works
 New server
 http://88.148.163.59/cocoon/palaute_app/linkki/1059 works
 http://88.148.163.59/cocoon/palaute_app/linkki/html/1059 doesn't 
work

 !!

 Any thoughts?
 I have tried a lot of things without success.

 - mika -

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: more encoding problems

2012-09-20 Thread Robby Pelssers
The one that does not work has following encoding in 



Robby

-Original Message-
From: m...@digikartta.net [mailto:m...@digikartta.net] 
Sent: Thursday, September 20, 2012 9:23 AM
To: users@cocoon.apache.org
Subject: more encoding problems


 Hi,
 C2.11, I moved my app from one server to another, which resulted some 
 pages to broke that is scands won't encode properly.
 I have this in my sitemap:

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 You can try this out:
 Old server
 http://77.240.23.91/cocoon/palaute_app/linkki/841 works
 http://77.240.23.91/cocoon/palaute_app/linkki/html/841 works
 New server
 http://88.148.163.59/cocoon/palaute_app/linkki/1059 works
 http://88.148.163.59/cocoon/palaute_app/linkki/html/1059 doesn't work 
 !!

 Any thoughts?
 I have tried a lot of things without success.

 - mika -

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



more encoding problems

2012-09-20 Thread mika


Hi,
C2.11, I moved my app from one server to another, which resulted some 
pages to broke that is scands won't encode properly.

I have this in my sitemap:



















You can try this out:
Old server
http://77.240.23.91/cocoon/palaute_app/linkki/841 works
http://77.240.23.91/cocoon/palaute_app/linkki/html/841 works
New server
http://88.148.163.59/cocoon/palaute_app/linkki/1059 works
http://88.148.163.59/cocoon/palaute_app/linkki/html/1059 doesn't work 
!!


Any thoughts?
I have tried a lot of things without success.

- mika -

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: encoding issue

2012-08-31 Thread mika


Ok,
could it have something to do with those locale settings?
Just a wild guess, because I really don't know how to use them.

Thanks anyway,
- mika-

P.S. Those > org.apache.cocoon.containerencoding=utf-8

org.apache.cocoon.formencoding=utf-8 BTW are in web.xml at C2.1.

They only effect so that the question mark turns into "monkey head".
I have all my pages UTF8-encoded (the file itself) and with utf8-tags. 
have to check those xlst's also.



On Fri, 31 Aug 2012 10:24:57 +0200, Robby Pelssers 
 wrote:

Problem is I'm not using C2.1.x anymore so it's really hard to
properly help you out here.

I know that for C2.2 we have to set 2 properties:
org.apache.cocoon.containerencoding=utf-8
org.apache.cocoon.formencoding=utf-8


As a side note: Check encoding in your xslt's

  indent="yes"/>


Robby

-Original Message-
From: m...@digikartta.net [mailto:m...@digikartta.net]
Sent: Friday, August 31, 2012 9:33 AM
To: users@cocoon.apache.org
Subject: RE: encoding issue


 Hi,
 yep, if I got you right, problems appear somewhere between the 
submit

 and the outcome of the bind, at server side. Container is Tomcat 6.
 Something like this:
 flowscript:
 form.createBinding("cocoon:/form_" + id + "_bind");
 form.save(doc);
 cocoon.sendPage("vastaus-display-pipeline.jx", {title: "blaah,
 blaah..", document: doc, id: id}

 sitemap:
 (this is for the dynamic binding..)
 
 
 
 
 
 

 and:

 

  value="{flow-attribute:locale}"/>




  value="{flow-attribute:locale}"/>



  


  
  
  value="{flow-attr:locale}"/>



  value="{flow-attribute:locale}"/>



  
 
   
   

 
 
   
   
 
   
 
  
  

  

 

 and the #{$document} in the jx-template has lost scands

 - mika -


 On Fri, 31 Aug 2012 09:19:04 +0200, Robby Pelssers
  wrote:

Hi Mika,

Some questions:
- are you having problems submitting forms where the data is not
received server side as UTF-8?
- what application container are you using?  Tomcat, Jetty, ...

Robby

-Original Message-
From: m...@digikartta.net [mailto:m...@digikartta.net]
Sent: Wednesday, August 29, 2012 12:37 PM
To: users@cocoon.apache.org
Subject: encoding issue


 Hi.

 C2.11 and CForms. I loose scands after binding, they are all
replaced
 by a question mark. I have read all the sites possible to resolve
this,
 but without luck. All is UTF-8, except container-encoding
iso-8859-1.
 Changing container to utf-8 will change the question marks to some
other
 wrong ones and will do other harms.
 What next? Checking the source code?

 - mika -


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: encoding issue

2012-08-31 Thread Robby Pelssers
Problem is I'm not using C2.1.x anymore so it's really hard to properly help 
you out here.

I know that for C2.2 we have to set 2 properties:
org.apache.cocoon.containerencoding=utf-8
org.apache.cocoon.formencoding=utf-8


As a side note: Check encoding in your xslt's

  

Robby

-Original Message-
From: m...@digikartta.net [mailto:m...@digikartta.net] 
Sent: Friday, August 31, 2012 9:33 AM
To: users@cocoon.apache.org
Subject: RE: encoding issue


 Hi,
 yep, if I got you right, problems appear somewhere between the submit 
 and the outcome of the bind, at server side. Container is Tomcat 6.
 Something like this:
 flowscript:
 form.createBinding("cocoon:/form_" + id + "_bind");
 form.save(doc);
 cocoon.sendPage("vastaus-display-pipeline.jx", {title: "blaah, 
 blaah..", document: doc, id: id}

 sitemap:
 (this is for the dynamic binding..)
 
 
 
 
 
 

 and:

 

  



  


  


  
  
  


  


  
 
   
   
 
 
   
   
 
   
 
  
  

  

 

 and the #{$document} in the jx-template has lost scands

 - mika -


 On Fri, 31 Aug 2012 09:19:04 +0200, Robby Pelssers 
  wrote:
> Hi Mika,
>
> Some questions:
> - are you having problems submitting forms where the data is not
> received server side as UTF-8?
> - what application container are you using?  Tomcat, Jetty, ...
>
> Robby
>
> -Original Message-
> From: m...@digikartta.net [mailto:m...@digikartta.net]
> Sent: Wednesday, August 29, 2012 12:37 PM
> To: users@cocoon.apache.org
> Subject: encoding issue
>
>
>  Hi.
>
>  C2.11 and CForms. I loose scands after binding, they are all 
> replaced
>  by a question mark. I have read all the sites possible to resolve 
> this,
>  but without luck. All is UTF-8, except container-encoding 
> iso-8859-1.
>  Changing container to utf-8 will change the question marks to some 
> other
>  wrong ones and will do other harms.
>  What next? Checking the source code?
>
>  - mika -
>
> -
> To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
> For additional commands, e-mail: users-h...@cocoon.apache.org
>
>
> -
> To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
> For additional commands, e-mail: users-h...@cocoon.apache.org


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: encoding issue

2012-08-31 Thread mika


Hi,
yep, if I got you right, problems appear somewhere between the submit 
and the outcome of the bind, at server side. Container is Tomcat 6.

Something like this:
flowscript:
form.createBinding("cocoon:/form_" + id + "_bind");
form.save(doc);
cocoon.sendPage("vastaus-display-pipeline.jx", {title: "blaah, 
blaah..", document: doc, id: id}


sitemap:
(this is for the dynamic binding..)







and:


   label="content1">

 
   
   

   
 
   
   
 
   
   
 value="{request:contextPath}/_cocoon/resources"/>

 
 
   
   
 
   
   
 


  value="dojo.transport"/>

  
src="resource://org/apache/cocoon/forms/resources/IframeTransport-bu-styling.xsl"/>


  
  

  

 
 
   
 
   


and the #{$document} in the jx-template has lost scands

- mika -


On Fri, 31 Aug 2012 09:19:04 +0200, Robby Pelssers 
 wrote:

Hi Mika,

Some questions:
- are you having problems submitting forms where the data is not
received server side as UTF-8?
- what application container are you using?  Tomcat, Jetty, ...

Robby

-Original Message-
From: m...@digikartta.net [mailto:m...@digikartta.net]
Sent: Wednesday, August 29, 2012 12:37 PM
To: users@cocoon.apache.org
Subject: encoding issue


 Hi.

 C2.11 and CForms. I loose scands after binding, they are all 
replaced
 by a question mark. I have read all the sites possible to resolve 
this,
 but without luck. All is UTF-8, except container-encoding 
iso-8859-1.
 Changing container to utf-8 will change the question marks to some 
other

 wrong ones and will do other harms.
 What next? Checking the source code?

 - mika -

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: encoding issue

2012-08-31 Thread Robby Pelssers
Hi Mika,

Some questions:
- are you having problems submitting forms where the data is not received 
server side as UTF-8?
- what application container are you using?  Tomcat, Jetty, ...

Robby

-Original Message-
From: m...@digikartta.net [mailto:m...@digikartta.net] 
Sent: Wednesday, August 29, 2012 12:37 PM
To: users@cocoon.apache.org
Subject: encoding issue


 Hi.

 C2.11 and CForms. I loose scands after binding, they are all replaced 
 by a question mark. I have read all the sites possible to resolve this, 
 but without luck. All is UTF-8, except container-encoding iso-8859-1. 
 Changing container to utf-8 will change the question marks to some other 
 wrong ones and will do other harms.
 What next? Checking the source code?

 - mika -

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: encoding issue

2012-08-31 Thread Robby Pelssers
Hi Mika,

Some questions:
- are you having problems submitting forms where the data is not received 
server side as UTF-8?
- what application container are you using?  Tomcat, Jetty, ...

Robby

-Original Message-
From: m...@digikartta.net [mailto:m...@digikartta.net] 
Sent: Wednesday, August 29, 2012 12:37 PM
To: users@cocoon.apache.org
Subject: encoding issue


 Hi.

 C2.11 and CForms. I loose scands after binding, they are all replaced 
 by a question mark. I have read all the sites possible to resolve this, 
 but without luck. All is UTF-8, except container-encoding iso-8859-1. 
 Changing container to utf-8 will change the question marks to some other 
 wrong ones and will do other harms.
 What next? Checking the source code?

 - mika -

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



encoding issue

2012-08-29 Thread mika


Hi.

C2.11 and CForms. I loose scands after binding, they are all replaced 
by a question mark. I have read all the sites possible to resolve this, 
but without luck. All is UTF-8, except container-encoding iso-8859-1. 
Changing container to utf-8 will change the question marks to some other 
wrong ones and will do other harms.

What next? Checking the source code?

- mika -

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: issue with form encoding C2.2

2012-06-06 Thread Robby Pelssers
I did found a workaround by the way…

If you add an extra request parameter called cocoon-form-encoding and set it to 
utf-8 it will work

Snippet from RequestProcessor.java:

protected Environment getEnvironment(String uri,
 HttpServletRequest req,
 HttpServletResponse res)
throws Exception {

String formEncoding = req.getParameter("cocoon-form-encoding");
if (formEncoding == null) {
formEncoding = this.settings.getFormEncoding();
}

HttpEnvironment env;
env = new HttpEnvironment(uri,
  req,
  res,
  this.servletContext,
  this.environmentContext,
  this.containerEncoding,
  formEncoding);
return env;
}

From: Robby Pelssers [mailto:robby.pelss...@nxp.com]
Sent: Wednesday, June 06, 2012 1:10 PM
To: d...@cocoon.apache.org; users@cocoon.apache.org
Subject: issue with form encoding C2.2

Hi all,

Just wanted to have a short discussion on an issue that I wasted quite some 
hours on.  Let me first explain that I configured my cocoon block with 
following two properties as per http://cocoon.apache.org/2.2/1366_1_1.html :

org.apache.cocoon.containerencoding=UTF-8
org.apache.cocoon.formencoding=UTF-8


Recently I created a form showing pre-populated data from an xquery.  One form 
field contained the Ohm Ω character and the browser rendered if fine.  But I 
had to post the data back to the server and the Ohm sign got corrupted.

From firebug I could see following:
descriptiveTitle N-channel 25 V 2.85 mΩ logic level MOSFET in LFPAK using 
NextPower technology
magCode R73
specificationStatus  Product

From flowscript:
descriptiveTitle=N-channel 25 V 2.85 mΩ logic level MOSFET in LFPAK using 
NextPower technology
specificationStatus=Product
magCode=R73



So next I started looking at cocoon sources using URLDecoder and I had a 
suspicion that NetUtils might be responsible for the issue. But that didn’t 
seem to be the issue.  I managed to find out that the value I defined for 
formencoding is not actually used.  The reason is that it also is set in 
cocoon-core and either it doesn’t get overwritten or the property from 
cocoon-core is overwriting my own property value.

nxp10009@NXL01262 /c/development/workspaces/cocoon22/trunk/core
$ find . -name *.properties | xargs grep "formencoding"
./cocoon-core/src/main/resources/META-INF/cocoon/properties/core.properties:org.apache.cocoon.formencoding=ISO-8859-1

So what’s the best way to fix this?




issue with form encoding C2.2

2012-06-06 Thread Robby Pelssers
Hi all,

Just wanted to have a short discussion on an issue that I wasted quite some 
hours on.  Let me first explain that I configured my cocoon block with 
following two properties as per http://cocoon.apache.org/2.2/1366_1_1.html :

org.apache.cocoon.containerencoding=UTF-8
org.apache.cocoon.formencoding=UTF-8


Recently I created a form showing pre-populated data from an xquery.  One form 
field contained the Ohm Ω character and the browser rendered if fine.  But I 
had to post the data back to the server and the Ohm sign got corrupted.

From firebug I could see following:
descriptiveTitle N-channel 25 V 2.85 mΩ logic level MOSFET in LFPAK using 
NextPower technology
magCode R73
specificationStatus  Product

From flowscript:
descriptiveTitle=N-channel 25 V 2.85 mΩ logic level MOSFET in LFPAK using 
NextPower technology
specificationStatus=Product
magCode=R73



So next I started looking at cocoon sources using URLDecoder and I had a 
suspicion that NetUtils might be responsible for the issue. But that didn’t 
seem to be the issue.  I managed to find out that the value I defined for 
formencoding is not actually used.  The reason is that it also is set in 
cocoon-core and either it doesn’t get overwritten or the property from 
cocoon-core is overwriting my own property value.

nxp10009@NXL01262 /c/development/workspaces/cocoon22/trunk/core
$ find . -name *.properties | xargs grep "formencoding"
./cocoon-core/src/main/resources/META-INF/cocoon/properties/core.properties:org.apache.cocoon.formencoding=ISO-8859-1

So what’s the best way to fix this?




Re: XML-> PDF bad encoding

2011-12-05 Thread FunkyDisco

Hi,
Maybe you haven't noticed or it was somehow missed, but in mine first post
it was written:
"Mine conf is Win 2008 R2 x64 (Windows XP SP3 x86) Win 1250 CP, Tomcat
6.0.33, Cocoon 2.1.11."
So this is mine conf.

To reproduce the problem just use original posting:
C:\Program
Files\Tomcat\webapps\cocoon\samples\blocks\fop\misc\minimal.fo.xml 
in which place:
đšžćčŠĐŽĆČ letters
and
C:\Program Files\Tomcat\webapps\cocoon\samples\blocks\fop\sitemap.xmap
which makes complete solution together.
Rg,
Damir


Andy Stevens-2 wrote:
> 
> It might help if you could include the relevant pipeline code and
> component
> configuration that you're using. And which cocoon version it's for.
> 
> Andy.
>  On 4 Dec 2011 12:02, "FunkyDisco"  wrote:
> 
>>
>> Appologize in front if this is somewhere explained, but as I'm not guru
>> in
>> Cocoon, after 5 days of google search and testing all possible
>> combinations,
>> haven't found a solution. So here is brief problem description
>>
>> When I test file generation from WEB URL, (source->output), :
>> XML->XML correct (simple copy test to see if it read OK and write OK)
>> XML->RTF correct
>> XML-PDF problem (national characters has "#")
>>
>> I have read "http://xmlgraphics.apache.org/fop/faq.html#pdf-characters";,
>> chapter 6.2 which shows exactly what I have but when I implement a
>> feature
>> some national character (ĐđČčĆć in unicode notation) remain as "#"
>> signs).
>>
>> Is this really a bug or there is a solution, so please once more
>> apologize
>> for some dumb entries in this issue, but pretty lost in this case.
>> Regards,
>> Funky
>> --
>> View this message in context:
>> http://old.nabble.com/XML-%3E-PDF-bad-encoding-tp32911752p32911752.html
>> Sent from the Cocoon - Users mailing list archive at Nabble.com.
>>
>>
>> -
>> To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
>> For additional commands, e-mail: users-h...@cocoon.apache.org
>>
>>
> 
> 

-- 
View this message in context: 
http://old.nabble.com/XML-%3E-PDF-bad-encoding-tp32911752p32918279.html
Sent from the Cocoon - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: XML-> PDF bad encoding

2011-12-05 Thread FunkyDisco

Correct. This apply to many  non-ASCII characters but in Croatian most
important are those mentioned.
Font (mentioned in inline description) is "Arial Unicode MS" (tried frankly
many others) and they all have correct output from MS Word, MS Wordpad, or
any other app in Windows that I know. I'm not sure if this is the answer on
your question.
What I'd like to see is one XML with mentioned characters (ĐŠŽĆČđšžćč) which
can be correctly reproduced in PDF as output.
Rg,
Damir

Nathaniel, Alfred wrote:
> 
> Does that apply to all non-ASCII characters, or only to those you mention
> (which are not part of Latin-1)?
> Did you check that the font description to give to FOP have glyphs for
> these specific codes?
> 
> HTH, Alfred.
> 
> -Original Message-
> From: FunkyDisco [mailto:funky_disco_fr...@hotmail.com] 
> Sent: Sonntag, 4. Dezember 2011 13:02
> To: users@cocoon.apache.org
> Subject: XML-> PDF bad encoding
> 
> 
> Appologize in front if this is somewhere explained, but as I'm not guru in
> Cocoon, after 5 days of google search and testing all possible
> combinations,
> haven't found a solution. So here is brief problem description 
> 
> When I test file generation from WEB URL, (source->output), : 
> XML->XML correct (simple copy test to see if it read OK and write OK) 
> XML->RTF correct 
> XML-PDF problem (national characters has "#") 
> 
> I have read "http://xmlgraphics.apache.org/fop/faq.html#pdf-characters";,
> chapter 6.2 which shows exactly what I have but when I implement a feature
> some national character (ĐđČčĆć in unicode notation) remain as "#" signs). 
> 
> Is this really a bug or there is a solution, so please once more apologize
> for some dumb entries in this issue, but pretty lost in this case. 
> Regards, 
> Funky 
> -- 
> View this message in context:
> http://old.nabble.com/XML-%3E-PDF-bad-encoding-tp32911752p32911752.html
> Sent from the Cocoon - Users mailing list archive at Nabble.com.
> 
> 
> -
> To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
> For additional commands, e-mail: users-h...@cocoon.apache.org
> 
> 
> The content of this e-mail is intended only for the confidential use of
> the person addressed. 
> If you are not the intended recipient, please notify the sender and delete
> this e-mail immediately.
> Thank you.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/XML-%3E-PDF-bad-encoding-tp32911752p32918069.html
Sent from the Cocoon - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: XML-> PDF bad encoding

2011-12-04 Thread Andy Stevens
It might help if you could include the relevant pipeline code and component
configuration that you're using. And which cocoon version it's for.

Andy.
 On 4 Dec 2011 12:02, "FunkyDisco"  wrote:

>
> Appologize in front if this is somewhere explained, but as I'm not guru in
> Cocoon, after 5 days of google search and testing all possible
> combinations,
> haven't found a solution. So here is brief problem description
>
> When I test file generation from WEB URL, (source->output), :
> XML->XML correct (simple copy test to see if it read OK and write OK)
> XML->RTF correct
> XML-PDF problem (national characters has "#")
>
> I have read "http://xmlgraphics.apache.org/fop/faq.html#pdf-characters";,
> chapter 6.2 which shows exactly what I have but when I implement a feature
> some national character (ĐđČčĆć in unicode notation) remain as "#" signs).
>
> Is this really a bug or there is a solution, so please once more apologize
> for some dumb entries in this issue, but pretty lost in this case.
> Regards,
> Funky
> --
> View this message in context:
> http://old.nabble.com/XML-%3E-PDF-bad-encoding-tp32911752p32911752.html
> Sent from the Cocoon - Users mailing list archive at Nabble.com.
>
>
> -
> To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
> For additional commands, e-mail: users-h...@cocoon.apache.org
>
>


RE: XML-> PDF bad encoding

2011-12-04 Thread Nathaniel, Alfred
Does that apply to all non-ASCII characters, or only to those you mention 
(which are not part of Latin-1)?
Did you check that the font description to give to FOP have glyphs for these 
specific codes?

HTH, Alfred.

-Original Message-
From: FunkyDisco [mailto:funky_disco_fr...@hotmail.com] 
Sent: Sonntag, 4. Dezember 2011 13:02
To: users@cocoon.apache.org
Subject: XML-> PDF bad encoding


Appologize in front if this is somewhere explained, but as I'm not guru in
Cocoon, after 5 days of google search and testing all possible combinations,
haven't found a solution. So here is brief problem description 

When I test file generation from WEB URL, (source->output), : 
XML->XML correct (simple copy test to see if it read OK and write OK) 
XML->RTF correct 
XML-PDF problem (national characters has "#") 

I have read "http://xmlgraphics.apache.org/fop/faq.html#pdf-characters";,
chapter 6.2 which shows exactly what I have but when I implement a feature
some national character (ĐđČčĆć in unicode notation) remain as "#" signs). 

Is this really a bug or there is a solution, so please once more apologize
for some dumb entries in this issue, but pretty lost in this case. 
Regards, 
Funky 
-- 
View this message in context: 
http://old.nabble.com/XML-%3E-PDF-bad-encoding-tp32911752p32911752.html
Sent from the Cocoon - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org


The content of this e-mail is intended only for the confidential use of the 
person addressed. 
If you are not the intended recipient, please notify the sender and delete this 
e-mail immediately.
Thank you.


XML-> PDF bad encoding

2011-12-04 Thread FunkyDisco

Appologize in front if this is somewhere explained, but as I'm not guru in
Cocoon, after 5 days of google search and testing all possible combinations,
haven't found a solution. So here is brief problem description 

When I test file generation from WEB URL, (source->output), : 
XML->XML correct (simple copy test to see if it read OK and write OK) 
XML->RTF correct 
XML-PDF problem (national characters has "#") 

I have read "http://xmlgraphics.apache.org/fop/faq.html#pdf-characters";,
chapter 6.2 which shows exactly what I have but when I implement a feature
some national character (ĐđČčĆć in unicode notation) remain as "#" signs). 

Is this really a bug or there is a solution, so please once more apologize
for some dumb entries in this issue, but pretty lost in this case. 
Regards, 
Funky 
-- 
View this message in context: 
http://old.nabble.com/XML-%3E-PDF-bad-encoding-tp32911752p32911752.html
Sent from the Cocoon - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: Encoding

2010-12-20 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Laurent,

On 12/17/2010 11:25 AM, Laurent Medioni wrote:
> Have a look at http://wiki.apache.org/tomcat/FAQ/CharacterEncoding I
> think the comment you refer to tries to say that if no
> charset/encoding is set when producing a response then assume the
> ISO-8859-1 default value (do not ask why ;) ).

That wiki page explains why. I know because I wrote it :) It's all in
the servlet and HTTP specifications.

There's actually an open issue in Tomcat that proposes to switch the
default request body encoding /and/ URI encoding to UTF-8. Comments welcome:
https://issues.apache.org/bugzilla/show_bug.cgi?id=48550

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk0P1FEACgkQ9CaO5/Lv0PCHMgCeIJ8Zt4DczFzMQA9ZFMd/ALiI
zvEAn2g14sxMECi+X7HaJ1y+X5FqXlV8
=pH7L
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: Encoding

2010-12-17 Thread Laurent Medioni
Setting container-encoding to UTF-8 enables you to share your servlet container 
(keeping its Latin1 default) with other applications not supporting UTF-8 (and 
not fiddling with encodings...), if relevant (we had the case...).

Have a look at http://wiki.apache.org/tomcat/FAQ/CharacterEncoding
I think the comment you refer to tries to say that if no charset/encoding is 
set when producing a response then assume the ISO8859-1 default value (do not 
ask why ;) ).

Alternatively try to do the equivalent of response.setContentType("text/html; 
charset=UTF-8") in your XSL ( ? sorry from 
memory, not an XSL specialist...), then you won't get the default encoding back.

Laurent
 

-Original Message-
From: Peter Flynn [mailto:pfl...@ucc.ie] 
Sent: vendredi, 17. décembre 2010 17:06
To: users@cocoon.apache.org
Subject: Re: Encoding

On 17/12/10 15:37, Laurent Medioni wrote:
> What is your 
> 
>   container-encoding
>   UTF-8
> 
> In web.xml ?

Interesting. ISO-8859-1, because



I wouldn't call Tomcat buggy, exactly, but the servlet spec made a poor
choice in making ISO-8859-1 the default, given that the rest of the
planet is going down the UTF-{8|16|32|64} road :-)

Certainly fixes the problem though...very many thanks.

///Peter

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org




• This email and any files transmitted with it are CONFIDENTIAL and intended
  solely for the use of the individual or entity to which they are addressed.
• Any unauthorized copying, disclosure, or distribution of the material within
  this email is strictly forbidden.
• Any views or opinions presented within this e-mail are solely those of the
  author and do not necessarily represent those of Odyssey Financial
Technologies SA unless otherwise specifically stated.
• An electronic message is not binding on its sender. Any message referring to
  a binding engagement must be confirmed in writing and duly signed.
• If you have received this email in error, please notify the sender immediately
  and delete the original.


Re: Encoding

2010-12-17 Thread Peter Flynn
On 17/12/10 15:37, Laurent Medioni wrote:
> What is your 
> 
>   container-encoding
>   UTF-8
> 
> In web.xml ?

Interesting. ISO-8859-1, because



I wouldn't call Tomcat buggy, exactly, but the servlet spec made a poor
choice in making ISO-8859-1 the default, given that the rest of the
planet is going down the UTF-{8|16|32|64} road :-)

Certainly fixes the problem though...very many thanks.

///Peter

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: Encoding

2010-12-17 Thread Laurent Medioni
What is your 

  container-encoding
  UTF-8

In web.xml ?



• This email and any files transmitted with it are CONFIDENTIAL and intended
  solely for the use of the individual or entity to which they are addressed.
• Any unauthorized copying, disclosure, or distribution of the material within
  this email is strictly forbidden.
• Any views or opinions presented within this e-mail are solely those of the
  author and do not necessarily represent those of Odyssey Financial
Technologies SA unless otherwise specifically stated.
• An electronic message is not binding on its sender. Any message referring to
  a binding engagement must be confirmed in writing and duly signed.
• If you have received this email in error, please notify the sender immediately
  and delete the original.


Re: Encoding

2010-12-17 Thread Peter Flynn
On 17/12/10 15:06, Peter Flynn wrote:
[...]
> The result is that the output at
> http://publish.ucc.ie/researchprofiles/A005
> has Unicode replacement characters instead of accents.

Curiouser and curiouser, that page serves as UTF-8 but lower down it says:

http://www.w3.org/TR/html4/loose.dtd";>
http://www.w3.org/1999/xhtml"; lang="en-ie">
   http://www.w3.org/1999/xhtml";>
  
  
  

That is generated by

  


  School: 
  
  ; Researcher-in-School: 
  
  ; Real School: 
  



so WTF is that

coming from? Is Cocoon sticking it in by itself? The page template which
I take for the framework is
http://www.ucc.ie/en/old-design-base/
and that says quite clearly


Something, somewhere is sticking a bogus encoding in the works.

///Peter

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Encoding

2010-12-17 Thread Peter Flynn
I restored the Xalan settings after (failing to) add Saxon by copying
Emacs' ~ backup copies of cocoon.xconf and sitemap.xmap, but now
suddenly there are Unicode replacement characters (U+FFFD) appearing for
accents in pages which were working before.

The data is taken from a feed from an Oracle Application Server giving a
HTML  fragment, eg
http://rss.ucc.ie/live/w_rms_profile_list.show?p_school_id=A005
which dog and wget identify in the headers as
Content-Type: text/html; charset=WINDOWS-1252
(yes, I know, yuck...not my server)

[That URI may not be accessible off-campus]

This is processed by a pipeline to ensure it is XML:


  http://rss.ucc.ie/dev/w_rms_profile_list.show?p_school_id={1}"/>
  


so that
http://publish.ucc.ie/researchprofiles/people-in-schools/A005
produces XML I can consume in my XSLT. However, this is appearing as:


UTF-8


The result is that the output at
http://publish.ucc.ie/researchprofiles/A005
has Unicode replacement characters instead of accents.

I thought it should enforce translation to UTF-8 but obviously I have
missed somethingbut what?

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: Set Encoding for XMLSerializer dynamically

2010-11-08 Thread Laurent Medioni
So this changed from 2.1, pity…

 



From: Ali Mahdoui [mailto:mahd...@hotmail.de] 
Sent: lundi, 8. novembre 2010 13:08
To: Cocoon users
Subject: RE: Set Encoding for XMLSerializer dynamically

 

Hi,

no that does not help (and the syntax is not allowed in the schema of the 
sitemap)

thanks

Ali



> Subject: RE: Set Encoding for XMLSerializer dynamically
> Date: Mon, 8 Nov 2010 12:56:11 +0100
> From: lmedi...@odyssey-group.com
> To: users@cocoon.apache.org
> 
> Hi,
> have you tried:
> 
> ...
> "{charsetEncoding}"
> ...
> 
> 
> ?
> Laurent
> 




• This email and any files transmitted with it are CONFIDENTIAL and intended
  solely for the use of the individual or entity to which they are addressed.
• Any unauthorized copying, disclosure, or distribution of the material within
  this email is strictly forbidden.
• Any views or opinions presented within this e-mail are solely those of the
  author and do not necessarily represent those of Odyssey Financial
Technologies SA unless otherwise specifically stated.
• An electronic message is not binding on its sender. Any message referring to
  a binding engagement must be confirmed in writing and duly signed.
• If you have received this email in error, please notify the sender immediately
  and delete the original.


RE: Set Encoding for XMLSerializer dynamically

2010-11-08 Thread Ali Mahdoui

Hi,no that does not help (and the syntax is not allowed in the schema of the 
sitemap)thanksAli

> Subject: RE: Set Encoding for XMLSerializer dynamically
> Date: Mon, 8 Nov 2010 12:56:11 +0100
> From: lmedi...@odyssey-group.com
> To: users@cocoon.apache.org
> 
> Hi,
> have you tried:
> 
>   ...
>   "{charsetEncoding}"
>   ...
> 
> 
> ?
> Laurent
> 

  

RE: Set Encoding for XMLSerializer dynamically

2010-11-08 Thread Laurent Medioni
Hi,
have you tried:

...
"{charsetEncoding}"
...


?
Laurent


From: Ali Mahdoui [mailto:mahd...@hotmail.de] 
Sent: dimanche, 7. novembre 2010 12:10
To: Cocoon users; d...@cocoon.apache.org
Subject: Set Encoding for XMLSerializer dynamically

Hi,
i am using cocoon 2.2 and i want to set the encoding for the xml serializer 
dynamically depending on the return value of a previous action.
for example like this  ...
For the moment i can only set the encoding in the bean definition...  
Is that possible? Thanks




• This email and any files transmitted with it are CONFIDENTIAL and intended
  solely for the use of the individual or entity to which they are addressed.
• Any unauthorized copying, disclosure, or distribution of the material within
  this email is strictly forbidden.
• Any views or opinions presented within this e-mail are solely those of the
  author and do not necessarily represent those of Odyssey Financial
Technologies SA unless otherwise specifically stated.
• An electronic message is not binding on its sender. Any message referring to
  a binding engagement must be confirmed in writing and duly signed.
• If you have received this email in error, please notify the sender immediately
  and delete the original.


Set Encoding for XMLSerializer dynamically

2010-11-07 Thread Ali Mahdoui

Hi,i am using cocoon 2.2 and i want to set the encoding for the xml serializer 
dynamically depending on the return value of a previous action.for example like 
this  
...For the moment i can only set the encoding in the bean definition...  Is 
that possible? Thanks
  

Re: problem encoding using SendMailTransformer

2010-10-13 Thread mvalencia

hi everybody

  I have found one solution to my problem. I change attribute "mime-type" of
email:body on xsl page from transform cocoon to value:"text/xml" and now the
email arrive well. Exactly:
old = 
new = 

In the code of SendMailTransformer.java I see that when mime-type is
text/plain body set with MimeMessage variable and when mime-type is another
value the body field set with MimeBodyPart variable.
It seem that MimeMessage variable does not manage encoding correctly.

Anyway for me the matter is fixed.

Thanks all.

-- 
View this message in context: 
http://old.nabble.com/problem-encoding-using-SendMailTransformer-tp29886850p29950964.html
Sent from the Cocoon - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: problem encoding using SendMailTransformer

2010-10-12 Thread Charles Yates

 Ah, yes, I see that in your original message.

In case it helps, here is how discovered the cause of my problem.  I 
used eclipse with WTP (web tools project) with the appropriate versions 
of cocoon, tomcat and my application as projects.  I put a breakpoint at 
a place where I knew I would see the garbled characters and then went 
back through the method calls to find the point where they were created.


In my case it was the tomcat Connector object using its default 
character encoding.


On 10/8/10 8:02 AM, Charles Yates wrote:

Hello, I think you need to set URIencoding in your tomcat connector:

http://tomcat.apache.org/tomcat-6.0-doc/config/http.html#Common_Attributes

|URIEncoding|

This specifies the character encoding used to decode the URI bytes, 
after %xx decoding the URL. If not specified, ISO-8859-1 will be used.



had this problem myself recently . . .

-Charles

Hi all

   I have a problem using the code of:
org.apache.cocoon.mail.transformation.SendMailTransformer, when I send an
email, target user always receive the field body with strange characters, so
seemd bad encoding, and is curious only the field body, the subject is ok.

I work with encoding UTF-8, and I check Tomcat configuration is good:
  * parameter: URIEncoding="UTF-8"
  * SetCharacterEncoding = UTF-8
  * container-encoding = utf-8
  *  form-encoding = utf-8

I check encoding os serializers on Cocoon, too:

   UTF-8


   -//W3C//DTD XHTML 1.0 Strict//EN

http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
   UTF-8


I only see that There is any problem on AbstractSAXTransformer to recovery
data from textarea field on HTML, because is the diference between field
subject and body.

Thank you



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: problem encoding using SendMailTransformer

2010-10-10 Thread mvalencia

Hi Charles 

I have configured the parameter URIEncoding to UTF-8, even I used the
parameter useBodyEncodingForURI but I go on with the same problem.


-- 
View this message in context: 
http://old.nabble.com/problem-encoding-using-SendMailTransformer-tp29886850p29931532.html
Sent from the Cocoon - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: problem encoding using SendMailTransformer

2010-10-10 Thread mvalencia

The form used get method, because with the post method not work the
"request-param" of Cocoon.

-- 
View this message in context: 
http://old.nabble.com/problem-encoding-using-SendMailTransformer-tp29886850p29931504.html
Sent from the Cocoon - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: problem encoding using SendMailTransformer

2010-10-08 Thread Thorsten Scherler
On Fri, 2010-10-08 at 04:36 -0700, mvalencia wrote:
> 
> can be the problematic part, since you take the value from the request.
> 
> I can remember we applied the tips
> http://cocoon.apache.org/2.2/1366_1_1.html once in one of the apps.
> AFAIR there should be the  SetCharacterEncodingFilter in your code base.
> 
> 
> Yes, SetCharacterEncodingFilter is a filter defined on web.xml of Cocoon
> Application:
> 
> SetCharacterEncoding
> 
>   es.sadesi.filter.SetCharacterEncodingFilter
> 
>   encoding
>   UTF-8
> 
>   
> ..
> 
> SetCharacterEncoding
> DispatcherServlet
>   
> .
> 
> I have test put encoding on HTML FORM, so with the parameter:
> accept-charset="UTF-8" on FORM tag, but it isn't work. It seems encoding
> lose when data go to block conector since Cocoon application, but I not
> sure.


are you using post or get for the form?

salu2
-- 
Thorsten Scherler 
codeBusters S.L. - web based systems

http://www.codebusters.es/


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: problem encoding using SendMailTransformer

2010-10-08 Thread Charles Yates

 Hello, I think you need to set URIencoding in your tomcat connector:

http://tomcat.apache.org/tomcat-6.0-doc/config/http.html#Common_Attributes

|URIEncoding|

This specifies the character encoding used to decode the URI bytes, 
after %xx decoding the URL. If not specified, ISO-8859-1 will be used.



had this problem myself recently . . .

-Charles

Hi all

   I have a problem using the code of:
org.apache.cocoon.mail.transformation.SendMailTransformer, when I send an
email, target user always receive the field body with strange characters, so
seemd bad encoding, and is curious only the field body, the subject is ok.

I work with encoding UTF-8, and I check Tomcat configuration is good:
  * parameter: URIEncoding="UTF-8"
  * SetCharacterEncoding = UTF-8
  * container-encoding = utf-8
  *  form-encoding = utf-8

I check encoding os serializers on Cocoon, too:

   UTF-8


   -//W3C//DTD XHTML 1.0 Strict//EN

http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
   UTF-8


I only see that There is any problem on AbstractSAXTransformer to recovery
data from textarea field on HTML, because is the diference between field
subject and body.

Thank you


Re: problem encoding using SendMailTransformer

2010-10-08 Thread mvalencia

Hi Andre

I load file with HTTP header of web page to send email:
http://old.nabble.com/file/p29914700/HTTP_header_Email-send.txt
HTTP_header_Email-send.txt 

I see that Accept-Charset parameter contains UTF-8 and in query-string data
are codified right.
Besides I check encoding browser is UTF-8.
-- 
View this message in context: 
http://old.nabble.com/problem-encoding-using-SendMailTransformer-tp29886850p29914700.html
Sent from the Cocoon - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: problem encoding using SendMailTransformer

2010-10-08 Thread Andre Juffer

On 10/08/2010 02:36 PM, mvalencia wrote:


can be the problematic part, since you take the value from the request.

I can remember we applied the tips
http://cocoon.apache.org/2.2/1366_1_1.html once in one of the apps.
AFAIR there should be the  SetCharacterEncodingFilter in your code base.


Yes, SetCharacterEncodingFilter is a filter defined on web.xml of Cocoon
Application:

 SetCharacterEncoding
 
   es.sadesi.filter.SetCharacterEncodingFilter
 
   encoding
   UTF-8
 
   
..

 SetCharacterEncoding
 DispatcherServlet
   
.

I have test put encoding on HTML FORM, so with the parameter:
accept-charset="UTF-8" on FORM tag, but it isn't work. It seems encoding
lose when data go to block conector since Cocoon application, but I not
sure.
   


You could check whether the browser when it display the form to the user 
actually uses the correct encoding (UTF-8).


--
Andre H. Juffer  | Phone: +358-8-553 1161
Biocenter Oulu and   | Fax: +358-8-553-1141
Department of Biochemistry   | Email: andre.juf...@oulu.fi
University of Oulu, Finland  | WWW: www.biochem.oulu.fi/Biocomputing/
StruBioCat   | WWW: www.strubiocat.oulu.fi
NordProt | WWW: www.nordprot.org
Triacle Biocomputing | WWW: www.triacle-bc.com


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: problem encoding using SendMailTransformer

2010-10-08 Thread mvalencia


can be the problematic part, since you take the value from the request.

I can remember we applied the tips
http://cocoon.apache.org/2.2/1366_1_1.html once in one of the apps.
AFAIR there should be the  SetCharacterEncodingFilter in your code base.


Yes, SetCharacterEncodingFilter is a filter defined on web.xml of Cocoon
Application:

SetCharacterEncoding

  es.sadesi.filter.SetCharacterEncodingFilter

  encoding
  UTF-8

  
..

SetCharacterEncoding
DispatcherServlet
  
.

I have test put encoding on HTML FORM, so with the parameter:
accept-charset="UTF-8" on FORM tag, but it isn't work. It seems encoding
lose when data go to block conector since Cocoon application, but I not
sure.
-- 
View this message in context: 
http://old.nabble.com/problem-encoding-using-SendMailTransformer-tp29886850p29914526.html
Sent from the Cocoon - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: problem encoding using SendMailTransformer

2010-10-07 Thread Thorsten Scherler
On Wed, 2010-10-06 at 08:26 -0700, mvalencia wrote:
> This is the matches:
> 
> 
>   
>  value="{global:conectorXmlPath}correos/parametros-action-validator.xml"/> 
> 
> 
> 
> 
>   
>   
>   
>   
>   
> 
> 
>  
>  
>src="{global:conectorXmlPath}correos/error-form-presidente.xml"/>
>   
>  
> 
> 
> I comment transform "constcorreopresidente" y I put the value of field
> direct on file propmail.xml and the sendmail works OK. I change serializer
> to different charset but the problem isn't there.
> I think the problem is when the data are received on transform
> "constcorreopresidente" with request-param, because the program jump to
> sendmail (It is a class that extend from AbstractSAXTransformer, really is
> @version $Id: SendMailTransformer.java 607381 2007-12-29 05:42:58Z
> vgritsenko) then the field body has lost the encoding.
> 
> However I come back to do the test that you tell me.


can be the problematic part, since you take the value from the request.

I can remember we applied the tips
http://cocoon.apache.org/2.2/1366_1_1.html once in one of the apps.
AFAIR there should be the  SetCharacterEncodingFilter in your code base.

HTH

salu2
-- 
Thorsten Scherler 
codeBusters S.L. - web based systems

http://www.codebusters.es/


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: problem encoding using SendMailTransformer

2010-10-06 Thread mvalencia

This is the matches:


  
   



  
  
  
  
  


 
 
  
  
 


I comment transform "constcorreopresidente" y I put the value of field
direct on file propmail.xml and the sendmail works OK. I change serializer
to different charset but the problem isn't there.
I think the problem is when the data are received on transform
"constcorreopresidente" with request-param, because the program jump to
sendmail (It is a class that extend from AbstractSAXTransformer, really is
@version $Id: SendMailTransformer.java 607381 2007-12-29 05:42:58Z
vgritsenko) then the field body has lost the encoding.

However I come back to do the test that you tell me.
-- 
View this message in context: 
http://old.nabble.com/problem-encoding-using-SendMailTransformer-tp29886850p29897843.html
Sent from the Cocoon - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: problem encoding using SendMailTransformer

2010-10-06 Thread Thorsten Scherler
On Tue, 2010-10-05 at 10:17 -0700, mvalencia wrote:
> Hi
> 
>   I did what you tell me, and the result is:

The first line (the xml declaration) is?

> 
> 
> mail-int.andaluciajunta.es
> 25
> miguel.valen...@juntadeandalucia.es
> miguel.valen...@juntadeandalucia.es
> españa camión
> 
>   Nombre del remitente: prueba, Miguel
>   Mensaje:Prueba de mensaje en españa, camión.
> 
> miguel.valen...@juntadeandalucia.es
> 
> 
> 

Did you use a ? Can you attach the xml
to exclude problems that the mail client may produce? The underlying
server has the locale UTF8_ES?

> It's seems that all text lose encoding, but I have checked that emails have
> subject correct and bad encoding on body field.
> I load a test email:
> http://old.nabble.com/file/p29889393/test-email.txt test-email.txt 

Hmm, let us do some tests: 
1) create a new match where you use the above input but 
a) use #&266; for all latin characters
b) write the "españa" yourself 
where you save the document and store them on the file system.
2) use  and  and create a match
for the body.

What is happening?

salu2
-- 
Thorsten Scherler 
codeBusters S.L. - web based systems

http://www.codebusters.es/


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: problem encoding using SendMailTransformer

2010-10-05 Thread mvalencia

Hi

  I did what you tell me, and the result is:


mail-int.andaluciajunta.es
25
miguel.valen...@juntadeandalucia.es
miguel.valen...@juntadeandalucia.es
españa camión

  Nombre del remitente: prueba, Miguel
  Mensaje:Prueba de mensaje en españa, camión.

miguel.valen...@juntadeandalucia.es



It's seems that all text lose encoding, but I have checked that emails have
subject correct and bad encoding on body field.
I load a test email:
http://old.nabble.com/file/p29889393/test-email.txt test-email.txt 
-- 
View this message in context: 
http://old.nabble.com/problem-encoding-using-SendMailTransformer-tp29886850p29889393.html
Sent from the Cocoon - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: problem encoding using SendMailTransformer

2010-10-05 Thread Thorsten Scherler
On Tue, 2010-10-05 at 05:26 -0700, mvalencia wrote:
> Hi all
> 
>   I have a problem using the code of:
> org.apache.cocoon.mail.transformation.SendMailTransformer, when I send an
> email, target user always receive the field body with strange characters, so
> seemd bad encoding, and is curious only the field body, the subject is ok.
> 
> I work with encoding UTF-8, and I check Tomcat configuration is good:
>  * parameter: URIEncoding="UTF-8"
>  * SetCharacterEncoding = UTF-8
>  * container-encoding = utf-8
>  *  form-encoding = utf-8
> 
> I check encoding os serializers on Cocoon, too:
>  src="org.apache.cocoon.serialization.XMLSerializer">
>   UTF-8
> 
>  src="org.apache.cocoon.serialization.XMLSerializer">
>   -//W3C//DTD XHTML 1.0 Strict//EN
>  
> http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
>   UTF-8
> 
> 
> I only see that There is any problem on AbstractSAXTransformer to recovery
> data from textarea field on HTML, because is the diference between field
> subject and body.

hmm, can you send us the xml that comes in just BEFORE the
SendMailTransformer? Just add  BEFORE and see whether
there is the correct encoding in the body.

salu2

> 
> Thank you

-- 
Thorsten Scherler 
codeBusters S.L. - web based systems

http://www.codebusters.es/


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



problem encoding using SendMailTransformer

2010-10-05 Thread mvalencia

Hi all

  I have a problem using the code of:
org.apache.cocoon.mail.transformation.SendMailTransformer, when I send an
email, target user always receive the field body with strange characters, so
seemd bad encoding, and is curious only the field body, the subject is ok.

I work with encoding UTF-8, and I check Tomcat configuration is good:
 * parameter: URIEncoding="UTF-8"
 * SetCharacterEncoding = UTF-8
 * container-encoding = utf-8
 *  form-encoding = utf-8

I check encoding os serializers on Cocoon, too:

  UTF-8


  -//W3C//DTD XHTML 1.0 Strict//EN
 
http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
  UTF-8


I only see that There is any problem on AbstractSAXTransformer to recovery
data from textarea field on HTML, because is the diference between field
subject and body.

Thank you
-- 
View this message in context: 
http://old.nabble.com/problem-encoding-using-SendMailTransformer-tp29886850p29886850.html
Sent from the Cocoon - Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: form encoding issues

2010-09-29 Thread Barbara Slupik

Hello

I followed the instruction here http://cocoon.apache.org/2.2/1366_1_1.html 
. For cocoon-2.1.11 I set



  container-encoding
  UTF-8



  form-encoding
  UTF-8


in my web.xml instead of org.apache.cocoon.containerencoding=utf-8 and  
org.apache.cocoon.formencoding=utf-8. I had to create  
SetCharacterEncodingFilter as well. All works fine in utf-8.


Barbara


Hi,

I'm stumbling on a character encoding issue (cocoon-2.1.10) and  
really can't see why. Apparently, text input in a form is passed on  
in a wrong encoding. I've set Cocoon's default encoding in all  
thinkable places as UTF-8:


web.xml:


Cocoon


container-encoding
UTF-8


form-encoding
UTF-8




sitemap.xmap

   pool-max="${xhtml-serializer.pool-max}"  
src="org.apache.cocoon.serialization.XMLSerializer">
-//W3C//DTD XHTML 1.0 Transitional//ENpublic>
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd 


UTF-8


Yet, when I execute following pipeline:









...with following minimal source files:

test.xml
===



test.xsl (which will mainly echo the previous input)
==

http://www.w3.org/1999/XSL/Transform";  
version="2.0">












current input: 





Yet, entering a string with accented characters, like e.g. 'très  
annoying', this comes out as: 'très annoying'...
On the other hand, when entering the according URL (<http://localhost:/test?input=tr%C3%A8s+annoying 
>) directly, the characters are passed on correctly. Does anyone  
know how this can be fixed?


Any hints much appreciated!

Ron Van den Branden

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org





Re: form encoding issues

2010-09-29 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Thomas,

On 9/29/2010 7:05 AM, Thomas Markus wrote:
>  hi,
> 
> that arabic character should fail with latin1.
> 
> we see a difference between jetty and tomcat (6.0). tomcat follows specs
> (see Andre's mail) and uses iso per default. you can switch completely
> to UTF-8 with:
> - send html content in utf-8
> - set container-encoding to utf-8
> - set form-encoding to utf-8
> - set URIEncoding to utf-8
> - and include a class like SetCharacterEncodingFilter to set request
> character encoding

Note that this item sets the character encoding for reading request
/bodies/ and not GET parameters from the URL. It also only sets the
request character encoding if the client has not set it.

All these issues are covered in this Tomcat document, though the content
is generally applicable to all containers:

http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q8

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkyjUBgACgkQ9CaO5/Lv0PCSUwCfan2R1diQzmoMj6s6Aohgyvw8
Lx0AnA7jrQeEoQjbum7rEzEhHI/iuvEm
=23lE
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: form encoding issues

2010-09-29 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Ron,

On 9/29/2010 5:43 AM, Ron Van den Branden wrote:
> There is stated that
> apparently (and counter-intuitively, IMO), 'request parameters are
> always decoded using ISO-8859-1 ',  and that consequently
> 'container_encoding should always be ISO-8859-1 (unless you have a
> broken servlet container), and form_encoding should be the same one as
> on your serializer.'.

Note that it's not /all/ parameters that are decoded using ISO-8859-1:
it's only GET parameters. If you use POST, you will likely have better
results.

Note that this means you can't send anything with non-ISO-8859-1
characters in GET parameters safely. There are three solutions:

1. Always use POST (not really a bad idea, but not always practical)
2. Force your container to use UTF-8 to decode GET parameters
   (in Tomcat, this can be accomplished using the URIEncoding
attribute of the  element: see your own container's
documentation for similar capabilities)
3. Never send strings as GET parameters (similar to #1, but somewhat
   different: perhaps use HttpSession or other strategies to avoid
   passing strings through the URL

Good luck,
- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkyjRP0ACgkQ9CaO5/Lv0PCwEgCZAXF/2nyM3qyQN4twApw1uvM7
IRsAoJiI91NyLyMIJ30kT3pMf/KHRB7B
=9sJ3
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: form encoding issues

2010-09-29 Thread Thomas Markus

 hi,

that arabic character should fail with latin1.

we see a difference between jetty and tomcat (6.0). tomcat follows specs 
(see Andre's mail) and uses iso per default. you can switch completely 
to UTF-8 with:

- send html content in utf-8
- set container-encoding to utf-8
- set form-encoding to utf-8
- set URIEncoding to utf-8
- and include a class like SetCharacterEncodingFilter to set request 
character encoding


regards
Thomas

Am 29.09.2010 12:36, schrieb Ron Van den Branden:

Hi Thomas,

I'm not much of an expert in encoding matters, and could indeed be 
happy with ISO-8859-1 instead of UTF-8.


However, testing with ISO-8859-1 set as container-encoding, even 
Arabic input is passed through correctly: ص (Arabic letter 'sad' - 
http://www.fileformat.info/info/unicode/char/0635/index.htm) comes out 
as it has been entered.


Does this mean that this (default) ISO-8859-1 container encoding does 
cater for UTF-8 correctly? Otherwise, would you mind expanding on your 
webapps/examples/WEB-INF/classes/filters/SetCharacterEncodingFilter.java 
suggestion (I'm not much of a Java expert, either ;-))?


OTOH, I don't see any difference between cocoon running in either 
Tomcat or the shipped Jetty.


Kind regards,

Ron

On 29/09/2010 12:11, Thomas Markus wrote:

thats right but you are bound to ISO-8895-1

we use UTF-8 in all stages with my comments.

regards
Thomas




-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: form encoding issues

2010-09-29 Thread Ron Van den Branden

 Hi Andre,

On 29/09/2010 12:01, Andre Juffer wrote:
Actually, Tomcat does, but Jetty does not (by default, UTF8). 
According to specification, servlet engine are suppose to decode using 
ISO-8859-1 by default.


I don't see any difference between both.




And lo: changing the  (over-eager?) container-encoding parameter in 
web.xml back to the default:


container-encoding
ISO-8859-1



Do I understand this correctly: you have encoded everything in UTF8, 
but to able to read your input fields (UTF8) you need to decode their 
value with ISO-8859-1 on the server?


Apparently: even Arabic text comes out fine with ISO-8859-1, not with 
UTF-8 (as I've mentioned in another reply on the ML).


I have had cases where the browser was encoding in ISO-8859-1 despite 
the presence of Content-type set to "text/html; charset=UTF-8" (it 
simply ignored the HTTP header value).


All my browsers interpret my test case as UTF-8 (with container-encoding 
set to ISO-8859-1)...


Kind regards,

Ron

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: form encoding issues

2010-09-29 Thread Ron Van den Branden

 Hi Thomas,

I'm not much of an expert in encoding matters, and could indeed be happy 
with ISO-8859-1 instead of UTF-8.


However, testing with ISO-8859-1 set as container-encoding, even Arabic 
input is passed through correctly: ص (Arabic letter 'sad' - 
http://www.fileformat.info/info/unicode/char/0635/index.htm) comes out 
as it has been entered.


Does this mean that this (default) ISO-8859-1 container encoding does 
cater for UTF-8 correctly? Otherwise, would you mind expanding on your 
webapps/examples/WEB-INF/classes/filters/SetCharacterEncodingFilter.java 
suggestion (I'm not much of a Java expert, either ;-))?


OTOH, I don't see any difference between cocoon running in either Tomcat 
or the shipped Jetty.


Kind regards,

Ron

On 29/09/2010 12:11, Thomas Markus wrote:

thats right but you are bound to ISO-8895-1

we use UTF-8 in all stages with my comments.

regards
Thomas




-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: form encoding issues

2010-09-29 Thread Thomas Markus

 thats right but you are bound to ISO-8895-1

we use UTF-8 in all stages with my comments.

regards
Thomas

Am 29.09.2010 11:43, schrieb Ron Van den Branden:

 Hi again,

Thank you very much for the quick help; meanwhile I think I found an 
answer in a post on cocoon-dev: 
<http://markmail.org/message/nm6bnvqztbee4s5o>. There is stated that 
apparently (and counter-intuitively, IMO), 'request parameters are 
always decoded using ISO-8859-1 ',  and that consequently 
'container_encoding should always be ISO-8859-1 (unless you have a 
broken servlet container), and form_encoding should be the same one as 
on your serializer.'.


And lo: changing the  (over-eager?) container-encoding parameter in 
web.xml back to the default:


container-encoding
ISO-8859-1


...seems to do the trick!
(phew!)

(note: I found this info also at 
<http://wiki.apache.org/cocoon/RequestParameterEncoding#A3._Decoding_incoming_requests:_Servlet_Container>) 



Thanks anyway,

Ron

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: form encoding issues

2010-09-29 Thread Andre Juffer

On 09/29/2010 12:43 PM, Ron Van den Branden wrote:

 Hi again,

Thank you very much for the quick help; meanwhile I think I found an 
answer in a post on cocoon-dev: 
<http://markmail.org/message/nm6bnvqztbee4s5o>. There is stated that 
apparently (and counter-intuitively, IMO), 'request parameters are 
always decoded using ISO-8859-1 ',  and that consequently 
'container_encoding should always be ISO-8859-1 (unless you have a 
broken servlet container), and form_encoding should be the same one as 
on your serializer.'.


Actually, Tomcat does, but Jetty does not (by default, UTF8). According 
to specification, servlet engine are suppose to decode using ISO-8859-1 
by default.




And lo: changing the  (over-eager?) container-encoding parameter in 
web.xml back to the default:


container-encoding
ISO-8859-1



Do I understand this correctly: you have encoded everything in UTF8, but 
to able to read your input fields (UTF8) you need to decode their value 
with ISO-8859-1 on the server?


I have had cases where the browser was encoding in ISO-8859-1 despite 
the presence of Content-type set to "text/html; charset=UTF-8" (it 
simply ignored the HTTP header value).




...seems to do the trick!
(phew!)

(note: I found this info also at 
<http://wiki.apache.org/cocoon/RequestParameterEncoding#A3._Decoding_incoming_requests:_Servlet_Container>) 



Thanks anyway,

Ron

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org




--
Andre H. Juffer  | Phone: +358-8-553 1161
Biocenter Oulu and   | Fax: +358-8-553-1141
Department of Biochemistry   | Email: andre.juf...@oulu.fi
University of Oulu, Finland  | WWW: www.biochem.oulu.fi/Biocomputing/
StruBioCat   | WWW: www.strubiocat.oulu.fi
NordProt | WWW: www.nordprot.org
Triacle Biocomputing | WWW: www.triacle-bc.com


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: form encoding issues

2010-09-29 Thread Ron Van den Branden

 Hi again,

Thank you very much for the quick help; meanwhile I think I found an 
answer in a post on cocoon-dev: 
<http://markmail.org/message/nm6bnvqztbee4s5o>. There is stated that 
apparently (and counter-intuitively, IMO), 'request parameters are 
always decoded using ISO-8859-1 ',  and that consequently 
'container_encoding should always be ISO-8859-1 (unless you have a 
broken servlet container), and form_encoding should be the same one as 
on your serializer.'.


And lo: changing the  (over-eager?) container-encoding parameter in 
web.xml back to the default:


container-encoding
ISO-8859-1


...seems to do the trick!
(phew!)

(note: I found this info also at 
<http://wiki.apache.org/cocoon/RequestParameterEncoding#A3._Decoding_incoming_requests:_Servlet_Container>) 



Thanks anyway,

Ron

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: form encoding issues

2010-09-29 Thread Thomas Markus

 Hi,

check out request character encoding. For tomcat look at 
http://confluence.atlassian.com/display/DOC/Configuring+Tomcat%27s+URI+encoding 
and in your tomcat installation at 
webapps/examples/WEB-INF/classes/filters/SetCharacterEncodingFilter.java


that worked for me

regards
Thomas


Am 29.09.2010 11:11, schrieb Ron Van den Branden:

Hi,

I'm stumbling on a character encoding issue (cocoon-2.1.10) and really 
can't see why. Apparently, text input in a form is passed on in a 
wrong encoding. I've set Cocoon's default encoding in all thinkable 
places as UTF-8:


web.xml:


Cocoon


container-encoding
UTF-8


form-encoding
UTF-8




sitemap.xmap

mime-type="text/html" name="xhtml"
pool-max="${xhtml-serializer.pool-max}" 
src="org.apache.cocoon.serialization.XMLSerializer">

-//W3C//DTD XHTML 1.0 Transitional//EN
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd 


UTF-8


Yet, when I execute following pipeline:









...with following minimal source files:

test.xml
===



test.xsl (which will mainly echo the previous input)
==

http://www.w3.org/1999/XSL/Transform"; 
version="2.0">












current input: 





Yet, entering a string with accented characters, like e.g. 'très 
annoying', this comes out as: 'très annoying'...
On the other hand, when entering the according URL 
(<http://localhost:/test?input=tr%C3%A8s+annoying>) directly, the 
characters are passed on correctly. Does anyone know how this can be 
fixed?


Any hints much appreciated!

Ron Van den Branden

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: form encoding issues

2010-09-29 Thread Robby Pelssers

Not sure as for how to do this with Cocoon2.1.x but with Cocoon2.2 you need to 
set following properties in the

META-INF/cocoon.properties
-
org.apache.cocoon.containerencoding=utf-8
org.apache.cocoon.formencoding=utf-8


Hope this gets you looking in the right direction.

Cheers,
Robby Pelssers


-Oorspronkelijk bericht-
Van: Ron Van den Branden [mailto:ron.vandenbran...@kantl.be]
Verzonden: wo 29-9-2010 11:11
Aan: users@cocoon.apache.org
Onderwerp: form encoding issues
 

  Hi,

I'm stumbling on a character encoding issue (cocoon-2.1.10) and really 
can't see why. Apparently, text input in a form is passed on in a wrong 
encoding. I've set Cocoon's default encoding in all thinkable places as 
UTF-8:

web.xml:


Cocoon


container-encoding
UTF-8


form-encoding
UTF-8




sitemap.xmap


-//W3C//DTD XHTML 1.0 Transitional//EN
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
UTF-8


Yet, when I execute following pipeline:









...with following minimal source files:

test.xml
===



test.xsl (which will mainly echo the previous input)
==

http://www.w3.org/1999/XSL/Transform"; 
version="2.0">











current input: 





Yet, entering a string with accented characters, like e.g. 'très 
annoying', this comes out as: 'très annoying'...
On the other hand, when entering the according URL 
(<http://localhost:/test?input=tr%C3%A8s+annoying>) directly, the 
characters are passed on correctly. Does anyone know how this can be fixed?

Any hints much appreciated!

Ron Van den Branden

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org


<>
-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org

form encoding issues

2010-09-29 Thread Ron Van den Branden

 Hi,

I'm stumbling on a character encoding issue (cocoon-2.1.10) and really 
can't see why. Apparently, text input in a form is passed on in a wrong 
encoding. I've set Cocoon's default encoding in all thinkable places as 
UTF-8:


web.xml:


Cocoon


container-encoding
UTF-8


form-encoding
UTF-8




sitemap.xmap

name="xhtml"
pool-max="${xhtml-serializer.pool-max}" 
src="org.apache.cocoon.serialization.XMLSerializer">

-//W3C//DTD XHTML 1.0 Transitional//EN
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
UTF-8


Yet, when I execute following pipeline:









...with following minimal source files:

test.xml
===



test.xsl (which will mainly echo the previous input)
==

http://www.w3.org/1999/XSL/Transform"; 
version="2.0">












current input: 





Yet, entering a string with accented characters, like e.g. 'très 
annoying', this comes out as: 'très annoying'...
On the other hand, when entering the according URL 
(<http://localhost:/test?input=tr%C3%A8s+annoying>) directly, the 
characters are passed on correctly. Does anyone know how this can be fixed?


Any hints much appreciated!

Ron Van den Branden

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: wrong encoding used when opening xml file with encoding utf-8

2010-09-13 Thread Robby Pelssers

Hi chris,

newer versions of excel support reading data from the web in xml format...  
Excel can even read in complete webpages containing html tables and that was 
the use case I was talking about. 

So i'm sorry about any confusion by not specifying exaclty what i was trying to 
do.

Robby

-Oorspronkelijk bericht-
Van: Christopher Schultz [mailto:ch...@christopherschultz.net]
Verzonden: ma 13-9-2010 15:46
Aan: users@cocoon.apache.org
Onderwerp: Re: wrong encoding used  when opening xml file with encoding utf-8
 
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Robby,

On 9/13/2010 8:31 AM, Robby Pelssers wrote:
> I'm generating an html table using Chinese characters and i set the encoding 
> and mimetype as follows:
> 
>   var response = cocoon.response; 
>   response.setContentType("application/vnd.ms-excel; charset=utf-8"); 

Uh, isn't application/vnd.ms-excel a binary file format? It shouldn't
have a charset in it's content type.

>   response.setHeader(
>   "Content-Disposition",
>   "attachment; filename=" + id + ".xls"
>   );  
>   
>   cocoon.sendPage(
>   "chemicalcontent/excel/" + rohs + "/" + id + ".xls"
>   );  
> 
> 
> When previewing the html table in the browser it displays the chinese 
> characters ok. But when i click the download link and i open the file with 
> excel, it always takes Western European as charset.. I can manually change 
> that and reload the file but am I missing something or is excel unable to 
> open an xml file using the correct encoding?

Maybe a sample of the file you're trying to send would be helpful.

> Ok... i found the solution:
> 
> I had to add following META as well to the generated html:
> 
> 
>   
>  content="application/vnd.ms-excel; charset=utf-8" />
>   

I'm completely confused: you have a Microsoft Excel (.xls) file that is
XML and also contains HTML  and  tags? No wonder Microsoft
Excel can't read it.

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkyOKzsACgkQ9CaO5/Lv0PBMAQCghgwa0r0IBR/BpOT8ublnKXal
3GIAn1Xd1cju+fvOswfg7fJVc+EiEJW/
=mGMR
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org


<>
-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org

Re: wrong encoding used when opening xml file with encoding utf-8

2010-09-13 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Robby,

On 9/13/2010 8:31 AM, Robby Pelssers wrote:
> I'm generating an html table using Chinese characters and i set the encoding 
> and mimetype as follows:
> 
>   var response = cocoon.response; 
>   response.setContentType("application/vnd.ms-excel; charset=utf-8"); 

Uh, isn't application/vnd.ms-excel a binary file format? It shouldn't
have a charset in it's content type.

>   response.setHeader(
>   "Content-Disposition",
>   "attachment; filename=" + id + ".xls"
>   );  
>   
>   cocoon.sendPage(
>   "chemicalcontent/excel/" + rohs + "/" + id + ".xls"
>   );  
> 
> 
> When previewing the html table in the browser it displays the chinese 
> characters ok. But when i click the download link and i open the file with 
> excel, it always takes Western European as charset.. I can manually change 
> that and reload the file but am I missing something or is excel unable to 
> open an xml file using the correct encoding?

Maybe a sample of the file you're trying to send would be helpful.

> Ok... i found the solution:
> 
> I had to add following META as well to the generated html:
> 
> 
>   
>  content="application/vnd.ms-excel; charset=utf-8" />
>   

I'm completely confused: you have a Microsoft Excel (.xls) file that is
XML and also contains HTML  and  tags? No wonder Microsoft
Excel can't read it.

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkyOKzsACgkQ9CaO5/Lv0PBMAQCghgwa0r0IBR/BpOT8ublnKXal
3GIAn1Xd1cju+fvOswfg7fJVc+EiEJW/
=mGMR
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: wrong encoding used when opening xml file with encoding utf-8

2010-09-13 Thread Robby Pelssers
Ok... i found the solution:

I had to add following META as well to the generated html:



  


Robby


-Oorspronkelijk bericht-
Van: Robby Pelssers [mailto:robby.pelss...@ciber.com]
Verzonden: ma 13-9-2010 14:31
Aan: users@cocoon.apache.org
Onderwerp: wrong encoding used  when opening xml file with encoding utf-8
 

Hi all,

I'm generating an html table using Chinese characters and i set the encoding 
and mimetype as follows:

var response = cocoon.response; 
response.setContentType("application/vnd.ms-excel; charset=utf-8"); 
response.setHeader(
"Content-Disposition",
"attachment; filename=" + id + ".xls"
);  

cocoon.sendPage(
"chemicalcontent/excel/" + rohs + "/" + id + ".xls"
);  


When previewing the html table in the browser it displays the chinese 
characters ok. But when i click the download link and i open the file with 
excel, it always takes Western European as charset.. I can manually change that 
and reload the file but am I missing something or is excel unable to open an 
xml file using the correct encoding?

I know this question is a bit of topic but maybe somebody else ever faced and 
solved this issue.

Kind regards,
Robby Pelssers

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org


<>
-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org

wrong encoding used when opening xml file with encoding utf-8

2010-09-13 Thread Robby Pelssers

Hi all,

I'm generating an html table using Chinese characters and i set the encoding 
and mimetype as follows:

var response = cocoon.response; 
response.setContentType("application/vnd.ms-excel; charset=utf-8"); 
response.setHeader(
"Content-Disposition",
"attachment; filename=" + id + ".xls"
);  

cocoon.sendPage(
"chemicalcontent/excel/" + rohs + "/" + id + ".xls"
);  


When previewing the html table in the browser it displays the chinese 
characters ok. But when i click the download link and i open the file with 
excel, it always takes Western European as charset.. I can manually change that 
and reload the file but am I missing something or is excel unable to open an 
xml file using the correct encoding?

I know this question is a bit of topic but maybe somebody else ever faced and 
solved this issue.

Kind regards,
Robby Pelssers

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: Strange encoding problem using forms

2010-02-11 Thread Søren Krum
And that fixed the problem! Winner of the cake is Dominic Mitchell,
thanks a lot :-)

-- 
Med vennlig hilsen

Søren D. Krum

-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: Strange encoding problem using forms

2010-02-11 Thread Søren Krum
Hi again!

:-) You see me smiling, the file encoding on the one machine is utf-8,
there it works, and the other is iso-8859-1. I am not sure if it fixes
the problem if i change it, but that i figure out soon (if i have
figured out why this differs at all, and how to change it.) :-) Thx for
this hint.


Med vennlig hilsen

Søren D. Krum


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: Strange encoding problem using forms

2010-02-11 Thread Dominic Mitchell
On Thu, Feb 11, 2010 at 6:40 AM, Jos Snellings wrote:

> The two tomcat versions on the machines are the same, but, can you
> please make a diff between the two server.xml under $CATALINA_HOME/conf?
> Just to be sure ...
>
>
The other thing to check are the system properties file.encoding and
sun.jnu.encoding.  I have seen the JVM take different values for
file.encoding depending on how it was started (by hand vs startup scripts).
You should be able to verify them using the StatusGenerator.

-Dom


Re: Strange encoding problem using forms

2010-02-11 Thread Jos Snellings
The two tomcat versions on the machines are the same, but, can you
please make a diff between the two server.xml under $CATALINA_HOME/conf?
Just to be sure ...

Jos

On Thu, 2010-02-11 at 11:47 +0100, Søren Krum wrote:
> Hello!
> 
> I have a small problem with a cocoon application and forms.
> 
> The application runs fine on one machine, but for some reason we want to
> have a mirror of that machine. Higg Avalability and failover...
> 
> And here some more details: The part failing is a simple form build up
> via cocoon forms (we are using cocoon 2.2), where in the first screen
> the user has some possibilities to enter data, and in the second screen,
> these data get presented to confirm them before they are comitted. The
> cocoon flow is used to transfer the data from first to second screen and
> further on to the third.
> 
> Entering some special characters like ö or øæå works fine on the one
> machine but not on the other. We tried to build up the two machines as
> alike as possible, the locale is the same, the tomcat and java versions
> are the same, the physical machine are the same, teh application is
> packed as a war and deployed on the same way.
> 
> Anyone an idea what could be lead to the trouble? Even a guess is nice,
> running out of ideas here... It looks like it caused by wrong
> interpretation of the parameters send with the request, but why are they
> interpreted differently?
> 
> The failure we get reads like the following:
> 
> org.apache.cocoon.ProcessingException: Failed to process pipeline
>   at  -
> file:///site/apps/apache-tomcat-6.0.20/work/portalen-forms/aksessliste/sitemap.xmap:117:28
>   at  -
> file:///site/apps/apache-tomcat-6.0.20/work/portalen-forms/aksessliste/sitemap.xmap:111:111
>   at  -
> file:///site/apps/apache-tomcat-6.0.20/work/portalen-forms/aksessliste/sitemap.xmap:108:69
>   at  -
> file:///site/apps/apache-tomcat-6.0.20/work/portalen-forms/aksessliste/sitemap.xmap:107:68
>   at  -
> file:///site/apps/apache-tomcat-6.0.20/work/portalen-forms/aksessliste/sitemap.xmap:106:50
>   at  -
> file:///site/apps/apache-tomcat-6.0.20/work/portalen-forms/aksessliste/sitemap.xmap:42:42
> 
>   at
> org.apache.cocoon.ProcessingException.throwLocated(ProcessingException.java:143)
>   at
> org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.handleException(AbstractProcessingPipeline.java:923)
>   at
> org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.processXMLPipeline(AbstractProcessingPipeline.java:548)
>   at
> org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:273)
>   at
> org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:439)
>   at sun.reflect.GeneratedMethodAccessor178.invoke(Unknown Source)
>   at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:585)
>   at
> org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
>   at $Proxy31.process(Unknown Source)
>   at
> org.apache.cocoon.components.treeprocessor.sitemap.SerializeNode.invoke(SerializeNode.java:147)
>   at
> org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:55)
>   at
> org.apache.cocoon.components.treeprocessor.sitemap.MatchNode.invoke(MatchNode.java:87)
>   at
> org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:55)
>   at
> org.apache.cocoon.components.treeprocessor.sitemap.ActTypeNode.invoke(ActTypeNode.java:123)
>   at
> org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
>   at
> org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143)
>   at
> org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
>   at
> org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:81)
>   at
> org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:239)
>   at
> org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.handleCocoonRedirect(ConcreteTreeProcessor.java:316)
>   at
> org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor$TreeProcessorRedirector.cocoonRedirect(ConcreteTreeProcessor.java:366)
>   at
> org.apache.cocoon.environment.ForwardRedirector.redirect(ForwardRedirector.java:62)
>   at
> org.apache.cocoon.components.flow.AbstractInterpreter.forwardTo(AbstractInterpreter.java:201)
>   at
> org.apache.cocoon.components.flow.javascript.fom.FOM_JavaScriptInterpreter.forwardTo(FOM_JavaScriptInte

Strange encoding problem using forms

2010-02-11 Thread Søren Krum
Hello!

I have a small problem with a cocoon application and forms.

The application runs fine on one machine, but for some reason we want to
have a mirror of that machine. Higg Avalability and failover...

And here some more details: The part failing is a simple form build up
via cocoon forms (we are using cocoon 2.2), where in the first screen
the user has some possibilities to enter data, and in the second screen,
these data get presented to confirm them before they are comitted. The
cocoon flow is used to transfer the data from first to second screen and
further on to the third.

Entering some special characters like ö or øæå works fine on the one
machine but not on the other. We tried to build up the two machines as
alike as possible, the locale is the same, the tomcat and java versions
are the same, the physical machine are the same, teh application is
packed as a war and deployed on the same way.

Anyone an idea what could be lead to the trouble? Even a guess is nice,
running out of ideas here... It looks like it caused by wrong
interpretation of the parameters send with the request, but why are they
interpreted differently?

The failure we get reads like the following:

org.apache.cocoon.ProcessingException: Failed to process pipeline
at  -
file:///site/apps/apache-tomcat-6.0.20/work/portalen-forms/aksessliste/sitemap.xmap:117:28
at  -
file:///site/apps/apache-tomcat-6.0.20/work/portalen-forms/aksessliste/sitemap.xmap:111:111
at  -
file:///site/apps/apache-tomcat-6.0.20/work/portalen-forms/aksessliste/sitemap.xmap:108:69
at  -
file:///site/apps/apache-tomcat-6.0.20/work/portalen-forms/aksessliste/sitemap.xmap:107:68
at  -
file:///site/apps/apache-tomcat-6.0.20/work/portalen-forms/aksessliste/sitemap.xmap:106:50
at  -
file:///site/apps/apache-tomcat-6.0.20/work/portalen-forms/aksessliste/sitemap.xmap:42:42

at
org.apache.cocoon.ProcessingException.throwLocated(ProcessingException.java:143)
at
org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.handleException(AbstractProcessingPipeline.java:923)
at
org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.processXMLPipeline(AbstractProcessingPipeline.java:548)
at
org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.processXMLPipeline(AbstractCachingProcessingPipeline.java:273)
at
org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(AbstractProcessingPipeline.java:439)
at sun.reflect.GeneratedMethodAccessor178.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at
org.apache.cocoon.core.container.spring.avalon.PoolableProxyHandler.invoke(PoolableProxyHandler.java:71)
at $Proxy31.process(Unknown Source)
at
org.apache.cocoon.components.treeprocessor.sitemap.SerializeNode.invoke(SerializeNode.java:147)
at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:55)
at
org.apache.cocoon.components.treeprocessor.sitemap.MatchNode.invoke(MatchNode.java:87)
at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:55)
at
org.apache.cocoon.components.treeprocessor.sitemap.ActTypeNode.invoke(ActTypeNode.java:123)
at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
at
org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143)
at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:78)
at
org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:81)
at
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:239)
at
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.handleCocoonRedirect(ConcreteTreeProcessor.java:316)
at
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor$TreeProcessorRedirector.cocoonRedirect(ConcreteTreeProcessor.java:366)
at
org.apache.cocoon.environment.ForwardRedirector.redirect(ForwardRedirector.java:62)
at
org.apache.cocoon.components.flow.AbstractInterpreter.forwardTo(AbstractInterpreter.java:201)
at
org.apache.cocoon.components.flow.javascript.fom.FOM_JavaScriptInterpreter.forwardTo(FOM_JavaScriptInterpreter.java:724)
at
org.apache.cocoon.components.flow.javascript.fom.FOM_Cocoon.forwardTo(FOM_Cocoon.java:717)
at
org.apache.cocoon.components.flow.javascript.fom.FOM_Cocoon.jsFunction_sendPage(FOM_Cocoon.java:265)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeM

Re: character encoding of a HttpServletRequest

2010-01-11 Thread Dominic Mitchell
On Mon, Jan 11, 2010 at 10:34 AM, Jos Snellings wrote:

> That is right!
> It is just a confusing situation :-(
> The filter works fine. The init() method of a generator does not give a
> chance to call setCharacterEncoding, as the parsing already happened.
> The good thing is that the code is already in spring, so, no new
> external dependencies. Maybe later on I add a
> "tryToGuessEncodingFilter".
>
>
Trying to guess encodings isn't a good idea, in general.  About the only one
that can be reliably detected is UTF-8.  In past projects, I've done
something like this:

  String result;
  try {
result = new String(someBytes, "UTF-8");
  catch (EncodingError e) {
result = new String(someBytes, "Windows-1252");
  }

In my experience, Windows-1252 was a better guess than ISO-8859-1, as users
tend to paste in stuff from word documents with curly quotes.

-Dom


Re: character encoding of a HttpServletRequest

2010-01-11 Thread Jos Snellings
That is right!
It is just a confusing situation :-(
The filter works fine. The init() method of a generator does not give a
chance to call setCharacterEncoding, as the parsing already happened.
The good thing is that the code is already in spring, so, no new
external dependencies. Maybe later on I add a
"tryToGuessEncodingFilter".

Jos

On Mon, 2010-01-11 at 10:49 +0100, Reinhard Pötz wrote:
> Jos Snellings wrote:
> > Hi,
> > 
> > HttpServletRequest looks 'imperfect':
> > Cocoon 3, alpha 2.
> > A generator accesses the HttpServletRequest in the setup method:
> > 
> > request = HttpContextHelper.getRequest(parameters);
> > text = request.getParameter("tekst");
> > 
> > The pages, including forms are ecoded in utf-8.
> > The String 'text' is strange: the original content (utf-8) is encoded
> > once again:
> > if the string on the form was one character, say 'é', the string has a
> > length of 4 bytes. It is the result of utf-8 encoding the two byte
> > character coming from the client. So, a second conversion is happening.
> > 
> > Now:
> > new String(request.getParameter("text").getBytes("ISO-8859-1")) works
> > fine.
> > 
> > Where should this be corrected?
> 
> Jos,
> 
> in Cocoon 3 there isn't any code that changes the encoding of request
> parameters. The plain HttpServletRequest as provided by the servlet
> container is used.
> 
> IIRC Tomcat uses ISO-8859-1 by default which follows the recommendation
> of the Servlet API spec:
> 
> ~~~
> SRV.4.9 Request data encoding
> Currently, many browsers do not send a char encoding qualifier with the
> Content-Type header, leaving open the determination of the character
> encoding for reading HTTP requests. The default encoding of a request
> the container uses to create the request reader and parse POST data must
> be “ISO-8859-1” if none has been specified by the client request.
> However, in order to indicate to the developer in this case the failure
> of the client to send a character encoding, the container returns null
> from the getCharacterEncoding method.
> If the client hasn’t set character encoding and the request data is
> encoded with a different encoding than the default as described above,
> breakage can occur. To remedy this situation, a new method
> setCharacterEncoding(String enc) has been added to the ServletRequest
> interface. Developers can override the character encoding supplied by
> the container by calling this method. It must be called prior to parsing
> any post data or reading any input from the request. Calling
> this method once data has been read will not affect the encoding.
> ~~~
> 
> So as some others suggested, the best option is using one of the
> CharecterEncoding servlet filters and not to remedy this situation
> somewhere in C3.
> 



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: character encoding of a HttpServletRequest

2010-01-11 Thread Jos Snellings
This, to notify you that the solution you suggested works fine:
So, for all cocoon users: if you are experiencing problems with the
character encoding of POST form data (which is very likely to occur):
the problem is generally cured by
Inserting the following code in web.xml


encodingFilter

org.springframework.web.filter.CharacterEncodingFilter

encoding
UTF-8


forceEncoding
true

 

 
encodingFilter
/*
 

(Insert it as the first children under the  root element)

Jos


On Mon, 2010-01-11 at 08:54 +, Dominic Mitchell wrote:
> 2010/1/10 Jos Snellings 
> This is not a specific cocoon issue, I believe. It probably
> has to do
> with Tomcat 5.5.27.
> request.setCharacterEncoding simply does not work; it does not
> change a
> thing.
> request.getCharacterEncoding returns nothing.
> 
> You have to call request.setCharacterEncoding() really early for it to
> have any impact.  Your best bet is to look at spring's
> CharacterEncodingFilter.  You can add that to your web.xml to get the
> character set defined very early on.
> 
> -Dom 
> 
> 



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: character encoding of a HttpServletRequest

2010-01-11 Thread Reinhard Pötz

Jos Snellings wrote:
> Hi,
> 
> HttpServletRequest looks 'imperfect':
> Cocoon 3, alpha 2.
> A generator accesses the HttpServletRequest in the setup method:
> 
> request = HttpContextHelper.getRequest(parameters);
> text = request.getParameter("tekst");
> 
> The pages, including forms are ecoded in utf-8.
> The String 'text' is strange: the original content (utf-8) is encoded
> once again:
> if the string on the form was one character, say 'é', the string has a
> length of 4 bytes. It is the result of utf-8 encoding the two byte
> character coming from the client. So, a second conversion is happening.
> 
> Now:
> new String(request.getParameter("text").getBytes("ISO-8859-1")) works
> fine.
> 
> Where should this be corrected?

Jos,

in Cocoon 3 there isn't any code that changes the encoding of request
parameters. The plain HttpServletRequest as provided by the servlet
container is used.

IIRC Tomcat uses ISO-8859-1 by default which follows the recommendation
of the Servlet API spec:

~~~
SRV.4.9 Request data encoding
Currently, many browsers do not send a char encoding qualifier with the
Content-Type header, leaving open the determination of the character
encoding for reading HTTP requests. The default encoding of a request
the container uses to create the request reader and parse POST data must
be “ISO-8859-1” if none has been specified by the client request.
However, in order to indicate to the developer in this case the failure
of the client to send a character encoding, the container returns null
from the getCharacterEncoding method.
If the client hasn’t set character encoding and the request data is
encoded with a different encoding than the default as described above,
breakage can occur. To remedy this situation, a new method
setCharacterEncoding(String enc) has been added to the ServletRequest
interface. Developers can override the character encoding supplied by
the container by calling this method. It must be called prior to parsing
any post data or reading any input from the request. Calling
this method once data has been read will not affect the encoding.
~~~

So as some others suggested, the best option is using one of the
CharecterEncoding servlet filters and not to remedy this situation
somewhere in C3.

-- 
Reinhard Pötz   Managing Director, {Indoqa} GmbH
 http://www.indoqa.com/en/people/reinhard.poetz/

Member of the Apache Software Foundation
Apache Cocoon Committer, PMC member  reinh...@apache.org


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: character encoding of a HttpServletRequest

2010-01-11 Thread Dominic Mitchell
On Mon, Jan 11, 2010 at 9:12 AM, Jos Snellings wrote:

> Thanks, I will try CharacterEncodingFilter!
> I will lookup in the code were filtering takes place, because the
> problem is rather that it looks like the form data are filtered twice.
>
> In addition, do I remember right that there used to be a cocoon servlet
> setting,
>
>form-encoding
>UTF-8
>
>
> Cheers, thanks for the hint. I will post the result... I will certainly
> not be the only person who is confronted with this problem.
>

There are so many places to set the encoding.  And just for fun, you can
have different encodings for query string parameters and form-data in the
body in the same request.  Sigh.

I've had good luck with
CharacterEncodingFilter<http://static.springsource.org/spring/docs/2.5.6/api/org/springframework/web/filter/CharacterEncodingFilter.html>though.

-Dom


Re: character encoding of a HttpServletRequest

2010-01-11 Thread Jos Snellings
Thanks, I will try CharacterEncodingFilter!
I will lookup in the code were filtering takes place, because the
problem is rather that it looks like the form data are filtered twice. 

In addition, do I remember right that there used to be a cocoon servlet
setting, 

form-encoding
UTF-8


Cheers, thanks for the hint. I will post the result... I will certainly
not be the only person who is confronted with this problem.

Jos


On Mon, 2010-01-11 at 08:54 +, Dominic Mitchell wrote:
> 2010/1/10 Jos Snellings 
> This is not a specific cocoon issue, I believe. It probably
> has to do
> with Tomcat 5.5.27.
> request.setCharacterEncoding simply does not work; it does not
> change a
> thing.
> request.getCharacterEncoding returns nothing.
> 
> You have to call request.setCharacterEncoding() really early for it to
> have any impact.  Your best bet is to look at spring's
> CharacterEncodingFilter.  You can add that to your web.xml to get the
> character set defined very early on.
> 
> -Dom 
> 
> 



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



Re: character encoding of a HttpServletRequest

2010-01-11 Thread Dominic Mitchell
2010/1/10 Jos Snellings 

> This is not a specific cocoon issue, I believe. It probably has to do
> with Tomcat 5.5.27.
> request.setCharacterEncoding simply does not work; it does not change a
> thing.
> request.getCharacterEncoding returns nothing.
>

You have to call request.setCharacterEncoding() really early for it to have
any impact.  Your best bet is to look at spring's CharacterEncodingFilter.
You can add that to your web.xml to get the character set defined very early
on.

-Dom


Re: character encoding of a HttpServletRequest

2010-01-10 Thread Jos Snellings
This is not a specific cocoon issue, I believe. It probably has to do
with Tomcat 5.5.27.
request.setCharacterEncoding simply does not work; it does not change a
thing.
request.getCharacterEncoding returns nothing.

Best,
Jos


On Sat, 2010-01-09 at 08:01 +0100, Jos Snellings wrote:
> Hi,
> 
> HttpServletRequest looks 'imperfect':
> Cocoon 3, alpha 2.
> A generator accesses the HttpServletRequest in the setup method:
> 
> request = HttpContextHelper.getRequest(parameters);
> text = request.getParameter("tekst");
> 
> The pages, including forms are ecoded in utf-8.
> The String 'text' is strange: the original content (utf-8) is encoded
> once again:
> if the string on the form was one character, say 'é', the string has a
> length of 4 bytes. It is the result of utf-8 encoding the two byte
> character coming from the client. So, a second conversion is happening.
> 
> Now:
> new String(request.getParameter("text").getBytes("ISO-8859-1")) works
> fine.
> 
> Where should this be corrected?
> 
> Cheers,
> Jos
> 
> 
> -
> To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
> For additional commands, e-mail: users-h...@cocoon.apache.org
> 
> 



-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



character encoding of a HttpServletRequest

2010-01-08 Thread Jos Snellings
Hi,

HttpServletRequest looks 'imperfect':
Cocoon 3, alpha 2.
A generator accesses the HttpServletRequest in the setup method:

request = HttpContextHelper.getRequest(parameters);
text = request.getParameter("tekst");

The pages, including forms are ecoded in utf-8.
The String 'text' is strange: the original content (utf-8) is encoded
once again:
if the string on the form was one character, say 'é', the string has a
length of 4 bytes. It is the result of utf-8 encoding the two byte
character coming from the client. So, a second conversion is happening.

Now:
new String(request.getParameter("text").getBytes("ISO-8859-1")) works
fine.

Where should this be corrected?

Cheers,
Jos


-
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org



RE: xml encoding issue cocoon2.2

2009-05-27 Thread Robby Pelssers
Ok.

 

I already found out how-to  ;-)

 

http://cocoon.apache.org/2.2/1366_1_1.html 4. Setting Cocoon's
encoding (especially CForms)  solved my issue.

Cheers,

Robby

 

 

From: Robby Pelssers [mailto:robby.pelss...@ciber.nl] 
Sent: Wednesday, May 27, 2009 12:34 PM
To: users@cocoon.apache.org
Subject: xml encoding issue cocoon2.2

 

Hi all,

 

I can't seem to set encoding properly when writing xml to a file using
the write-source transformer.  Googling around has not yet resulted in a
solution ;-(

 

In my sitemap I have declared the serializer like below:

 

   

  UTF-8

   

 

 

All my stylesheets contain following 

 

  

 

But nothing seems to affect the end result:



 

Anybody who can explain how to override the encoding to UTF-8?

 

Thx in advance,

Robby

 



xml encoding issue cocoon2.2

2009-05-27 Thread Robby Pelssers
Hi all,

 

I can't seem to set encoding properly when writing xml to a file using
the write-source transformer.  Googling around has not yet resulted in a
solution ;-(

 

In my sitemap I have declared the serializer like below:

 

   

  UTF-8

   

 

 

All my stylesheets contain following 

 

  

 

But nothing seems to affect the end result:



 

Anybody who can explain how to override the encoding to UTF-8?

 

Thx in advance,

Robby

 



  1   2   3   4   5   6   7   >