Re: Non us-ascii character in filenames break. Was: French accent, getting crazy....
I cannot help this - are you sure that you client is using utf8? I am Danish, and I can store filenames using special Danish letters. I have also tested with Russian letter in filenames, and it is working fine! What you describe sound to me like you client is sending non utf8 encoded data to slide, and that will mess up files it the way you describe! Your problems are the same as when I tested with windows 2000 and webfolders. Windows 2000 only worked when I installed office xp with latest servicepack or office 2003. Windows XP seem to be working fine. DAVExplorer will corrupt filenames if your slide is set to utf8. If you put a sniffer on you system and monitor the data transmitted between client and server, then try following. Upload a file called é.bat and the header send from the client should look something like this: PUT /files/%c3%a9.bat HTTP/1.1 Host: localhost:82 That is: the utf8 escaped version of é is %c3%a9. And the unexcaped utf8 version of é would be: é. This is correct behavior. /jacob - Original Message - From: delbd [EMAIL PROTECTED] To: Slide Users Mailing List slide-user@jakarta.apache.org Sent: Friday, April 29, 2005 3:19 PM Subject: Non us-ascii character in filenames break. Was: French accent, getting crazy Submitted a detailed bug report of problem. I hope the slide devels will fix this fast! http://issues.apache.org/bugzilla/show_bug.cgi?id=34679 Le Vendredi 29 Avril 2005 12:55, Alexandre Clavaud a écrit : Thanks, that will be great. I have to projects: 1. For a customer, using Slide as Document Management repository, accessing from WebFolder and from Java applications. 2. For Compiere, an Open Source ERP, using Slide as Document Management repository full integrated in the application, with Document and Folder types, metadata, workflow, ... If ok, will be part of the core product. Regards Alexandre Hooow shit! Tried here. Indeed slide mess with the accents when sending it's result to the client. I created a file with accents. Platform encoding is utf-8, slide encoding is utf-8, client is the kde webdav protocol working nicely with accent on other webdav implementations. However, result of a propfind (sniffed with ethereal) send by slide server is like if string was converted to an utf-8 byte array and then converted back to string as an iso8859-1 byte array. (This is the typical round copyright sign followed by another char which we all see when a browser tries to open an utf-8 page as an iso8859 one). This look like it's done before server put it in the propfind result dom. Problem being it's the server doing the messup before url encoding. For information, not only the href is wrong but also the displayname. Clients bear no responsability in problem. I also took a look in database, as we store document on an oracle database, the uri and the displayname are all ok. So seems like it's the servlet on output which mess something. I'll do some step by step analysis and keep you informed if i can find a way around this. Note to slide-dev, this is a real problem big problem as the document becomes unmanageable! Le Vendredi 29 Avril 2005 11:35, Alexandre Clavaud a écrit : Then, rather than using utf8, should I use ISO8859-1 ? I have slide 2.1 working with utf8. But you should notice that windows 2000 with office 97 and DAVExplorer does not support utf8. Have a look at: http://greenbytes.de/tech/webdav/webfolder-client-list.html /jacob - Original Message - From: Alexandre Clavaud [EMAIL PROTECTED] To: slide-user@jakarta.apache.org Sent: Friday, April 29, 2005 8:48 AM Subject: French accent, getting crazy Hello, Does someone managed to make slide (2.1 or more) working with french accent ? using Oracle store (Oracle 10g) ? using File store (linux) ? using Bea Weblogic (v8.1 on linux) ? from DAV Explorer ? from Webfolder on windows 2000 with Office 97 ? I tried differents combinaison of utf8 and iso8859-1 in slide.properties but I still get error when getting the file or when browsing the content of a folder (the file is displayed with '_' instead of accentued characters). I really need help, I'm getting crazy and I've got a big project on which I want to use slide. Alexandre Clavaud Consultant Technique ILEM S.A Tel: +41 79 773 6888 Email : [EMAIL PROTECTED] --- --- -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Alexandre Clavaud Consultant Technique ILEM S.A Tel: +41 79 773 6888 Email : [EMAIL PROTECTED]
Re: Non us-ascii character in filenames break. Was: French accent, getting crazy....
The put went without a problem. The data is encoded correctly on the database. It's an Oracle database and the accents are corrects in it. On propfind, slide sends a list of documents with an encoded href. This is the href the client should be sending back to slide when it tries any operation on document. As detailed in bug report, slide is unable to decode the href it has send, this has nothing to do with the client IMO. For example, a document /files/d0_public/téèst.txt gets a href in the result of propfind in d0_public like this: D:response xmlns:D=DAV: D:href/intranet/DAV/files/d0_public/t%C3%A9%C3%A8st.txt/D:href D:propstat D:prop ...blablabla however, a GET on this url returns an object not found. Problem arise wether slide is configured with utf-8 or another charset. I also set java.io.encoding to UTF-8 to set the default String encoding to utf-8 (just to be sure). See transcript: [EMAIL PROTECTED]:~$ telnet localhost 8080 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GET /intranet/DAV/files/d0_public/t%C3%A9%C3%A8st.txt HTTP/1.1 Host: localhost:8080 HTTP/1.1 404 Not Found: No object found at /files/d0_public/t%C3%A9%C3%A8st.txt Server: Apache-Coyote/1.1 Set-Cookie: JSESSIONID=5C06606B1A4C0A5DC6629178C9009704; Path=/intranet Content-Type: text/html;charset=utf-8 Content-Length: 1148 Date: Mon, 02 May 2005 08:50:53 GMT htmlheadtitleApache Tomcat/5.5.7 - Error report/titlestyle!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--/style /headbodyh1HTTP Status 404 - Not Found: No object found at /files/d0_public/t%C3%A9%C3%A8st.txt/h1HR size=1 noshade=noshadepbtype/b Status report/ppbmessage/b uNot Found: No objectfound at /files/d0_public/t%C3%A9%C3%A8st.txt/u/ppbdescription/b uThe requested resource (Not Found: No object found at /files/d0_public/t%C3%A9%C3%A8st.txt) is not available./u/pHR size=1 noshade=noshadeh3Apache Tomcat/5.5.7/h3/body/htmlConnection closed by foreign host. -- David Delbecq Royal Meteorological Institute of Belgium Le Lundi 2 Mai 2005 09:02, Jacob Lund a écrit : I cannot help this - are you sure that you client is using utf8? I am Danish, and I can store filenames using special Danish letters. I have also tested with Russian letter in filenames, and it is working fine! What you describe sound to me like you client is sending non utf8 encoded data to slide, and that will mess up files it the way you describe! Your problems are the same as when I tested with windows 2000 and webfolders. Windows 2000 only worked when I installed office xp with latest servicepack or office 2003. Windows XP seem to be working fine. DAVExplorer will corrupt filenames if your slide is set to utf8. If you put a sniffer on you system and monitor the data transmitted between client and server, then try following. Upload a file called é.bat and the header send from the client should look something like this: PUT /files/%c3%a9.bat HTTP/1.1 Host: localhost:82 That is: the utf8 escaped version of é is %c3%a9. And the unexcaped utf8 version of é would be: é. This is correct behavior. /jacob - Original Message - From: delbd [EMAIL PROTECTED] To: Slide Users Mailing List slide-user@jakarta.apache.org Sent: Friday, April 29, 2005 3:19 PM Subject: Non us-ascii character in filenames break. Was: French accent, getting crazy Submitted a detailed bug report of problem. I hope the slide devels will fix this fast! http://issues.apache.org/bugzilla/show_bug.cgi?id=34679 Le Vendredi 29 Avril 2005 12:55, Alexandre Clavaud a écrit : Thanks, that will be great. I have to projects: 1. For a customer, using Slide as Document Management repository, accessing from WebFolder and from Java applications. 2. For Compiere, an Open Source ERP, using Slide as Document Management repository full integrated in the application, with Document and Folder types, metadata, workflow, ... If ok, will be part of the core product. Regards Alexandre Hooow shit! Tried here. Indeed slide mess with the accents when sending it's result to the client. I created a file with accents. Platform encoding is utf-8, slide encoding is utf-8, client is the kde webdav protocol working nicely with accent on other webdav implementations. However, result of a propfind (sniffed with ethereal) send by slide server is like if string was
Re: Non us-ascii character in filenames break. Was: French accent, getting crazy....
I just uploaded a file with that exact name: téèst.txt to my slide version 2.1. I tried with both windows explorer webfolder and with my own client - it worked fine. Are you testing with the txfilestore? BTW utf8 is broken in 2.2 so you should stay with 2.1 for now. Also you are showing here - could you verify that the put request from your client is also encoding utf8! My problem is that it works fine in my case - for both filestore and for the SQLServer store. This makes me conclude that it is a setup issue or and oracle store problem. Or am I missing something? /jacob - Original Message - From: delbd [EMAIL PROTECTED] To: Slide Users Mailing List slide-user@jakarta.apache.org Sent: Monday, May 02, 2005 10:57 AM Subject: Re: Non us-ascii character in filenames break. Was: French accent, getting crazy The put went without a problem. The data is encoded correctly on the database. It's an Oracle database and the accents are corrects in it. On propfind, slide sends a list of documents with an encoded href. This is the href the client should be sending back to slide when it tries any operation on document. As detailed in bug report, slide is unable to decode the href it has send, this has nothing to do with the client IMO. For example, a document /files/d0_public/téèst.txt gets a href in the result of propfind in d0_public like this: D:response xmlns:D=DAV: D:href/intranet/DAV/files/d0_public/t%C3%A9%C3%A8st.txt/D:href D:propstat D:prop ...blablabla however, a GET on this url returns an object not found. Problem arise wether slide is configured with utf-8 or another charset. I also set java.io.encoding to UTF-8 to set the default String encoding to utf-8 (just to be sure). See transcript: [EMAIL PROTECTED]:~$ telnet localhost 8080 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GET /intranet/DAV/files/d0_public/t%C3%A9%C3%A8st.txt HTTP/1.1 Host: localhost:8080 HTTP/1.1 404 Not Found: No object found at /files/d0_public/t%C3%A9%C3%A8st.txt Server: Apache-Coyote/1.1 Set-Cookie: JSESSIONID=5C06606B1A4C0A5DC6629178C9009704; Path=/intranet Content-Type: text/html;charset=utf-8 Content-Length: 1148 Date: Mon, 02 May 2005 08:50:53 GMT htmlheadtitleApache Tomcat/5.5.7 - Error report/titlestyle!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--/style /headbodyh1HTTP Status 404 - Not Found: No object found at /files/d0_public/t%C3%A9%C3%A8st.txt/h1HR size=1 noshade=noshadepbtype/b Status report/ppbmessage/b uNot Found: No objectfound at /files/d0_public/t%C3%A9%C3%A8st.txt/u/ppbdescription/b uThe requested resource (Not Found: No object found at /files/d0_public/t%C3%A9%C3%A8st.txt) is not available./u/pHR size=1 noshade=noshadeh3Apache Tomcat/5.5.7/h3/body/htmlConnection closed by foreign host. -- David Delbecq Royal Meteorological Institute of Belgium Le Lundi 2 Mai 2005 09:02, Jacob Lund a écrit : I cannot help this - are you sure that you client is using utf8? I am Danish, and I can store filenames using special Danish letters. I have also tested with Russian letter in filenames, and it is working fine! What you describe sound to me like you client is sending non utf8 encoded data to slide, and that will mess up files it the way you describe! Your problems are the same as when I tested with windows 2000 and webfolders. Windows 2000 only worked when I installed office xp with latest servicepack or office 2003. Windows XP seem to be working fine. DAVExplorer will corrupt filenames if your slide is set to utf8. If you put a sniffer on you system and monitor the data transmitted between client and server, then try following. Upload a file called é.bat and the header send from the client should look something like this: PUT /files/%c3%a9.bat HTTP/1.1 Host: localhost:82 That is: the utf8 escaped version of é is %c3%a9. And the unexcaped utf8 version of é would be: é. This is correct behavior. /jacob - Original Message - From: delbd [EMAIL PROTECTED] To: Slide Users Mailing List slide-user@jakarta.apache.org Sent: Friday, April 29, 2005 3:19 PM Subject: Non us-ascii character in filenames break. Was: French accent, getting crazy Submitted a detailed bug report of problem. I hope the slide devels will fix this fast! http://issues.apache.org/bugzilla/show_bug.cgi?id=34679 Le Vendredi 29 Avril 2005 12:55, Alexandre Clavaud a écrit : Thanks, that will be great. I have to projects: 1. For a customer
Re: Non us-ascii character in filenames break. Was: French accent, getting crazy....
A very quick browsing through google makes me believe that oracle only stores unicode if you use nvarchar2 and nclob etc. Even if database is set to utf8 in oracle db. Or am I way off? Could you try to make a searchreplace: from VARCHAR2 to NVARCHAR2 and from CLOB to NCLOB in the oracle scheme? And then create the database again? /jacob - Original Message - From: delbd [EMAIL PROTECTED] To: Slide Users Mailing List slide-user@jakarta.apache.org Sent: Monday, May 02, 2005 2:27 PM Subject: Re: Non us-ascii character in filenames break. Was: French accent, getting crazy Yes oracle database here is configured to use a unicode character set as the default charset for all text fields. Le Lundi 2 Mai 2005 13:26, Jacob Lund a écrit : I just noticed something - is the sql scheme for oracle using Unicode? In order to make SQLServer support utf8 I had to change varchar to nvarchar - otherwise it would react in a way similar to what you describe. If you create a file on you desktop and cutpast some Russian characters into the filename and the upload the file to slide, the it will fail unless the database supports unicode. /jacob - Original Message - From: Jacob Lund [EMAIL PROTECTED] To: Slide Users Mailing List slide-user@jakarta.apache.org Sent: Monday, May 02, 2005 1:20 PM Subject: Re: Non us-ascii character in filenames break. Was: French accent, getting crazy I just uploaded a file with that exact name: téèst.txt to my slide version 2.1. I tried with both windows explorer webfolder and with my own client - it worked fine. Are you testing with the txfilestore? BTW utf8 is broken in 2.2 so you should stay with 2.1 for now. Also you are showing here - could you verify that the put request from your client is also encoding utf8! My problem is that it works fine in my case - for both filestore and for the SQLServer store. This makes me conclude that it is a setup issue or and oracle store problem. Or am I missing something? /jacob - Original Message - From: delbd [EMAIL PROTECTED] To: Slide Users Mailing List slide-user@jakarta.apache.org Sent: Monday, May 02, 2005 10:57 AM Subject: Re: Non us-ascii character in filenames break. Was: French accent, getting crazy The put went without a problem. The data is encoded correctly on the database. It's an Oracle database and the accents are corrects in it. On propfind, slide sends a list of documents with an encoded href. This is the href the client should be sending back to slide when it tries any operation on document. As detailed in bug report, slide is unable to decode the href it has send, this has nothing to do with the client IMO. For example, a document /files/d0_public/téèst.txt gets a href in the result of propfind in d0_public like this: D:response xmlns:D=DAV: D:href/intranet/DAV/files/d0_public/t%C3%A9%C3%A8st.txt/D:href D:propstat D:prop ...blablabla however, a GET on this url returns an object not found. Problem arise wether slide is configured with utf-8 or another charset. I also set java.io.encoding to UTF-8 to set the default String encoding to utf-8 (just to be sure). See transcript: [EMAIL PROTECTED]:~$ telnet localhost 8080 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GET /intranet/DAV/files/d0_public/t%C3%A9%C3%A8st.txt HTTP/1.1 Host: localhost:8080 HTTP/1.1 404 Not Found: No object found at /files/d0_public/t%C3%A9%C3%A8st.txt Server: Apache-Coyote/1.1 Set-Cookie: JSESSIONID=5C06606B1A4C0A5DC6629178C9009704; Path=/intranet Content-Type: text/html;charset=utf-8 Content-Length: 1148 Date: Mon, 02 May 2005 08:50:53 GMT htmlheadtitleApache Tomcat/5.5.7 - Error report/titlestyle!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76 ;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76 ;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76 ;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76 ;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-si ze:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--/style /headbodyh1HTTP Status 404 - Not Found: No object found at /files/d0_public/t%C3%A9%C3%A8st.txt/h1HR size=1 noshade=noshadepbtype/b Status report/ppbmessage/b uNot Found: No objectfound at /files/d0_public/t%C3%A9%C3%A8st.txt/u/ppbdescription/b uThe requested resource (Not Found: No object found at /files/d0_public/t%C3%A9%C3%A8st.txt) is not available./u/pHR size=1 noshade=noshadeh3Apache Tomcat/5.5.7/h3/body/htmlConnection closed by foreign host. -- David Delbecq Royal Meteorological Institute of Belgium Le Lundi 2 Mai 2005 09:02, Jacob Lund a écrit : I cannot help this - are you sure that you client
Re: Non us-ascii character in filenames break. Was: French accent, getting crazy....
Will try something similar with slide 2.1, thanks for suggestion. Le Vendredi 29 Avril 2005 16:55, Alexandre Clavaud a écrit : I managed to patch it (working with Slide-cvs-head-2.2pre1) : 1. Use org.apache.slide.urlEncoding=ISO8859-1 in slide.properties 2. In org.apache.slide.webdav.util.WebdavUtils class, getRelativePath method, change : if (result == null) { if (config.isDefaultServlet()) { result = req.getServletPath(); } else { result = req.getRequestURI(); result = result.substring(req.getContextPath().length()+ req.getServletPath().length()); } } with : if (result == null) { if (config.isDefaultServlet()) { result = req.getRequestURI().substring(req.getContextPath().length() ); } else { result = req.getRequestURI(); result = result.substring(req.getContextPath().length()+ req.getServletPath().length()); } } Submitted a detailed bug report of problem. I hope the slide devels will fix this fast! http://issues.apache.org/bugzilla/show_bug.cgi?id=34679 Le Vendredi 29 Avril 2005 12:55, Alexandre Clavaud a écrit : Thanks, that will be great. I have to projects: 1. For a customer, using Slide as Document Management repository, accessing from WebFolder and from Java applications. 2. For Compiere, an Open Source ERP, using Slide as Document Management repository full integrated in the application, with Document and Folder types, metadata, workflow, ... If ok, will be part of the core product. Regards Alexandre Hooow shit! Tried here. Indeed slide mess with the accents when sending it's result to the client. I created a file with accents. Platform encoding is utf-8, slide encoding is utf-8, client is the kde webdav protocol working nicely with accent on other webdav implementations. However, result of a propfind (sniffed with ethereal) send by slide server is like if string was converted to an utf-8 byte array and then converted back to string as an iso8859-1 byte array. (This is the typical round copyright sign followed by another char which we all see when a browser tries to open an utf-8 page as an iso8859 one). This look like it's done before server put it in the propfind result dom. Problem being it's the server doing the messup before url encoding. For information, not only the href is wrong but also the displayname. Clients bear no responsability in problem. I also took a look in database, as we store document on an oracle database, the uri and the displayname are all ok. So seems like it's the servlet on output which mess something. I'll do some step by step analysis and keep you informed if i can find a way around this. Note to slide-dev, this is a real problem big problem as the document becomes unmanageable! Le Vendredi 29 Avril 2005 11:35, Alexandre Clavaud a écrit : Then, rather than using utf8, should I use ISO8859-1 ? I have slide 2.1 working with utf8. But you should notice that windows 2000 with office 97 and DAVExplorer does not support utf8. Have a look at: http://greenbytes.de/tech/webdav/webfolder-client-list.html /jacob - Original Message - From: Alexandre Clavaud [EMAIL PROTECTED] To: slide-user@jakarta.apache.org Sent: Friday, April 29, 2005 8:48 AM Subject: French accent, getting crazy Hello, Does someone managed to make slide (2.1 or more) working with french accent ? using Oracle store (Oracle 10g) ? using File store (linux) ? using Bea Weblogic (v8.1 on linux) ? from DAV Explorer ? from Webfolder on windows 2000 with Office 97 ? I tried differents combinaison of utf8 and iso8859-1 in slide.properties but I still get error when getting the file or when browsing the content of a folder (the file is displayed with '_' instead of accentued characters). I really need help, I'm getting crazy and I've got a big project on which I want to use slide. Alexandre Clavaud Consultant Technique ILEM S.A Tel: +41 79 773 6888 Email : [EMAIL PROTECTED] --- --- -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Alexandre Clavaud Consultant Technique ILEM S.A Tel: +41 79 773 6888 Email : [EMAIL PROTECTED] --- --- --