Re: [Dspace-tech] FW: 1.5.2 browse and UTF-8 and diacritics
We had the same problem. The problem was that we specified URIEncoding=UTF-8 in the wrong place in the server.xml config. We specified it under the Connector port=8080 maxHttpHeaderSize=8192 section, but as we use proxy_ajp to port 8009 , we needed to specify it under Connector port=8009 section : So it look like this in server.xml : Connector port=8009 URIEncoding=UTF-8 enableLookups=false redirectPort=8443 protocol=AJP/1.3 / -- View this message in context: http://www.nabble.com/Re%3A-FW%3A-1.5.2-browse-and-UTF-8-and-diacritics-tp23400478p23757858.html Sent from the DSpace - Tech mailing list archive at Nabble.com. -- Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT is a gathering of tech-side developers brand creativity professionals. Meet the minds behind Google Creative Lab, Visual Complexity, Processing, iPhoneDevCamp as they present alongside digital heavyweights like Barbarian Group, R/GA, Big Spaceship. http://p.sf.net/sfu/creativitycat-com ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] FW: 1.5.2 browse and UTF-8 and diacritics
Hi Jennifer, the sort order class is defined in dspace.cfg so you can create your personal class that ignore diacritics and use it. Try something like this package my.edu.sort; import org.dspace.text.filter.StripDiacritics; import org.dspace.text.filter.LowerCaseAndTrim; import org.dspace.text.filter.TextFilter; import org.dspace.sort.AbstractTextFilterOFD; public class OrderFormatAuthorIgnoreDiacritics extends AbstractTextFilterOFD { { filters = new TextFilter[] { new StripDiacritics(), new LowerCaseAndTrim() }; } } and add in the dspace.cfg author = org.dspace.sort.OrderFormatAuthorIgnoreDiacritics Hope this help, Andrea Jennifer Whalan ha scritto: While I was at it. Has anyone solved the issue about the sort order, with authors (or subjects I suppose), that contain diacritics. At this link: http://www.territorystories.nt.gov.au/browse?order=ASCrpp=20sort_by=-1etal=-1offset=5823type=author http://www.territorystories.nt.gov.au/browse?order=ASCrpp=20sort_by=-1etal=-1offset=5823type=author We have the authors: Muston, C. Mutch, Verdun Joseph. Müller, D. Myeni, Annie D. Myerscough, Mark. But our cataloguers expect for the author “Müller, D.” to be between the authors: Mull, A. E. E. and Muller, W.J. I’ve had a look at the source, and if I’m reading it correctly, the OrderFormatAuthor is the file that controls this, and when it calls DecomposeDiacitics, it changes this author to “mu(diacritic)ller, w.j.” I’m assuming that because it places the diacritic after the u, that is why this author is sorting after all the authors that begin with Mu. To make this long story short, is there a way to make the sort, ignore diacritics completely, and just order by the character. Thanks Jennifer *Jennifer Whalan *Territory Stories Administrator Innovation Access, Northern Territory Library Department of Natural Resources, Environment, The Arts and Sport Northern Territory Government Phone: (08) 8922 0757 Fax:(08) 8922 0722 Email: jennifer.wha...@nt.gov.au mailto:jennifer.wha...@nt.gov.au Web: * *www.ntl.nt.gov.au http://www.ntl.nt.gov.au* * The information contained in this message and any attachments may be confidential information and may be subject to legal privilege, public interest or legal profession privilege. If you are not the intended recipient, any use, disclosure or copying of this message or any attachments is unauthorised. If you have received this document in error, please advise the sender. No representation or warranty is given that attached files are free from viruses or other defects. The recipient assumes all responsibility for any loss or damage resulting directly or indirectly from the use of any attached files. *From:* Jennifer Whalan [mailto:jennifer.wha...@nt.gov.au] *Sent:* Wednesday, 6 May 2009 10:32 AM *To:* dspace-tech@lists.sourceforge.net *Subject:* [Dspace-tech] FW: 1.5.2 browse and UTF-8 Just resending, as I did not get any replies. A question about browsing in 1.5.2 XMLUI On our test instance, we have upgraded from 1.5.1, to 1.5.2 (using manakin), and we have an author with the name Müller, D. However, when you go to view the browse list of the items of this author, the url is browse?value=Müller%2C+D.type=author, but the page says Browsing by Author Müller, D. and shows no items. Reading through the changelist for 1.5.2, my understanding was that this http://jira.dspace.org/jira/browse/DS-132 fixed this problem(?). Another issue (http://jira.dspace.org/jira/browse/DS-130), says that you need to remove the URIEncoding=UTF-8 from the tomcat settings. This setting is currently on for us. In 1.5.2, do you still need to remove this from the tomcat settings (although the manual states that when installing, you need to make sure this is added to the tomcat settings), or am I missing something. Also, the web.xml file does have cocoon filter in it. If it makes any difference, the request header for the browse page of this author is Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7 and the response header is Content-Typetext/html;charset=utf-8 Thanks Jennifer Whalan *Jennifer Whalan *Territory Stories Administrator Innovation Access, Northern Territory Library Department of Natural Resources, Environment, The Arts and Sport Northern Territory Government Phone: (08) 8922 0757 Fax:(08) 8922 0722 Email: jennifer.wha...@nt.gov.au mailto:jennifer.wha...@nt.gov.au Web: * *www.ntl.nt.gov.au http://www.ntl.nt.gov.au* * The information contained in this message and any attachments may be confidential information and may be subject to legal privilege, public interest or legal profession privilege. If you are not the intended recipient, any use, disclosure or copying
Re: [Dspace-tech] FW: 1.5.2 browse and UTF-8
Hi Jennifer, the uRIEncoding need to be set to UTF-8 so your current setting is ok I'm not able to reproduce the issue on the http://dspace-testhaton.cilea.it/xmlui instance, see http://dspace-testhaton.cilea.it/xmlui/browse?value=name3%2C+M%C3%BCllertype=author http://dspace-testhaton.cilea.it/xmlui/browse?value=An%C3%B6ther%2C+Te%C5%A1ttype=author Have you make any customization? are you sure to have deployed the new war? Try also to clean the tomcat work directory I'm not sure if this can help in this case but just in case... Hope this help, A. Jennifer Whalan ha scritto: Just resending, as I did not get any replies. A question about browsing in 1.5.2 XMLUI On our test instance, we have upgraded from 1.5.1, to 1.5.2 (using manakin), and we have an author with the name Müller, D. However, when you go to view the browse list of the items of this author, the url is browse?value=Müller%2C+D.type=author, but the page says Browsing by Author Müller, D. and shows no items. Reading through the changelist for 1.5.2, my understanding was that this http://jira.dspace.org/jira/browse/DS-132 fixed this problem(?). Another issue (http://jira.dspace.org/jira/browse/DS-130), says that you need to remove the URIEncoding=UTF-8 from the tomcat settings. This setting is currently on for us. In 1.5.2, do you still need to remove this from the tomcat settings (although the manual states that when installing, you need to make sure this is added to the tomcat settings), or am I missing something. Also, the web.xml file does have cocoon filter in it. If it makes any difference, the request header for the browse page of this author is Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7 and the response header is Content-Typetext/html;charset=utf-8 Thanks Jennifer Whalan *Jennifer Whalan *Territory Stories Administrator Innovation Access, Northern Territory Library Department of Natural Resources, Environment, The Arts and Sport Northern Territory Government Phone: (08) 8922 0757 Fax:(08) 8922 0722 Email: jennifer.wha...@nt.gov.au mailto:jennifer.wha...@nt.gov.au Web: * *www.ntl.nt.gov.au http://www.ntl.nt.gov.au* * The information contained in this message and any attachments may be confidential information and may be subject to legal privilege, public interest or legal profession privilege. If you are not the intended recipient, any use, disclosure or copying of this message or any attachments is unauthorised. If you have received this document in error, please advise the sender. No representation or warranty is given that attached files are free from viruses or other defects. The recipient assumes all responsibility for any loss or damage resulting -- The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech -- Dott. Andrea Bollini Project Manager, IT Architect Systems Integrator Sezione Servizi per le Biblioteche e l'Editoria Elettronica CILEA, http://www.cilea.it tel. +39 06-59292853 cel. +39 348-8277525 --- Disclaimer: the content of this email is confidential and may be privileged, and it must not be disclosed or copied without the sender's consent. If you have received this message in error, please notify the sender and remove it from your system. The content of this email does not constitute legal advice, nor any responsibility is accepted for loss or damage incurred as a result of acting upon its contents or attachments. The statements and opinions expressed in this email are those of the author and do not necessarily reflect those of the employer. -- The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
[Dspace-tech] FW: 1.5.2 browse and UTF-8
Just resending, as I did not get any replies. A question about browsing in 1.5.2 XMLUI On our test instance, we have upgraded from 1.5.1, to 1.5.2 (using manakin), and we have an author with the name Müller, D. However, when you go to view the browse list of the items of this author, the url is browse?value=Müller%2C+D.type=author, but the page says Browsing by Author Müller, D. and shows no items. Reading through the changelist for 1.5.2, my understanding was that this http://jira.dspace.org/jira/browse/DS-132 fixed this problem(?). Another issue (http://jira.dspace.org/jira/browse/DS-130), says that you need to remove the URIEncoding=UTF-8 from the tomcat settings. This setting is currently on for us. In 1.5.2, do you still need to remove this from the tomcat settings (although the manual states that when installing, you need to make sure this is added to the tomcat settings), or am I missing something. Also, the web.xml file does have cocoon filter in it. If it makes any difference, the request header for the browse page of this author is Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7 and the response header is Content-Typetext/html;charset=utf-8 Thanks Jennifer Whalan Jennifer Whalan Territory Stories Administrator Innovation Access, Northern Territory Library Department of Natural Resources, Environment, The Arts and Sport Northern Territory Government Phone: (08) 8922 0757 Fax:(08) 8922 0722 Email: jennifer.wha...@nt.gov.au mailto:jennifer.wha...@nt.gov.au Web:www.ntl.nt.gov.au http://www.ntl.nt.gov.au The information contained in this message and any attachments may be confidential information and may be subject to legal privilege, public interest or legal profession privilege. If you are not the intended recipient, any use, disclosure or copying of this message or any attachments is unauthorised. If you have received this document in error, please advise the sender. No representation or warranty is given that attached files are free from viruses or other defects. The recipient assumes all responsibility for any loss or damage resulting -- The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] FW: 1.5.2 browse and UTF-8 and diacritics
While I was at it. Has anyone solved the issue about the sort order, with authors (or subjects I suppose), that contain diacritics. At this link: http://www.territorystories.nt.gov.au/browse?order=ASCrpp=20sort_by=-1etal=-1offset=5823type=author We have the authors: Muston, C. Mutch, Verdun Joseph. Müller, D. Myeni, Annie D. Myerscough, Mark. But our cataloguers expect for the author Müller, D. to be between the authors: Mull, A. E. E. and Muller, W.J. I've had a look at the source, and if I'm reading it correctly, the OrderFormatAuthor is the file that controls this, and when it calls DecomposeDiacitics, it changes this author to mu(diacritic)ller, w.j. I'm assuming that because it places the diacritic after the u, that is why this author is sorting after all the authors that begin with Mu. To make this long story short, is there a way to make the sort, ignore diacritics completely, and just order by the character. Thanks Jennifer Jennifer Whalan Territory Stories Administrator Innovation Access, Northern Territory Library Department of Natural Resources, Environment, The Arts and Sport Northern Territory Government Phone: (08) 8922 0757 Fax:(08) 8922 0722 Email: jennifer.wha...@nt.gov.au mailto:jennifer.wha...@nt.gov.au Web:www.ntl.nt.gov.au http://www.ntl.nt.gov.au The information contained in this message and any attachments may be confidential information and may be subject to legal privilege, public interest or legal profession privilege. If you are not the intended recipient, any use, disclosure or copying of this message or any attachments is unauthorised. If you have received this document in error, please advise the sender. No representation or warranty is given that attached files are free from viruses or other defects. The recipient assumes all responsibility for any loss or damage resulting directly or indirectly from the use of any attached files. From: Jennifer Whalan [mailto:jennifer.wha...@nt.gov.au] Sent: Wednesday, 6 May 2009 10:32 AM To: dspace-tech@lists.sourceforge.net Subject: [Dspace-tech] FW: 1.5.2 browse and UTF-8 Just resending, as I did not get any replies. A question about browsing in 1.5.2 XMLUI On our test instance, we have upgraded from 1.5.1, to 1.5.2 (using manakin), and we have an author with the name Müller, D. However, when you go to view the browse list of the items of this author, the url is browse?value=Müller%2C+D.type=author, but the page says Browsing by Author Müller, D. and shows no items. Reading through the changelist for 1.5.2, my understanding was that this http://jira.dspace.org/jira/browse/DS-132 fixed this problem(?). Another issue (http://jira.dspace.org/jira/browse/DS-130), says that you need to remove the URIEncoding=UTF-8 from the tomcat settings. This setting is currently on for us. In 1.5.2, do you still need to remove this from the tomcat settings (although the manual states that when installing, you need to make sure this is added to the tomcat settings), or am I missing something. Also, the web.xml file does have cocoon filter in it. If it makes any difference, the request header for the browse page of this author is Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7 and the response header is Content-Typetext/html;charset=utf-8 Thanks Jennifer Whalan Jennifer Whalan Territory Stories Administrator Innovation Access, Northern Territory Library Department of Natural Resources, Environment, The Arts and Sport Northern Territory Government Phone: (08) 8922 0757 Fax:(08) 8922 0722 Email: jennifer.wha...@nt.gov.au mailto:jennifer.wha...@nt.gov.au Web:www.ntl.nt.gov.au http://www.ntl.nt.gov.au The information contained in this message and any attachments may be confidential information and may be subject to legal privilege, public interest or legal profession privilege. If you are not the intended recipient, any use, disclosure or copying of this message or any attachments is unauthorised. If you have received this document in error, please advise the sender. No representation or warranty is given that attached files are free from viruses or other defects. The recipient assumes all responsibility for any loss or damage resulting -- The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech