Thanks for responses.

I created class OrderFormatLocale:

 public class OrderFormatLocale extends AbstractTextFilterOFD {
        {
                filters = new TextFilter[] { new LowerCaseAndTrim(),
                                                   new LocaleOrderingFilter() };
        }
}

but sorting was then very queer. For example alphabet starts with B, A
was after D and there was other queer things like this. So I modified
class by removing LocaleOrderingFilter to this form:

public class OrderFormatLocale extends AbstractTextFilterOFD {
        {
                filters = new TextFilter[] { new LowerCaseAndTrim()};
        }
}

Then sorting was correct in browsing ( by title, author and subjects
too) but started to be incorrect in search results. When search
results are sorted by title or author, string with diacritics are
sorted to the end after all letters without diacritics.

2011/5/19 Graham Triggs <grahamtri...@gmail.com>:
> Please take a look at a previous post of mine on this subject:
> http://dspace.2283337.n4.nabble.com/Browse-UTF-8-and-sorting-in-1-5-tp3281449p3281450.html
> Regards,
> G
>
> On 19 May 2011 15:18, Peter Dietz <pdiet...@gmail.com> wrote:
>>
>> Hi Ladislav,
>> I've noticed that our librarians here are happier with sorting when we use
>> the collate of C as opposed to utf8/en_US.
>>
>> postgres=# create database "dspace" with owner = dspace encoding='utf8'
>> tablespace=pg_default lc_collate = 'C' lc_ctype='en_US.UTF-8' template
>> template0;
>>
>> I've add these three authors to a test collection that had some sample
>> data in it, and it has the results you were expecting:
>> == Author Name ==
>> Cabanová, Zuzana
>> Cablová, Barbora
>> creatorlast, creatorfirst
>> Čabla, Michael
>>
>>
>>
>> Peter Dietz
>>
>>
>>
>> On Thu, May 19, 2011 at 4:41 AM, Ladislav Kulhanek
>> <ladislav.kulha...@vsb.cz> wrote:
>>>
>>> Hello everybody.
>>>
>>> We have data in our DSpace in czech language (code "cs" in accordance
>>> with ISO 639-1) and we have a problem with order in browsing by
>>> author, titles and subjects (order in search results is correct).
>>> There are letters with diacritic in czech alphabet, for example "Č"
>>> (0x010C code in unicode). This letter should be ordered between "C"
>>> and "D", but in DSpace it is ordered to the same place as "C". For
>>> example we have ordered list as
>>>
>>> Cabanová, Zuzana
>>> Čabla, Michael
>>> Cablová, Barbora
>>>
>>> and this list should be
>>>
>>> Cabanová, Zuzana
>>> Cablová, Barbora
>>> Čabla, Michael
>>>
>>> And czech alphabet contains letter "Ch" (it consists from two
>>> characters). This letter should be ordered between "h" and "i". This
>>> letter is ordered in DSpace correctly. So it looks like DSpace order
>>> in accordance with czech alphabet, but ignore diacritics.
>>> We have DSpace 1.7.1, Manakin, db PostgreSQL 8.4 (database has
>>> Collation and Ctype set as cs_CZ.UTF-8), and in tomcat connector is
>>> URIEncoding="UTF-8". Any idea how to solve it? Thanks.
>>>
>>> Ladislav Kulhanek
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> What Every C/C++ and Fortran developer Should Know!
>>> Read this article and learn how Intel has extended the reach of its
>>> next-generation tools to help Windows* and Linux* C/C++ and Fortran
>>> developers boost performance applications - including clusters.
>>> http://p.sf.net/sfu/intel-dev2devmay
>>> _______________________________________________
>>> DSpace-tech mailing list
>>> DSpace-tech@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>>
>>
>>
>> ------------------------------------------------------------------------------
>> What Every C/C++ and Fortran developer Should Know!
>> Read this article and learn how Intel has extended the reach of its
>> next-generation tools to help Windows* and Linux* C/C++ and Fortran
>> developers boost performance applications - including clusters.
>> http://p.sf.net/sfu/intel-dev2devmay
>> _______________________________________________
>> DSpace-tech mailing list
>> DSpace-tech@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>>
>
>

------------------------------------------------------------------------------
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to