Re: [Dspace-tech] Browse by Author/Subject in Chinese

2010-03-22 Thread Allen Lam
Hi Hayden,

In our case, for items having chinese author names, there are usually 
pinyin or various kinds of latinized representation of the names in the 
record. All names are naturally ordered, i.e. A-Z list first and then 
the chinese names. So you'll see surnames Lee, then Li, then inside the 
chinese block there is 李.

Our IR is open to the public. You are welcome to click into each section 
and page to see how the rows are listed.

Best,
Allen Lam.
HKU Hub Administrator, http://hub.hku.hk


Mr Havercamp wrote:
> Thanks Allen. If HKU is satisfied with natural ordering then it should 
> be more than sufficient for our project as well.
>
> Are you handling pinyin, or latinized, names as well within your author 
> lists? If so, are they listed along with the chinese names and how are 
> they sorted on the screen?
>
> Cheers
>
>
> Hayden
>
> On 19/03/10 13:23, Allen Lam wrote:
>   
>> Hi Hayden,
>>
>> Part of our dspace items are having chinese titles and chinese author names.
>> We do not have special ordering for any specific languages so that
>> chinese texts are ordered in their "natural ordering".
>> We are storing data in UTF-8. The meaning of "natural" is that when each
>> char (in english or chinese or any languages) of a string of text is
>> interpreted as an integer value, the strings of integers are ordered
>> from small to big.
>>
>> For example, a list of chinese names are naturally ordered like that:
>>
>> 何子雅
>> 何存德
>> 何存邦
>> 何學儉
>> 何學強
>> 何宏德
>> 何宗憲
>> 何宗義
>> 何定邦
>> 何宛珊
>>
>> because their numerical values are that:
>>
>> 20309 23376 38597
>> 20309 23384 24503
>> 20309 23384 37030
>> 20309 23416 20745
>> 20309 23416 24375
>> 20309 23439 24503
>> 20309 23447 25010
>> 20309 23447 32681
>> 20309 23450 37030
>> 20309 23451 29642
>>
>> It is subjective to define what is the "correct order". I feel the
>> natural ordering is correct enough for our use.
>>
>> You can re-define the ordering by providing information of stroke,
>> binary or phonetic for each string of text you want to order in dspace.
>> It means defining extra dc tags and supplying extra metadatavalues when
>> you submit each item.
>>
>> Best,
>> Allen Lam.
>> HKU Hub Administrator, http://hub.hku.hk
>>
>>
>>
>> Mr Havercamp wrote:
>>
>> 
>>> We have a DSpace instance which is storing metadata in the Chinese
>>> language. We are wondering whether anybody has tackled this and been
>>> successful in ordering the Author and Subject correctly. If so, did you
>>> use the stroke, binary or phonetic method for sorting the relevant fields.
>>>
>>> Cheers
>>>
>>>
>>> Hayden
>>>
>>> --
>>> Download Intel® Parallel Studio Eval
>>> Try the new software tools for yourself. Speed compiling, find bugs
>>> proactively, and fine-tune applications for parallel performance.
>>> See why Intel Parallel Studio got high marks during beta.
>>> http://p.sf.net/sfu/intel-sw-dev
>>> ___
>>> DSpace-tech mailing list
>>> DSpace-tech@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>>>
>>>  
>>>   
>> --
>> Download Intel® Parallel Studio Eval
>> Try the new software tools for yourself. Speed compiling, find bugs
>> proactively, and fine-tune applications for parallel performance.
>> See why Intel Parallel Studio got high marks during beta.
>> http://p.sf.net/sfu/intel-sw-dev
>> ___
>> DSpace-tech mailing list
>> DSpace-tech@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>>
>> 
>
>
> --
> Download Intel® Parallel Studio Eval
> Try the new software tools for yourself. Speed compiling, find bugs
> proactively, and fine-tune applications for parallel performance.
> See why Intel Parallel Studio got high marks during beta.
> http://p.sf.net/sfu/intel-sw-dev
> ___
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>   


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Browse by Author/Subject in Chinese

2010-03-19 Thread Mr Havercamp
Thanks Allen. If HKU is satisfied with natural ordering then it should 
be more than sufficient for our project as well.

Are you handling pinyin, or latinized, names as well within your author 
lists? If so, are they listed along with the chinese names and how are 
they sorted on the screen?

Cheers


Hayden

On 19/03/10 13:23, Allen Lam wrote:
> Hi Hayden,
>
> Part of our dspace items are having chinese titles and chinese author names.
> We do not have special ordering for any specific languages so that
> chinese texts are ordered in their "natural ordering".
> We are storing data in UTF-8. The meaning of "natural" is that when each
> char (in english or chinese or any languages) of a string of text is
> interpreted as an integer value, the strings of integers are ordered
> from small to big.
>
> For example, a list of chinese names are naturally ordered like that:
>
> 何子雅
> 何存德
> 何存邦
> 何學儉
> 何學強
> 何宏德
> 何宗憲
> 何宗義
> 何定邦
> 何宛珊
>
> because their numerical values are that:
>
> 20309 23376 38597
> 20309 23384 24503
> 20309 23384 37030
> 20309 23416 20745
> 20309 23416 24375
> 20309 23439 24503
> 20309 23447 25010
> 20309 23447 32681
> 20309 23450 37030
> 20309 23451 29642
>
> It is subjective to define what is the "correct order". I feel the
> natural ordering is correct enough for our use.
>
> You can re-define the ordering by providing information of stroke,
> binary or phonetic for each string of text you want to order in dspace.
> It means defining extra dc tags and supplying extra metadatavalues when
> you submit each item.
>
> Best,
> Allen Lam.
> HKU Hub Administrator, http://hub.hku.hk
>
>
>
> Mr Havercamp wrote:
>
>> We have a DSpace instance which is storing metadata in the Chinese
>> language. We are wondering whether anybody has tackled this and been
>> successful in ordering the Author and Subject correctly. If so, did you
>> use the stroke, binary or phonetic method for sorting the relevant fields.
>>
>> Cheers
>>
>>
>> Hayden
>>
>> --
>> Download Intel® Parallel Studio Eval
>> Try the new software tools for yourself. Speed compiling, find bugs
>> proactively, and fine-tune applications for parallel performance.
>> See why Intel Parallel Studio got high marks during beta.
>> http://p.sf.net/sfu/intel-sw-dev
>> ___
>> DSpace-tech mailing list
>> DSpace-tech@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>>
>>  
>
> --
> Download Intel® Parallel Studio Eval
> Try the new software tools for yourself. Speed compiling, find bugs
> proactively, and fine-tune applications for parallel performance.
> See why Intel Parallel Studio got high marks during beta.
> http://p.sf.net/sfu/intel-sw-dev
> ___
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Browse by Author/Subject in Chinese

2010-03-18 Thread Allen Lam
Hi Hayden,

Part of our dspace items are having chinese titles and chinese author names.
We do not have special ordering for any specific languages so that 
chinese texts are ordered in their "natural ordering".
We are storing data in UTF-8. The meaning of "natural" is that when each 
char (in english or chinese or any languages) of a string of text is 
interpreted as an integer value, the strings of integers are ordered 
from small to big.

For example, a list of chinese names are naturally ordered like that:

何子雅
何存德
何存邦
何學儉
何學強
何宏德
何宗憲
何宗義
何定邦
何宛珊

because their numerical values are that:

20309 23376 38597
20309 23384 24503
20309 23384 37030
20309 23416 20745
20309 23416 24375
20309 23439 24503
20309 23447 25010
20309 23447 32681
20309 23450 37030
20309 23451 29642

It is subjective to define what is the "correct order". I feel the 
natural ordering is correct enough for our use.

You can re-define the ordering by providing information of stroke, 
binary or phonetic for each string of text you want to order in dspace. 
It means defining extra dc tags and supplying extra metadatavalues when 
you submit each item.

Best,
Allen Lam.
HKU Hub Administrator, http://hub.hku.hk



Mr Havercamp wrote:
> We have a DSpace instance which is storing metadata in the Chinese 
> language. We are wondering whether anybody has tackled this and been 
> successful in ordering the Author and Subject correctly. If so, did you 
> use the stroke, binary or phonetic method for sorting the relevant fields.
>
> Cheers
>
>
> Hayden
>
> --
> Download Intel® Parallel Studio Eval
> Try the new software tools for yourself. Speed compiling, find bugs
> proactively, and fine-tune applications for parallel performance.
> See why Intel Parallel Studio got high marks during beta.
> http://p.sf.net/sfu/intel-sw-dev
> ___
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>   


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


[Dspace-tech] Browse by Author/Subject in Chinese

2010-03-17 Thread Mr Havercamp
We have a DSpace instance which is storing metadata in the Chinese 
language. We are wondering whether anybody has tackled this and been 
successful in ordering the Author and Subject correctly. If so, did you 
use the stroke, binary or phonetic method for sorting the relevant fields.

Cheers


Hayden

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech