Re: [Neo4j] how Neo4j work for sorting chinese character?

2011-09-08 Thread Peter Neubauer
Thanks Yuanlong,
we will look at it as soon as we got some time!

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Thu, Sep 8, 2011 at 2:38 AM, iamyuanlong  wrote:
> I added here : https://github.com/neo4j/community/issues/14
>
>
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-Neo4j-work-for-sorting-chinese-character-tp3309754p3318317.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] how Neo4j work for sorting chinese character?

2011-09-07 Thread iamyuanlong
I added here : https://github.com/neo4j/community/issues/14


--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-Neo4j-work-for-sorting-chinese-character-tp3309754p3318317.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] how Neo4j work for sorting chinese character?

2011-09-07 Thread Peter Neubauer
Yuan,
could you make a test for this, and issue a pull request on GIThub? This
should absolutely be part of the main code so you don't have to have your
own fork.

Cheers,

/peter neubauer

GTalk:  neubauer.peter
Skype   peter.neubauer
Phone   +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter  http://twitter.com/peterneubauer

http://www.neo4j.org   - Your high performance graph database.
http://startupbootcamp.org/- Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.


On Wed, Sep 7, 2011 at 3:34 AM, iamyuanlong  wrote:

> A ha ! I changed the cypher Scala source code by myself .  Thank you for
> your
> help.
>
> --
> View this message in context:
> http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-Neo4j-work-for-sorting-chinese-character-tp3309754p3315369.html
> Sent from the Neo4j Community Discussions mailing list archive at
> Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] how Neo4j work for sorting chinese character?

2011-09-06 Thread iamyuanlong
A ha ! I changed the cypher Scala source code by myself .  Thank you for your
help.

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-Neo4j-work-for-sorting-chinese-character-tp3309754p3315369.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] how Neo4j work for sorting chinese character?

2011-09-05 Thread wang yuanlong
hi,
In java we can sort Pinyin like this: (Sun provide a Comparator)

public int compare(String o1, String o2) {
return Collator.getInstance(Locale.CHINESE).compare(o1, o2);
}

But it's  got some flaws.
You know there are so many homophones in Chinese.
But in Sun's Comparator they don't equals each other.

Assert.assertTrue(comparator.compare("怕", "帕") != 0); //怕 pà 帕 pà

And the unfamiliar Chinese character never sort userful in this Comparator.
Some like '怡'.

Assert.assertTrue(comparator.compare("怡", "张") > 0); //怡 yí 张 zhāng

With luck, there is a open source project at sf .
http://pinyin4j.sourceforge.net/

So we can Convert Chinese to Pinyin.Then it will be easy.

I can provide a Java code ,some  not coding by myself.
-

/**
  * @author Jeff
  *
  * Copyright (c)
  */
package chinese.utility;

import java.util.Comparator;
import net.sourceforge.pinyin4j.PinyinHelper;

public class PinyinComparator implements Comparator {

public int compare(String o1, String o2) {

for (int i = 0; i < o1.length() && i < o2.length(); i++) {

int codePoint1 = o1.charAt(i);
int codePoint2 = o2.charAt(i);

if (Character.isSupplementaryCodePoint(codePoint1)
|| Character.isSupplementaryCodePoint(codePoint2)) {
i++;
}

if (codePoint1 != codePoint2) {
if (Character.isSupplementaryCodePoint(codePoint1)
|| Character.isSupplementaryCodePoint(codePoint2)) {
return codePoint1 - codePoint2;
}

String pinyin1 = pinyin((char) codePoint1);
String pinyin2 = pinyin((char) codePoint2);

if (pinyin1 != null && pinyin2 != null) { // Both of them
are Chinese character
if (!pinyin1.equals(pinyin2)) {
return pinyin1.compareTo(pinyin2);
}
} else {
return codePoint1 - codePoint2;
}
}
}
return o1.length() - o2.length();
}

/**
 * If it is a  polyphonic we got the first one.If not a Chinese
character return null.
 */
private String pinyin(char c) {
String[] pinyins = PinyinHelper.toHanyuPinyinStringArray(c);
if (pinyins == null) {
return null;
}
return pinyins[0];
}
}
---

The junit4 Test.
---

/**
  * @author Jeff
  *
  * Copyright (c)
  */
package chinese.utility.test;

import java.util.Comparator;

import org.junit.Assert;
import org.junit.Test;

import chinese.utility.PinyinComparator;

public class PinyinComparatorTest {

private Comparator comparator = new PinyinComparator();

/**
 * Sight Words
 */
@Test
public void testCommon() {
Assert.assertTrue(comparator.compare("孟", "宋") < 0);
}

/**
 * different length
 */
@Test
public void testDifferentLength() {
Assert.assertTrue(comparator.compare("天气真好", "天气真好啊") < 0);
}

/**
 * compare with non-Chinese character
 */
@Test
public void testNoneChinese() {
Assert.assertTrue(comparator.compare("a", "阿") < 0);
Assert.assertTrue(comparator.compare("1", "阿") < 0);
}

/**
 * unfamiliar characters (怡)
 */
@Test
public void testNoneCommon() {
Assert.assertTrue(comparator.compare("怡", "张") < 0);
}

/**
 * homophones
 */
@Test
public void testSameSound() {
Assert.assertTrue(comparator.compare("怕", "帕") == 0);
}

/**
 * polyphonic (曾[zēng,céng] )
 */
@Test
public void testMultiSound() {
Assert.assertTrue(comparator.compare("曾经", "曾迪") > 0);
}

}
--


2011/9/5 Peter Neubauer 

> Yuanlong,
> can you provide Java code on how to sort Pinyin characters? In that case, I
> am sure there is a way to incorporate it into the Cypher sorting routines.
> It would be very helpful since we don't even know how to test Pinyin
> sorting
> for correctness :/
>
> Cheers,
>
> /peter neubauer
>
> GTalk:  neubauer.peter
> Skype   peter.neubauer
> Phone   +46 704 106975
> LinkedIn   http://www.linkedin.com/in/neubauer
> Twitter  http://twitter.com/peterneubauer
>
> http://www.neo4j.org   - Your high performance graph database.
> http://startupbootcamp.org/- Öresund - Innovation happens HERE.
> http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
>
>
> On Mon, Sep 5, 2011 at 4:59 AM, iamyuanlong 
> wrote:
>
> > hi ,
> >
> >  Sorry for disturb you,Please excuse for my bad english.
> >
> >  i'd like use the CypherParser of neo4j.
> >  when i query the user's info order by 

Re: [Neo4j] how Neo4j work for sorting chinese character?

2011-09-05 Thread Peter Neubauer
Yuanlong,
can you provide Java code on how to sort Pinyin characters? In that case, I
am sure there is a way to incorporate it into the Cypher sorting routines.
It would be very helpful since we don't even know how to test Pinyin sorting
for correctness :/

Cheers,

/peter neubauer

GTalk:  neubauer.peter
Skype   peter.neubauer
Phone   +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter  http://twitter.com/peterneubauer

http://www.neo4j.org   - Your high performance graph database.
http://startupbootcamp.org/- Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.


On Mon, Sep 5, 2011 at 4:59 AM, iamyuanlong  wrote:

> hi ,
>
>  Sorry for disturb you,Please excuse for my bad english.
>
>  i'd like use the CypherParser of neo4j.
>  when i query the user's info order by user.username desc.
>  i got the result that have a little difference from the result in
> sqlserver.
>  i hope that the result can be sorted by chinese Pinyin.
>
> eg.
> i got :
> 风过这头
> 镇定的猎豹
> 达小鱼儿
> 财富分享
> 蝶儿菲菲
> 脚一滑
> 股童天尊
> 股票赢家888
>
> i hope:
> 镇定的猎豹
> 脚一滑
> 股童天尊
> 股票赢家888
> 风过这头
> 蝶儿菲菲
> 达小鱼儿
> 财富分享
>
> --
> View this message in context:
> http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-Neo4j-work-for-sorting-chinese-character-tp3309754p3309754.html
> Sent from the Neo4j Community Discussions mailing list archive at
> Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user