hi, In java we can sort Pinyin like this: (Sun provide a Comparator) public int compare(String o1, String o2) { return Collator.getInstance(Locale.CHINESE).compare(o1, o2); }
But it's got some flaws. You know there are so many homophones in Chinese. But in Sun's Comparator they don't equals each other. Assert.assertTrue(comparator.compare("怕", "帕") != 0); //怕 pà 帕 pà And the unfamiliar Chinese character never sort userful in this Comparator. Some like '怡'. Assert.assertTrue(comparator.compare("怡", "张") > 0); //怡 yí 张 zhāng With luck, there is a open source project at sf . http://pinyin4j.sourceforge.net/ So we can Convert Chinese to Pinyin.Then it will be easy. I can provide a Java code ,some not coding by myself. ------------------------------------------------------------------------- /** * @author Jeff * * Copyright (c) */ package chinese.utility; import java.util.Comparator; import net.sourceforge.pinyin4j.PinyinHelper; public class PinyinComparator implements Comparator<String> { public int compare(String o1, String o2) { for (int i = 0; i < o1.length() && i < o2.length(); i++) { int codePoint1 = o1.charAt(i); int codePoint2 = o2.charAt(i); if (Character.isSupplementaryCodePoint(codePoint1) || Character.isSupplementaryCodePoint(codePoint2)) { i++; } if (codePoint1 != codePoint2) { if (Character.isSupplementaryCodePoint(codePoint1) || Character.isSupplementaryCodePoint(codePoint2)) { return codePoint1 - codePoint2; } String pinyin1 = pinyin((char) codePoint1); String pinyin2 = pinyin((char) codePoint2); if (pinyin1 != null && pinyin2 != null) { // Both of them are Chinese character if (!pinyin1.equals(pinyin2)) { return pinyin1.compareTo(pinyin2); } } else { return codePoint1 - codePoint2; } } } return o1.length() - o2.length(); } /** * If it is a polyphonic we got the first one.If not a Chinese character return null. */ private String pinyin(char c) { String[] pinyins = PinyinHelper.toHanyuPinyinStringArray(c); if (pinyins == null) { return null; } return pinyins[0]; } } ------------------------------------------------------------------- The junit4 Test. ------------------------------------------------------------------- /** * @author Jeff * * Copyright (c) */ package chinese.utility.test; import java.util.Comparator; import org.junit.Assert; import org.junit.Test; import chinese.utility.PinyinComparator; public class PinyinComparatorTest { private Comparator<String> comparator = new PinyinComparator(); /** * Sight Words */ @Test public void testCommon() { Assert.assertTrue(comparator.compare("孟", "宋") < 0); } /** * different length */ @Test public void testDifferentLength() { Assert.assertTrue(comparator.compare("天气真好", "天气真好啊") < 0); } /** * compare with non-Chinese character */ @Test public void testNoneChinese() { Assert.assertTrue(comparator.compare("a", "阿") < 0); Assert.assertTrue(comparator.compare("1", "阿") < 0); } /** * unfamiliar characters (怡) */ @Test public void testNoneCommon() { Assert.assertTrue(comparator.compare("怡", "张") < 0); } /** * homophones */ @Test public void testSameSound() { Assert.assertTrue(comparator.compare("怕", "帕") == 0); } /** * polyphonic (曾[zēng,céng] ) */ @Test public void testMultiSound() { Assert.assertTrue(comparator.compare("曾经", "曾迪") > 0); } } ---------------------------------------------------------------------- 2011/9/5 Peter Neubauer <peter.neuba...@neotechnology.com> > Yuanlong, > can you provide Java code on how to sort Pinyin characters? In that case, I > am sure there is a way to incorporate it into the Cypher sorting routines. > It would be very helpful since we don't even know how to test Pinyin > sorting > for correctness :/ > > Cheers, > > /peter neubauer > > GTalk: neubauer.peter > Skype peter.neubauer > Phone +46 704 106975 > LinkedIn http://www.linkedin.com/in/neubauer > Twitter http://twitter.com/peterneubauer > > http://www.neo4j.org - Your high performance graph database. > http://startupbootcamp.org/ - Öresund - Innovation happens HERE. > http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. > > > On Mon, Sep 5, 2011 at 4:59 AM, iamyuanlong <yuanlong1...@gmail.com> > wrote: > > > hi , > > > > Sorry for disturb you,Please excuse for my bad english. > > > > i'd like use the CypherParser of neo4j. > > when i query the user's info order by user.username desc. > > i got the result that have a little difference from the result in > > sqlserver. > > i hope that the result can be sorted by chinese Pinyin. > > > > eg. > > i got : > > 风过这头 > > 镇定的猎豹 > > 达小鱼儿 > > 财富分享 > > 蝶儿菲菲 > > 脚一滑 > > 股童天尊 > > 股票赢家888 > > > > i hope: > > 镇定的猎豹 > > 脚一滑 > > 股童天尊 > > 股票赢家888 > > 风过这头 > > 蝶儿菲菲 > > 达小鱼儿 > > 财富分享 > > > > -- > > View this message in context: > > > http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-Neo4j-work-for-sorting-chinese-character-tp3309754p3309754.html > > Sent from the Neo4j Community Discussions mailing list archive at > > Nabble.com. > > _______________________________________________ > > Neo4j mailing list > > User@lists.neo4j.org > > https://lists.neo4j.org/mailman/listinfo/user > > > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user