Re: [Neo4j] how Neo4j work for sorting chinese character?
Thanks Yuanlong, we will look at it as soon as we got some time! Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Thu, Sep 8, 2011 at 2:38 AM, iamyuanlong wrote: > I added here : https://github.com/neo4j/community/issues/14 > > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-Neo4j-work-for-sorting-chinese-character-tp3309754p3318317.html > Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] how Neo4j work for sorting chinese character?
I added here : https://github.com/neo4j/community/issues/14 -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-Neo4j-work-for-sorting-chinese-character-tp3309754p3318317.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] how Neo4j work for sorting chinese character?
Yuan, could you make a test for this, and issue a pull request on GIThub? This should absolutely be part of the main code so you don't have to have your own fork. Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/- Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Sep 7, 2011 at 3:34 AM, iamyuanlong wrote: > A ha ! I changed the cypher Scala source code by myself . Thank you for > your > help. > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-Neo4j-work-for-sorting-chinese-character-tp3309754p3315369.html > Sent from the Neo4j Community Discussions mailing list archive at > Nabble.com. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] how Neo4j work for sorting chinese character?
A ha ! I changed the cypher Scala source code by myself . Thank you for your help. -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-Neo4j-work-for-sorting-chinese-character-tp3309754p3315369.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] how Neo4j work for sorting chinese character?
hi, In java we can sort Pinyin like this: (Sun provide a Comparator) public int compare(String o1, String o2) { return Collator.getInstance(Locale.CHINESE).compare(o1, o2); } But it's got some flaws. You know there are so many homophones in Chinese. But in Sun's Comparator they don't equals each other. Assert.assertTrue(comparator.compare("怕", "帕") != 0); //怕 pà 帕 pà And the unfamiliar Chinese character never sort userful in this Comparator. Some like '怡'. Assert.assertTrue(comparator.compare("怡", "张") > 0); //怡 yí 张 zhāng With luck, there is a open source project at sf . http://pinyin4j.sourceforge.net/ So we can Convert Chinese to Pinyin.Then it will be easy. I can provide a Java code ,some not coding by myself. - /** * @author Jeff * * Copyright (c) */ package chinese.utility; import java.util.Comparator; import net.sourceforge.pinyin4j.PinyinHelper; public class PinyinComparator implements Comparator { public int compare(String o1, String o2) { for (int i = 0; i < o1.length() && i < o2.length(); i++) { int codePoint1 = o1.charAt(i); int codePoint2 = o2.charAt(i); if (Character.isSupplementaryCodePoint(codePoint1) || Character.isSupplementaryCodePoint(codePoint2)) { i++; } if (codePoint1 != codePoint2) { if (Character.isSupplementaryCodePoint(codePoint1) || Character.isSupplementaryCodePoint(codePoint2)) { return codePoint1 - codePoint2; } String pinyin1 = pinyin((char) codePoint1); String pinyin2 = pinyin((char) codePoint2); if (pinyin1 != null && pinyin2 != null) { // Both of them are Chinese character if (!pinyin1.equals(pinyin2)) { return pinyin1.compareTo(pinyin2); } } else { return codePoint1 - codePoint2; } } } return o1.length() - o2.length(); } /** * If it is a polyphonic we got the first one.If not a Chinese character return null. */ private String pinyin(char c) { String[] pinyins = PinyinHelper.toHanyuPinyinStringArray(c); if (pinyins == null) { return null; } return pinyins[0]; } } --- The junit4 Test. --- /** * @author Jeff * * Copyright (c) */ package chinese.utility.test; import java.util.Comparator; import org.junit.Assert; import org.junit.Test; import chinese.utility.PinyinComparator; public class PinyinComparatorTest { private Comparator comparator = new PinyinComparator(); /** * Sight Words */ @Test public void testCommon() { Assert.assertTrue(comparator.compare("孟", "宋") < 0); } /** * different length */ @Test public void testDifferentLength() { Assert.assertTrue(comparator.compare("天气真好", "天气真好啊") < 0); } /** * compare with non-Chinese character */ @Test public void testNoneChinese() { Assert.assertTrue(comparator.compare("a", "阿") < 0); Assert.assertTrue(comparator.compare("1", "阿") < 0); } /** * unfamiliar characters (怡) */ @Test public void testNoneCommon() { Assert.assertTrue(comparator.compare("怡", "张") < 0); } /** * homophones */ @Test public void testSameSound() { Assert.assertTrue(comparator.compare("怕", "帕") == 0); } /** * polyphonic (曾[zēng,céng] ) */ @Test public void testMultiSound() { Assert.assertTrue(comparator.compare("曾经", "曾迪") > 0); } } -- 2011/9/5 Peter Neubauer > Yuanlong, > can you provide Java code on how to sort Pinyin characters? In that case, I > am sure there is a way to incorporate it into the Cypher sorting routines. > It would be very helpful since we don't even know how to test Pinyin > sorting > for correctness :/ > > Cheers, > > /peter neubauer > > GTalk: neubauer.peter > Skype peter.neubauer > Phone +46 704 106975 > LinkedIn http://www.linkedin.com/in/neubauer > Twitter http://twitter.com/peterneubauer > > http://www.neo4j.org - Your high performance graph database. > http://startupbootcamp.org/- Öresund - Innovation happens HERE. > http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. > > > On Mon, Sep 5, 2011 at 4:59 AM, iamyuanlong > wrote: > > > hi , > > > > Sorry for disturb you,Please excuse for my bad english. > > > > i'd like use the CypherParser of neo4j. > > when i query the user's info order by
Re: [Neo4j] how Neo4j work for sorting chinese character?
Yuanlong, can you provide Java code on how to sort Pinyin characters? In that case, I am sure there is a way to incorporate it into the Cypher sorting routines. It would be very helpful since we don't even know how to test Pinyin sorting for correctness :/ Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/- Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Mon, Sep 5, 2011 at 4:59 AM, iamyuanlong wrote: > hi , > > Sorry for disturb you,Please excuse for my bad english. > > i'd like use the CypherParser of neo4j. > when i query the user's info order by user.username desc. > i got the result that have a little difference from the result in > sqlserver. > i hope that the result can be sorted by chinese Pinyin. > > eg. > i got : > 风过这头 > 镇定的猎豹 > 达小鱼儿 > 财富分享 > 蝶儿菲菲 > 脚一滑 > 股童天尊 > 股票赢家888 > > i hope: > 镇定的猎豹 > 脚一滑 > 股童天尊 > 股票赢家888 > 风过这头 > 蝶儿菲菲 > 达小鱼儿 > 财富分享 > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-how-Neo4j-work-for-sorting-chinese-character-tp3309754p3309754.html > Sent from the Neo4j Community Discussions mailing list archive at > Nabble.com. > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user