We are trying SOLR 1.3 with Paoding Chinese Analyzer , and after reindexing the index size went from 1.5 Gb to 2.7 Gb.
Is that some expected behavior ? Is there any switch or trick to avoid having a double + index file size? Koji Sekiguchi-2 wrote: > > CharFilter can normalize (convert) traditional chinese to simplified > chinese or vice versa, > if you define mapping.txt. Here is the sample of Chinese character > normalization: > > https://issues.apache.org/jira/secure/attachment/12392639/character-normalization.JPG > > See SOLR-822 for the detail: > > https://issues.apache.org/jira/browse/SOLR-822 > > Koji > > > revathy arun wrote: >> Hi, >> >> When I index chinese content using chinese tokenizer and analyzer in solr >> 1.3 ,some of the chinese text files are getting indexed but others are >> not. >> >> Since chinese has got many different language subtypes as in standard >> chinese,simplified chinese etc which of these does the chinese tokenizer >> support and is there any method to find the type of chiense language >> from >> the file? >> >> Rgds >> >> > > > -- View this message in context: http://www.nabble.com/indexing-Chienese-langage-tp22033302p23864358.html Sent from the Solr - User mailing list archive at Nabble.com.