Hello Nik. Thanks for your advice. I had just tried as you advice. But, I met an error as following.
"error": "IndexCreationException[[search] failed to create index]; nested: CreationException[Guice creation errors:\n\n1) Could not find a suitable constructor in org.apache.lucene.analysis.th.ThaiWordFilterFactory. Classes must have either one (and only one) constructor annotated with @Inject or a zero-argument constructor that is not private.\n at org.apache.lucene.analysis.th.ThaiWordFilterFactory.class(Unknown Source)\n at org.elasticsearch.index.analysis.TokenFilterFactoryFactory.create(Unknown Source)\n at org.elasticsearch.common.inject.assistedinject.FactoryProvider2.initialize(Unknown Source)\n at _unknown_\n\n1 error]; ", In my opnion, this error raises by ThaiWordFilterFactory which has`t a zeo-argument constructor. In fact, the ThaiWordFilterFactory has only a following constructor. /** Creates a new ThaiWordFilterFactory */ public ThaiWordFilterFactory(Map<String,String> args) { super(args); assureMatchVersion(); if (!args.isEmpty()) { throw new IllegalArgumentException("Unknown parameters: " + args); } } If you don`t mind, I have an one more question. Can I define a constructor argument in above settings JSON. 2014년 2월 7일 금요일 오후 11시 17분 59초 UTC+9, Nikolas Everett 님의 말: > > If you don't like the language analyzer you have to rebuild it as a custom > analyzer then add what you need to it. > > { > "analyzer": { > "thai_with_ngram": { > "type": "custom", > "tokenizer": "standard", > "filters": ["standard", "lowercase", "thai", "thai_stop", "ngram"] > } > }, > "filter": { > "thai": { > "type": "org.apache.lucene.analysis.th.ThaiWordFilterFactory" > }, > "thai_stop": { > "type": "stop", > "stopwords_path": "org/apache/lucene/analysis/th/stopwords.txt" > }, > "ngram": { your ngram configuration here } > } > } > > Builds it with your ngram configuration. I think. I'm taking quite a few > educated guesses here so I expect you to have to fiddle with it to get it > right. > > How I did this: > 1. Open the class called ThaiAnalyzer in the Lucene version Elasticsearch > is using and find the method called createComponents. For me this is > simple because I have Elasticsearch open in Eclipse. > 2. That method defines the tokenizer (standard) and some filters > (standard, lowercase, ThaiWordFilter, and stop. You have to be able to > translate the class names to Elasticsearch's easier names to get this to > work properly. > 3. Now build it as a custom filter with your extra filter in there. That > is "thai_with_ngram" above. > 4. Next you'll need to define all the filters that don't exist by default > in Elasticsearch. In this case that is thai, thai_stop, and your ngram > filter. In order: > 5. The thai filter doesn't have an easy Elasticsearch mapping so you have > to tell Elasticsearch the class name to load. That class doesn't take an > configuration so we're done. > 6. The thai_stop filter is just a regular stop word filter with thai stop > words. But Elasticserach doesn't have an easy name to reference the thai > stop words file. That isn't too bad, as you can load the stopwords file > from the classepath. It lives in Lucene at the path I added above. > 7. The ngram filter is yours to build but it is well documented. > > That took longer then I expected but it was worth the exercise so I'll > remember how to do it again when I need it. For reference, I do it for > English which has more filters but they all have easy names. > > Nik > > > On Fri, Feb 7, 2014 at 12:59 AM, Min Cha <mins...@gmail.com > <javascript:>>wrote: > >> Hi folks. >> >> I would like to develop for a searching system for Thai language. >> First of all, I found Thai analyzer and it seemed like good. >> >> Actually, but, It doesn`t meet my whole requirement. >> I decided what extends it. >> For example, I would like to add nGram token filter on the Thai analyzer >> without any changes on it. >> >> How to do this? >> Please, give me some advice. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearc...@googlegroups.com <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/5041f397-8732-413f-8e50-46e25610c639%40googlegroups.com >> . >> For more options, visit https://groups.google.com/groups/opt_out. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fc05b477-2673-4d41-b611-96874005e379%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.