Re: Need addtional info for Fiel d(希望看得懂中文的朋友帮我出出 主意)

2008-05-04 Thread kai.hu
你只要索引并分词“下午去开会”就行了,把对应的时间保存进去。 如document.add(new Field(sub,下午去开会,Field.Store.YES,Field.Index.TOKENIZED)); document.add(new Field(time,01:02:02,Field.Store.YES,Field.Index.UN_TOKENIZED)); 到时候搜索出的单个document里就包含这两个Field了。 only index and tokenized 下午去开会,and store the time with this sub.

Re: Need addtional info for Fiel d(希望看得懂中文的朋友帮我出出 主意)

2008-05-04 Thread kai.hu
在google里搜一下中文分词,出车东的包外,应该还有很多了,如果你发现有更好分词,更高效率的,也推荐一份啊。 -- From: kai.hu [EMAIL PROTECTED] Sent: Sunday, May 04, 2008 4:20 PM To: java-user@lucene.apache.org Subject: Re: Need addtional info for Field(希望看得懂中文的朋友帮我出出主意)

Re: Need addtional info for Fiel d(希望看得懂中文的朋友帮我出出 主意)

2008-05-04 Thread 王建新
好的,谢谢! - Original Message - From: kai.hu [EMAIL PROTECTED] To: java-user@lucene.apache.org Sent: Sunday, May 04, 2008 4:27 PM Subject: Re: Need addtional info for Field(希望看得懂中文的朋友帮我出出主意) 在google里搜一下中文分词,出车东的包外,应该还有很多了,如果你发现有更好分词,更高效率的,也推荐一份啊。

Re: Lucene Indexing structure

2008-05-04 Thread Vaijanath N. Rao
Hi Chris, Sorry for the cross-posting and also for not making clear the problem. Let me try to explain the problem at my hand. I am tying to write a CBIR (Content Based Image Reterival) frame work using lucene. As each document have entities such as title, description, author and so on. I

Re: Lucene Indexing structure

2008-05-04 Thread Grant Ingersoll
Would a Function Query (ValueSourceQuery, see the org.apache.lucene.search.function package) work in this case? -Grant On May 4, 2008, at 9:35 AM, Vaijanath N. Rao wrote: Hi Chris, Sorry for the cross-posting and also for not making clear the problem. Let me try to explain the problem at

Re: Lucene's Mean Average Precision

2008-05-04 Thread Grant Ingersoll
How did you arrive at that MAP? What analyzers, etc.? So much of search depends on your choices during indexing and querying, etc. There is some work by the IBM Haifa people up on the Wiki, so that would be one place to check. Another question is what is your end goal is in doing a

Re: Lucene's Mean Average Precision

2008-05-04 Thread DanaWhite
I arrived at this MAP by modifying IndexFiles to use a StopAnalyzer and work in a way that was acceptable for TReC files. The SearchFiles was modified to use a StopAnalyzer and output data in a trec_eval suitable format. Trec_eval reports about 11% at this setting. I am not competing in TReC I

Re: Lucene's Mean Average Precision

2008-05-04 Thread Grant Ingersoll
On May 4, 2008, at 7:28 PM, DanaWhite wrote: I arrived at this MAP by modifying IndexFiles to use a StopAnalyzer and work in a way that was acceptable for TReC files. The SearchFiles was modified to use a StopAnalyzer and output data in a trec_eval suitable format. Trec_eval reports

Re: Lucene's Mean Average Precision

2008-05-04 Thread Otis Gospodnetic
Hi Dana, There is no out of the box for Lucene, really, especially not when it comes to Analyzers. For example, look at this post and then the comments: http://blogs.sun.com/searchguy/entry/minion_and_lucene_case_sensitivity Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

RE: lucene farsi problem

2008-05-04 Thread Steven A Rowe
Hi Esra, I have attached a patch to LUCENE-1279 containing a new class: CollatingRangeQuery. The patch also contains a test class: TestCollatingRangeQuery. One of the test methods checks for the Farsi range you were having trouble with. It should be mentioned that according to