date:20111129

Re: Scoring a document using LDA topics

2011-11-29 Thread Sujit Pal

Hi Stephen, We precompute a variant of P(z,d) during indexing, and do the first 3 steps. The resulting documents are ordered by payload score, which is basically z in our case. We don't currently care about P(t,z) but it seems like a good thing to have for disambiguation purposes. So anyway, I ha

RE: Custom Filter for Splitting CamelCase?

2011-11-29 Thread Uwe Schindler

Hi, Be sure to use the same Solr version as your Lucene version (if >= 3.1) and this is example code from test case: WordDelimiterFilterFactory fact = new WordDelimiterFilterFactory(); // we dont need this if we dont load external exclusion files: // ResourceLoader loader = new Solr

Re: Custom Filter for Splitting CamelCase?

2011-11-29 Thread Stephen Thomas

How do you use the WordDelimiterFilterFactory()? I tried the following code: TokenStream out = new LowerCaseTokenizer(reader); WordDelimiterFilterFactory wdf = new WordDelimiterFilterFactory(); out = wdf.create(out); ... But I am getting a runtime error: Exception in thread "main" java.lang.Ab

RE: Custom Filter for Splitting CamelCase?

2011-11-29 Thread Uwe Schindler

Hi, There is WordDelimiterFilter in Solr that was also ported to Lucene Analysis module in Lucene trunk (4.0). In 3.x yu can still add solr.jar to your classpath and WordDelimiterFilterFactory to produce one (WordDelimiterFilter itself is package-private). - Uwe Schindler H.-H.-Meier-Allee 63

Custom Filter for Splitting CamelCase?

2011-11-29 Thread Stephen Thomas

List, I have written my own CustomAnalyzer, as follows: public TokenStream tokenStream(String fieldName, Reader reader) { // TODO: add calls to RemovePuncation, and SplitIdentifiers here // First, convert to lower case TokenStream

Re: Quoted search on Analyzed fields

2011-11-29 Thread Robert Muir

Again there is nothing wrong with the quotes: its instead how you are configuring the analysis for this field. If you put stuff in quotes and your analyzer breaks it into multiple tokens, then queryparser forms a phrase query. You must index positions to support phrase queries. Normally DOCS_ONLY

Re: Quoted search on Analyzed fields

2011-11-29 Thread Mihai Caraman

Still no difference, it may be because of some other hidden bug.Anyway, adding freq and positions will be a no - no because of space :) so bye bye quotes. Thank you

Re: Scoring a document using LDA topics

2011-11-29 Thread Stephen Thomas

Sujit, Thanks for your reply, and the link to your blog post, which was helpful and got me thinking about Payloads. I still have one more question. I need to be able to compute the Sim(query q, doc d) similarity function, which is defined below: Sim (query q, doc d) = sum_{t in q} sum_{z} P(t, z

Re: Quoted search on Analyzed fields

2011-11-29 Thread Robert Muir

if you use standardanalyzer it will break "john doe" into 2 tokens and form a phrase query. if you want to do phrase queries, don't set the indexoptions to DOCS_ONLY. otherwise they won't work. if what you want is for "john doe" to only be 1 term without positions, then use KeywordAnalyzer, and DO

Quoted search on Analyzed fields

2011-11-29 Thread Mihai Caraman

field = new Field("author",(author).toLowerCase(),Field.Store.NO, Field.Index.NOT_ANALYZED); field.setIndexOptions(FieldInfo.IndexOptions.DOCS_ONLY); field.setOmitNorms(true); When in the above configuration i switched from NOT_ANALYZED to ANALYZED, luke's results for autho

Re: Error while re-indexing - cannot overwrite 0.fdt

2011-11-29 Thread Ian Lea

Close the first index writer? http://lmgtfy.com/?q=lucene+Cannot+overwrite+%22_0.fdt%22+file If you can't find the answer and need to post again, include as a minimum details of the OS and lucene version that you are using. -- Ian. On Tue, Nov 29, 2011 at 12:15 PM, Rohan A Ambasta wrote: > >

Error while re-indexing - cannot overwrite 0.fdt

2011-11-29 Thread Rohan A Ambasta

Hi, I get the error - "Cannot Overwrite 0.fdt" when I start indexing. Detail TestCase - 1) Performing indexing for the first time work fine. 2) Then I do search and I get the search results 3) After search, If I again start indexing I get the error - "Cannot overwrite 0.fdt" Has anybody faced

Re: PorterStemFilter causes wildcard searches to not work

2011-11-29 Thread Ian Lea

A google search of "lucene stemming wildcards" finds some hits implying these don't work well together. http://lucene.472066.n3.nabble.com/Conflicts-with-Stemming-and-Wildcard-Prefix-Queries-td540479.html may be a solution. -- Ian. On Tue, Nov 29, 2011 at 10:39 AM, SBS wrote: >> This is very

Re: PorterStemFilter causes wildcard searches to not work

2011-11-29 Thread SBS

> This is very hard to follow. I for one don't recall what you > described or what you are looking for. Sorry about that, I am using the web interface where the context of my post is visible to all. To sum up, my original post was: > It seems that when I use a PorterStemFilter in my custom an

Re: PorterStemFilter causes wildcard searches to not work

2011-11-29 Thread Ian Lea

This is very hard to follow. I for one don't recall what you described or what you are looking for. Have you worked through http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC8_incorrect_hits.3F? -- Ian. On Tue, Nov 29, 2011 at 7:25 AM, SBS wrote: > I am applying the P

Re: Scoring a document using LDA topics

RE: Custom Filter for Splitting CamelCase?

Re: Custom Filter for Splitting CamelCase?

RE: Custom Filter for Splitting CamelCase?

Custom Filter for Splitting CamelCase?

Re: Quoted search on Analyzed fields

Re: Quoted search on Analyzed fields

Re: Scoring a document using LDA topics

Re: Quoted search on Analyzed fields

Quoted search on Analyzed fields

Re: Error while re-indexing - cannot overwrite 0.fdt

Error while re-indexing - cannot overwrite 0.fdt

Re: PorterStemFilter causes wildcard searches to not work

Re: PorterStemFilter causes wildcard searches to not work

Re: PorterStemFilter causes wildcard searches to not work

15 matches

Site Navigation

Mail list logo

Footer information