date:20070718

Inrease the performance of Indexing in Lucene

2007-07-18 Thread miztaken

Hi, Please help me. Its been a month since i am trying lucene. My requirements are huge, i have to index and search in TB of data. I have question regarding three topics: 1. Problem in Indexing As i need to index TB of data, so by googling and visiting different forum i deployed following fash

Re: StandardTokenizer is slowing down highlighting a lot

2007-07-18 Thread Michael Stoppelman

Might be nice to add a line of documentation to the highlighter on the possible performance hit if one uses StandardAnalyzer which probably is a common case. Thanks for the speedy response. -M On 7/18/07, Mark Miller <[EMAIL PROTECTED]> wrote: Unfortunately, StandardAnalyzer is slow. StandardA

Re: StandardTokenizer is slowing down highlighting a lot

2007-07-18 Thread Mark Miller

Unfortunately, StandardAnalyzer is slow. StandardAnalyzer is really limited by JavaCC speed. You cannot shave much more performance out of the grammar as it is already about as simple as it gets. You should first see if you can get away without it and use a different Analyzer, or if you can re-

StandardTokenizer is slowing down highlighting a lot

2007-07-18 Thread Michael Stoppelman

Hi all, I was tracking down slowness in the contrib highlighter code and it seems the seemingly simple tokenStream.next() is the culprit. I've seen multiple posts about this being a possible cause. Has anyone looked into how to speed up StandardTokenizer? For my documents it's taking about 70ms p

Re: MoreLikeThis

2007-07-18 Thread Akanksha Baid

Right , I was making a silly mistake there. I have it working now. Thanks for the reply. yu wrote: You can put lucene-queries-2.2.0.jar on your class path or your Eclipse project build path. That's all you need. Jay Akanksha Baid wrote: I am using Lucene 2.1.0 and want to use MoreLikeThis f

Re: MoreLikeThis

2007-07-18 Thread yu

You can put lucene-queries-2.2.0.jar on your class path or your Eclipse project build path. That's all you need. Jay Akanksha Baid wrote: I am using Lucene 2.1.0 and want to use MoreLikeThis for querying documents. I understand that the jar file for the same is in contrib. I have the contrib

MoreLikeThis

2007-07-18 Thread Akanksha Baid

I am using Lucene 2.1.0 and want to use MoreLikeThis for querying documents. I understand that the jar file for the same is in contrib. I have the contrib folder extracted, but am not sure what to do from this point on. What jar file am I looking for and where should put it. I am using Eclipse

TermEnum - previous() method ?

2007-07-18 Thread muraalee

Hi All, I searched in this forum for anybody looking for need for previous() method in TermEnum. I found only this link http://www.nabble.com/How-to-navigate-through-indexed-terms-tf28148.html#a189225 Would it be possible to implement previous() method ? I know i am asking for quick solution here

Dictionary Type Lookup

2007-07-18 Thread muraalee

Hi, I am trying to model a Dictionary Type Search in Lucene. My approach was this - Load the dictionary file ( words & their meanings ) and index each dictionary term and associated meaning as a Lucene Document. - Use IndexReader's term method to peek at the index and get the TermEnum. TermEnum'

Re: Lucene shows parts of search query as a HIT

2007-07-18 Thread Askar Zaidi

Hey Guys, I just checked my Lucene results. It shows a document with the word hit "change" when I am searching for "Chan", and it considers that as a hit. Is there a way to stop this and show just the exact word match ? I started using Lucene yesterday, so I am fairly new ! thanks AZ On 7/18/0

Re: lucene version?

2007-07-18 Thread Michael McCandless

I don't think this is stored in the index. I think the closest you can get is the "format" of the segments_N file which changes every time the index file format changes. That at least lets you narrow it down possibly to a single release if the file format is changing frequently (eg it has in the

Re: Lucene shows parts of search query as a HIT

2007-07-18 Thread Erick Erickson

Are you sure that the hit wasn't on "w" or "kim"? The default for searching is OR... I recommend that you get a copy of Luke (google lucene luke) which allows you to examine your index as well as see how queries parse using various analyzers. It's an invaluable tool... Best Erick On 7/18/07, As

lucene version?

2007-07-18 Thread Akanksha Baid

Is there a way to test as to which version of Lucene was used to build an index? -Akanksha - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Lucene shows parts of search query as a HIT

2007-07-18 Thread Askar Zaidi

Hey folks, I am a new Lucene user , I used the following after indexing: search(searcher, "W. Chan Kim"); Lucene showed me hits of documents where "channel" word existed. Notice that "Chan" is a part of "Channel" . How do I stop this ? I am keen to find the exact word. I used the following, b

Re: WildcardQuery and SpanQuery

2007-07-18 Thread Paul Elschot

On Wednesday 18 July 2007 12:30, Cedric Ho wrote: > Thanks for the quick response Paul =) > > However I am lost while looking at the surround package. That is not really surprising, the code is factored to the bone, and it is hardly documented. You could have a look at the test code to start. Al

Re: Does Index have a Tokenizer Built into it

2007-07-18 Thread John Paul Sondag

Is there a way to know how big to make the array before hand (how many terms are in the topic total?). I'm worried about the efficiency of this, since I'd have to rebuild every document that is a "hit" on the fly to make a snippet for each "hit" on the page (say 10 a page). Now I have to wonder

Re: Query in lucene

2007-07-18 Thread Erick Erickson

When in doubt, WhitespaceAnalyzer is the most predictable. Note that it doesn't lower-case the tokens though. Depending upon your requirements, you can always pre-process your query and indexing streams and do your own lowercasing and/or character stripping. You can always create your own analyze

Query in lucene

2007-07-18 Thread WATHELET Thomas

Witch analyser I have to use to find text like this ''?

Re: WildcardQuery and SpanQuery

2007-07-18 Thread Mark Miller

You could give this a shot (From my Qsol query parser): package com.mhs.qsol.spans; /** * Copyright 2006 Mark Miller ([EMAIL PROTECTED]) * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy

Re: WildcardQuery and SpanQuery

2007-07-18 Thread Cedric Ho

Thanks for the quick response Paul =) However I am lost while looking at the surround package. Are you suggesting I can solve my problem at hand using the surround package? On 7/18/07, Paul Elschot <[EMAIL PROTECTED]> wrote: On Wednesday 18 July 2007 05:58, Cedric Ho wrote: > Hi everybody, > >

Inrease the performance of Indexing in Lucene

Re: StandardTokenizer is slowing down highlighting a lot

Re: StandardTokenizer is slowing down highlighting a lot

StandardTokenizer is slowing down highlighting a lot

Re: MoreLikeThis

Re: MoreLikeThis

MoreLikeThis

TermEnum - previous() method ?

Dictionary Type Lookup

Re: Lucene shows parts of search query as a HIT

Re: lucene version?

Re: Lucene shows parts of search query as a HIT

lucene version?

Lucene shows parts of search query as a HIT

Re: WildcardQuery and SpanQuery

Re: Does Index have a Tokenizer Built into it

Re: Query in lucene

Query in lucene

Re: WildcardQuery and SpanQuery

Re: WildcardQuery and SpanQuery

20 matches

Site Navigation

Mail list logo

Footer information