Re: arguments in favour of lucene over commercial competition

2010-06-25 Thread jm
I am pretty sure at least some number already exists...cause I have seen mentioned several times things like '3.0 is 3 times faster than 2.4 in benchmark x' and things like that, the only thing is that numbers are not probably consolidated in one place On Fri, Jun 25, 2010 at 12:27 AM, Itamar Syn-

searching for wildcard as valid character

2010-06-25 Thread frueskens
Dear all, I have to solve the following problem but without success yet. We need to search for a content in a field 'name' that contains the wildcard symbol appearing somewhere in a string. E.g. indexed string "1234*abc". The query should ignore all others that does not contain this symbol. A

RE: searching for wildcard as valid character

2010-06-25 Thread Uwe Schindler
Mybe you simply don't use QueryParser for such types of Queries and instantiate TermQuery, BooleanQuery, WildCard, Prefix by hand. Then you don't need to take care of syntax, you create unambiguous objects. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...

Re: searching for wildcard as valid character

2010-06-25 Thread tarun sapra
TermQuery should solve your problem as it would consider "1234*abc" as one single term. Regards Tarun Sapra On Fri, Jun 25, 2010 at 4:13 PM, frueskens wrote: > > Dear all, > > I have to solve the following problem but without success yet. > > We need to search for a content in a field 'name' tha

Re: arguments in favour of lucene over commercial competition

2010-06-25 Thread Grant Ingersoll
Hi JM, On Jun 23, 2010, at 4:01 AM, jm wrote: > Hi, > > I am trying to compile some arguments in favour of lucene as > management is deciding weather to standardize on lucene or a competing > commercial product (we have a couple of produc, one using lucene, > another using commercial product, im

Re: arguments in favour of lucene over commercial competition

2010-06-25 Thread jm
thanks Grant, great pointers! With these and other previous replies I think I'm good to write up a good case for lucene. I'll look for perf numbers over releases (but I don't have the time to create those numbers from scratch unfortunately). regards, javier On Fri, Jun 25, 2010 at 4:27 PM, Grant

Index with multiple level structure

2010-06-25 Thread Alexandre Leopoldo Gonçalves
Hi All, I wonder if it is possible to create Lucene indexes with a multiple level structure. For instance, a field named "institutions" with all institutions I´ve worked and sub-fields to detail my contribution in a specific institution. The structure would be like this: field: name conte

Re: searching for wildcard as valid character

2010-06-25 Thread Robert Muir
i just wanted to mention that wildcardquery (forget queryparser) has no way to allow for an escaped character such as * or ? that is also an operator: https://issues.apache.org/jira/browse/LUCENE-588 On Fri, Jun 25, 2010 at 7:01 AM, Uwe Schindler wrote: > Mybe you simply don't use QueryParser f

Re: Index with multiple level structure

2010-06-25 Thread Rebecca Watson
hi alex, sounds like you are going to tackle a similar problem to what we're trying to do in our XML too -- as it looks like you've got a one-to-many type relationship you want to search over but return based on the top-level document -- similar to an an XML i.e. structured doc search problem --

Re: URL Tokenization

2010-06-25 Thread Sudha Verma
Thanks, That worked from Lucene API. Because the code is not fully released, some of it had build errors. Nothing big. I ran into a few compile errors because the path for some of the analysis classes got changed to standard/ or core/...A lot of the import statements in solr source from that trun

Re: Index with multiple level structure

2010-06-25 Thread Simon Willnauer
On Sat, Jun 26, 2010 at 5:02 AM, Rebecca Watson wrote: > hi alex, > > sounds like you are going to tackle a similar problem to what we're > trying to do > in our XML too -- as it looks like you've got a one-to-many type relationship > you want to search over but return based on the top-level docum