AW: NewBie To Lucene || Perfect configuration on a 64 bit server

2014-05-27 Thread Ralf Heyde
Hey, I have several notes about your process. 1st: How you select the documents you are passing to the index for further searching? Maybe it is more straight forward to "find" them on you programming language? 2nd: Storage is cheap, buy a hard-disk and store the overall index. The most expensive

AW: Questions for facets search

2014-08-12 Thread Ralf Heyde
For 1st: from Solr Level i guess, you could select (only) the document by uniqueid. Then you have the facets for that particular document. But this results in one additional query/doc. Gesendet von meinem BlackBerry 10-Smartphone.   Originalnachricht   Von: Sheng Gesendet: Dienstag, 12. August 2

AW: Why does this search fail?

2014-08-26 Thread Ralf Heyde
Can you Post the Result of the queryparser for the other queries too? Gesendet von meinem BlackBerry 10-Smartphone.   Originalnachricht   Von: Milind Gesendet: Dienstag, 26. August 2014 18:24 An: java-user@lucene.apache.org Antwort an: java-user@lucene.apache.org Betreff: Why does this search fail

AW: Retrieve found terms

2014-11-26 Thread Ralf Heyde
Hi John, as far as I remember, the Highlighter has some functionality, to provide that. Maybe you should have a look into that Lucene Highlighting Project too. Ralf -Ursprüngliche Nachricht- Von: John Cecere [mailto:john.cec...@oracle.com] Gesendet: Dienstag, 25. November 2014 15:12 An:

RE: Too many open files and ulimit limits reached

2011-07-02 Thread Ralf Heyde
Hi Dean, can you describe your problem more in detail? Some time ago I used Lucene to Index 50-Millions of medium-sized-text documents. But I didn't run into a problem having 1000th of open files. Maybe you can explain, what are you trying to do? Regards, Ralf -Original Message- From:

RE: Index Writer()

2011-08-29 Thread Ralf Heyde
Hi Josh, Can you paste your sourcecode AND explain what are you trying to do? Ralf From: Josh Am [mailto:josh22...@gmail.com] Sent: Montag, 29. August 2011 12:39 To: java-user@lucene.apache.org Subject: Index Writer() Dear friends Hi!! I used Lucene to index some documents but unfo

Is it possible to combine Wildcard and Phrasequery for the Queryparser

2011-10-13 Thread Ralf Heyde
Hello, i'm trying to search the following phase: I'm searching all occurrences of: . "The Right Way" . "The Right Ways" Possible solutions could be something like this - combining a phrase & wildcard search: . title:"The Right Way*" . title:"The Ri

RE: 10 million entities and 100 million related information

2012-01-13 Thread Ralf Heyde
Hi, Maybe have a look at Solr ... if you need additional capacities, Solr offers you a distribution of the index over more than one machine / harddisk. Ralf -Original Message- From: Cheng [mailto:zhoucheng2...@gmail.com] Sent: Freitag, 13. Januar 2012 01:48 To: java-user@lucene.apach

RE: Indexing 100Gb of readonly numeric data

2012-02-20 Thread Ralf Heyde
Hi Pedro, Maybe have a look at Hadoop / JAQL / HBase? For this "simple" setup it could be a scalable and simple solution (with additional aggregation functions). Best Ralf -Original Message- From: Pedro Ferreira [mailto:psilvaferre...@gmail.com] Sent: Mittwoch, 15. Februar 2012 23:18

Re: lucene algorithm ?

2012-04-26 Thread Ralf Heyde
Hi, i dont know the correct implementation. All that I can say is, that you should take a look at query optimization in state-of-the-art database systems. The generell solution is to select this part of a query first, which reduces the resultset most. For example you try to search for a term l

AW: Lucene reorganizing indexes

2012-07-16 Thread Ralf Heyde
Do you use Lucene or Solr? We faced the problem in Solr due too big Caches, which where (re)warmed up after a commit and the never ending full GCs. Greets Ralf -Ursprüngliche Nachricht- Von: Scott Smith [mailto:ssm...@mainstreamdata.com] Gesendet: Montag, 16. Juli 2012 22:29 An: java-us

Solr adding Documents / Commit in different Threads

2012-08-14 Thread Ralf Heyde
Hello, we currently facing a problem which may lost updates for some documents during adding / comitting. The infrastructure: we have a main solr, which gets documents and distribute them to a lot of slaves. The situation: we have a Job, which runs scheduled every minute (no run, if a prev

AW: Lucene Boolean Query Minimization

2015-02-02 Thread Ralf Heyde
Just an idea: could it be optimized by boolean Algebra? Gruss, Ralf Gesendet vom Mobiltelefon ‎   Originalnachricht   Von: Uwe Schindler Gesendet: Montag, 2. Februar 2015 16:25 An: java-user@lucene.apache.org Antwort an: java-user@lucene.apache.org Betreff: RE: Lucene Boolean Query Minimization

AW: Lucene Boolean Query Minimization

2015-02-02 Thread Ralf Heyde
The question here is: is a 'smaller' boolean query consuming less ressources?  Gruss, Ralf Gesendet vom Mobiltelefon ‎   Originalnachricht   Von: Ralf Heyde Gesendet: Montag, 2. Februar 2015 16:28 An: java-user@lucene.apache.org; java-user@lucene.apache.org Antwort an:

Re: [ANNOUNCE] Apache Lucene 8.0.0 released

2019-03-14 Thread Ralf Heyde
Congratulations! Von meinem iPhone gesendet > Am 14.03.2019 um 13:15 schrieb jim ferenczi : > > 14 March 2019, Apache Lucene™ 8.0.0 available > > The Lucene PMC is pleased to announce the release of Apache Lucene 8.0.0. > > Apache Lucene is a high-performance, full-featured text search engine

Lucene 8 / Facets and Nested Documents / ToParentBlockJoinQuery

2019-04-10 Thread Ralf Heyde
Hey Lucene Geeks,   I have a more or less tricky Question. I'm currently trying to get my brain clear about Facets and Nested / Parent-Child Document relations. A research in the internet showed me quite some examples with Solr and Elasticsearch (and yes, I have heavily used this in the past) - b

Re: Which version of Lucene Java to use?

2019-04-15 Thread Ralf Heyde
Hey, I just started with Lucene 8. I personally did not stumble on any issues yet. If you need this for lets say simple solutions, just take the latest - in case you are in doubt, take the latest 7.x. Von meinem iPhone gesendet > Am 14.04.2019 um 23:01 schrieb gerasimos_rag : > > So I am com

Re: umlauts / diacritic expansion

2019-04-16 Thread Ralf Heyde
Hey, Take a look at Asciifoldingfilter - this one is quite generic. Does this answer your question? Cheers Ralf Von meinem iPhone gesendet > Am 16.04.2019 um 20:08 schrieb Michael Sokolov : > > I'm learning how to index/search German today and understanding that > vowels with umlauts are conv

Re: umlauts / diacritic expansion

2019-04-16 Thread Ralf Heyde
Ah sorry, Asciifolding for umlauts will result in ue/ae - ß/ss etc You could allow a distance of 1 or 2 given you use levenshtein distance - this might be close to what you need. Von meinem iPhone gesendet > Am 16.04.2019 um 20:08 schrieb Michael Sokolov : > > I'm learning how to index/search

Re: recommended index size

2024-01-04 Thread Ralf Heyde
Hi Vincent, My 2 cents: We had a production environment with ~250g and ~1M docs with static + dynamic fields in Solr (afair lucene 7) with a machine having 4GB for the jvm and (afair) a little bit more maybe 6GB OS ‚cache‘. In peak times (re-index) we had 10-15k updates / minute and (partiall