Improving search performance for forum search

2012-11-12 Thread Arjen van der Meijden
Hi List, I'm working on a search engine for our forum using Lucene 4. Since its a brand new search engine, I can change it as I see fit. We have about 1.5M topics in the various subforums and on average 20 replies to each topic (i.e. about 33M in total). For now, I've opted to index all repli

Re: Combining The results from DB and Index Regd.,

2012-11-12 Thread selvakumar netaji
Hi Arjen, Thanks for the reply I have one query. If the index is updated often(for every minute) will the search performance degrade. Is it good approach to index the documents often? On Tue, Nov 13, 2012 at 12:43 PM, Arjen van der Meijden < acmmail...@tweakers.net> wrote: > On 13-11-2012 4:1

Re: Combining The results from DB and Index Regd.,

2012-11-12 Thread Arjen van der Meijden
On 13-11-2012 4:15 selvakumar netaji wrote: Hi All, We are using lucene for searching data from the database in our enterprise application. The searches would be in a single index, whose documents are indexed from two different databases A and B. The frequency of updating the database A is li

Re: content disappears in the index

2012-11-12 Thread Bernd Fehling
Hi Erik, I like the fortune cookie :-) I came to the same solution as you did but with a short java proggy by trying different patterns, so try and error ;-) This brings me to the question, is there now (with 4.0) any filter doing the job for me? I took a look at LengthFilter but it has a differ

Re: content disappears in the index

2012-11-12 Thread Erick Erickson
Because your regex is wrong? (sorry, couldn't resist). Regexes always give me indigestion. But if you look at your results, your regex isn't working in any case at all. The second group is being removed from the end of the string. I _think_ what's happening is that the longest possible string is b

Re: content disappears in the index

2012-11-12 Thread Bernd Fehling
Yes, it is the second PatternReplaceFilterFactory. the String "Arslanagic, Aida ; Siqveland, Elisabeth" is reduced to "a", whereas the other strings are: "Alexander, Kvam ; Bjørn, Nyland ; Bjørn, Reiten ; Øystein, Huse" --> "alexanderkvambj" "Brennmoen, Ingar ; Hauklien, Øystein ; Hedalen, Trond

Re: content disappears in the index

2012-11-12 Thread Bernd Fehling
The field type is derived from the distributed alphaOnlySort as follows: It reduces long lists of author names (100 and more authors) to the first 30 chars for sorting and removes some illegal chars to keep sorting with utf8 solid. Don't see any problems there.

Re: content disappears in the index

2012-11-12 Thread Jack Krupansky
Maybe... the author names have middle or first initials? Like, maybe the "Arslanagic" dude has an "A" initial in his name, like "A. Arslanagic" or "Arslanagic, A.". In any case, "string" is the proper type for a sorted field, although it would be nice if Lucene/Solr was more developer-friendly

Re: content disappears in the index

2012-11-12 Thread Erick Erickson
First, sorting on tokenized fields is undefined/unsupported. You _might_ get away with it if the author field always reduces to one token, i.e. if you're always indexing only the last name. I should say unsupported/undefined when more than one token is the result of analysis. You can do things lik

RE: content disappears in the index

2012-11-12 Thread Uwe Schindler
Hi, could it be that the issue is tokenization? In your explanation, you write the field is tokenized, but fields used for sorting should not be tokenized and should be indexed as-is (e.g. as Lucene 4.0 StringField). If you have more than one token/document in the field, the sorting is not defi

content disappears in the index

2012-11-12 Thread Bernd Fehling
Hi list, a user reported wrong sorting of our search service running on solr. While chasing this issue I traced it back through lucene into the index. I have a text field for sorting (stored,indexed,tokenized,omitNorms,sortMissingLast) and three docs with author names. If I trace at org.apache.lu

Re: NPE while decrement ref count

2012-11-12 Thread Martin Sachs
hi, thanks for your fast response! I'm using : / java version "1.6.0_25" Java(TM) SE Runtime Environment (build 1.6.0_25-b06) Java HotSpot(TM) Server VM (build 20.0-b11, mixed mode) / This is a 32-bit Oracle JVM (JDK), not the 64-bit on RedHat, but maybe its a bug on REHL. While I write this, I

RE: NPE while decrement ref count

2012-11-12 Thread Uwe Schindler
Hi, I opened the code, the NPE occurs here: if (bytes != null) { assert bytesRef != null; bytesRef.decrementAndGet(); // <-- LINE 102, NPE occurs here bytes = null; bytesRef = null; } else { assert bytesRef == null; } This is completely i

Re: Why QueryParser isn't in API?

2012-11-12 Thread Chris Male
All the QueryParsers were consolidated into their own module in Lucene 4.0. http://lucene.apache.org/core/4_0_0/queryparser/index.html On Mon, Nov 12, 2012 at 6:16 PM, 余靖毅 <502437...@qq.com> wrote: > I'm a new Lucene programmer. I want to know why the class of QueryParser > didn't find in api

Why QueryParser isn't in API?

2012-11-12 Thread ??????
I'm a new Lucene programmer. I want to know why the class of QueryParser didn't find in api. And a simple example in docs still used the class. And are there anthor methods to replace it. Thx! Harry Yu

Re: NPE while decrement ref count

2012-11-12 Thread Martin Sachs
oh yes i missed the version: I'm using lucene 3.6.1 Martin Am 12.11.2012 09:40, schrieb Uwe Schindler: > Which Lucene version? > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > >> -Original Message- >> From: Martin Sac

RE: NPE while decrement ref count

2012-11-12 Thread Uwe Schindler
Which Lucene version? - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Martin Sachs [mailto:martin.sa...@artnology.com] > Sent: Monday, November 12, 2012 9:18 AM > To: java-user@lucene.apache.org > Subjec

Java HotSpot problem with search and 64-bit JVM

2012-11-12 Thread Martin Sachs
Hi, i know, this can be a little off topic. But maybe someone knows something. Background: I'm using RHEL 5.8 with 64-bit JVM. With a 32-bit JVM the Searcher works fine. # # A fatal error has been detected by the Java Runtime Environment: # # SIGBUS (0x7) at pc=0x2b060334, pid=12235, ti

NPE while decrement ref count

2012-11-12 Thread Martin Sachs
Hi , i'm hanging with a NPE Problem. This occurs only on production environment from day to day. Do anyone know some thing about this ? java.lang.NullPointerException at org.apache.lucene.index.SegmentNorms.decRef(SegmentNorms.java:102) at org.apache.lucene.index.SegmentReader.do