Re: index local file system by lucene

2006-07-14 Thread Otis Gospodnetic
Hi Feng, That's a question for java-user, not java-dev. Borrow the code from Lucene in Action - www.lucenebook.com - it has what you need. Otis - Original Message From: Feng Ji <[EMAIL PROTECTED]> To: java-dev@lucene.apache.org Sent: Friday, July 14, 2006 7:26:30 PM Subject: index loca

Re: is it wrong with my code?

2006-07-14 Thread James liu
yes,it can do...but now work for chinese search. english search is ok. and how can i fix it? 2006/7/15, Doron Cohen <[EMAIL PROTECTED]>: > Hits hits = searcher.search(qp.Query(queryStr)); I think it should be qp.parse(String query) (rather than qp.Query(String field))

Re: What are norms?

2006-07-14 Thread Yonik Seeley
On 7/14/06, Marvin Humphrey <[EMAIL PROTECTED]> wrote: Yonik, I disagree on one point. I recommend against omitting norms for title fields. Well, yes, I should have said "sometimes", when you don't need or want length normalization. The scenarios where you don't want/need length normalization

Re: What are norms?

2006-07-14 Thread Chris Hostetter
: > Length normalization of the field. Full-text matches on shorter : > fields score higher because the match is seen as more specific. You : > loose that if you omit norms. That's typically OK for short fields : > like "title" anyway, and fields that aren't full-text (like dates, : > numbers,

Re: Date ranges - getting the approach right

2006-07-14 Thread Chris Hostetter
: I gather I should prefer RangeQuery to ConstantScoreQuery+RangeFilter, : because it is faster not to use a Filter. However, I sometimes have to It's not allways faster ... it really depends on how many matching terms there are in your range. : In a year of 365 days with e-mail messages arrivin

Re: What are norms?

2006-07-14 Thread Marvin Humphrey
On Jul 14, 2006, at 7:42 AM, Yonik Seeley wrote: On 7/14/06, Rob Staveley (Tom) <[EMAIL PROTECTED]> wrote: What would I lose by omitting norms? The ability to boost individual fields as they are added to the index? Anything else? Length normalization of the field. Full-text matches on sho

RE: MissingStringLastComparatorSource and MultiSearcher

2006-07-14 Thread Chris Hostetter
: I was wanting to apply this to a field, which sorts on INT. Specifically I'm : trying to achieve reverse chronological sorting on a timestamp field, which : stores YYMMDDHHI (i.e. resolves to 10 minutes and doesn't handle centuries). : Missing timestamps are assumed to be "old" (i.e. should appe

Re: PrefixQuery rewrite() bug, ignores max clause count

2006-07-14 Thread Chris Hostetter
: I dug into the code I found where the PrefixQuery rewrites itself as a : BooleanQuery. Unfortunately, it doesn't respect the maxClauseCount of : the BooleanQuery in the process. Thus, when we hit a sufficiently large : number of results, this causes the TooManyClauses exception that a : number

Re: MissingStringLastComparatorSource and MultiSearcher

2006-07-14 Thread Yonik Seeley
On 7/14/06, Rob Staveley (Tom) <[EMAIL PROTECTED]> wrote: I was wanting to apply this to a field, which sorts on INT. The problem with int is that the FieldCache stores the values as an int[], and you can't tell when a value is missing. Specifically I'm trying to achieve reverse chronological

Re: is it wrong with my code?

2006-07-14 Thread Doron Cohen
> Hits hits = searcher.search(qp.Query(queryStr)); I think it should be qp.parse(String query) (rather than qp.Query(String field)) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

is it wrong with my code?

2006-07-14 Thread James liu
i can compile it correctly.but it can not search anything. i can't find error...so wanna u help me? package kk; import java.io.*; import java.util.Date; //import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.analysis.cjk.CJKAnalyzer; import org.apache.lucene.d

PrefixQuery rewrite() bug, ignores max clause count

2006-07-14 Thread Russell M. Allen
We recently ran into an issue while executing a simple prefix search "name:b*", which results in a BooleanQuery$TooManyClauses exception. At first I found it odd that a single clause query was causing this, but as I dug into the code I found where the PrefixQuery rewrites itself as a BooleanQuery.

RE: HTMLParser

2006-07-14 Thread Ross Rankin
Ok I got it fixed and though I would respond back so it was in the archive for the next poor soul... Here's the code I used: StringBean sb = new StringBean (); String htmlSource = record.get("column14").toString().trim(); Parser parser = new Parser(new Lexer(htmlSource));

RE: MissingStringLastComparatorSource and MultiSearcher

2006-07-14 Thread Rob Staveley (Tom)
Neat :-) I was wanting to apply this to a field, which sorts on INT. Specifically I'm trying to achieve reverse chronological sorting on a timestamp field, which stores YYMMDDHHI (i.e. resolves to 10 minutes and doesn't handle centuries). Missing timestamps are assumed to be "old" (i.e. should app

Re: MissingStringLastComparatorSource and MultiSearcher

2006-07-14 Thread Yonik Seeley
On 7/14/06, Rob Staveley (Tom) <[EMAIL PROTECTED]> wrote: Chris Hostetter and Yonik's MissingStringLastComparator looks like a neat way to specify where to put null values when you want them to appear at the end of reverse sorts rather than at the beginning, but I spotted the note... // Note

MissingStringLastComparatorSource and MultiSearcher

2006-07-14 Thread Rob Staveley (Tom)
Chris Hostetter and Yonik's MissingStringLastComparator looks like a neat way to specify where to put null values when you want them to appear at the end of reverse sorts rather than at the beginning, but I spotted the note... // Note: basing lastStringValue on the StringIndex won't work /

Re: accented characters, wildcards and other problems

2006-07-14 Thread Tomi NA
On 7/13/06, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: Bok Tomi, What do you mean by "terms are misrepresented"? What should they be, and what are you seeing? I mean 3/5 accented characters appear in the index with accents correctly displayed, but the remaining 2 accented characters appear

RE: What are norms?

2006-07-14 Thread Rob Staveley (Tom)
Got it. Thanks Yonik. -Original Message- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: 14 July 2006 15:42 To: java-user@lucene.apache.org Subject: Re: What are norms? On 7/14/06, Rob Staveley (Tom) <[EMAIL PROTECTED]> wrote: > I'm trying to reduce the memory requirement of my applic

Re: What are norms?

2006-07-14 Thread Yonik Seeley
On 7/14/06, Rob Staveley (Tom) <[EMAIL PROTECTED]> wrote: I'm trying to reduce the memory requirement of my application that has ~40 indexed fields. Would I be wasting my time omitting norms in this application? What would I lose by omitting norms? The ability to boost individual fields as they

RE: What are norms?

2006-07-14 Thread Rob Staveley (Tom)
I'm trying to reduce the memory requirement of my application that has ~40 indexed fields. Would I be wasting my time omitting norms in this application? What would I lose by omitting norms? The ability to boost individual fields as they are added to the index? Anything else? [I want to check tha

Mixing compressed and uncompressed values

2006-07-14 Thread Rob Staveley (Tom)
Is this a bad idea? String synopsis = /* may be any length between 0 and 400 characters */ // Store, but don't index the synopsis // If the synopsis is > 150 characters, we should compress it Field field = new Field( "synopsis",synopsis

Re: Are Search Joins Possible between two Physically separate Indexes?

2006-07-14 Thread Paul Borgermans
Sorry, a filter won't help if I understood you correctly (a real join based on the external DocumentID) Paul On 7/13/06, Paul Borgermans <[EMAIL PROTECTED]> wrote: Though I'm a newbie (which means I may be completely wrong), I don't think this is possible "out of the box". The quickest would b

Re: FileNotFoundException: occurs during the optimization of index

2006-07-14 Thread Kapil Chhabra
You may try to update a copy of the index and then either replace the live index with the updated one or instruct other instances to update the index path. You may try this scenario if your index size is manageable. Hope this helps. Regards, kapilChhabra Supriya Kumar Shyamal wrote: I have comm

FileNotFoundException: occurs during the optimization of index

2006-07-14 Thread Supriya Kumar Shyamal
I have common index which is shared by 6 application server through nfs. I use only one instance to maintain the index. other instances shares the index with readpnly nfs share so that I make sure the search instances does not corrupt the index. I do the index optimization everyday at 6:00 at t

Re: modify existing non-indexed field

2006-07-14 Thread dan2000
Thanks Doron. The site you provided doesn't support firefox, that's why had problem with downloading. Your code works fine and I've just noticed I didn't change the create parameter to false when I've leaned the index directory. Sorry for my mistake. Thanks a lot for your help Doron. You advic

Date ranges - getting the approach right

2006-07-14 Thread Rob Staveley (Tom)
nClause.Occur.SHOULD ); dateRangeBooleanQuery.add( new TermQuery("20060714") // i.e. today ,BooleanClause.Occur.SHOULD ); BooleanQuery mainQuery = new BooleanQuery(); mainQuery.add(