Help migrating from 1.9.1 to 2.3.0 (Newbie)

2008-04-07 Thread JavaJava
Someone has left here and I was asked to upgrade to lucene 2.3.0. Please comment on the compile errors: ERROR: rd.delete(t) > does this become: rd.deleteDocuments(t) ? (rd is an IndexReader) ERROR: Query q = MultiFieldQueryParser.parse(srch, names, new StandardAnalyzer()) Do I change this

Sort difference between 2.1 and 2.3

2008-04-07 Thread Antony Bowesman
Hi, I had a test case that added two documents, each with one untokenized field, and sorted them. The data in each document was char(1) + "First" char(0x) + "Last" With Lucene 2.1 the documents are sorted correctly, but with Lucene 2.3.1, they are not. Looking at the index with Luke sh

StandardTokenizerConstants in 2.3

2008-04-07 Thread Antony Bowesman
I'm migrating from 2.1 to 2.3 and found that the public interface StandardTokenizerConstants has gone. It looks like the definitions have disappeared inside the package private class StandardTokenizerImpl. Was this intentional? I was using these to determine the returns values from Token.typ

[jira] Commented: (LUCENE-1260) Norm codec strategy in Similarity

2008-04-07 Thread Karl Wettin (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12586588#action_12586588 ] Karl Wettin commented on LUCENE-1260: - I suppose it would be possible to implement a N

[jira] Updated: (LUCENE-1260) Norm codec strategy in Similarity

2008-04-07 Thread Karl Wettin (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wettin updated LUCENE-1260: Attachment: LUCENE-1260.txt * Simlarity#getNormCodec() * Simlarity#setNormCodec(NormCodec) * S

[jira] Created: (LUCENE-1260) Norm codec strategy in Similarity

2008-04-07 Thread Karl Wettin (JIRA)
Norm codec strategy in Similarity - Key: LUCENE-1260 URL: https://issues.apache.org/jira/browse/LUCENE-1260 Project: Lucene - Java Issue Type: Improvement Components: Search Affects Versions: 2.3

Dataset to test lucene

2008-04-07 Thread sumittyagi
hi, i need a dataset having html pages to test my lucene programs... from where can i download it.. -- View this message in context: http://www.nabble.com/Dataset-to-test-lucene-tp16538138p16538138.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. -

[jira] Created: (LUCENE-1259) Token.clone() copies termBuffer - unneccessary in most cases

2008-04-07 Thread Thomas Peuss (JIRA)
Token.clone() copies termBuffer - unneccessary in most cases Key: LUCENE-1259 URL: https://issues.apache.org/jira/browse/LUCENE-1259 Project: Lucene - Java Issue Type: Improvement

[jira] Updated: (LUCENE-1259) Token.clone() copies termBuffer - unneccessary in most cases

2008-04-07 Thread Thomas Peuss (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Peuss updated LUCENE-1259: - Attachment: LUCENE-1259.patch > Token.clone() copies termBuffer - unneccessary in most cases > -

RE: shingles and punctuations

2008-04-07 Thread Steven A Rowe
Hi Mathieu, >From the class comment for ShingleFilter: This filter handles position increments > 1 by inserting filler tokens (tokens with termtext "_"). It does not handle a position increment of 0. You could use feature this by setting (in an upstream filter) the positionIncrement of ea