Re: lucene 2.9 sorting algorithm
Given that this new API is pretty unweildy, and seems to not actually perform any better than the old one... are we going to consider revisiting that? -jake On Mon, Oct 19, 2009 at 11:27 PM, Uwe Schindler wrote: > The old search API is already removed in trunk… > > > > Uwe > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > -- > > *From:* John Wang [mailto:john.w...@gmail.com] > *Sent:* Tuesday, October 20, 2009 3:28 AM > *To:* java-dev@lucene.apache.org > *Subject:* Re: lucene 2.9 sorting algorithm > > > > Hi Michael: > > > > Was wondering if you got a chance to take a look at this. > > > > Since deprecated APIs are being removed in 3.0, I was wondering > if/when we would decide on keeping the ScoreDocComparator API and thus would > be kept for Lucene 3.0. > > > > Thanks > > > > -John > > On Fri, Oct 16, 2009 at 9:53 AM, Michael McCandless < > luc...@mikemccandless.com> wrote: > > Oh, no problem... > > Mike > > > On Fri, Oct 16, 2009 at 12:33 PM, John Wang wrote: > > Mike, just a clarification on my first perf report email. > > The first section, numHits is incorrectly labeled, it should be 20 > instead > > of 50. Sorry about the possible confusion. > > Thanks > > -John > > > > On Fri, Oct 16, 2009 at 3:21 AM, Michael McCandless > > wrote: > >> > >> Thanks John; I'll have a look. > >> > >> Mike > >> > >> On Fri, Oct 16, 2009 at 12:57 AM, John Wang > wrote: > >> > Hi Michael: > >> > I added classes: ScoreDocComparatorQueue > and OneSortNoScoreCollector > >> > as > >> > a more general case. I think keeping the old api for > ScoreDocComparator > >> > and > >> > SortComparatorSource would work. > >> > Please take a look. > >> > Thanks > >> > -John > >> > > >> > On Thu, Oct 15, 2009 at 6:52 PM, John Wang > wrote: > >> >> > >> >> Hi Michael: > >> >> It is open, > http://code.google.com/p/lucene-book/source/checkout > >> >> I think I sent the https url instead, sorry. > >> >> The multi PQ sorting is fairly self-contained, I have 2 versions, > 1 > >> >> for string and 1 for int, each are Collector impls. > >> >> I shouldn't say the Multi Q is faster on int sort, it is within > >> >> the > >> >> error boundary. The diff is very very small, I would stay they are > more > >> >> equal. > >> >> If you think it is a good thing to go this way, (if not for the > >> >> perf, > >> >> just for the simpler api) I'd be happy to work on a patch. > >> >> Thanks > >> >> -John > >> >> On Thu, Oct 15, 2009 at 5:18 PM, Michael McCandless > >> >> wrote: > >> >>> > >> >>> John, looks like this requires login -- any plans to open that up, > or, > >> >>> post the code on an issue? > >> >>> > >> >>> How self-contained is your Multi PQ sorting? EG is it a standalone > >> >>> Collector impl that I can test? > >> >>> > >> >>> Mike > >> >>> > >> >>> On Thu, Oct 15, 2009 at 6:33 PM, John Wang > >> >>> wrote: > >> >>> > BTW, we are have a little sandbox for these experiments. And all > my > >> >>> > testcode > >> >>> > are at. They are not very polished. > >> >>> > > >> >>> > https://lucene-book.googlecode.com/svn/trunk > >> >>> > > >> >>> > -John > >> >>> > > >> >>> > On Thu, Oct 15, 2009 at 3:29 PM, John Wang > >> >>> > wrote: > >> >>> >> > >> >>> >> Numbers Mike requested for Int types: > >> >>> >> > >> >>> >> only the time/cputime are posted, others are all the same since > the > >> >>> >> algorithm is the same. > >> >>> >> > >> >>> >> Lucene 2.9: > >> >>> >> numhits: 10 > >> >>> >> time: 14619495 > >> >>> >> cpu: 146126 > >> >>> >> > >> >>> >> numhits: 20 > >> >>> >> time: 14550568 > >> >>> >> cpu: 163242 > >> >>> >> > >> >>> >> numhits: 100 > >> >>> >> time: 16467647 > >> >>> >> cpu: 178379 > >> >>> >> > >> >>> >> > >> >>> >> my test: > >> >>> >> numHits: 10 > >> >>> >> time: 14101094 > >> >>> >> cpu: 144715 > >> >>> >> > >> >>> >> numHits: 20 > >> >>> >> time: 14804821 > >> >>> >> cpu: 151305 > >> >>> >> > >> >>> >> numHits: 100 > >> >>> >> time: 15372157 > >> >>> >> cpu time: 158842 > >> >>> >> > >> >>> >> Conclusions: > >> >>> >> The are very similar, the differences are all within error > bounds, > >> >>> >> especially with lower PQ sizes, which second sort alg again > >> >>> >> slightly > >> >>> >> faster. > >> >>> >> > >> >>> >> Hope this helps. > >> >>> >> > >> >>> >> -John > >> >>> >> > >> >>> >> > >> >>> >> On Thu, Oct 15, 2009 at 3:04 PM, Yonik Seeley > >> >>> >> > >> >>> >> wrote: > >> >>> >>> > >> >>> >>> On Thu, Oct 15, 2009 at 5:33 PM, Michael McCandless > >> >>> >>> wrote: > >> >>> >>> > Though it'd be odd if the switch to searching by segment > >> >>> >>> > really was most of the gains here. > >> >>> >>> > >> >>> >>> I had assumed that much of the improvement was due to ditching > >> >>> >>> MultiTermEnum/MultiTermDocs. > >> >>> >>> Note that LUCENE-1483 was before LUCENE-1596... but that only > >> >>> >>> helps > >> >>> >>> with queries that use a TermEnum (range, prefi
RE: lucene 2.9 sorting algorithm
The old search API is already removed in trunk. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de _ From: John Wang [mailto:john.w...@gmail.com] Sent: Tuesday, October 20, 2009 3:28 AM To: java-dev@lucene.apache.org Subject: Re: lucene 2.9 sorting algorithm Hi Michael: Was wondering if you got a chance to take a look at this. Since deprecated APIs are being removed in 3.0, I was wondering if/when we would decide on keeping the ScoreDocComparator API and thus would be kept for Lucene 3.0. Thanks -John On Fri, Oct 16, 2009 at 9:53 AM, Michael McCandless wrote: Oh, no problem... Mike On Fri, Oct 16, 2009 at 12:33 PM, John Wang wrote: > Mike, just a clarification on my first perf report email. > The first section, numHits is incorrectly labeled, it should be 20 instead > of 50. Sorry about the possible confusion. > Thanks > -John > > On Fri, Oct 16, 2009 at 3:21 AM, Michael McCandless > wrote: >> >> Thanks John; I'll have a look. >> >> Mike >> >> On Fri, Oct 16, 2009 at 12:57 AM, John Wang wrote: >> > Hi Michael: >> > I added classes: ScoreDocComparatorQueue and OneSortNoScoreCollector >> > as >> > a more general case. I think keeping the old api for ScoreDocComparator >> > and >> > SortComparatorSource would work. >> > Please take a look. >> > Thanks >> > -John >> > >> > On Thu, Oct 15, 2009 at 6:52 PM, John Wang wrote: >> >> >> >> Hi Michael: >> >> It is open, http://code.google.com/p/lucene-book/source/checkout >> >> I think I sent the https url instead, sorry. >> >> The multi PQ sorting is fairly self-contained, I have 2 versions, 1 >> >> for string and 1 for int, each are Collector impls. >> >> I shouldn't say the Multi Q is faster on int sort, it is within >> >> the >> >> error boundary. The diff is very very small, I would stay they are more >> >> equal. >> >> If you think it is a good thing to go this way, (if not for the >> >> perf, >> >> just for the simpler api) I'd be happy to work on a patch. >> >> Thanks >> >> -John >> >> On Thu, Oct 15, 2009 at 5:18 PM, Michael McCandless >> >> wrote: >> >>> >> >>> John, looks like this requires login -- any plans to open that up, or, >> >>> post the code on an issue? >> >>> >> >>> How self-contained is your Multi PQ sorting? EG is it a standalone >> >>> Collector impl that I can test? >> >>> >> >>> Mike >> >>> >> >>> On Thu, Oct 15, 2009 at 6:33 PM, John Wang >> >>> wrote: >> >>> > BTW, we are have a little sandbox for these experiments. And all my >> >>> > testcode >> >>> > are at. They are not very polished. >> >>> > >> >>> > https://lucene-book.googlecode.com/svn/trunk >> >>> > >> >>> > -John >> >>> > >> >>> > On Thu, Oct 15, 2009 at 3:29 PM, John Wang >> >>> > wrote: >> >>> >> >> >>> >> Numbers Mike requested for Int types: >> >>> >> >> >>> >> only the time/cputime are posted, others are all the same since the >> >>> >> algorithm is the same. >> >>> >> >> >>> >> Lucene 2.9: >> >>> >> numhits: 10 >> >>> >> time: 14619495 >> >>> >> cpu: 146126 >> >>> >> >> >>> >> numhits: 20 >> >>> >> time: 14550568 >> >>> >> cpu: 163242 >> >>> >> >> >>> >> numhits: 100 >> >>> >> time: 16467647 >> >>> >> cpu: 178379 >> >>> >> >> >>> >> >> >>> >> my test: >> >>> >> numHits: 10 >> >>> >> time: 14101094 >> >>> >> cpu: 144715 >> >>> >> >> >>> >> numHits: 20 >> >>> >> time: 14804821 >> >>> >> cpu: 151305 >> >>> >> >> >>> >> numHits: 100 >> >>> >> time: 15372157 >> >>> >> cpu time: 158842 >> >>> >> >> >>> >> Conclusions: >> >>> >> The are very similar, the differences are all within error bounds, >> >>> >> especially with lower PQ sizes, which second sort alg again >> >>> >> slightly >> >>> >> faster. >> >>> >> >> >>> >> Hope this helps. >> >>> >> >> >>> >> -John >> >>> >> >> >>> >> >> >>> >> On Thu, Oct 15, 2009 at 3:04 PM, Yonik Seeley >> >>> >> >> >>> >> wrote: >> >>> >>> >> >>> >>> On Thu, Oct 15, 2009 at 5:33 PM, Michael McCandless >> >>> >>> wrote: >> >>> >>> > Though it'd be odd if the switch to searching by segment >> >>> >>> > really was most of the gains here. >> >>> >>> >> >>> >>> I had assumed that much of the improvement was due to ditching >> >>> >>> MultiTermEnum/MultiTermDocs. >> >>> >>> Note that LUCENE-1483 was before LUCENE-1596... but that only >> >>> >>> helps >> >>> >>> with queries that use a TermEnum (range, prefix, etc). >> >>> >>> >> >>> >>> -Yonik >> >>> >>> http://www.lucidimagination.com >> >>> >>> >> >>> >>> >> >>> >>> - >> >>> >>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> >>> >>> For additional commands, e-mail: java-dev-h...@lucene.apache.org >> >>> >>> >> >>> >> >> >>> > >> >>> > >> >>> >> >>> - >> >>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> >>> For additional commands, e-mail: java-dev-h...@lucene.apache.org >> >>> >>
Re: lucene 2.9 sorting algorithm
Hi Michael: Was wondering if you got a chance to take a look at this. Since deprecated APIs are being removed in 3.0, I was wondering if/when we would decide on keeping the ScoreDocComparator API and thus would be kept for Lucene 3.0. Thanks -John On Fri, Oct 16, 2009 at 9:53 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > Oh, no problem... > > Mike > > On Fri, Oct 16, 2009 at 12:33 PM, John Wang wrote: > > Mike, just a clarification on my first perf report email. > > The first section, numHits is incorrectly labeled, it should be 20 > instead > > of 50. Sorry about the possible confusion. > > Thanks > > -John > > > > On Fri, Oct 16, 2009 at 3:21 AM, Michael McCandless > > wrote: > >> > >> Thanks John; I'll have a look. > >> > >> Mike > >> > >> On Fri, Oct 16, 2009 at 12:57 AM, John Wang > wrote: > >> > Hi Michael: > >> > I added classes: ScoreDocComparatorQueue > and OneSortNoScoreCollector > >> > as > >> > a more general case. I think keeping the old api for > ScoreDocComparator > >> > and > >> > SortComparatorSource would work. > >> > Please take a look. > >> > Thanks > >> > -John > >> > > >> > On Thu, Oct 15, 2009 at 6:52 PM, John Wang > wrote: > >> >> > >> >> Hi Michael: > >> >> It is open, > http://code.google.com/p/lucene-book/source/checkout > >> >> I think I sent the https url instead, sorry. > >> >> The multi PQ sorting is fairly self-contained, I have 2 versions, > 1 > >> >> for string and 1 for int, each are Collector impls. > >> >> I shouldn't say the Multi Q is faster on int sort, it is within > >> >> the > >> >> error boundary. The diff is very very small, I would stay they are > more > >> >> equal. > >> >> If you think it is a good thing to go this way, (if not for the > >> >> perf, > >> >> just for the simpler api) I'd be happy to work on a patch. > >> >> Thanks > >> >> -John > >> >> On Thu, Oct 15, 2009 at 5:18 PM, Michael McCandless > >> >> wrote: > >> >>> > >> >>> John, looks like this requires login -- any plans to open that up, > or, > >> >>> post the code on an issue? > >> >>> > >> >>> How self-contained is your Multi PQ sorting? EG is it a standalone > >> >>> Collector impl that I can test? > >> >>> > >> >>> Mike > >> >>> > >> >>> On Thu, Oct 15, 2009 at 6:33 PM, John Wang > >> >>> wrote: > >> >>> > BTW, we are have a little sandbox for these experiments. And all > my > >> >>> > testcode > >> >>> > are at. They are not very polished. > >> >>> > > >> >>> > https://lucene-book.googlecode.com/svn/trunk > >> >>> > > >> >>> > -John > >> >>> > > >> >>> > On Thu, Oct 15, 2009 at 3:29 PM, John Wang > >> >>> > wrote: > >> >>> >> > >> >>> >> Numbers Mike requested for Int types: > >> >>> >> > >> >>> >> only the time/cputime are posted, others are all the same since > the > >> >>> >> algorithm is the same. > >> >>> >> > >> >>> >> Lucene 2.9: > >> >>> >> numhits: 10 > >> >>> >> time: 14619495 > >> >>> >> cpu: 146126 > >> >>> >> > >> >>> >> numhits: 20 > >> >>> >> time: 14550568 > >> >>> >> cpu: 163242 > >> >>> >> > >> >>> >> numhits: 100 > >> >>> >> time: 16467647 > >> >>> >> cpu: 178379 > >> >>> >> > >> >>> >> > >> >>> >> my test: > >> >>> >> numHits: 10 > >> >>> >> time: 14101094 > >> >>> >> cpu: 144715 > >> >>> >> > >> >>> >> numHits: 20 > >> >>> >> time: 14804821 > >> >>> >> cpu: 151305 > >> >>> >> > >> >>> >> numHits: 100 > >> >>> >> time: 15372157 > >> >>> >> cpu time: 158842 > >> >>> >> > >> >>> >> Conclusions: > >> >>> >> The are very similar, the differences are all within error > bounds, > >> >>> >> especially with lower PQ sizes, which second sort alg again > >> >>> >> slightly > >> >>> >> faster. > >> >>> >> > >> >>> >> Hope this helps. > >> >>> >> > >> >>> >> -John > >> >>> >> > >> >>> >> > >> >>> >> On Thu, Oct 15, 2009 at 3:04 PM, Yonik Seeley > >> >>> >> > >> >>> >> wrote: > >> >>> >>> > >> >>> >>> On Thu, Oct 15, 2009 at 5:33 PM, Michael McCandless > >> >>> >>> wrote: > >> >>> >>> > Though it'd be odd if the switch to searching by segment > >> >>> >>> > really was most of the gains here. > >> >>> >>> > >> >>> >>> I had assumed that much of the improvement was due to ditching > >> >>> >>> MultiTermEnum/MultiTermDocs. > >> >>> >>> Note that LUCENE-1483 was before LUCENE-1596... but that only > >> >>> >>> helps > >> >>> >>> with queries that use a TermEnum (range, prefix, etc). > >> >>> >>> > >> >>> >>> -Yonik > >> >>> >>> http://www.lucidimagination.com > >> >>> >>> > >> >>> >>> > >> >>> >>> > - > >> >>> >>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > >> >>> >>> For additional commands, e-mail: > java-dev-h...@lucene.apache.org > >> >>> >>> > >> >>> >> > >> >>> > > >> >>> > > >> >>> > >> >>> > - > >> >>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > >> >>> For additional commands, e-mail: java-dev-h...@lucene.apache.org > >> >>> > >> >> > >
[jira] Updated: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Kay updated LUCENE-1257: Attachment: LUCENE-1257_javacc_upgrade.patch common-build.xml , build comments match those in build.txt > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch, > LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, > LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, > LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, > LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_javacc_upgrade.patch, > LUCENE-1257_messages.patch, LUCENE-1257_MultiFieldQueryParser.patch, > LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, LUCENE-1257_queryParser_jj.patch, > lucene1257surround1.patch, lucene1257surround1.patch, > shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Kay updated LUCENE-1257: Attachment: LUCENE-1257_MultiFieldQueryParser.patch MultiFieldQueryParser > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch, > LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, > LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, > LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, > LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, > LUCENE-1257_MultiFieldQueryParser.patch, LUCENE-1257_o.a.l.queryParser.patch, > LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, > LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, LUCENE-1257_queryParser_jj.patch, > lucene1257surround1.patch, lucene1257surround1.patch, > shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Kay updated LUCENE-1257: Attachment: LUCENE-1257_queryParser_jj.patch QueryParser.jj patch separately for generics > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch, > LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, > LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, > LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, > LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, > LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, LUCENE-1257_queryParser_jj.patch, > lucene1257surround1.patch, lucene1257surround1.patch, > shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767617#action_12767617 ] Uwe Schindler commented on LUCENE-1257: --- bq. What's the version of javacc being used/suggested currently ( the latest release seems to be 5.0 ) . *From BUILD.txt* (I suggest to use this version 4.1, e.g. 4.2 has a bug that corrupts the parser somehow): Step 3) Install JavaCC Building the Lucene distribution from the source does not require the JavaCC parser generator, but if you wish to regenerate any of the pre-generated parser pieces, you will need to install JavaCC. Version 4.1 is tested to work correctly. http://javacc.dev.java.net Follow the download links and download the zip file to a temporary location on your file system. After JavaCC is installed, create a build.properties file (as in step 2), and add the line javacc.home=/javacc where this points to the root directory of your javacc installation (the directory that contains bin/lib/javacc.jar). > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch, > LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, > LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, > LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, > LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, > LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, > lucene1257surround1.patch, shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1257: -- Attachment: LUCENE-1257-FieldCacheRangeFilter.patch FieldCacheRangeFilter generified + type safe accessor methods. Committed revision: 826883 > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch, > LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, > LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, > LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, > LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, > LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, > lucene1257surround1.patch, shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767611#action_12767611 ] Kay Kay commented on LUCENE-1257: - | I updated the parser generator task to use Java 1.5. If you want to generify the other parts of QueryParser, update the .jj file and regenerate the java files. I will do this tomorrow. Will go to bed now. What's the version of javacc being used/suggested currently ( the latest release seems to be 5.0 ) . > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, > LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, > LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, > LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, > LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, > lucene1257surround1.patch, shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767603#action_12767603 ] Uwe Schindler commented on LUCENE-1257: --- Committed: LUCENE-1257-MTQWF.patch 2009-10-19 10:55 PM Uwe Schindler 5 kB LUCENE-1257-TopDocsCollector.patch 2009-10-19 08:47 PM Kay Kay 8 kB LUCENE-1257-FieldCacheImpl.patch 2009-10-19 08:23 PM Kay Kay 8 kB (with some modifications in FieldCacheImpl, where Class was not generified to Class). At revision: 826857 > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, > LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, > LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, > LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, > LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, > lucene1257surround1.patch, shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1257: -- Attachment: LUCENE-1257-MTQWF.patch better generification of MultiTermQueryWrapperFilter (no more casts in sub-classes). > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, > LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, > LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, > LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, > LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, > lucene1257surround1.patch, shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1995) ArrayIndexOutOfBoundsException during indexing
[ https://issues.apache.org/jira/browse/LUCENE-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767564#action_12767564 ] Michael McCandless commented on LUCENE-1995: That's a nice large RAM buffer :) bq. Mike - I think keeping the signed shift is the right thing to do... a zero-cost check against silent corruption. Ahh good point, OK we'll keep it as is. bq. But I'm not sure if 2048MiB is safe either 2048 probably won't be safe, because a large doc just as the buffer is filling up could still overflow. (Though, RAM is also used eg for norms, so you might squeak by). I'll update the javadocs to note the limitation! > ArrayIndexOutOfBoundsException during indexing > -- > > Key: LUCENE-1995 > URL: https://issues.apache.org/jira/browse/LUCENE-1995 > Project: Lucene - Java > Issue Type: Bug > Components: Index >Affects Versions: 2.9 >Reporter: Yonik Seeley > Fix For: 2.9.1 > > > http://search.lucidimagination.com/search/document/f29fc52348ab9b63/arrayindexoutofboundsexception_during_indexing -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Assigned: (LUCENE-1995) ArrayIndexOutOfBoundsException during indexing
[ https://issues.apache.org/jira/browse/LUCENE-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1995: -- Assignee: Michael McCandless > ArrayIndexOutOfBoundsException during indexing > -- > > Key: LUCENE-1995 > URL: https://issues.apache.org/jira/browse/LUCENE-1995 > Project: Lucene - Java > Issue Type: Bug > Components: Index >Affects Versions: 2.9 >Reporter: Yonik Seeley >Assignee: Michael McCandless > Fix For: 2.9.1 > > > http://search.lucidimagination.com/search/document/f29fc52348ab9b63/arrayindexoutofboundsexception_during_indexing -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1995) ArrayIndexOutOfBoundsException during indexing
[ https://issues.apache.org/jira/browse/LUCENE-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767556#action_12767556 ] Yonik Seeley commented on LUCENE-1995: -- lol - well, there we go. Looks like perhaps a JavaDoc fix (and a comment in solrconfig.xml)? The buffered size was never meant to be quite so large :-) Mike - I think keeping the signed shift is the right thing to do... a zero-cost check against silent corruption. But I'm not sure if 2048MiB is safe either... I'm not sure of one could overflow the number of buffers somehow as well (is every buffer except the last fully utilized?) > ArrayIndexOutOfBoundsException during indexing > -- > > Key: LUCENE-1995 > URL: https://issues.apache.org/jira/browse/LUCENE-1995 > Project: Lucene - Java > Issue Type: Bug > Components: Index >Affects Versions: 2.9 >Reporter: Yonik Seeley > Fix For: 2.9.1 > > > http://search.lucidimagination.com/search/document/f29fc52348ab9b63/arrayindexoutofboundsexception_during_indexing -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Parameter class and Java 5 Enums
Should the Parameter class be replaced with Java 5 enums? My only concern is backward compatibility. I noticed that Parameter is serializable. Is this used by Lucene? I wasn't able to see any place that depended on it. The only public method, Parameter.toString() results in the same value as a Java 5 Enum. It seems that an advanced form of enums would be helpful, too. I'm seeing a lot of "switch" statements on their value: e.g. In AbstractField: if (store == Field.Store.YES){ this.isStored = true; } else if (store == Field.Store.NO){ this.isStored = false; } else throw new IllegalArgumentException("unknown store parameter " + store); if (index == Field.Index.NO) { this.isIndexed = false; this.isTokenized = false; } else if (index == Field.Index.ANALYZED) { this.isIndexed = true; this.isTokenized = true; } else if (index == Field.Index.NOT_ANALYZED) { this.isIndexed = true; this.isTokenized = false; } else if (index == Field.Index.NOT_ANALYZED_NO_NORMS) { this.isIndexed = true; this.isTokenized = false; this.omitNorms = true; } else if (index == Field.Index.ANALYZED_NO_NORMS) { this.isIndexed = true; this.isTokenized = true; this.omitNorms = true; } else { throw new IllegalArgumentException("unknown index parameter " + index); } This could be reduced to: this.stored = store.isStored(); this.isIndexed = index.isIndexed(); this.isTokenized = index.isTokenized(); this.omitNorms = index.omitNorms(); With the following: public enum Store { YES { public boolean isStored() { return true; } }, NO{ public boolean isStored() { return false; } }; // Determine whether this is stored or not abstract boolean isStored(); } public enum Index { ANALYZED { public boolean isIndexed() { return true; } public boolean isTokenized() { return true; } public boolean omitNorms() { return false; } ... }, ... abstract boolean isIndexed(); abstract boolean isTokenized(); abstract boolean omitNorms(); ... } What I like about this pattern is that it clearly documents what each member does. As it is it is spread around in the files. One can add a "picker" method to these to serve as a factory. E.g. given indexed = true, tokenized = false, ... what is the appropriate value from the Index enum. -- DM - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767548#action_12767548 ] Uwe Schindler commented on LUCENE-1987: --- To move back to my other problem: How to handle the problem with LUCENE_29 setting and the posIncr of stopwords together with QueryParser that has a default setting of ignoring posIncr?: This leads to the problem, that a phrase query does not hit anything if you index with StandardAnalyzer=LUCENE_29 and QueryParser using the same analyzer but with setEnablePositionIncrements(false) [the current default for QueryParser]. > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 2.9.1, 3.0 > > Attachments: LUCENE-1987-StopFilter-backport29.patch, > LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, > LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1995) ArrayIndexOutOfBoundsException during indexing
[ https://issues.apache.org/jira/browse/LUCENE-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767541#action_12767541 ] Aaron McKee commented on LUCENE-1995: - I make no claims to the reasonableness of these settings, I only recently began efforts to tune our prototype. =) useCompoundFile: false mergeFactor: 10 maxBufferedDocs: 500 ramBufferSizeMB: 8192 maxFieldLength: 1 reopenReaders: true My system has 24gb and my index is typically ~16gb, so I set some of these values a bit high. If the ram buffer is being indexed with an int, that could certainly be my issue; I feel a bit silly for not having thought of that, already. I'll try setting it down to 2048 and see if the problem disappears. > ArrayIndexOutOfBoundsException during indexing > -- > > Key: LUCENE-1995 > URL: https://issues.apache.org/jira/browse/LUCENE-1995 > Project: Lucene - Java > Issue Type: Bug > Components: Index >Affects Versions: 2.9 >Reporter: Yonik Seeley > Fix For: 2.9.1 > > > http://search.lucidimagination.com/search/document/f29fc52348ab9b63/arrayindexoutofboundsexception_during_indexing -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1995) ArrayIndexOutOfBoundsException during indexing
[ https://issues.apache.org/jira/browse/LUCENE-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767532#action_12767532 ] Michael McCandless commented on LUCENE-1995: Spooky! It does look likely we overflowed int, because (1 + Integer.MAX_VALUE) >> 15 is -65536. > ArrayIndexOutOfBoundsException during indexing > -- > > Key: LUCENE-1995 > URL: https://issues.apache.org/jira/browse/LUCENE-1995 > Project: Lucene - Java > Issue Type: Bug > Components: Index >Affects Versions: 2.9 >Reporter: Yonik Seeley > Fix For: 2.9.1 > > > http://search.lucidimagination.com/search/document/f29fc52348ab9b63/arrayindexoutofboundsexception_during_indexing -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Kay updated LUCENE-1257: Attachment: (was: LUCENE-1257-FieldValueHitQueue.patch) > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, > LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, > LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, > LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, > lucene1257surround1.patch, shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Kay updated LUCENE-1257: Attachment: LUCENE-1257-TopDocsCollector.patch * FieldValueHitQueue * TopDocsCollector * TopScoreDocsCollector * TopFieldHitsCollector > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, > LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, > LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, > LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, > lucene1257surround1.patch, shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Kay updated LUCENE-1257: Attachment: LUCENE-1257-FieldValueHitQueue.patch > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldValueHitQueue.patch, > LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-WordListLoader.patch, > LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch, > LUCENE-1257_messages.patch, LUCENE-1257_o.a.l.queryParser.patch, > LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, > LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, > lucene1257surround1.patch, shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Kay updated LUCENE-1257: Attachment: LUCENE-1257-FieldCacheImpl.patch > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-WordListLoader.patch, > LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch, > LUCENE-1257_messages.patch, LUCENE-1257_o.a.l.queryParser.patch, > LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, > LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, > lucene1257surround1.patch, shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: [jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
On Mon, Oct 19, 2009 at 4:00 PM, Yonik Seeley wrote: > On Mon, Oct 19, 2009 at 3:45 PM, Mark Miller wrote: >> but there is some old source code here and >> there that really bugs me > > Is it Doug's > > if (foo) > bar() > else { > baz(); > } > > or is it my single line > > if (a==null) return 0; > > ;-) Or my always doing this up until a while ago: if (foo) something; but then suddenly [trying to] switch to the correct: if (foo) { something; } ? > One of my personal pet peeves is more indentation than necessary for > large blocks of code, rather than just immediately handling the > exception cases and escaping. Example: Hmm I think I tend to do this :) But I agree, your way IS more readable so I'll try to switch! Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: [jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
On Mon, Oct 19, 2009 at 3:45 PM, Mark Miller wrote: > but there is some old source code here and > there that really bugs me Is it Doug's if (foo) bar() else { baz(); } or is it my single line if (a==null) return 0; ;-) One of my personal pet peeves is more indentation than necessary for large blocks of code, rather than just immediately handling the exception cases and escaping. Example: void doSomething(MyObj obj) { if (obj != null) {// at this point, I'm wondering... hmmm, is there code that executes *after* this huge "if" in the event that obj is null? [...] // same with this one... ya gotta go and try to match up braces to see if there is code that executes in the opposite case... // and if it also falls through to execute the obj==null case or simply returns. if (some other condition) { [ tons of code ] [ tons of code ] } } A much more readable version (regardless of if one likes the single-line syntax or not): void doSomething(MyObj obj) { if (obj==null) return; // immediately obvious handling of the exception case [...] if (!some other condition) return; // again, immediately obvious how the exception case was handled [ tons of code ] [ tons of code ] } -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1996) EnwikiContentSource isn't thread safe
[ https://issues.apache.org/jira/browse/LUCENE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767492#action_12767492 ] Michael McCandless commented on LUCENE-1996: That IS really crazy. > EnwikiContentSource isn't thread safe > - > > Key: LUCENE-1996 > URL: https://issues.apache.org/jira/browse/LUCENE-1996 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/benchmark >Reporter: Michael McCandless >Priority: Minor > Fix For: 3.1 > > > When I run this alg: > {code} > analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer > content.source=org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource > docs.file=/x/lucene/enwiki-20090724-pages-articles.xml.bz2 > doc.tokenized = false > ram.flush.mb=32.0 > doc.stored = false > doc.term.vector = false > log.step.AddDoc=1 > directory=FSDirectory > autocommit=false > compound=false > work.dir=/lucene/work.wiki.nd0.02M > { "BuildIndex" > - CreateIndex > [ { "AddDocs" AddDoc > : 1 } : 2 > - CloseIndex > } > RepSumByPrefRound BuildIndex > {code} > I hit exceptions in each thread like this: > {code} > Exception in thread "Thread-2" java.lang.RuntimeException: > org.xml.sax.SAXParseException: Open quote is expected for attribute "msxi" > associated with an element type "mdiiki". > at > org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:189) > at java.lang.Thread.run(Thread.java:613) > Caused by: org.xml.sax.SAXParseException: Open quote is expected for > attribute "msxi" associated with an element type "mdiiki". > at > com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236) > at > com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:215) > at > com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:386) > at > com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:316) > at > com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1441) > at > com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:802) > at > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:578) > at > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:222) > at > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(XMLNSDocumentScannerImpl.java:779) > at > com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1794) > at > com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368) > at > com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834) > at > com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764) > at > com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148) > at > com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242) > at > org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:166) > ... 1 more > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: [jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
On Mon, Oct 19, 2009 at 3:11 PM, Mark Miller wrote: > Uwe Schindler (JIRA) wrote: >> >> And please: next time when we deprecate APIs: remove all deprecated calls >> from tests and contrib and mark all deprecated-test as such! >> >> > Its the nature of open source. Each of us takes the work that other > contributors are willing/able/havetime to provide - and fill in the rest > ourselves or decide its too much work and don't. I agree that its a nice > idea, but I don't think the issue is going away so easily myself ;) In > which case it falls to the poor soul who decides to help later and > remove the deprecated methods. Or perhaps it keeps someone from stepping > up and doing that - nature of the beast. I do agree this is the nature of the beast. Also, thinking more about it... I think a good approach, for an issue with a large number of deprecations, might be to open a separate issue to fix the deprecations in contrib/test, and fix it after some delay. This way we confirm that deprecated usage of the APIs is working, for at least some time, before removing them all from the tests. EG in LUCENE-1458 I waited until quite late to cutover usage to the flex API. > But as long as we are making such requests, please no one commit any > more funky source formatting either :) It hurts my eyes. +1! Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1996) EnwikiContentSource isn't thread safe
[ https://issues.apache.org/jira/browse/LUCENE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767487#action_12767487 ] Mark Miller commented on LUCENE-1996: - The scary part is that its been around for some time and we both independently hit it today ... quantum mechanics in action I guess ... > EnwikiContentSource isn't thread safe > - > > Key: LUCENE-1996 > URL: https://issues.apache.org/jira/browse/LUCENE-1996 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/benchmark >Reporter: Michael McCandless >Priority: Minor > Fix For: 3.1 > > > When I run this alg: > {code} > analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer > content.source=org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource > docs.file=/x/lucene/enwiki-20090724-pages-articles.xml.bz2 > doc.tokenized = false > ram.flush.mb=32.0 > doc.stored = false > doc.term.vector = false > log.step.AddDoc=1 > directory=FSDirectory > autocommit=false > compound=false > work.dir=/lucene/work.wiki.nd0.02M > { "BuildIndex" > - CreateIndex > [ { "AddDocs" AddDoc > : 1 } : 2 > - CloseIndex > } > RepSumByPrefRound BuildIndex > {code} > I hit exceptions in each thread like this: > {code} > Exception in thread "Thread-2" java.lang.RuntimeException: > org.xml.sax.SAXParseException: Open quote is expected for attribute "msxi" > associated with an element type "mdiiki". > at > org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:189) > at java.lang.Thread.run(Thread.java:613) > Caused by: org.xml.sax.SAXParseException: Open quote is expected for > attribute "msxi" associated with an element type "mdiiki". > at > com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236) > at > com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:215) > at > com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:386) > at > com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:316) > at > com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1441) > at > com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:802) > at > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:578) > at > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:222) > at > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(XMLNSDocumentScannerImpl.java:779) > at > com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1794) > at > com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368) > at > com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834) > at > com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764) > at > com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148) > at > com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242) > at > org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:166) > ... 1 more > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-1996) EnwikiContentSource isn't thread safe
[ https://issues.apache.org/jira/browse/LUCENE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1996. Resolution: Duplicate Duh, yes, dup. Must read email before opening issues ;) > EnwikiContentSource isn't thread safe > - > > Key: LUCENE-1996 > URL: https://issues.apache.org/jira/browse/LUCENE-1996 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/benchmark >Reporter: Michael McCandless >Priority: Minor > Fix For: 3.1 > > > When I run this alg: > {code} > analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer > content.source=org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource > docs.file=/x/lucene/enwiki-20090724-pages-articles.xml.bz2 > doc.tokenized = false > ram.flush.mb=32.0 > doc.stored = false > doc.term.vector = false > log.step.AddDoc=1 > directory=FSDirectory > autocommit=false > compound=false > work.dir=/lucene/work.wiki.nd0.02M > { "BuildIndex" > - CreateIndex > [ { "AddDocs" AddDoc > : 1 } : 2 > - CloseIndex > } > RepSumByPrefRound BuildIndex > {code} > I hit exceptions in each thread like this: > {code} > Exception in thread "Thread-2" java.lang.RuntimeException: > org.xml.sax.SAXParseException: Open quote is expected for attribute "msxi" > associated with an element type "mdiiki". > at > org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:189) > at java.lang.Thread.run(Thread.java:613) > Caused by: org.xml.sax.SAXParseException: Open quote is expected for > attribute "msxi" associated with an element type "mdiiki". > at > com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236) > at > com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:215) > at > com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:386) > at > com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:316) > at > com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1441) > at > com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:802) > at > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:578) > at > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:222) > at > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(XMLNSDocumentScannerImpl.java:779) > at > com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1794) > at > com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368) > at > com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834) > at > com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764) > at > com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148) > at > com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242) > at > org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:166) > ... 1 more > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: [jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
Uwe Schindler wrote: >> Uwe Schindler (JIRA) wrote: >> >>> And please: next time when we deprecate APIs: remove all deprecated >>> >> calls from tests and contrib and mark all deprecated-test as such! >> >>> >> Its the nature of open source. Each of us takes the work that other >> contributors are willing/able/havetime to provide - and fill in the rest >> ourselves or decide its too much work and don't. I agree that its a nice >> idea, but I don't think the issue is going away so easily myself ;) In >> which case it falls to the poor soul who decides to help later and >> remove the deprecated methods. Or perhaps it keeps someone from stepping >> up and doing that - nature of the beast. >> > > Sorry, I was disappointed and somehow angry because nothing worked as > expected when I removed the deprecated parts. I fixed one thing and 5 other > problems appeared. > Ha - no reason to be sorry - I agree it would be nice - just saying good luck getting everyone to fall in line in the future :) > >> But as long as we are making such requests, please no one commit any >> more funky source formatting either :) It hurts my eyes. >> > > What was funky? > > I think I should stop working today and do something other... > Ha again :) I actually reworded that because the first time I wrote it I thought it sounded like I was saying you did it - guess I failed :) I was commenting in general, not about you - I don't think anything to bad has gotten in in some time - but there is some old source code here and there that really bugs me - totally unrelated to your comment - just adding a wish of my own - no more ugly source code :) ! > Uwe > > > - > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Assigned: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned LUCENE-1486: --- Assignee: (was: Mark Miller) > Wildcards, ORs etc inside Phrase queries > > > Key: LUCENE-1486 > URL: https://issues.apache.org/jira/browse/LUCENE-1486 > Project: Lucene - Java > Issue Type: Improvement > Components: QueryParser >Affects Versions: 2.4 >Reporter: Mark Harwood >Priority: Minor > Fix For: 3.0, 3.1 > > Attachments: ComplexPhraseQueryParser.java, > junit_complex_phrase_qp_07_21_2009.patch, > junit_complex_phrase_qp_07_22_2009.patch, Lucene-1486 non default > field.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, > LUCENE-1486.patch, LUCENE-1486.patch, TestComplexPhraseQuery.java > > > An extension to the default QueryParser that overrides the parsing of > PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries. > The implementation feels a little hacky - this is arguably better handled in > QueryParser itself. This works as a proof of concept for much of the query > parser syntax. Examples from the Junit test include: > checkMatches("\"j* smyth~\"", "1,2"); //wildcards and fuzzies > are OK in phrases > checkMatches("\"(jo* -john) smith\"", "2"); // boolean logic > works > checkMatches("\"jo* smith\"~2", "1,2,3"); // position logic > works. > > checkBadQuery("\"jo* id:1 smith\""); //mixing fields in a > phrase is bad > checkBadQuery("\"jo* \"smith\" \""); //phrases inside phrases > is bad > checkBadQuery("\"jo* [sma TO smZ]\" \""); //range queries > inside phrases not supported > Code plus Junit test to follow... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: [jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
> Uwe Schindler (JIRA) wrote: > > > > And please: next time when we deprecate APIs: remove all deprecated > calls from tests and contrib and mark all deprecated-test as such! > > > > > Its the nature of open source. Each of us takes the work that other > contributors are willing/able/havetime to provide - and fill in the rest > ourselves or decide its too much work and don't. I agree that its a nice > idea, but I don't think the issue is going away so easily myself ;) In > which case it falls to the poor soul who decides to help later and > remove the deprecated methods. Or perhaps it keeps someone from stepping > up and doing that - nature of the beast. Sorry, I was disappointed and somehow angry because nothing worked as expected when I removed the deprecated parts. I fixed one thing and 5 other problems appeared. > But as long as we are making such requests, please no one commit any > more funky source formatting either :) It hurts my eyes. What was funky? I think I should stop working today and do something other... Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1996) EnwikiContentSource isn't thread safe
[ https://issues.apache.org/jira/browse/LUCENE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767474#action_12767474 ] Mark Miller commented on LUCENE-1996: - dupe? LUCENE-1994 > EnwikiContentSource isn't thread safe > - > > Key: LUCENE-1996 > URL: https://issues.apache.org/jira/browse/LUCENE-1996 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/benchmark >Reporter: Michael McCandless >Priority: Minor > Fix For: 3.1 > > > When I run this alg: > {code} > analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer > content.source=org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource > docs.file=/x/lucene/enwiki-20090724-pages-articles.xml.bz2 > doc.tokenized = false > ram.flush.mb=32.0 > doc.stored = false > doc.term.vector = false > log.step.AddDoc=1 > directory=FSDirectory > autocommit=false > compound=false > work.dir=/lucene/work.wiki.nd0.02M > { "BuildIndex" > - CreateIndex > [ { "AddDocs" AddDoc > : 1 } : 2 > - CloseIndex > } > RepSumByPrefRound BuildIndex > {code} > I hit exceptions in each thread like this: > {code} > Exception in thread "Thread-2" java.lang.RuntimeException: > org.xml.sax.SAXParseException: Open quote is expected for attribute "msxi" > associated with an element type "mdiiki". > at > org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:189) > at java.lang.Thread.run(Thread.java:613) > Caused by: org.xml.sax.SAXParseException: Open quote is expected for > attribute "msxi" associated with an element type "mdiiki". > at > com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236) > at > com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:215) > at > com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:386) > at > com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:316) > at > com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1441) > at > com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:802) > at > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:578) > at > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:222) > at > com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(XMLNSDocumentScannerImpl.java:779) > at > com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1794) > at > com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368) > at > com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834) > at > com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764) > at > com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148) > at > com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242) > at > org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:166) > ... 1 more > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-1955) Fix Hits deprecation notice
[ https://issues.apache.org/jira/browse/LUCENE-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved LUCENE-1955. - Resolution: Fixed > Fix Hits deprecation notice > --- > > Key: LUCENE-1955 > URL: https://issues.apache.org/jira/browse/LUCENE-1955 > Project: Lucene - Java > Issue Type: Bug > Components: Javadocs >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Minor > Fix For: 2.9.1 > > > Just needs to be committed to 2.9 branch since hits is now removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1995) ArrayIndexOutOfBoundsException during indexing
[ https://issues.apache.org/jira/browse/LUCENE-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767467#action_12767467 ] Yonik Seeley commented on LUCENE-1995: -- The point at the exception uses a signed shift instead of unsigned, but that shouldn't matter unless the buffer pool is huge? Aaron, what are your index settings (like ramBufferSizeMB?) > ArrayIndexOutOfBoundsException during indexing > -- > > Key: LUCENE-1995 > URL: https://issues.apache.org/jira/browse/LUCENE-1995 > Project: Lucene - Java > Issue Type: Bug > Components: Index >Affects Versions: 2.9 >Reporter: Yonik Seeley > Fix For: 2.9.1 > > > http://search.lucidimagination.com/search/document/f29fc52348ab9b63/arrayindexoutofboundsexception_during_indexing -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Created: (LUCENE-1996) EnwikiContentSource isn't thread safe
EnwikiContentSource isn't thread safe - Key: LUCENE-1996 URL: https://issues.apache.org/jira/browse/LUCENE-1996 Project: Lucene - Java Issue Type: Bug Components: contrib/benchmark Reporter: Michael McCandless Priority: Minor Fix For: 3.1 When I run this alg: {code} analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer content.source=org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource docs.file=/x/lucene/enwiki-20090724-pages-articles.xml.bz2 doc.tokenized = false ram.flush.mb=32.0 doc.stored = false doc.term.vector = false log.step.AddDoc=1 directory=FSDirectory autocommit=false compound=false work.dir=/lucene/work.wiki.nd0.02M { "BuildIndex" - CreateIndex [ { "AddDocs" AddDoc > : 1 } : 2 - CloseIndex } RepSumByPrefRound BuildIndex {code} I hit exceptions in each thread like this: {code} Exception in thread "Thread-2" java.lang.RuntimeException: org.xml.sax.SAXParseException: Open quote is expected for attribute "msxi" associated with an element type "mdiiki". at org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:189) at java.lang.Thread.run(Thread.java:613) Caused by: org.xml.sax.SAXParseException: Open quote is expected for attribute "msxi" associated with an element type "mdiiki". at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236) at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:215) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:386) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:316) at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1441) at com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:802) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:578) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:222) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(XMLNSDocumentScannerImpl.java:779) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1794) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242) at org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:166) ... 1 more {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-1929) Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery
[ https://issues.apache.org/jira/browse/LUCENE-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved LUCENE-1929. - Resolution: Fixed Lucene Fields: [New, Patch Available] (was: [New]) > Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery > -- > > Key: LUCENE-1929 > URL: https://issues.apache.org/jira/browse/LUCENE-1929 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/highlighter >Affects Versions: 2.9 >Reporter: Mark Miller >Assignee: Mark Miller > Fix For: 2.9.1 > > Attachments: LUCENE-1929.patch > > > Sucks. Will throw a NullPointer exception. > Only NumericRangeQuery will throw the exception. > RangeQuery just won't highlight. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Created: (LUCENE-1995) ArrayIndexOutOfBoundsException during indexing
ArrayIndexOutOfBoundsException during indexing -- Key: LUCENE-1995 URL: https://issues.apache.org/jira/browse/LUCENE-1995 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 2.9 Reporter: Yonik Seeley Fix For: 2.9.1 http://search.lucidimagination.com/search/document/f29fc52348ab9b63/arrayindexoutofboundsexception_during_indexing -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767463#action_12767463 ] Robert Muir commented on LUCENE-1987: - bq. The problem is that this is not very different from saying "the onus is on the user to call the setXYZ method to get back to the old buggy behavior", which at least last time we discussed back-compat was controversial (ie, it's a change to our drop-in back-compat policy). Michael, yes I agree with you. What I am wondering is: is it really working in practice/in spirit? Forcing the user to supply the version, well it does make them look at the warning in the Version class, which is good. But nothing stops them from just using CURRENT. {noformat} Use this to get the latest & greatest settings, bug fixes, etc, for Lucene. {noformat} followed by the big bold warning about backwards compatibility. just curious what most users are doing, sacrificing drop-in for "latest and greatest?" I do think we should do things to improve contrib analyzers that are still stuck with this buggy behavior at some point: i.e LUCENE-1373. But maybe we don't need the Version with contrib analyzers, since you should be able to use an older lucene-analyzers jar file with new lucene if you want the back compat (sorry to stray somewhat off-topic) > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 2.9.1, 3.0 > > Attachments: LUCENE-1987-StopFilter-backport29.patch, > LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, > LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: [jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
Uwe Schindler (JIRA) wrote: > > And please: next time when we deprecate APIs: remove all deprecated calls > from tests and contrib and mark all deprecated-test as such! > > Its the nature of open source. Each of us takes the work that other contributors are willing/able/havetime to provide - and fill in the rest ourselves or decide its too much work and don't. I agree that its a nice idea, but I don't think the issue is going away so easily myself ;) In which case it falls to the poor soul who decides to help later and remove the deprecated methods. Or perhaps it keeps someone from stepping up and doing that - nature of the beast. But as long as we are making such requests, please no one commit any more funky source formatting either :) It hurts my eyes. -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767453#action_12767453 ] Robert Muir commented on LUCENE-1987: - bq. Ugh, this is because they embed StopFilter, right? One option might be to simply keep StopFilter's deprecated static methods for setting the default? Though I think adding Version to them over time is the right thing to do (though more work, today). not just this. Many use StandardTokenizer, so they have same invalid acronym, etc issues StandardAnalyzer has. But, this versioning/etc is all managed at StandardAnalyzer level (system properties, version numbers, etc)... when it also affects these other analyzers too. > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 2.9.1, 3.0 > > Attachments: LUCENE-1987-StopFilter-backport29.patch, > LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, > LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767450#action_12767450 ] Michael McCandless commented on LUCENE-1987: bq. maybe the default should really be LUCENE_CURRENT, and if you want the back compat-buggy behavior, the onus is on you as the user to set the flag right if you don't want to reindex? The problem is that this is not very different from saying "the onus is on the user to call the setXYZ method to get back to the old buggy behavior", which at least last time we discussed back-compat was controversial (ie, it's a change to our drop-in back-compat policy). > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 2.9.1, 3.0 > > Attachments: LUCENE-1987-StopFilter-backport29.patch, > LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, > LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: 2.9.1
OK, so now we're up to 3 2.9.1 issues to be resolved. Mike On Mon, Oct 19, 2009 at 1:56 PM, Uwe Schindler wrote: > Please wait and look at https://issues.apache.org/jira/browse/LUCENE-1987 > > We have some inconsistencies between QueryParser and the new > StandardAnalyzer with stop word posIncr. > > There is also a patch for 2.9 there! > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > >> -Original Message- >> From: Michael McCandless [mailto:luc...@mikemccandless.com] >> Sent: Monday, October 19, 2009 6:03 PM >> To: java-dev@lucene.apache.org; yo...@lucidimagination.com >> Subject: Re: 2.9.1 >> >> On Mon, Oct 19, 2009 at 11:54 AM, Yonik Seeley >> wrote: >> > On Wed, Oct 14, 2009 at 5:39 PM, Michael McCandless >> > wrote: >> >> I can cut the 2.9.1 release, but... should we wait a bit to see >> >> whether other issues come up? Or do it, now? >> > >> > Other issues came up, and were quickly fixed - nice job guys!. >> > I don't see anything else serious lurking about... seems like the >> > 2.9.1 release process could be started soon? >> >> +1, I'll try to get an RC out tomorrow. >> >> Mike >> >> - >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-dev-h...@lucene.apache.org > > > > - > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767449#action_12767449 ] Michael McCandless commented on LUCENE-1987: bq. All contrib analyzers have stopWordPosIncr turned off (backwards compatibility). Maybe we need a Version Parameter in all analyzers there too! Ugh, this is because they embed StopFilter, right? One option might be to simply keep StopFilter's deprecated static methods for setting the default? Though I think adding Version to them over time is the right thing to do (though more work, today). bq. benchmark does not work any longer, because StandardAnalyzer has no default ctor anymore and cannot be instantiated by reflection, same with StopAnalyzer When the no-arg ctor is unavailable, can we fallback to looking for a ctor that takes Version? For now we should just pass LUCENE_CURRENT; a future enhancement to benchmark can allow specifying version compat. bq. The default of QueryParser is to ignore position increments, but the current version of StandardAnalyzer uses posIncr for stop words Hmm. How about adding Version to QP ctor? bq. And please: next time when we deprecate APIs: remove all deprecated calls from tests and contrib and mark all deprecated-test as such! OK, I agree. I'll try to do this in the future! > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 2.9.1, 3.0 > > Attachments: LUCENE-1987-StopFilter-backport29.patch, > LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, > LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767435#action_12767435 ] Robert Muir commented on LUCENE-1987: - bq. All contrib analyzers have stopWordPosIncr turned off (backwards compatibility). Maybe we need a Version Parameter in all analyzers there too! Personally I would not be against this, not sure yet... downside would be more complexity and maintenance Upside would be that we could improve these analyzers in various ways, without annoying users bq. benchmark does not work any longer, because StandardAnalyzer has no default ctor anymore and cannot be instantiated by reflection, same with StopAnalyzer I also personally like having default ctor... its convienient and nice to be able to look at what these analyzers do in Luke, etc But I think this goes against the version flag concept? (because if users just set it to LUCENE_CURRENT then its doing nothing?) But I wonder if users do this anyway... maybe the default should really be LUCENE_CURRENT, and if you want the back compat-buggy behavior, the onus is on you as the user to set the flag right if you don't want to reindex? > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 2.9.1, 3.0 > > Attachments: LUCENE-1987-StopFilter-backport29.patch, > LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, > LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1987: -- Fix Version/s: 2.9.1 > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 2.9.1, 3.0 > > Attachments: LUCENE-1987-StopFilter-backport29.patch, > LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, > LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: 2.9.1
Please wait and look at https://issues.apache.org/jira/browse/LUCENE-1987 We have some inconsistencies between QueryParser and the new StandardAnalyzer with stop word posIncr. There is also a patch for 2.9 there! - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Monday, October 19, 2009 6:03 PM > To: java-dev@lucene.apache.org; yo...@lucidimagination.com > Subject: Re: 2.9.1 > > On Mon, Oct 19, 2009 at 11:54 AM, Yonik Seeley > wrote: > > On Wed, Oct 14, 2009 at 5:39 PM, Michael McCandless > > wrote: > >> I can cut the 2.9.1 release, but... should we wait a bit to see > >> whether other issues come up? Or do it, now? > > > > Other issues came up, and were quickly fixed - nice job guys!. > > I don't see anything else serious lurking about... seems like the > > 2.9.1 release process could be started soon? > > +1, I'll try to get an RC out tomorrow. > > Mike > > - > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1987: -- Attachment: LUCENE-1987-StopFilter-backport29.patch LUCENE-1987-StopFilter-BW.patch LUCENE-1987-StopFilter.patch Here 2 mega patches and one backport to 2.9 (want to get this in before 2.9.1): All core tests pass, all bw tests pass. Most contrib tests also pass, but we have the following problems and inconsistencies: - benchmark does not work any longer, because StandardAnalyzer has no default ctor anymore and cannot be instantiated by reflection, same with StopAnalyzer - Highlighter only works, if StandardAnalyzer is in 2.4 mde, in 2.9 mode (current) it fails because the position increments of stop words are not correctly respected. This fails in addition/combination with the following: - Very bad inconsistency: The default of QueryParser is to ignore position increments, but the current version of StandardAnalyzer uses posIncr for stop words -> bäng. We should change the default for QueryParser(+ contrib QP), too. There is march rework needed and much documentation. The tests in core now pass, as most parts use StandardAnalyzer in 2.9 mode but have no stop words. And the special tests explicitely set the posIncr flag. This is totally disturbed, it needs fixing! (it also affects 2.9.0, if somebody uses the new StandardAnalyzer with LUCENE_CURRENT). - XMLQueryParser also fails with latest StandardAnalyzer version, because it cannot set the flag in QueryParser. In my opinion, the query parser should take the flag from the analyzer, but this is not easy to fix. - All contrib analyzers have stopWordPosIncr turned off (backwards compatibility). Maybe we need a Version Parameter in all analyzers there too! What to do? After this StopFilter/StandardAnalyzer-hell-day Aspirin and Paracetamol and beer is not enough to think clear again... And please: next time when we deprecate APIs: remove all deprecated calls from tests and contrib and mark all deprecated-test as such! > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter-backport29.patch, > LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, > LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1955) Fix Hits deprecation notice
[ https://issues.apache.org/jira/browse/LUCENE-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767397#action_12767397 ] Mark Miller commented on LUCENE-1955: - Sorry again ;) I'm slowing everything up - feel free - if you don't, I'll do it when I commit the Highlighter fix in a bit. Just have to throw my noisy laptop out the window and into a brick wall first ... > Fix Hits deprecation notice > --- > > Key: LUCENE-1955 > URL: https://issues.apache.org/jira/browse/LUCENE-1955 > Project: Lucene - Java > Issue Type: Bug > Components: Javadocs >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Minor > Fix For: 2.9.1 > > > Just needs to be committed to 2.9 branch since hits is now removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1929) Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery
[ https://issues.apache.org/jira/browse/LUCENE-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767394#action_12767394 ] Mark Miller commented on LUCENE-1929: - Yeah - sorry - has been for some time. I can commit it shortly. > Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery > -- > > Key: LUCENE-1929 > URL: https://issues.apache.org/jira/browse/LUCENE-1929 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/highlighter >Affects Versions: 2.9 >Reporter: Mark Miller >Assignee: Mark Miller > Fix For: 2.9.1 > > Attachments: LUCENE-1929.patch > > > Sucks. Will throw a NullPointer exception. > Only NumericRangeQuery will throw the exception. > RangeQuery just won't highlight. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1929) Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery
[ https://issues.apache.org/jira/browse/LUCENE-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767392#action_12767392 ] Michael McCandless commented on LUCENE-1929: Mark is this one reading to go into 2.9.1? > Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery > -- > > Key: LUCENE-1929 > URL: https://issues.apache.org/jira/browse/LUCENE-1929 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/highlighter >Affects Versions: 2.9 >Reporter: Mark Miller >Assignee: Mark Miller > Fix For: 2.9.1 > > Attachments: LUCENE-1929.patch > > > Sucks. Will throw a NullPointer exception. > Only NumericRangeQuery will throw the exception. > RangeQuery just won't highlight. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1955) Fix Hits deprecation notice
[ https://issues.apache.org/jira/browse/LUCENE-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767391#action_12767391 ] Michael McCandless commented on LUCENE-1955: Mark do you want to commit this? Or I can. Wanting to cut an RC tomorrow... > Fix Hits deprecation notice > --- > > Key: LUCENE-1955 > URL: https://issues.apache.org/jira/browse/LUCENE-1955 > Project: Lucene - Java > Issue Type: Bug > Components: Javadocs >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Minor > Fix For: 2.9.1 > > > Just needs to be committed to 2.9 branch since hits is now removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1986) NPE in NearSpansUnordered from PayloadNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767373#action_12767373 ] Peter Keegan commented on LUCENE-1986: -- + if (!more) { +return false; + } I was about to submit this same patch today, but I see you beat me to it :) Thanks Mark. > NPE in NearSpansUnordered from PayloadNearQuery > --- > > Key: LUCENE-1986 > URL: https://issues.apache.org/jira/browse/LUCENE-1986 > Project: Lucene - Java > Issue Type: Bug > Components: Search >Affects Versions: 2.9 >Reporter: Peter Keegan >Assignee: Michael McCandless > Fix For: 2.9.1, 3.0 > > Attachments: LUCENE-1986.patch, LUCENE-1986.patch, > TestPayloadNearQuery1.java > > > The following query causes a NPE in NearSpansUnordered, and is reproducible > with the the attached unit test. The failure occurs on the last document > scored. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: [jira] Commented: (LUCENE-1994) EnwikiConentSource does not work with parallel tasks
I don't think some of the stat tracking works right with parallel either - to get the total time, its adding up when each thread finished - eg if thread one finishes at second 30 and thread2 at second 32, its saying it took 62 seconds total. [java] > algorithm: [java] Seq { [java] Rounds_2 { [java] ResetSystemErase [java] Populate { [java] CreateIndex [java] Par_8 [ [java] MAddDocs_2500 { [java] AddDoc [java] } * 2500 [java] ] * 8 [java] Optimize [java] CommitIndex [java] CloseIndex [java] } [java] RepSumByPref MAddDocs [java] NewRound [java] } * 2 [java] RepSumByNameRound [java] RepSumByName [java] RepSumByPrefRound MAddDocs [java] } [java] > starting task: Seq [java] > starting task: Rounds_2 [java] > starting task: ResetSystemErase [java] > starting task: Populate [java] 55.84 sec --> Thread-2 added 2000 docs [java] 60.94 sec --> Thread-6 added 2000 docs [java] 74.82 sec --> Thread-0 added 2000 docs [java] 77.48 sec --> Thread-3 added 2000 docs [java] 81.21 sec --> Thread-1 added 2000 docs [java] 90.72 sec --> Thread-5 added 2000 docs [java] 96.46 sec --> Thread-7 added 2000 docs [java] 97.17 sec --> Thread-4 added 2000 docs [java] > Report Sum By Prefix (MAddDocs) (1 about 8 out of 20016) [java] Operation round mrg flush cmpnd runCnt recsPerRunrec/s elapsedSecavgUsedMemavgTotalMem [java] MAddDocs_2500 0 20 48.00 false8 250028.01 713.99 135,359,120273,850,368 Shai Erera (JIRA) wrote: > [ > https://issues.apache.org/jira/browse/LUCENE-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767343#action_12767343 > ] > > Shai Erera commented on LUCENE-1994: > > > Yes I agree (to both comments). Basically for a ContentSource to be supported > by parallel tasks, its getNextDocData should be made synchronized, or it > finds another way to sync on the important stuff (for example > TrecContentSource). > > >> EnwikiConentSource does not work with parallel tasks >> >> >> Key: LUCENE-1994 >> URL: https://issues.apache.org/jira/browse/LUCENE-1994 >> Project: Lucene - Java >> Issue Type: Bug >> Components: contrib/benchmark >>Affects Versions: 2.9 >>Reporter: Mark Miller >>Priority: Minor >> >> > > > -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: 2.9.1
On Mon, Oct 19, 2009 at 11:54 AM, Yonik Seeley wrote: > On Wed, Oct 14, 2009 at 5:39 PM, Michael McCandless > wrote: >> I can cut the 2.9.1 release, but... should we wait a bit to see >> whether other issues come up? Or do it, now? > > Other issues came up, and were quickly fixed - nice job guys!. > I don't see anything else serious lurking about... seems like the > 2.9.1 release process could be started soon? +1, I'll try to get an RC out tomorrow. Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: 2.9.1
On Wed, Oct 14, 2009 at 5:39 PM, Michael McCandless wrote: > I can cut the 2.9.1 release, but... should we wait a bit to see > whether other issues come up? Or do it, now? Other issues came up, and were quickly fixed - nice job guys!. I don't see anything else serious lurking about... seems like the 2.9.1 release process could be started soon? -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1994) EnwikiConentSource does not work with parallel tasks
[ https://issues.apache.org/jira/browse/LUCENE-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767343#action_12767343 ] Shai Erera commented on LUCENE-1994: Yes I agree (to both comments). Basically for a ContentSource to be supported by parallel tasks, its getNextDocData should be made synchronized, or it finds another way to sync on the important stuff (for example TrecContentSource). > EnwikiConentSource does not work with parallel tasks > > > Key: LUCENE-1994 > URL: https://issues.apache.org/jira/browse/LUCENE-1994 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/benchmark >Affects Versions: 2.9 >Reporter: Mark Miller >Priority: Minor > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1994) EnwikiConentSource does not work with parallel tasks
[ https://issues.apache.org/jira/browse/LUCENE-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767341#action_12767341 ] Mark Miller commented on LUCENE-1994: - bq. I believe this was the original behavior of EnwikiDocMaker Probably - but we should make it work right? bq. But anyway, I think that if getNextDocData will be synchronized, this should do it? Thats actually what I did locally as a quick fix - seems to work out alright. > EnwikiConentSource does not work with parallel tasks > > > Key: LUCENE-1994 > URL: https://issues.apache.org/jira/browse/LUCENE-1994 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/benchmark >Affects Versions: 2.9 >Reporter: Mark Miller >Priority: Minor > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1994) EnwikiConentSource does not work with parallel tasks
[ https://issues.apache.org/jira/browse/LUCENE-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767335#action_12767335 ] Shai Erera commented on LUCENE-1994: I believe this was the original behavior of EnwikiDocMaker. But anyway, I think that if getNextDocData will be synchronized, this should do it? > EnwikiConentSource does not work with parallel tasks > > > Key: LUCENE-1994 > URL: https://issues.apache.org/jira/browse/LUCENE-1994 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/benchmark >Affects Versions: 2.9 >Reporter: Mark Miller >Priority: Minor > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Created: (LUCENE-1994) EnwikiConentSource does not work with parallel tasks
EnwikiConentSource does not work with parallel tasks Key: LUCENE-1994 URL: https://issues.apache.org/jira/browse/LUCENE-1994 Project: Lucene - Java Issue Type: Bug Components: contrib/benchmark Affects Versions: 2.9 Reporter: Mark Miller Priority: Minor -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767304#action_12767304 ] Uwe Schindler commented on LUCENE-1987: --- OK, I fix the tests using find/grep/sed :-) > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987.patch, LUCENE-1987.patch, LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Created: (LUCENE-1993) MoreLikeThis - allow to exclude terms that appear in too many documents (patch included)
MoreLikeThis - allow to exclude terms that appear in too many documents (patch included) Key: LUCENE-1993 URL: https://issues.apache.org/jira/browse/LUCENE-1993 Project: Lucene - Java Issue Type: Improvement Components: contrib/* Affects Versions: 2.9 Reporter: Christian Steinert Attachments: MoreLikeThis.java.patch The MoreLikeThis class allows to generate a likeness query based on a given document. So far, it is impossible to suppress words from the likeness query, that appear in almost all documents, making it necessary to use extensive lists of stop words. Therefore I suggest to allow excluding words for which a certain absolute document count or a certain percentage of documents is exceeded. Depending on the corpus of text, words that appear in more than 50 or even 70% of documents can usually be considered insignificant for classifying a document. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1993) MoreLikeThis - allow to exclude terms that appear in too many documents (patch included)
[ https://issues.apache.org/jira/browse/LUCENE-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Steinert updated LUCENE-1993: --- Attachment: MoreLikeThis.java.patch suggested patch against current SVN head > MoreLikeThis - allow to exclude terms that appear in too many documents > (patch included) > > > Key: LUCENE-1993 > URL: https://issues.apache.org/jira/browse/LUCENE-1993 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/* >Affects Versions: 2.9 >Reporter: Christian Steinert > Attachments: MoreLikeThis.java.patch > > Original Estimate: 0.17h > Remaining Estimate: 0.17h > > The MoreLikeThis class allows to generate a likeness query based on a given > document. So far, it is impossible to suppress words from the likeness query, > that appear in almost all documents, making it necessary to use extensive > lists of stop words. > Therefore I suggest to allow excluding words for which a certain absolute > document count or a certain percentage of documents is exceeded. Depending on > the corpus of text, words that appear in more than 50 or even 70% of > documents can usually be considered insignificant for classifying a document. > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767299#action_12767299 ] Michael McCandless commented on LUCENE-1987: bq. I did not remove the other version constants, because then we have them and can use them anywhere else. And a user coming from e.g. 2.2 to 3.0 can just use LUCENE_22 to match his old behaviour. The user should be free to give his version he used before for this backwards compatibility. OK I think that's reasonable. bq. Mike: Should I backport the setting for 2.4 to 2.9 to enable plugin-replacements from 2.9.1 to 3.0? +1 {quote} bq. Going forward, when we fix a bug but need to conditionally preserve the bug for back compat, we should use the version switching so that by default for new users (VERSION_CURRENT or VERSION_XX if XX is the next release) the bug is fixed. Do you mean I should add the default ctor of StandardAnalyzer() and rewire it to LUCENE_CURRENT? {quote} Sorry, I wasn't clear... No -- I don't think we should ever have a ctor that defaults to LUCENE_CURRENT. That's a back compat trap (and it just gets us back to where we started when we had no explicit version). Users must be explicit about which version they want. What I meant was: when fixing some sneaky bug in the future, we should never set the default so that the bug is still present (as we did on the first go of "invalid acronyms"), expecting new users to realize they have to go out of their way to tell Lucene not to emulate the bug. Instead, the default going forward (if version >= next-release-version) should be "the bug is fixed". > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987.patch, LUCENE-1987.patch, LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Highlighting - catering for all query types
I've been putting together some code to support highlighting of opaque query clauses (cached filters, trie range, spatial etc etc) which shows some promise. This is not intended as a replacement for the existing highlighter(s) which deal with free-text but is instead concentrating on the hard-to-highlight clauses and has the benefit of working in-line with the query process. Summarisation is not a requirement here - I simply need to know if a given query clause matched on a result. The approach I have come up with is to wrap query clauses with lightweight (processing and RAM-wise) instrumenting objects in order to record which clauses matched. The recorded matches are encoded as a byte in the document score which unfortunately requires some loss of precision in the scores - more on this later. The general approach for use looks like this: //Wrap *any* type of query object for highlight flagging and allocate a flag number between 1 and 8 for the clauses of interest FlagRecordingQuery frqA=new FlagRecordingQuery(new TermQuery(new Term("statusField","published")),1); FlagRecordingQuery frqB=new FlagRecordingQuery(new XyzLtd3rdPartyQuery("imageDataField", "unknown magic to find 'sunset'")),2); BooleanQuery bq=new BooleanQuery(); bq.add(new BooleanClause(frqA,Occur.SHOULD)); bq.add(new BooleanClause(frqB,Occur.SHOULD)); //Parent query must be a FlagCombiningQuery to encode child match info in the doc scores FlagCombiningQuery fcq=new FlagCombiningQuery(bq); //Run search TopDocs td = s.search(fcq,10); ScoreDoc[] sd = td.scoreDocs; for (ScoreDoc scoreDoc : sd) { float score=scoreDoc.score; //Check to see which flags are encoded in the score. if(FlagCombiningQuery.hasFlag(1, score)) { System.out.println("woot! "+scoreDoc.doc+" matched clause 1 "); } if(FlagCombiningQuery.hasFlag(2, score)) { System.out.println("woot! "+scoreDoc.doc+" matched clause 2 "); } } The FlagRecordingQuery child clauses introduce themselves to the FlagCombiningQuery through a thread local at "rewrite" time. The FlagCombiningQuery at the root adjusts the scores as follows: static final float DEFAULT_MULTIPLIER=1000f; float multiplier=DEFAULT_MULTIPLIER; public float score() throws IOException { float score = delegateScorer.score(); byte flags=0; int d=doc(); //encode all matched child clauses into a "flags" byte. for (FlagRecordingQuery frq : thisThreadsFlags) { if(frq.matched(d)) { byte mask=flagMasks[frq.flag-1]; flags=setFlag(flags, mask); } } //Multiply score to turn float into int with sufficient fractions in score. int shiftedI=(int) (score*multiplier); //Shift int to make space for byte holding flags int iPlusSpaceForByte=shiftedI<<8; //Add match flags int iCombinedScoreAndFlags=iPlusSpaceForByte|flags; System.out.println("combined score="+iCombinedScoreAndFlags+" for doc#"+doc()); return iCombinedScoreAndFlags; } The mechanism works but relies on original score values that : a) Are not too big - i.e. do not lose significant digits when multiplied by "multiplier" and then shifted left 8 bits. b) Are not too similar - i.e. only differ in very small fractions e.g. all scores occur in the range 0.1234 to 0.1235 To give an indication of restrictions this imposes here are the usable score ranges for various settings of "multiplier": multiplier max score fraction precision == = 10 838860 0.x 100 83886 0.xx 1000 8388 0.xxx 1 838 0. I would imagine the majority of Lucene query results would still rank sensibly with a 1,000 or 10,000 multiplier. However, all this potentially dangerous bit twiddling could of course be avoided if the Lucene search API was expanded to include docid, score AND a completely seperate field for recording match flags. Thoughts? - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767273#action_12767273 ] Uwe Schindler commented on LUCENE-1987: --- bq. Going forward, when we fix a bug but need to conditionally preserve the bug for back compat, we should use the version switching so that by default for new users (VERSION_CURRENT or VERSION_XX if XX is the next release) the bug is fixed. Do you mean I should add the default ctor of StandardAnalyzer() and rewire it to LUCENE_CURRENT? We have to put this in the docs, that from 3.0 on, the standard analyzer's default ctor now no longer behaves like 2.4, but always uses the newest features. > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987.patch, LUCENE-1987.patch, LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767273#action_12767273 ] Uwe Schindler edited comment on LUCENE-1987 at 10/19/09 3:14 AM: - bq. Going forward, when we fix a bug but need to conditionally preserve the bug for back compat, we should use the version switching so that by default for new users (VERSION_CURRENT or VERSION_XX if XX is the next release) the bug is fixed. Do you mean I should add the default ctor of StandardAnalyzer() and rewire it to LUCENE_CURRENT? We have to put this in the docs, that from 3.0 on, the standard analyzer's default ctor now no longer behaves like 2.4, but always uses the newest features. That would help me lot with the tests was (Author: thetaphi): bq. Going forward, when we fix a bug but need to conditionally preserve the bug for back compat, we should use the version switching so that by default for new users (VERSION_CURRENT or VERSION_XX if XX is the next release) the bug is fixed. Do you mean I should add the default ctor of StandardAnalyzer() and rewire it to LUCENE_CURRENT? We have to put this in the docs, that from 3.0 on, the standard analyzer's default ctor now no longer behaves like 2.4, but always uses the newest features. > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987.patch, LUCENE-1987.patch, LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1987: -- Attachment: LUCENE-1987-StopFilter.patch Updated patch with LUCENE_24. I did not remove the other version constants, because then we have them and can use them anywhere else. And a user coming from e.g. 2.2 to 3.0 can just use LUCENE_22 to match his old behaviour. The user should be free to give his version he used before for this backwards compatibility. Mike: Should I backport the setting for 2.4 to 2.9 to enable plugin-replacements from 2.9.1 to 3.0? > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, > LUCENE-1987.patch, LUCENE-1987.patch, LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Call the authorities
Indeed! It's doing nothing now. Just creating Sort objects but not in fact doing any searching with them. Hmm. Unfortunately, the test very much relied on the deprecated "setUseLegacySearch" API, to compare old vs new sorting. I suppose its time has past, given that it has had a good amount of time, now, to assert that old and new were producing identical results. Should we just remove it? Mike On Sun, Oct 18, 2009 at 11:20 PM, Mark Miller wrote: > Mark Miller wrote: >> TestStressSort has been butchered. >> >> > I suppose we could just pull it since it wouldn't check for much any > more - looks awful funny as is. > > -- > - Mark > > http://www.lucidimagination.com > > > > > - > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767270#action_12767270 ] Michael McCandless commented on LUCENE-1987: bq. LUCENE-1068 says: Fix version 2.3 Right, that bug was fixed in 2.3, however with that fix the buggy behavior was kept by default. In 2.4 we then fixed the default to be true, ie, the bug would be fixed by default. So if I were to specify VERSION_23, I should get the buggy behavior, but if I specify VERSION_24, I should get the correct behavior. Going forward, when we fix a bug but need to conditionally preserve the bug for back compat, we should use the version switching so that by default for new users (VERSION_CURRENT or VERSION_XX if XX is the next release) the bug is fixed. > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, > LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767269#action_12767269 ] Uwe Schindler commented on LUCENE-1987: --- I just added also 20 and 21. I can remove them again (20 and 21). 22 is needed because the invalidAcronym thing is there in 2.2 and fixed in 2.3 (according to LUCENE-1068). > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, > LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767267#action_12767267 ] Michael McCandless commented on LUCENE-1987: Why add 2.0, 2.1. 2.2 versions? We don't anywhere emulate bugs based on those, right? Otherwise, patch looks great! Thanks Uwe. Nice to see StandardAnalyzer clean again. > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, > LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1987: -- Attachment: LUCENE-1987-StopFilter.patch Javadocs fixes. > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, > LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767265#action_12767265 ] Uwe Schindler commented on LUCENE-1987: --- LUCENE-1068 says: Fix version 2.3 > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter.patch, > LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, > LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767264#action_12767264 ] Michael McCandless commented on LUCENE-1987: I'll have a look, but one thing is invalid acronym replacement should be enabled if version >= 2.4, not >= 2.3. Ie, if version is 2.3, the bug is still present. > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, > LUCENE-1987.patch, LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1987: -- Attachment: (was: LUCENE-1987-StopFilter.patch) > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, > LUCENE-1987.patch, LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1987: -- Attachment: LUCENE-1987-StopFilter.patch Correct patch. > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, > LUCENE-1987.patch, LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767262#action_12767262 ] Uwe Schindler commented on LUCENE-1987: --- If we are fine with that, I would backport the version constants and the default setting to 2.9.x > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, > LUCENE-1987.patch, LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)
[ https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1987: -- Attachment: LUCENE-1987-StopFilter.patch Hallo Mike, attached is a patch with all deprecated methods removed (only the setOverridesTokenStream is still there, making Analyzers final is another thing to do). Also StopFilter and its stopWord ets were generified (to , which is ok for every type of set, as CharArraySet uses toString() to convert everything to string when testing, so any set is fine) I only had the following problems and solution is here (StandardAnalyzer): {code} enableStopPositionIncrements = matchVersion.onOrAfter(Version.LUCENE_29); replaceInvalidAcronym = matchVersion.onOrAfter(Version.LUCENE_23); {code} The setting defaultPosIncr was removed (static method, so there is no default anymore). Because of that, the pre 2.9 default was false (which is now not changeable). So I set the posIncr to false for all older versions (this was the default before, but is now fixed as no static setter/sysprop anymore) For the invalid acronyms I added LUCENE_23 version constant, so for all versions >=2.3 it is enabled. If you want old behaviour, use LUCENE_22 or below. Mike: Can you review this? If you're ok with it I have to change 175 "new StandardAnalyzer()" occurences in tests :( > Remove rest of analysis deprecations (Token, CharacterCache) > > > Key: LUCENE-1987 > URL: https://issues.apache.org/jira/browse/LUCENE-1987 > Project: Lucene - Java > Issue Type: Task > Components: Analysis >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, > LUCENE-1987.patch, LUCENE-1987.patch > > > These removes the rest of the deprecations in the analysis package: > - -Token's termText field-- (DONE) > - -eventually un-deprecate ctors of Token taking Strings (they are still > useful) -> if yes remove deprec in 2.9.1- (DONE) > - -remove CharacterCache and use Character.valueOf() from Java5- (DONE) > - Stopwords lists > - Remove the backwards settings from analyzers (acronym, posIncr,...). They > are deprecated, but we still have the VERSION constants. Do not know, how to > proceed. Keep the settings alive for index compatibility? Or remove it > together with the version constants (which were undeprecated). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1992) intermittent failure in TestIndexWriter. testExceptionDuringSync
[ https://issues.apache.org/jira/browse/LUCENE-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767258#action_12767258 ] Michael McCandless commented on LUCENE-1992: bq. Should this also applied to bw branch (2.4 for) 2.9 and (2.9 for) 3.0? No, it can only be applied on trunk. That call tells ConcurrentMergeScheduler to expect exceptions during this test, which when autoCommit is true (which this test is doing everywhere except trunk) will happen because when a merge completes, it'll commit and call Directory.sync which throws the intentional exception. > intermittent failure in TestIndexWriter. testExceptionDuringSync > - > > Key: LUCENE-1992 > URL: https://issues.apache.org/jira/browse/LUCENE-1992 > Project: Lucene - Java > Issue Type: Bug > Components: Index >Reporter: Michael McCandless >Assignee: Michael McCandless >Priority: Minor > Fix For: 2.9.1, 3.0 > > Attachments: LUCENE-1992.patch > > > {code} > common.test: > [mkdir] Created dir: C:\Projects\lucene\trunk-full1\build\test > [junit] Testsuite: org.apache.lucene.index.TestIndexWriter > [junit] Tests run: 102, Failures: 0, Errors: 1, Time elapsed: 100,297sec > [junit] > [junit] Testcase: > testExceptionDuringSync(org.apache.lucene.index.TestIndexWriter): Caused an > ERROR > [junit] _a.fnm > [junit] java.io.FileNotFoundException: _a.fnm > [junit] at > org.apache.lucene.store.MockRAMDirectory.openInput(MockRAMDirectory.java:226) > [junit] at > org.apache.lucene.index.FieldInfos.(FieldInfos.java:68) > [junit] at > org.apache.lucene.index.SegmentReader$CoreReaders.(SegmentReader.java:116) > [junit] at > org.apache.lucene.index.SegmentReader.get(SegmentReader.java:620) > [junit] at > org.apache.lucene.index.SegmentReader.get(SegmentReader.java:590) > [junit] at > org.apache.lucene.index.DirectoryReader.(DirectoryReader.java:104) > [junit] at > org.apache.lucene.index.ReadOnlyDirectoryReader.(ReadOnlyDirectoryReader.java:27) > [junit] at > org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:74) > [junit] at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:704) > [junit] at > org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:69) > [junit] at > org.apache.lucene.index.IndexReader.open(IndexReader.java:307) > [junit] at > org.apache.lucene.index.IndexReader.open(IndexReader.java:193) > [junit] at > org.apache.lucene.index.TestIndexWriter.testExceptionDuringSync(TestIndexWriter.java:2723) > [junit] at > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:206) > [junit] > [junit] > [junit] Test org.apache.lucene.index.TestIndexWriter FAILED > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767242#action_12767242 ] Uwe Schindler commented on LUCENE-1257: --- One note: I do not want to apply any test-related generics patches, as it makes it harder to port patches to the backwards branch currently. As soon as all deprecations are removed, we could start with fixing the tests. Before removing all deprecations it may often be needed to also apply changes to the backwards branch, which is Java 1.4 for backwards testing with 2.9. > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-WordListLoader.patch, > LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch, > LUCENE-1257_messages.patch, LUCENE-1257_o.a.l.queryParser.patch, > LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, > LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, > lucene1257surround1.patch, shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767241#action_12767241 ] Uwe Schindler commented on LUCENE-1257: --- Comitted: LUCENE-1257-CloseableThreadLocal.patch 2009-10-18 06:31 PM Kay Kay 4 kB LUCENE-1257_analysis.patch 2009-10-18 05:41 PM Robert Muir 8 kB At revision: 826601 > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-WordListLoader.patch, > LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch, > LUCENE-1257_messages.patch, LUCENE-1257_o.a.l.queryParser.patch, > LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, > LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, > lucene1257surround1.patch, shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767240#action_12767240 ] Uwe Schindler commented on LUCENE-1257: --- I removed some unneeded patches. > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-WordListLoader.patch, > LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch, > LUCENE-1257_messages.patch, LUCENE-1257_o.a.l.queryParser.patch, > LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, > LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, > lucene1257surround1.patch, shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1257: -- Attachment: (was: o.a.l.analysis.patch) > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-WordListLoader.patch, > LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch, > LUCENE-1257_messages.patch, LUCENE-1257_o.a.l.queryParser.patch, > LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, > LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, > lucene1257surround1.patch, shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1257) Port to Java5
[ https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1257: -- Attachment: (was: java5.patch) > Port to Java5 > - > > Key: LUCENE-1257 > URL: https://issues.apache.org/jira/browse/LUCENE-1257 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis, Examples, Index, Other, Query/Scoring, > QueryParser, Search, Store, Term Vectors >Affects Versions: 3.0 >Reporter: Cédric Champeau >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.0 > > Attachments: instantiated_fieldable.patch, > LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, > LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, > LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, > LUCENE-1257-CompoundFileReaderWriter.patch, > LUCENE-1257-ConcurrentMergeScheduler.patch, > LUCENE-1257-DirectoryReader.patch, > LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, > LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, > LUCENE-1257-IndexDeleter.patch, > LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, > LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, > LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, > LUCENE-1257-StringBuffer.patch, LUCENE-1257-WordListLoader.patch, > LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch, > LUCENE-1257_messages.patch, LUCENE-1257_o.a.l.queryParser.patch, > LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, > LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, > LUCENE-1257_o_a_l_search_spans.patch, > LUCENE-1257_org_apache_lucene_index.patch, > LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, > lucene1257surround1.patch, o.a.l.analysis.patch, > shinglematrixfilter_generified.patch > > > For my needs I've updated Lucene so that it uses Java 5 constructs. I know > Java 5 migration had been planned for 2.1 someday in the past, but don't know > when it is planned now. This patch against the trunk includes : > - most obvious generics usage (there are tons of usages of sets, ... Those > which are commonly used have been generified) > - PriorityQueue generification > - replacement of indexed for loops with for each constructs > - removal of unnececessary unboxing > The code is to my opinion much more readable with those features (you > actually *know* what is stored in collections reading the code, without the > need to lookup for field definitions everytime) and it simplifies many > algorithms. > Note that this patch also includes an interface for the Query class. This has > been done for my company's needs for building custom Query classes which add > some behaviour to the base Lucene queries. It prevents multiple unnnecessary > casts. I know this introduction is not wanted by the team, but it really > makes our developments easier to maintain. If you don't want to use this, > replace all /Queriable/ calls with standard /Query/. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org