date:20091019

Re: lucene 2.9 sorting algorithm

2009-10-19 Thread Jake Mannix

Given that this new API is pretty unweildy, and seems to not actually
perform any better than the old one... are we going to consider revisiting
that?

  -jake

On Mon, Oct 19, 2009 at 11:27 PM, Uwe Schindler  wrote:

>  The old search API is already removed in trunk…
>
>
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>   --
>
> *From:* John Wang [mailto:john.w...@gmail.com]
> *Sent:* Tuesday, October 20, 2009 3:28 AM
> *To:* java-dev@lucene.apache.org
> *Subject:* Re: lucene 2.9 sorting algorithm
>
>
>
> Hi Michael:
>
>
>
>  Was wondering if you got a chance to take a look at this.
>
>
>
>  Since deprecated APIs are being removed in 3.0, I was wondering
> if/when we would decide on keeping the ScoreDocComparator API and thus would
> be kept for Lucene 3.0.
>
>
>
> Thanks
>
>
>
> -John
>
> On Fri, Oct 16, 2009 at 9:53 AM, Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
> Oh, no problem...
>
> Mike
>
>
> On Fri, Oct 16, 2009 at 12:33 PM, John Wang  wrote:
> > Mike, just a clarification on my first perf report email.
> > The first section, numHits is incorrectly labeled, it should be 20
> instead
> > of 50. Sorry about the possible confusion.
> > Thanks
> > -John
> >
> > On Fri, Oct 16, 2009 at 3:21 AM, Michael McCandless
> >  wrote:
> >>
> >> Thanks John; I'll have a look.
> >>
> >> Mike
> >>
> >> On Fri, Oct 16, 2009 at 12:57 AM, John Wang 
> wrote:
> >> > Hi Michael:
> >> > I added classes: ScoreDocComparatorQueue
> and OneSortNoScoreCollector
> >> > as
> >> > a more general case. I think keeping the old api for
> ScoreDocComparator
> >> > and
> >> > SortComparatorSource would work.
> >> >   Please take a look.
> >> > Thanks
> >> > -John
> >> >
> >> > On Thu, Oct 15, 2009 at 6:52 PM, John Wang 
> wrote:
> >> >>
> >> >> Hi Michael:
> >> >>  It is open,
> http://code.google.com/p/lucene-book/source/checkout
> >> >>  I think I sent the https url instead, sorry.
> >> >> The multi PQ sorting is fairly self-contained, I have 2 versions,
> 1
> >> >> for string and 1 for int, each are Collector impls.
> >> >>  I shouldn't say the Multi Q is faster on int sort, it is within
> >> >> the
> >> >> error boundary. The diff is very very small, I would stay they are
> more
> >> >> equal.
> >> >>  If you think it is a good thing to go this way, (if not for the
> >> >> perf,
> >> >> just for the simpler api) I'd be happy to work on a patch.
> >> >> Thanks
> >> >> -John
> >> >> On Thu, Oct 15, 2009 at 5:18 PM, Michael McCandless
> >> >>  wrote:
> >> >>>
> >> >>> John, looks like this requires login -- any plans to open that up,
> or,
> >> >>> post the code on an issue?
> >> >>>
> >> >>> How self-contained is your Multi PQ sorting?  EG is it a standalone
> >> >>> Collector impl that I can test?
> >> >>>
> >> >>> Mike
> >> >>>
> >> >>> On Thu, Oct 15, 2009 at 6:33 PM, John Wang 
> >> >>> wrote:
> >> >>> > BTW, we are have a little sandbox for these experiments. And all
> my
> >> >>> > testcode
> >> >>> > are at. They are not very polished.
> >> >>> >
> >> >>> > https://lucene-book.googlecode.com/svn/trunk
> >> >>> >
> >> >>> > -John
> >> >>> >
> >> >>> > On Thu, Oct 15, 2009 at 3:29 PM, John Wang 
> >> >>> > wrote:
> >> >>> >>
> >> >>> >> Numbers Mike requested for Int types:
> >> >>> >>
> >> >>> >> only the time/cputime are posted, others are all the same since
> the
> >> >>> >> algorithm is the same.
> >> >>> >>
> >> >>> >> Lucene 2.9:
> >> >>> >> numhits: 10
> >> >>> >> time: 14619495
> >> >>> >> cpu: 146126
> >> >>> >>
> >> >>> >> numhits: 20
> >> >>> >> time: 14550568
> >> >>> >> cpu: 163242
> >> >>> >>
> >> >>> >> numhits: 100
> >> >>> >> time: 16467647
> >> >>> >> cpu: 178379
> >> >>> >>
> >> >>> >>
> >> >>> >> my test:
> >> >>> >> numHits: 10
> >> >>> >> time: 14101094
> >> >>> >> cpu: 144715
> >> >>> >>
> >> >>> >> numHits: 20
> >> >>> >> time: 14804821
> >> >>> >> cpu: 151305
> >> >>> >>
> >> >>> >> numHits: 100
> >> >>> >> time: 15372157
> >> >>> >> cpu time: 158842
> >> >>> >>
> >> >>> >> Conclusions:
> >> >>> >> The are very similar, the differences are all within error
> bounds,
> >> >>> >> especially with lower PQ sizes, which second sort alg again
> >> >>> >> slightly
> >> >>> >> faster.
> >> >>> >>
> >> >>> >> Hope this helps.
> >> >>> >>
> >> >>> >> -John
> >> >>> >>
> >> >>> >>
> >> >>> >> On Thu, Oct 15, 2009 at 3:04 PM, Yonik Seeley
> >> >>> >> 
> >> >>> >> wrote:
> >> >>> >>>
> >> >>> >>> On Thu, Oct 15, 2009 at 5:33 PM, Michael McCandless
> >> >>> >>>  wrote:
> >> >>> >>> > Though it'd be odd if the switch to searching by segment
> >> >>> >>> > really was most of the gains here.
> >> >>> >>>
> >> >>> >>> I had assumed that much of the improvement was due to ditching
> >> >>> >>> MultiTermEnum/MultiTermDocs.
> >> >>> >>> Note that LUCENE-1483 was before LUCENE-1596... but that only
> >> >>> >>> helps
> >> >>> >>> with queries that use a TermEnum (range, prefi

RE: lucene 2.9 sorting algorithm

2009-10-19 Thread Uwe Schindler

The old search API is already removed in trunk.

 

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

  _  

From: John Wang [mailto:john.w...@gmail.com] 
Sent: Tuesday, October 20, 2009 3:28 AM
To: java-dev@lucene.apache.org
Subject: Re: lucene 2.9 sorting algorithm

 

Hi Michael:

 

 Was wondering if you got a chance to take a look at this.

 

 Since deprecated APIs are being removed in 3.0, I was wondering if/when
we would decide on keeping the ScoreDocComparator API and thus would be kept
for Lucene 3.0.

 

Thanks

 

-John

On Fri, Oct 16, 2009 at 9:53 AM, Michael McCandless
 wrote:

Oh, no problem...

Mike


On Fri, Oct 16, 2009 at 12:33 PM, John Wang  wrote:
> Mike, just a clarification on my first perf report email.
> The first section, numHits is incorrectly labeled, it should be 20 instead
> of 50. Sorry about the possible confusion.
> Thanks
> -John
>
> On Fri, Oct 16, 2009 at 3:21 AM, Michael McCandless
>  wrote:
>>
>> Thanks John; I'll have a look.
>>
>> Mike
>>
>> On Fri, Oct 16, 2009 at 12:57 AM, John Wang  wrote:
>> > Hi Michael:
>> > I added classes: ScoreDocComparatorQueue and
OneSortNoScoreCollector
>> > as
>> > a more general case. I think keeping the old api for ScoreDocComparator
>> > and
>> > SortComparatorSource would work.
>> >   Please take a look.
>> > Thanks
>> > -John
>> >
>> > On Thu, Oct 15, 2009 at 6:52 PM, John Wang  wrote:
>> >>
>> >> Hi Michael:
>> >>  It is open, http://code.google.com/p/lucene-book/source/checkout
>> >>  I think I sent the https url instead, sorry.
>> >> The multi PQ sorting is fairly self-contained, I have 2 versions,
1
>> >> for string and 1 for int, each are Collector impls.
>> >>  I shouldn't say the Multi Q is faster on int sort, it is within
>> >> the
>> >> error boundary. The diff is very very small, I would stay they are
more
>> >> equal.
>> >>  If you think it is a good thing to go this way, (if not for the
>> >> perf,
>> >> just for the simpler api) I'd be happy to work on a patch.
>> >> Thanks
>> >> -John
>> >> On Thu, Oct 15, 2009 at 5:18 PM, Michael McCandless
>> >>  wrote:
>> >>>
>> >>> John, looks like this requires login -- any plans to open that up,
or,
>> >>> post the code on an issue?
>> >>>
>> >>> How self-contained is your Multi PQ sorting?  EG is it a standalone
>> >>> Collector impl that I can test?
>> >>>
>> >>> Mike
>> >>>
>> >>> On Thu, Oct 15, 2009 at 6:33 PM, John Wang 
>> >>> wrote:
>> >>> > BTW, we are have a little sandbox for these experiments. And all my
>> >>> > testcode
>> >>> > are at. They are not very polished.
>> >>> >
>> >>> > https://lucene-book.googlecode.com/svn/trunk
>> >>> >
>> >>> > -John
>> >>> >
>> >>> > On Thu, Oct 15, 2009 at 3:29 PM, John Wang 
>> >>> > wrote:
>> >>> >>
>> >>> >> Numbers Mike requested for Int types:
>> >>> >>
>> >>> >> only the time/cputime are posted, others are all the same since
the
>> >>> >> algorithm is the same.
>> >>> >>
>> >>> >> Lucene 2.9:
>> >>> >> numhits: 10
>> >>> >> time: 14619495
>> >>> >> cpu: 146126
>> >>> >>
>> >>> >> numhits: 20
>> >>> >> time: 14550568
>> >>> >> cpu: 163242
>> >>> >>
>> >>> >> numhits: 100
>> >>> >> time: 16467647
>> >>> >> cpu: 178379
>> >>> >>
>> >>> >>
>> >>> >> my test:
>> >>> >> numHits: 10
>> >>> >> time: 14101094
>> >>> >> cpu: 144715
>> >>> >>
>> >>> >> numHits: 20
>> >>> >> time: 14804821
>> >>> >> cpu: 151305
>> >>> >>
>> >>> >> numHits: 100
>> >>> >> time: 15372157
>> >>> >> cpu time: 158842
>> >>> >>
>> >>> >> Conclusions:
>> >>> >> The are very similar, the differences are all within error bounds,
>> >>> >> especially with lower PQ sizes, which second sort alg again
>> >>> >> slightly
>> >>> >> faster.
>> >>> >>
>> >>> >> Hope this helps.
>> >>> >>
>> >>> >> -John
>> >>> >>
>> >>> >>
>> >>> >> On Thu, Oct 15, 2009 at 3:04 PM, Yonik Seeley
>> >>> >> 
>> >>> >> wrote:
>> >>> >>>
>> >>> >>> On Thu, Oct 15, 2009 at 5:33 PM, Michael McCandless
>> >>> >>>  wrote:
>> >>> >>> > Though it'd be odd if the switch to searching by segment
>> >>> >>> > really was most of the gains here.
>> >>> >>>
>> >>> >>> I had assumed that much of the improvement was due to ditching
>> >>> >>> MultiTermEnum/MultiTermDocs.
>> >>> >>> Note that LUCENE-1483 was before LUCENE-1596... but that only
>> >>> >>> helps
>> >>> >>> with queries that use a TermEnum (range, prefix, etc).
>> >>> >>>
>> >>> >>> -Yonik
>> >>> >>> http://www.lucidimagination.com
>> >>> >>>
>> >>> >>>
>> >>> >>>
-
>> >>> >>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
>> >>> >>> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>> >>> >>>
>> >>> >>
>> >>> >
>> >>> >
>> >>>
>> >>> -
>> >>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
>> >>> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>> >>>
>>

Re: lucene 2.9 sorting algorithm

2009-10-19 Thread John Wang

Hi Michael:
 Was wondering if you got a chance to take a look at this.

 Since deprecated APIs are being removed in 3.0, I was wondering if/when
we would decide on keeping the ScoreDocComparator API and thus would be kept
for Lucene 3.0.

Thanks

-John

On Fri, Oct 16, 2009 at 9:53 AM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> Oh, no problem...
>
> Mike
>
> On Fri, Oct 16, 2009 at 12:33 PM, John Wang  wrote:
> > Mike, just a clarification on my first perf report email.
> > The first section, numHits is incorrectly labeled, it should be 20
> instead
> > of 50. Sorry about the possible confusion.
> > Thanks
> > -John
> >
> > On Fri, Oct 16, 2009 at 3:21 AM, Michael McCandless
> >  wrote:
> >>
> >> Thanks John; I'll have a look.
> >>
> >> Mike
> >>
> >> On Fri, Oct 16, 2009 at 12:57 AM, John Wang 
> wrote:
> >> > Hi Michael:
> >> > I added classes: ScoreDocComparatorQueue
> and OneSortNoScoreCollector
> >> > as
> >> > a more general case. I think keeping the old api for
> ScoreDocComparator
> >> > and
> >> > SortComparatorSource would work.
> >> >   Please take a look.
> >> > Thanks
> >> > -John
> >> >
> >> > On Thu, Oct 15, 2009 at 6:52 PM, John Wang 
> wrote:
> >> >>
> >> >> Hi Michael:
> >> >>  It is open,
> http://code.google.com/p/lucene-book/source/checkout
> >> >>  I think I sent the https url instead, sorry.
> >> >> The multi PQ sorting is fairly self-contained, I have 2 versions,
> 1
> >> >> for string and 1 for int, each are Collector impls.
> >> >>  I shouldn't say the Multi Q is faster on int sort, it is within
> >> >> the
> >> >> error boundary. The diff is very very small, I would stay they are
> more
> >> >> equal.
> >> >>  If you think it is a good thing to go this way, (if not for the
> >> >> perf,
> >> >> just for the simpler api) I'd be happy to work on a patch.
> >> >> Thanks
> >> >> -John
> >> >> On Thu, Oct 15, 2009 at 5:18 PM, Michael McCandless
> >> >>  wrote:
> >> >>>
> >> >>> John, looks like this requires login -- any plans to open that up,
> or,
> >> >>> post the code on an issue?
> >> >>>
> >> >>> How self-contained is your Multi PQ sorting?  EG is it a standalone
> >> >>> Collector impl that I can test?
> >> >>>
> >> >>> Mike
> >> >>>
> >> >>> On Thu, Oct 15, 2009 at 6:33 PM, John Wang 
> >> >>> wrote:
> >> >>> > BTW, we are have a little sandbox for these experiments. And all
> my
> >> >>> > testcode
> >> >>> > are at. They are not very polished.
> >> >>> >
> >> >>> > https://lucene-book.googlecode.com/svn/trunk
> >> >>> >
> >> >>> > -John
> >> >>> >
> >> >>> > On Thu, Oct 15, 2009 at 3:29 PM, John Wang 
> >> >>> > wrote:
> >> >>> >>
> >> >>> >> Numbers Mike requested for Int types:
> >> >>> >>
> >> >>> >> only the time/cputime are posted, others are all the same since
> the
> >> >>> >> algorithm is the same.
> >> >>> >>
> >> >>> >> Lucene 2.9:
> >> >>> >> numhits: 10
> >> >>> >> time: 14619495
> >> >>> >> cpu: 146126
> >> >>> >>
> >> >>> >> numhits: 20
> >> >>> >> time: 14550568
> >> >>> >> cpu: 163242
> >> >>> >>
> >> >>> >> numhits: 100
> >> >>> >> time: 16467647
> >> >>> >> cpu: 178379
> >> >>> >>
> >> >>> >>
> >> >>> >> my test:
> >> >>> >> numHits: 10
> >> >>> >> time: 14101094
> >> >>> >> cpu: 144715
> >> >>> >>
> >> >>> >> numHits: 20
> >> >>> >> time: 14804821
> >> >>> >> cpu: 151305
> >> >>> >>
> >> >>> >> numHits: 100
> >> >>> >> time: 15372157
> >> >>> >> cpu time: 158842
> >> >>> >>
> >> >>> >> Conclusions:
> >> >>> >> The are very similar, the differences are all within error
> bounds,
> >> >>> >> especially with lower PQ sizes, which second sort alg again
> >> >>> >> slightly
> >> >>> >> faster.
> >> >>> >>
> >> >>> >> Hope this helps.
> >> >>> >>
> >> >>> >> -John
> >> >>> >>
> >> >>> >>
> >> >>> >> On Thu, Oct 15, 2009 at 3:04 PM, Yonik Seeley
> >> >>> >> 
> >> >>> >> wrote:
> >> >>> >>>
> >> >>> >>> On Thu, Oct 15, 2009 at 5:33 PM, Michael McCandless
> >> >>> >>>  wrote:
> >> >>> >>> > Though it'd be odd if the switch to searching by segment
> >> >>> >>> > really was most of the gains here.
> >> >>> >>>
> >> >>> >>> I had assumed that much of the improvement was due to ditching
> >> >>> >>> MultiTermEnum/MultiTermDocs.
> >> >>> >>> Note that LUCENE-1483 was before LUCENE-1596... but that only
> >> >>> >>> helps
> >> >>> >>> with queries that use a TermEnum (range, prefix, etc).
> >> >>> >>>
> >> >>> >>> -Yonik
> >> >>> >>> http://www.lucidimagination.com
> >> >>> >>>
> >> >>> >>>
> >> >>> >>>
> -
> >> >>> >>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> >> >>> >>> For additional commands, e-mail:
> java-dev-h...@lucene.apache.org
> >> >>> >>>
> >> >>> >>
> >> >>> >
> >> >>> >
> >> >>>
> >> >>>
> -
> >> >>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> >> >>> For additional commands, e-mail: java-dev-h...@lucene.apache.org
> >> >>>
> >> >>
> >

[jira] Updated: (LUCENE-1257) Port to Java5

2009-10-19 Thread Kay Kay (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Kay updated LUCENE-1257:


Attachment: LUCENE-1257_javacc_upgrade.patch

common-build.xml , build comments match those in build.txt 

> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch, 
> LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, 
> LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, 
> LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, 
> LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_javacc_upgrade.patch, 
> LUCENE-1257_messages.patch, LUCENE-1257_MultiFieldQueryParser.patch, 
> LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, LUCENE-1257_queryParser_jj.patch, 
> lucene1257surround1.patch, lucene1257surround1.patch, 
> shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1257) Port to Java5

2009-10-19 Thread Kay Kay (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Kay updated LUCENE-1257:


Attachment: LUCENE-1257_MultiFieldQueryParser.patch

MultiFieldQueryParser 

> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch, 
> LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, 
> LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, 
> LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, 
> LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, 
> LUCENE-1257_MultiFieldQueryParser.patch, LUCENE-1257_o.a.l.queryParser.patch, 
> LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, 
> LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, LUCENE-1257_queryParser_jj.patch, 
> lucene1257surround1.patch, lucene1257surround1.patch, 
> shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1257) Port to Java5

2009-10-19 Thread Kay Kay (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Kay updated LUCENE-1257:


Attachment: LUCENE-1257_queryParser_jj.patch

QueryParser.jj patch separately for generics 

> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch, 
> LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, 
> LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, 
> LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, 
> LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, 
> LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, LUCENE-1257_queryParser_jj.patch, 
> lucene1257surround1.patch, lucene1257surround1.patch, 
> shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1257) Port to Java5

2009-10-19 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767617#action_12767617
 ] 

Uwe Schindler commented on LUCENE-1257:
---

bq. What's the version of javacc being used/suggested currently ( the latest 
release seems to be 5.0 ) .

*From BUILD.txt* (I suggest to use this version 4.1, e.g. 4.2 has a bug that 
corrupts the parser somehow):

Step 3) Install JavaCC

Building the Lucene distribution from the source does not require the JavaCC
parser generator, but if you wish to regenerate any of the pre-generated
parser pieces, you will need to install JavaCC. Version 4.1 is tested to
work correctly.

  http://javacc.dev.java.net

Follow the download links and download the zip file to a temporary
location on your file system.

After JavaCC is installed, create a build.properties file
(as in step 2), and add the line

  javacc.home=/javacc

where this points to the root directory of your javacc installation
(the directory that contains bin/lib/javacc.jar).


> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch, 
> LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, 
> LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, 
> LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, 
> LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, 
> LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, 
> lucene1257surround1.patch, shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1257) Port to Java5

2009-10-19 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1257:
--

Attachment: LUCENE-1257-FieldCacheRangeFilter.patch

FieldCacheRangeFilter generified + type safe accessor methods.

Committed revision: 826883

> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch, 
> LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, 
> LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, 
> LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, 
> LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, 
> LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, 
> lucene1257surround1.patch, shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1257) Port to Java5

2009-10-19 Thread Kay Kay (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767611#action_12767611
 ] 

Kay Kay commented on LUCENE-1257:
-

| I updated the parser generator task to use Java 1.5. If you want to generify 
the other parts of QueryParser, update the .jj file and regenerate the java 
files. I will do this tomorrow. Will go to bed now.

What's the version of javacc being used/suggested currently ( the latest 
release seems to be 5.0 ) .

> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, 
> LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, 
> LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, 
> LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, 
> LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, 
> lucene1257surround1.patch, shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1257) Port to Java5

2009-10-19 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767603#action_12767603
 ] 

Uwe Schindler commented on LUCENE-1257:
---

Committed:
   LUCENE-1257-MTQWF.patch 2009-10-19 10:55 PM Uwe Schindler 5 kB 
   LUCENE-1257-TopDocsCollector.patch 2009-10-19 08:47 PM Kay Kay 8 kB 
   LUCENE-1257-FieldCacheImpl.patch 2009-10-19 08:23 PM Kay Kay 8 kB 

(with some modifications in FieldCacheImpl, where Class was not generified to 
Class).

At revision: 826857

> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, 
> LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, 
> LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, 
> LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, 
> LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, 
> lucene1257surround1.patch, shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1257) Port to Java5

2009-10-19 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1257:
--

Attachment: LUCENE-1257-MTQWF.patch

better generification of MultiTermQueryWrapperFilter (no more casts in 
sub-classes).

> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, 
> LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, 
> LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, 
> LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, 
> LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, 
> lucene1257surround1.patch, shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1995) ArrayIndexOutOfBoundsException during indexing

2009-10-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767564#action_12767564
 ] 

Michael McCandless commented on LUCENE-1995:


That's a nice large RAM buffer :)

bq. Mike - I think keeping the signed shift is the right thing to do... a 
zero-cost check against silent corruption.

Ahh good point, OK we'll keep it as is.

bq. But I'm not sure if 2048MiB is safe either

2048 probably won't be safe, because a large doc just as the buffer is filling 
up could still overflow.  (Though, RAM is also used eg for norms, so you might 
squeak by).

I'll update the javadocs to note the limitation!

> ArrayIndexOutOfBoundsException during indexing
> --
>
> Key: LUCENE-1995
> URL: https://issues.apache.org/jira/browse/LUCENE-1995
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
>Affects Versions: 2.9
>Reporter: Yonik Seeley
> Fix For: 2.9.1
>
>
> http://search.lucidimagination.com/search/document/f29fc52348ab9b63/arrayindexoutofboundsexception_during_indexing

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Assigned: (LUCENE-1995) ArrayIndexOutOfBoundsException during indexing

2009-10-19 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reassigned LUCENE-1995:
--

Assignee: Michael McCandless

> ArrayIndexOutOfBoundsException during indexing
> --
>
> Key: LUCENE-1995
> URL: https://issues.apache.org/jira/browse/LUCENE-1995
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
>Affects Versions: 2.9
>Reporter: Yonik Seeley
>Assignee: Michael McCandless
> Fix For: 2.9.1
>
>
> http://search.lucidimagination.com/search/document/f29fc52348ab9b63/arrayindexoutofboundsexception_during_indexing

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1995) ArrayIndexOutOfBoundsException during indexing

2009-10-19 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767556#action_12767556
 ] 

Yonik Seeley commented on LUCENE-1995:
--

lol - well, there we go.  Looks like perhaps a JavaDoc fix (and a comment in 
solrconfig.xml)?  The buffered size was never meant to be quite so large :-)

Mike - I think keeping the signed shift is the right thing to do... a zero-cost 
check against silent corruption.
But I'm not sure if 2048MiB is safe either... I'm not sure of one could 
overflow the number of buffers somehow as well (is every buffer except the last 
fully utilized?)


> ArrayIndexOutOfBoundsException during indexing
> --
>
> Key: LUCENE-1995
> URL: https://issues.apache.org/jira/browse/LUCENE-1995
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
>Affects Versions: 2.9
>Reporter: Yonik Seeley
> Fix For: 2.9.1
>
>
> http://search.lucidimagination.com/search/document/f29fc52348ab9b63/arrayindexoutofboundsexception_during_indexing

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Parameter class and Java 5 Enums

2009-10-19 Thread DM Smith

Should the Parameter class be replaced with Java 5 enums? My only 
concern is backward compatibility. I noticed that Parameter is 
serializable. Is this used by Lucene? I wasn't able to see any place 
that depended on it. The only public method, Parameter.toString() 
results in the same value as a Java 5 Enum.


It seems that an advanced form of enums would be helpful, too. I'm 
seeing a lot of "switch" statements on their value:

e.g.
In AbstractField:
if (store == Field.Store.YES){
  this.isStored = true;
}
else if (store == Field.Store.NO){
  this.isStored = false;
}
else
  throw new IllegalArgumentException("unknown store parameter " + 
store);


if (index == Field.Index.NO) {
  this.isIndexed = false;
  this.isTokenized = false;
} else if (index == Field.Index.ANALYZED) {
  this.isIndexed = true;
  this.isTokenized = true;
} else if (index == Field.Index.NOT_ANALYZED) {
  this.isIndexed = true;
  this.isTokenized = false;
} else if (index == Field.Index.NOT_ANALYZED_NO_NORMS) {
  this.isIndexed = true;
  this.isTokenized = false;
  this.omitNorms = true;
} else if (index == Field.Index.ANALYZED_NO_NORMS) {
  this.isIndexed = true;
  this.isTokenized = true;
  this.omitNorms = true;
} else {
  throw new IllegalArgumentException("unknown index parameter " + 
index);

}

This could be reduced to:
this.stored = store.isStored();
this.isIndexed = index.isIndexed();
this.isTokenized = index.isTokenized();
this.omitNorms = index.omitNorms();

With the following:
public enum Store {
  YES   { public boolean isStored() { return true; } },
  NO{ public boolean isStored() { return false; } };

  // Determine whether this is stored or not
  abstract boolean isStored();
}

public enum Index {
ANALYZED {
   public boolean isIndexed() { return true; }
   public boolean isTokenized() { return true; }
   public boolean omitNorms() { return false; }
   ...
},
...

abstract boolean isIndexed();
abstract boolean isTokenized();
abstract boolean omitNorms();
...
}

What I like about this pattern is that it clearly documents what each 
member does. As it is it is spread around in the files.


One can add a "picker" method to these to serve as a factory. E.g. given 
indexed = true, tokenized = false, ... what is the appropriate value 
from the Index enum.




-- DM


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767548#action_12767548
 ] 

Uwe Schindler commented on LUCENE-1987:
---

To move back to my other problem:
How to handle the problem with LUCENE_29 setting and the posIncr of stopwords 
together with QueryParser that has a default setting of ignoring posIncr?:

This leads to the problem, that a phrase query does not hit anything if you 
index with StandardAnalyzer=LUCENE_29 and QueryParser using the same analyzer 
but with setEnablePositionIncrements(false) [the current default for 
QueryParser].

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 2.9.1, 3.0
>
> Attachments: LUCENE-1987-StopFilter-backport29.patch, 
> LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1995) ArrayIndexOutOfBoundsException during indexing

2009-10-19 Thread Aaron McKee (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767541#action_12767541
 ] 

Aaron McKee commented on LUCENE-1995:
-

I make no claims to the reasonableness of these settings, I only recently began 
efforts to tune our prototype. =)

useCompoundFile: false
mergeFactor: 10
maxBufferedDocs: 500
ramBufferSizeMB: 8192 
maxFieldLength: 1
reopenReaders: true

My system has 24gb and my index is typically ~16gb, so I set some of these 
values a bit high. If the ram buffer is being indexed with an int, that could 
certainly be my issue; I feel a bit silly for not having thought of that, 
already.  I'll try setting it down to 2048 and see if the problem disappears.

> ArrayIndexOutOfBoundsException during indexing
> --
>
> Key: LUCENE-1995
> URL: https://issues.apache.org/jira/browse/LUCENE-1995
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
>Affects Versions: 2.9
>Reporter: Yonik Seeley
> Fix For: 2.9.1
>
>
> http://search.lucidimagination.com/search/document/f29fc52348ab9b63/arrayindexoutofboundsexception_during_indexing

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1995) ArrayIndexOutOfBoundsException during indexing

2009-10-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767532#action_12767532
 ] 

Michael McCandless commented on LUCENE-1995:


Spooky!  It does look likely we overflowed int, because (1 + Integer.MAX_VALUE) 
>> 15 is -65536.

> ArrayIndexOutOfBoundsException during indexing
> --
>
> Key: LUCENE-1995
> URL: https://issues.apache.org/jira/browse/LUCENE-1995
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
>Affects Versions: 2.9
>Reporter: Yonik Seeley
> Fix For: 2.9.1
>
>
> http://search.lucidimagination.com/search/document/f29fc52348ab9b63/arrayindexoutofboundsexception_during_indexing

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1257) Port to Java5

2009-10-19 Thread Kay Kay (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Kay updated LUCENE-1257:


Attachment: (was: LUCENE-1257-FieldValueHitQueue.patch)

> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, 
> LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, 
> LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, 
> LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, 
> lucene1257surround1.patch, shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1257) Port to Java5

2009-10-19 Thread Kay Kay (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Kay updated LUCENE-1257:


Attachment: LUCENE-1257-TopDocsCollector.patch

* FieldValueHitQueue
* TopDocsCollector
* TopScoreDocsCollector
* TopFieldHitsCollector


> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, 
> LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, 
> LUCENE-1257_BooleanFilter_Generics.patch, LUCENE-1257_messages.patch, 
> LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_search.patch, LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, 
> lucene1257surround1.patch, shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1257) Port to Java5

2009-10-19 Thread Kay Kay (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Kay updated LUCENE-1257:


Attachment: LUCENE-1257-FieldValueHitQueue.patch

> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldValueHitQueue.patch, 
> LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-WordListLoader.patch, 
> LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch, 
> LUCENE-1257_messages.patch, LUCENE-1257_o.a.l.queryParser.patch, 
> LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, 
> LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, 
> lucene1257surround1.patch, shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1257) Port to Java5

2009-10-19 Thread Kay Kay (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Kay updated LUCENE-1257:


Attachment: LUCENE-1257-FieldCacheImpl.patch

> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-WordListLoader.patch, 
> LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch, 
> LUCENE-1257_messages.patch, LUCENE-1257_o.a.l.queryParser.patch, 
> LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, 
> LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, 
> lucene1257surround1.patch, shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: [jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Michael McCandless

On Mon, Oct 19, 2009 at 4:00 PM, Yonik Seeley
 wrote:
> On Mon, Oct 19, 2009 at 3:45 PM, Mark Miller  wrote:
>> but there is some old source code here and
>> there that really bugs me
>
> Is it Doug's
>
>  if (foo)
>     bar()
>  else {
>    baz();
>  }
>
> or is it my single line
>
>  if (a==null) return 0;
>
> ;-)

Or my always doing this up until a while ago:

  if (foo)
something;

but then suddenly [trying to] switch to the correct:

  if (foo) {
something;
  }

?

> One of my personal pet peeves is more indentation than necessary for
> large blocks of code, rather than just immediately handling the
> exception cases and escaping. Example:

Hmm I think I tend to do this :)

But I agree, your way IS more readable so I'll try to switch!

Mike

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: [jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Yonik Seeley

On Mon, Oct 19, 2009 at 3:45 PM, Mark Miller  wrote:
> but there is some old source code here and
> there that really bugs me

Is it Doug's

  if (foo)
 bar()
  else {
baz();
  }

or is it my single line

  if (a==null) return 0;

;-)

One of my personal pet peeves is more indentation than necessary for
large blocks of code, rather than just immediately handling the
exception cases and escaping. Example:

void doSomething(MyObj obj) {
  if (obj != null) {// at this point, I'm wondering... hmmm, is
there code that executes *after* this huge "if" in the event that obj
is null?
  [...]
  // same with this one... ya gotta go and try to match up braces
to see if there is code that executes in the opposite case...
  // and if it also falls through to execute the obj==null case or
simply returns.
  if (some other condition) {
  [ tons of code ]
  [ tons of code ]
  }
  }

A much more readable version (regardless of if one likes the
single-line syntax or not):

void doSomething(MyObj obj) {
  if (obj==null) return;  // immediately obvious handling of the exception case
  [...]
  if (!some other condition) return;  // again, immediately obvious
how the exception case was handled

   [ tons of code ]
   [ tons of code ]
  }

-Yonik
http://www.lucidimagination.com

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1996) EnwikiContentSource isn't thread safe

2009-10-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767492#action_12767492
 ] 

Michael McCandless commented on LUCENE-1996:


That IS really crazy.

> EnwikiContentSource isn't thread safe
> -
>
> Key: LUCENE-1996
> URL: https://issues.apache.org/jira/browse/LUCENE-1996
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/benchmark
>Reporter: Michael McCandless
>Priority: Minor
> Fix For: 3.1
>
>
> When I run this alg:
> {code}
> analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
> content.source=org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource
> docs.file=/x/lucene/enwiki-20090724-pages-articles.xml.bz2
> doc.tokenized = false
> ram.flush.mb=32.0
> doc.stored = false
> doc.term.vector = false
> log.step.AddDoc=1
> directory=FSDirectory
> autocommit=false
> compound=false
> work.dir=/lucene/work.wiki.nd0.02M
> { "BuildIndex"
>   - CreateIndex
>   [ { "AddDocs" AddDoc > : 1 } : 2
>   - CloseIndex
> }
> RepSumByPrefRound BuildIndex
> {code}
> I hit exceptions in each thread like this:
> {code}
> Exception in thread "Thread-2" java.lang.RuntimeException: 
> org.xml.sax.SAXParseException: Open quote is expected for attribute "msxi" 
> associated with an  element type  "mdiiki".
>   at 
> org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:189)
>   at java.lang.Thread.run(Thread.java:613)
> Caused by: org.xml.sax.SAXParseException: Open quote is expected for 
> attribute "msxi" associated with an  element type  "mdiiki".
>   at 
> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236)
>   at 
> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:215)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:386)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:316)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1441)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:802)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:578)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:222)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(XMLNSDocumentScannerImpl.java:779)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1794)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148)
>   at 
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242)
>   at 
> org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:166)
>   ... 1 more
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: [jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Michael McCandless

On Mon, Oct 19, 2009 at 3:11 PM, Mark Miller  wrote:
> Uwe Schindler (JIRA) wrote:
>>
>> And please: next time when we deprecate APIs: remove all deprecated calls 
>> from tests and contrib and mark all deprecated-test as such!
>>
>>
> Its the nature of open source. Each of us takes the work that other
> contributors are willing/able/havetime to provide - and fill in the rest
> ourselves or decide its too much work and don't. I agree that its a nice
> idea, but I don't think the issue is going away so easily myself ;) In
> which case it falls to the poor soul who decides to help later and
> remove the deprecated methods. Or perhaps it keeps someone from stepping
> up and doing that - nature of the beast.

I do agree this is the nature of the beast.

Also, thinking more about it... I think a good approach, for an issue
with a large number of deprecations, might be to open a separate issue
to fix the deprecations in contrib/test, and fix it after some delay.
This way we confirm that deprecated usage of the APIs is working, for
at least some time, before removing them all from the tests.

EG in LUCENE-1458 I waited until quite late to cutover usage to the flex API.

> But as long as we are making such requests, please no one commit any
> more funky source formatting either :) It hurts my eyes.

+1!

Mike

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1996) EnwikiContentSource isn't thread safe

2009-10-19 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767487#action_12767487
 ] 

Mark Miller commented on LUCENE-1996:
-

The scary part is that its been around for some time and we both independently 
hit it today ... quantum mechanics in action I guess ... 

> EnwikiContentSource isn't thread safe
> -
>
> Key: LUCENE-1996
> URL: https://issues.apache.org/jira/browse/LUCENE-1996
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/benchmark
>Reporter: Michael McCandless
>Priority: Minor
> Fix For: 3.1
>
>
> When I run this alg:
> {code}
> analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
> content.source=org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource
> docs.file=/x/lucene/enwiki-20090724-pages-articles.xml.bz2
> doc.tokenized = false
> ram.flush.mb=32.0
> doc.stored = false
> doc.term.vector = false
> log.step.AddDoc=1
> directory=FSDirectory
> autocommit=false
> compound=false
> work.dir=/lucene/work.wiki.nd0.02M
> { "BuildIndex"
>   - CreateIndex
>   [ { "AddDocs" AddDoc > : 1 } : 2
>   - CloseIndex
> }
> RepSumByPrefRound BuildIndex
> {code}
> I hit exceptions in each thread like this:
> {code}
> Exception in thread "Thread-2" java.lang.RuntimeException: 
> org.xml.sax.SAXParseException: Open quote is expected for attribute "msxi" 
> associated with an  element type  "mdiiki".
>   at 
> org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:189)
>   at java.lang.Thread.run(Thread.java:613)
> Caused by: org.xml.sax.SAXParseException: Open quote is expected for 
> attribute "msxi" associated with an  element type  "mdiiki".
>   at 
> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236)
>   at 
> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:215)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:386)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:316)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1441)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:802)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:578)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:222)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(XMLNSDocumentScannerImpl.java:779)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1794)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148)
>   at 
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242)
>   at 
> org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:166)
>   ... 1 more
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-1996) EnwikiContentSource isn't thread safe

2009-10-19 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-1996.


Resolution: Duplicate

Duh, yes, dup.  Must read email before opening issues ;)

> EnwikiContentSource isn't thread safe
> -
>
> Key: LUCENE-1996
> URL: https://issues.apache.org/jira/browse/LUCENE-1996
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/benchmark
>Reporter: Michael McCandless
>Priority: Minor
> Fix For: 3.1
>
>
> When I run this alg:
> {code}
> analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
> content.source=org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource
> docs.file=/x/lucene/enwiki-20090724-pages-articles.xml.bz2
> doc.tokenized = false
> ram.flush.mb=32.0
> doc.stored = false
> doc.term.vector = false
> log.step.AddDoc=1
> directory=FSDirectory
> autocommit=false
> compound=false
> work.dir=/lucene/work.wiki.nd0.02M
> { "BuildIndex"
>   - CreateIndex
>   [ { "AddDocs" AddDoc > : 1 } : 2
>   - CloseIndex
> }
> RepSumByPrefRound BuildIndex
> {code}
> I hit exceptions in each thread like this:
> {code}
> Exception in thread "Thread-2" java.lang.RuntimeException: 
> org.xml.sax.SAXParseException: Open quote is expected for attribute "msxi" 
> associated with an  element type  "mdiiki".
>   at 
> org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:189)
>   at java.lang.Thread.run(Thread.java:613)
> Caused by: org.xml.sax.SAXParseException: Open quote is expected for 
> attribute "msxi" associated with an  element type  "mdiiki".
>   at 
> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236)
>   at 
> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:215)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:386)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:316)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1441)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:802)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:578)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:222)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(XMLNSDocumentScannerImpl.java:779)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1794)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148)
>   at 
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242)
>   at 
> org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:166)
>   ... 1 more
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: [jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Mark Miller

Uwe Schindler wrote:
>> Uwe Schindler (JIRA) wrote:
>> 
>>> And please: next time when we deprecate APIs: remove all deprecated
>>>   
>> calls from tests and contrib and mark all deprecated-test as such!
>> 
>>>   
>> Its the nature of open source. Each of us takes the work that other
>> contributors are willing/able/havetime to provide - and fill in the rest
>> ourselves or decide its too much work and don't. I agree that its a nice
>> idea, but I don't think the issue is going away so easily myself ;) In
>> which case it falls to the poor soul who decides to help later and
>> remove the deprecated methods. Or perhaps it keeps someone from stepping
>> up and doing that - nature of the beast.
>> 
>
> Sorry, I was disappointed and somehow angry because nothing worked as
> expected when I removed the deprecated parts. I fixed one thing and 5 other
> problems appeared.
>   
Ha - no reason to be sorry - I agree it would be nice - just saying good
luck getting everyone to fall in line in the future :)
>   
>> But as long as we are making such requests, please no one commit any
>> more funky source formatting either :) It hurts my eyes.
>> 
>
> What was funky?
>
> I think I should stop working today and do something other...
>   
Ha again :) I actually reworded that because the first time I wrote it I
thought it sounded like I was saying you did it - guess I failed :) I
was commenting in general, not about you - I don't think anything to bad
has gotten in in some time - but there is some old source code here and
there that really bugs me - totally unrelated to your comment - just
adding a wish of my own - no more ugly source code :) !
> Uwe
>
>
> -
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>   


-- 
- Mark

http://www.lucidimagination.com




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Assigned: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2009-10-19 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned LUCENE-1486:
---

Assignee: (was: Mark Miller)

> Wildcards, ORs etc inside Phrase queries
> 
>
> Key: LUCENE-1486
> URL: https://issues.apache.org/jira/browse/LUCENE-1486
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: QueryParser
>Affects Versions: 2.4
>Reporter: Mark Harwood
>Priority: Minor
> Fix For: 3.0, 3.1
>
> Attachments: ComplexPhraseQueryParser.java, 
> junit_complex_phrase_qp_07_21_2009.patch, 
> junit_complex_phrase_qp_07_22_2009.patch, Lucene-1486 non default 
> field.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, TestComplexPhraseQuery.java
>
>
> An extension to the default QueryParser that overrides the parsing of 
> PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries.
> The implementation feels a little hacky - this is arguably better handled in 
> QueryParser itself. This works as a proof of concept  for much of the query 
> parser syntax. Examples from the Junit test include:
>   checkMatches("\"j*   smyth~\"", "1,2"); //wildcards and fuzzies 
> are OK in phrases
>   checkMatches("\"(jo* -john)  smith\"", "2"); // boolean logic 
> works
>   checkMatches("\"jo*  smith\"~2", "1,2,3"); // position logic 
> works.
>   
>   checkBadQuery("\"jo*  id:1 smith\""); //mixing fields in a 
> phrase is bad
>   checkBadQuery("\"jo* \"smith\" \""); //phrases inside phrases 
> is bad
>   checkBadQuery("\"jo* [sma TO smZ]\" \""); //range queries 
> inside phrases not supported
> Code plus Junit test to follow...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

RE: [jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Uwe Schindler

> Uwe Schindler (JIRA) wrote:
> >
> > And please: next time when we deprecate APIs: remove all deprecated
> calls from tests and contrib and mark all deprecated-test as such!
> >
> >
> Its the nature of open source. Each of us takes the work that other
> contributors are willing/able/havetime to provide - and fill in the rest
> ourselves or decide its too much work and don't. I agree that its a nice
> idea, but I don't think the issue is going away so easily myself ;) In
> which case it falls to the poor soul who decides to help later and
> remove the deprecated methods. Or perhaps it keeps someone from stepping
> up and doing that - nature of the beast.

Sorry, I was disappointed and somehow angry because nothing worked as
expected when I removed the deprecated parts. I fixed one thing and 5 other
problems appeared.

> But as long as we are making such requests, please no one commit any
> more funky source formatting either :) It hurts my eyes.

What was funky?

I think I should stop working today and do something other...

Uwe


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1996) EnwikiContentSource isn't thread safe

2009-10-19 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767474#action_12767474
 ] 

Mark Miller commented on LUCENE-1996:
-

dupe? LUCENE-1994

> EnwikiContentSource isn't thread safe
> -
>
> Key: LUCENE-1996
> URL: https://issues.apache.org/jira/browse/LUCENE-1996
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/benchmark
>Reporter: Michael McCandless
>Priority: Minor
> Fix For: 3.1
>
>
> When I run this alg:
> {code}
> analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
> content.source=org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource
> docs.file=/x/lucene/enwiki-20090724-pages-articles.xml.bz2
> doc.tokenized = false
> ram.flush.mb=32.0
> doc.stored = false
> doc.term.vector = false
> log.step.AddDoc=1
> directory=FSDirectory
> autocommit=false
> compound=false
> work.dir=/lucene/work.wiki.nd0.02M
> { "BuildIndex"
>   - CreateIndex
>   [ { "AddDocs" AddDoc > : 1 } : 2
>   - CloseIndex
> }
> RepSumByPrefRound BuildIndex
> {code}
> I hit exceptions in each thread like this:
> {code}
> Exception in thread "Thread-2" java.lang.RuntimeException: 
> org.xml.sax.SAXParseException: Open quote is expected for attribute "msxi" 
> associated with an  element type  "mdiiki".
>   at 
> org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:189)
>   at java.lang.Thread.run(Thread.java:613)
> Caused by: org.xml.sax.SAXParseException: Open quote is expected for 
> attribute "msxi" associated with an  element type  "mdiiki".
>   at 
> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236)
>   at 
> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:215)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:386)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:316)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1441)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:802)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:578)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:222)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(XMLNSDocumentScannerImpl.java:779)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1794)
>   at 
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764)
>   at 
> com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148)
>   at 
> com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242)
>   at 
> org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:166)
>   ... 1 more
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-1955) Fix Hits deprecation notice

2009-10-19 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved LUCENE-1955.
-

Resolution: Fixed

> Fix Hits deprecation notice
> ---
>
> Key: LUCENE-1955
> URL: https://issues.apache.org/jira/browse/LUCENE-1955
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Javadocs
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 2.9.1
>
>
> Just needs to be committed to 2.9 branch since hits is now removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1995) ArrayIndexOutOfBoundsException during indexing

2009-10-19 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767467#action_12767467
 ] 

Yonik Seeley commented on LUCENE-1995:
--

The point at the exception uses a signed shift instead of unsigned, but that 
shouldn't matter unless the buffer pool is huge?
Aaron, what are your index settings (like ramBufferSizeMB?)


> ArrayIndexOutOfBoundsException during indexing
> --
>
> Key: LUCENE-1995
> URL: https://issues.apache.org/jira/browse/LUCENE-1995
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
>Affects Versions: 2.9
>Reporter: Yonik Seeley
> Fix For: 2.9.1
>
>
> http://search.lucidimagination.com/search/document/f29fc52348ab9b63/arrayindexoutofboundsexception_during_indexing

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Created: (LUCENE-1996) EnwikiContentSource isn't thread safe

2009-10-19 Thread Michael McCandless (JIRA)

EnwikiContentSource isn't thread safe
-

 Key: LUCENE-1996
 URL: https://issues.apache.org/jira/browse/LUCENE-1996
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/benchmark
Reporter: Michael McCandless
Priority: Minor
 Fix For: 3.1


When I run this alg:
{code}
analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer

content.source=org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource
docs.file=/x/lucene/enwiki-20090724-pages-articles.xml.bz2
doc.tokenized = false
ram.flush.mb=32.0


doc.stored = false
doc.term.vector = false
log.step.AddDoc=1

directory=FSDirectory
autocommit=false
compound=false

work.dir=/lucene/work.wiki.nd0.02M

{ "BuildIndex"
  - CreateIndex
  [ { "AddDocs" AddDoc > : 1 } : 2
  - CloseIndex
}

RepSumByPrefRound BuildIndex
{code}

I hit exceptions in each thread like this:

{code}
Exception in thread "Thread-2" java.lang.RuntimeException: 
org.xml.sax.SAXParseException: Open quote is expected for attribute "msxi" 
associated with an  element type  "mdiiki".
at 
org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:189)
at java.lang.Thread.run(Thread.java:613)
Caused by: org.xml.sax.SAXParseException: Open quote is expected for attribute 
"msxi" associated with an  element type  "mdiiki".
at 
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236)
at 
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:215)
at 
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:386)
at 
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:316)
at 
com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1441)
at 
com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:802)
at 
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(XMLNSDocumentScannerImpl.java:578)
at 
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:222)
at 
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(XMLNSDocumentScannerImpl.java:779)
at 
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1794)
at 
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:368)
at 
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:834)
at 
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764)
at 
com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:148)
at 
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1242)
at 
org.apache.lucene.benchmark.byTask.feeds.EnwikiContentSource$Parser.run(EnwikiContentSource.java:166)
... 1 more
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-1929) Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery

2009-10-19 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved LUCENE-1929.
-

   Resolution: Fixed
Lucene Fields: [New, Patch Available]  (was: [New])

> Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery
> --
>
> Key: LUCENE-1929
> URL: https://issues.apache.org/jira/browse/LUCENE-1929
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/highlighter
>Affects Versions: 2.9
>Reporter: Mark Miller
>Assignee: Mark Miller
> Fix For: 2.9.1
>
> Attachments: LUCENE-1929.patch
>
>
> Sucks. Will throw a NullPointer exception. 
> Only NumericRangeQuery will throw the exception.
> RangeQuery just won't highlight.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Created: (LUCENE-1995) ArrayIndexOutOfBoundsException during indexing

2009-10-19 Thread Yonik Seeley (JIRA)

ArrayIndexOutOfBoundsException during indexing
--

 Key: LUCENE-1995
 URL: https://issues.apache.org/jira/browse/LUCENE-1995
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.9
Reporter: Yonik Seeley
 Fix For: 2.9.1


http://search.lucidimagination.com/search/document/f29fc52348ab9b63/arrayindexoutofboundsexception_during_indexing

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767463#action_12767463
 ] 

Robert Muir commented on LUCENE-1987:
-

bq. The problem is that this is not very different from saying "the onus is on 
the user to call the setXYZ method to get back to the old buggy behavior", 
which at least last time we discussed back-compat was controversial (ie, it's a 
change to our drop-in back-compat policy).

Michael, yes I agree with you. What I am wondering is: is it really working in 
practice/in spirit? Forcing the user to supply the version, well it does make 
them look at the warning in the Version class, which is good.  But nothing 
stops them from just using CURRENT.

{noformat}
Use this to get the latest & greatest settings, bug fixes, etc, for Lucene.
{noformat}

followed by the big bold warning about backwards compatibility. just curious 
what most users are doing, sacrificing drop-in for "latest and greatest?"

I do think we should do things to improve contrib analyzers that are still 
stuck with this buggy behavior at some point: i.e LUCENE-1373.
But maybe we don't need the Version with contrib analyzers, since you should be 
able to use an older lucene-analyzers jar file with new lucene if you want the 
back compat

(sorry to stray somewhat off-topic)


> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 2.9.1, 3.0
>
> Attachments: LUCENE-1987-StopFilter-backport29.patch, 
> LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: [jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Mark Miller

Uwe Schindler (JIRA) wrote:
>
> And please: next time when we deprecate APIs: remove all deprecated calls 
> from tests and contrib and mark all deprecated-test as such!
>
>   
Its the nature of open source. Each of us takes the work that other
contributors are willing/able/havetime to provide - and fill in the rest
ourselves or decide its too much work and don't. I agree that its a nice
idea, but I don't think the issue is going away so easily myself ;) In
which case it falls to the poor soul who decides to help later and
remove the deprecated methods. Or perhaps it keeps someone from stepping
up and doing that - nature of the beast.

But as long as we are making such requests, please no one commit any
more funky source formatting either :) It hurts my eyes.

-- 
- Mark

http://www.lucidimagination.com




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767453#action_12767453
 ] 

Robert Muir commented on LUCENE-1987:
-

bq. Ugh, this is because they embed StopFilter, right? One option might be to 
simply keep StopFilter's deprecated static methods for setting the default? 
Though I think adding Version to them over time is the right thing to do 
(though more work, today).

not just this. Many use StandardTokenizer, so they have same invalid acronym, 
etc issues StandardAnalyzer has. But, this versioning/etc is all managed at 
StandardAnalyzer level (system properties, version numbers, etc)... when it 
also affects these other analyzers too.

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 2.9.1, 3.0
>
> Attachments: LUCENE-1987-StopFilter-backport29.patch, 
> LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767450#action_12767450
 ] 

Michael McCandless commented on LUCENE-1987:


bq. maybe the default should really be LUCENE_CURRENT, and if you want the back 
compat-buggy behavior, the onus is on you as the user to set the flag right if 
you don't want to reindex?

The problem is that this is not very different from saying "the onus is on the 
user to call the setXYZ method to get back  to the old buggy behavior", which 
at least last time we discussed back-compat was controversial (ie, it's a 
change to our drop-in back-compat policy).

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 2.9.1, 3.0
>
> Attachments: LUCENE-1987-StopFilter-backport29.patch, 
> LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: 2.9.1

2009-10-19 Thread Michael McCandless

OK, so now we're up to 3 2.9.1 issues to be resolved.

Mike

On Mon, Oct 19, 2009 at 1:56 PM, Uwe Schindler  wrote:
> Please wait and look at https://issues.apache.org/jira/browse/LUCENE-1987
>
> We have some inconsistencies between QueryParser and the new
> StandardAnalyzer with stop word posIncr.
>
> There is also a patch for 2.9 there!
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>> -Original Message-
>> From: Michael McCandless [mailto:luc...@mikemccandless.com]
>> Sent: Monday, October 19, 2009 6:03 PM
>> To: java-dev@lucene.apache.org; yo...@lucidimagination.com
>> Subject: Re: 2.9.1
>>
>> On Mon, Oct 19, 2009 at 11:54 AM, Yonik Seeley
>>  wrote:
>> > On Wed, Oct 14, 2009 at 5:39 PM, Michael McCandless
>> >  wrote:
>> >> I can cut the 2.9.1 release, but... should we wait a bit to see
>> >> whether other issues come up?  Or do it, now?
>> >
>> > Other issues came up, and were quickly fixed - nice job guys!.
>> > I don't see anything else serious lurking about... seems like the
>> > 2.9.1 release process could be started soon?
>>
>> +1, I'll try to get an RC out tomorrow.
>>
>> Mike
>>
>> -
>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>
>
> -
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767449#action_12767449
 ] 

Michael McCandless commented on LUCENE-1987:


bq. All contrib analyzers have stopWordPosIncr turned off (backwards 
compatibility). Maybe we need a Version Parameter in all analyzers there too!

Ugh, this is because they embed StopFilter, right?  One option might be to 
simply keep StopFilter's deprecated static methods for setting the default?  
Though I think adding Version to them over time is the right thing to do 
(though more work, today).

bq. benchmark does not work any longer, because StandardAnalyzer has no default 
ctor anymore and cannot be instantiated by reflection, same with StopAnalyzer

When the no-arg ctor is unavailable, can we fallback to looking for a ctor that 
takes Version?  For now we should just pass LUCENE_CURRENT; a future 
enhancement to benchmark can allow specifying version compat.

bq. The default of QueryParser is to ignore position increments, but the 
current version of StandardAnalyzer uses posIncr for stop words

Hmm.  How about adding Version to QP ctor?

bq. And please: next time when we deprecate APIs: remove all deprecated calls 
from tests and contrib and mark all deprecated-test as such!

OK, I agree.  I'll try to do this in the future!


> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 2.9.1, 3.0
>
> Attachments: LUCENE-1987-StopFilter-backport29.patch, 
> LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767435#action_12767435
 ] 

Robert Muir commented on LUCENE-1987:
-

bq. All contrib analyzers have stopWordPosIncr turned off (backwards 
compatibility). Maybe we need a Version Parameter in all analyzers there too! 

Personally I would not be against this, not sure yet... downside would be more 
complexity and maintenance
Upside would be that we could improve these analyzers in various ways, without 
annoying users

bq. benchmark does not work any longer, because StandardAnalyzer has no default 
ctor anymore and cannot be instantiated by reflection, same with StopAnalyzer 
I also personally like having default ctor... its convienient and nice to be 
able to look at what these analyzers do in Luke, etc
But I think this goes against the version flag concept? (because if users just 
set it to LUCENE_CURRENT then its doing nothing?)
But I wonder if users do this anyway... maybe the default should really be 
LUCENE_CURRENT, and if you want the back compat-buggy behavior, the onus is on 
you as the user to set the flag right if you don't want to reindex?



> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 2.9.1, 3.0
>
> Attachments: LUCENE-1987-StopFilter-backport29.patch, 
> LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1987:
--

Fix Version/s: 2.9.1

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 2.9.1, 3.0
>
> Attachments: LUCENE-1987-StopFilter-backport29.patch, 
> LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

RE: 2.9.1

2009-10-19 Thread Uwe Schindler

Please wait and look at https://issues.apache.org/jira/browse/LUCENE-1987

We have some inconsistencies between QueryParser and the new
StandardAnalyzer with stop word posIncr.

There is also a patch for 2.9 there!

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

> -Original Message-
> From: Michael McCandless [mailto:luc...@mikemccandless.com]
> Sent: Monday, October 19, 2009 6:03 PM
> To: java-dev@lucene.apache.org; yo...@lucidimagination.com
> Subject: Re: 2.9.1
> 
> On Mon, Oct 19, 2009 at 11:54 AM, Yonik Seeley
>  wrote:
> > On Wed, Oct 14, 2009 at 5:39 PM, Michael McCandless
> >  wrote:
> >> I can cut the 2.9.1 release, but... should we wait a bit to see
> >> whether other issues come up?  Or do it, now?
> >
> > Other issues came up, and were quickly fixed - nice job guys!.
> > I don't see anything else serious lurking about... seems like the
> > 2.9.1 release process could be started soon?
> 
> +1, I'll try to get an RC out tomorrow.
> 
> Mike
> 
> -
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1987:
--

Attachment: LUCENE-1987-StopFilter-backport29.patch
LUCENE-1987-StopFilter-BW.patch
LUCENE-1987-StopFilter.patch

Here 2 mega patches and one backport to 2.9 (want to get this in before 2.9.1):

All core tests pass, all bw tests pass. Most contrib tests also pass, but we 
have the following problems and inconsistencies:

- benchmark does not work any longer, because StandardAnalyzer has no default 
ctor anymore and cannot be instantiated by reflection, same with StopAnalyzer
- Highlighter only works, if StandardAnalyzer is in 2.4 mde, in 2.9 mode 
(current) it fails because the position increments of stop words are not 
correctly respected. This fails in addition/combination with the following:
- Very bad inconsistency: The default of QueryParser is to ignore position 
increments, but the current version of StandardAnalyzer uses posIncr for stop 
words -> bäng. We should change the default for QueryParser(+ contrib QP), too. 
There is march rework needed and much documentation. The tests in core now 
pass, as most parts use StandardAnalyzer in 2.9 mode but have no stop words. 
And the special tests explicitely set the posIncr flag. This is totally 
disturbed, it needs fixing! (it also affects 2.9.0, if somebody uses the new 
StandardAnalyzer with LUCENE_CURRENT). 
- XMLQueryParser also fails with latest StandardAnalyzer version, because it 
cannot set the flag in QueryParser. In my opinion, the query parser should take 
the flag from the analyzer, but this is not easy to fix.
- All contrib analyzers have stopWordPosIncr turned off (backwards 
compatibility). Maybe we need a Version Parameter in all analyzers there too!

What to do? After this StopFilter/StandardAnalyzer-hell-day Aspirin and 
Paracetamol and beer is not enough to think clear again...

And please: next time when we deprecate APIs: remove all deprecated calls from 
tests and contrib and mark all deprecated-test as such!

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter-backport29.patch, 
> LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1955) Fix Hits deprecation notice

2009-10-19 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767397#action_12767397
 ] 

Mark Miller commented on LUCENE-1955:
-

Sorry again ;) I'm slowing everything up - feel free - if you don't, I'll do it 
when I commit the Highlighter fix in a bit. Just have to throw my noisy laptop 
out the window and into a brick wall first ...

> Fix Hits deprecation notice
> ---
>
> Key: LUCENE-1955
> URL: https://issues.apache.org/jira/browse/LUCENE-1955
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Javadocs
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 2.9.1
>
>
> Just needs to be committed to 2.9 branch since hits is now removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1929) Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery

2009-10-19 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767394#action_12767394
 ] 

Mark Miller commented on LUCENE-1929:
-

Yeah - sorry - has been for some time. I can commit it shortly.

> Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery
> --
>
> Key: LUCENE-1929
> URL: https://issues.apache.org/jira/browse/LUCENE-1929
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/highlighter
>Affects Versions: 2.9
>Reporter: Mark Miller
>Assignee: Mark Miller
> Fix For: 2.9.1
>
> Attachments: LUCENE-1929.patch
>
>
> Sucks. Will throw a NullPointer exception. 
> Only NumericRangeQuery will throw the exception.
> RangeQuery just won't highlight.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1929) Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery

2009-10-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767392#action_12767392
 ] 

Michael McCandless commented on LUCENE-1929:


Mark is this one reading to go into 2.9.1?

> Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery
> --
>
> Key: LUCENE-1929
> URL: https://issues.apache.org/jira/browse/LUCENE-1929
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/highlighter
>Affects Versions: 2.9
>Reporter: Mark Miller
>Assignee: Mark Miller
> Fix For: 2.9.1
>
> Attachments: LUCENE-1929.patch
>
>
> Sucks. Will throw a NullPointer exception. 
> Only NumericRangeQuery will throw the exception.
> RangeQuery just won't highlight.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1955) Fix Hits deprecation notice

2009-10-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767391#action_12767391
 ] 

Michael McCandless commented on LUCENE-1955:


Mark do you want to commit this?  Or I can.  Wanting to cut an RC tomorrow...

> Fix Hits deprecation notice
> ---
>
> Key: LUCENE-1955
> URL: https://issues.apache.org/jira/browse/LUCENE-1955
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Javadocs
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 2.9.1
>
>
> Just needs to be committed to 2.9 branch since hits is now removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1986) NPE in NearSpansUnordered from PayloadNearQuery

2009-10-19 Thread Peter Keegan (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767373#action_12767373
 ] 

Peter Keegan commented on LUCENE-1986:
--

+  if (!more) {
+return false;
+  }
I was about to submit this same patch today, but I see you beat me to it :) 
Thanks Mark.

> NPE in NearSpansUnordered from PayloadNearQuery
> ---
>
> Key: LUCENE-1986
> URL: https://issues.apache.org/jira/browse/LUCENE-1986
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Search
>Affects Versions: 2.9
>Reporter: Peter Keegan
>Assignee: Michael McCandless
> Fix For: 2.9.1, 3.0
>
> Attachments: LUCENE-1986.patch, LUCENE-1986.patch, 
> TestPayloadNearQuery1.java
>
>
> The following query causes a NPE in NearSpansUnordered, and is reproducible 
> with the the attached unit test. The failure occurs on the last document 
> scored.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: [jira] Commented: (LUCENE-1994) EnwikiConentSource does not work with parallel tasks

2009-10-19 Thread Mark Miller

I don't think some of the stat tracking works right with parallel either
- to get the total time, its adding up when each thread finished - eg if
thread one finishes at second 30 and thread2 at second 32, its saying it
took 62 seconds total.

   [java] > algorithm:
 [java] Seq {
 [java] Rounds_2 {
 [java] ResetSystemErase
 [java] Populate {
 [java] CreateIndex
 [java] Par_8 [
 [java] MAddDocs_2500 {
 [java] AddDoc
 [java] } * 2500
 [java] ] * 8
 [java] Optimize
 [java] CommitIndex
 [java] CloseIndex
 [java] }
 [java] RepSumByPref MAddDocs
 [java] NewRound
 [java] } * 2
 [java] RepSumByNameRound
 [java] RepSumByName
 [java] RepSumByPrefRound MAddDocs
 [java] }
 [java] > starting task: Seq
 [java] > starting task: Rounds_2
 [java] > starting task: ResetSystemErase
 [java] > starting task: Populate
 [java] 55.84 sec --> Thread-2 added 2000 docs
 [java] 60.94 sec --> Thread-6 added 2000 docs
 [java] 74.82 sec --> Thread-0 added 2000 docs
 [java] 77.48 sec --> Thread-3 added 2000 docs
 [java] 81.21 sec --> Thread-1 added 2000 docs
 [java] 90.72 sec --> Thread-5 added 2000 docs
 [java] 96.46 sec --> Thread-7 added 2000 docs
 [java] 97.17 sec --> Thread-4 added 2000 docs
 [java] > Report Sum By Prefix (MAddDocs) (1 about 8 out
of 20016)
 [java] Operation round mrg flush cmpnd   runCnt  
recsPerRunrec/s  elapsedSecavgUsedMemavgTotalMem
 [java] MAddDocs_2500 0  20 48.00 false8
250028.01  713.99   135,359,120273,850,368

Shai Erera (JIRA) wrote:
> [ 
> https://issues.apache.org/jira/browse/LUCENE-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767343#action_12767343
>  ] 
>
> Shai Erera commented on LUCENE-1994:
> 
>
> Yes I agree (to both comments). Basically for a ContentSource to be supported 
> by parallel tasks, its getNextDocData should be made synchronized, or it 
> finds another way to sync on the important stuff (for example 
> TrecContentSource).
>
>   
>> EnwikiConentSource does not work with parallel tasks
>> 
>>
>> Key: LUCENE-1994
>> URL: https://issues.apache.org/jira/browse/LUCENE-1994
>> Project: Lucene - Java
>>  Issue Type: Bug
>>  Components: contrib/benchmark
>>Affects Versions: 2.9
>>Reporter: Mark Miller
>>Priority: Minor
>>
>> 
>
>
>   


-- 
- Mark

http://www.lucidimagination.com




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: 2.9.1

2009-10-19 Thread Michael McCandless

On Mon, Oct 19, 2009 at 11:54 AM, Yonik Seeley
 wrote:
> On Wed, Oct 14, 2009 at 5:39 PM, Michael McCandless
>  wrote:
>> I can cut the 2.9.1 release, but... should we wait a bit to see
>> whether other issues come up?  Or do it, now?
>
> Other issues came up, and were quickly fixed - nice job guys!.
> I don't see anything else serious lurking about... seems like the
> 2.9.1 release process could be started soon?

+1, I'll try to get an RC out tomorrow.

Mike

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: 2.9.1

2009-10-19 Thread Yonik Seeley

On Wed, Oct 14, 2009 at 5:39 PM, Michael McCandless
 wrote:
> I can cut the 2.9.1 release, but... should we wait a bit to see
> whether other issues come up?  Or do it, now?

Other issues came up, and were quickly fixed - nice job guys!.
I don't see anything else serious lurking about... seems like the
2.9.1 release process could be started soon?

-Yonik
http://www.lucidimagination.com

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1994) EnwikiConentSource does not work with parallel tasks

2009-10-19 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767343#action_12767343
 ] 

Shai Erera commented on LUCENE-1994:


Yes I agree (to both comments). Basically for a ContentSource to be supported 
by parallel tasks, its getNextDocData should be made synchronized, or it finds 
another way to sync on the important stuff (for example TrecContentSource).

> EnwikiConentSource does not work with parallel tasks
> 
>
> Key: LUCENE-1994
> URL: https://issues.apache.org/jira/browse/LUCENE-1994
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/benchmark
>Affects Versions: 2.9
>Reporter: Mark Miller
>Priority: Minor
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1994) EnwikiConentSource does not work with parallel tasks

2009-10-19 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767341#action_12767341
 ] 

Mark Miller commented on LUCENE-1994:
-

bq. I believe this was the original behavior of EnwikiDocMaker

Probably - but we should make it work right?

bq. But anyway, I think that if getNextDocData will be synchronized, this 
should do it?

Thats actually what I did locally as a quick fix - seems to work out alright.

> EnwikiConentSource does not work with parallel tasks
> 
>
> Key: LUCENE-1994
> URL: https://issues.apache.org/jira/browse/LUCENE-1994
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/benchmark
>Affects Versions: 2.9
>Reporter: Mark Miller
>Priority: Minor
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1994) EnwikiConentSource does not work with parallel tasks

2009-10-19 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767335#action_12767335
 ] 

Shai Erera commented on LUCENE-1994:


I believe this was the original behavior of EnwikiDocMaker. But anyway, I think 
that if getNextDocData will be synchronized, this should do it?

> EnwikiConentSource does not work with parallel tasks
> 
>
> Key: LUCENE-1994
> URL: https://issues.apache.org/jira/browse/LUCENE-1994
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/benchmark
>Affects Versions: 2.9
>Reporter: Mark Miller
>Priority: Minor
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Created: (LUCENE-1994) EnwikiConentSource does not work with parallel tasks

2009-10-19 Thread Mark Miller (JIRA)

EnwikiConentSource does not work with parallel tasks


 Key: LUCENE-1994
 URL: https://issues.apache.org/jira/browse/LUCENE-1994
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/benchmark
Affects Versions: 2.9
Reporter: Mark Miller
Priority: Minor




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767304#action_12767304
 ] 

Uwe Schindler commented on LUCENE-1987:
---

OK, I fix the tests using find/grep/sed :-)

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987.patch, LUCENE-1987.patch, LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Created: (LUCENE-1993) MoreLikeThis - allow to exclude terms that appear in too many documents (patch included)

2009-10-19 Thread Christian Steinert (JIRA)

MoreLikeThis - allow to exclude terms that appear in too many documents (patch 
included)


 Key: LUCENE-1993
 URL: https://issues.apache.org/jira/browse/LUCENE-1993
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/*
Affects Versions: 2.9
Reporter: Christian Steinert
 Attachments: MoreLikeThis.java.patch

The MoreLikeThis class allows to generate a likeness query based on a given 
document. So far, it is impossible to suppress words from the likeness query, 
that appear in almost all documents, making it necessary to use extensive lists 
of stop words.

Therefore I suggest to allow excluding words for which a certain absolute 
document count or a certain percentage of documents is exceeded. Depending on 
the corpus of text, words that appear in more than 50 or even 70% of documents 
can usually be considered insignificant for classifying a document.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1993) MoreLikeThis - allow to exclude terms that appear in too many documents (patch included)

2009-10-19 Thread Christian Steinert (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Steinert updated LUCENE-1993:
---

Attachment: MoreLikeThis.java.patch

suggested patch against current SVN head

> MoreLikeThis - allow to exclude terms that appear in too many documents 
> (patch included)
> 
>
> Key: LUCENE-1993
> URL: https://issues.apache.org/jira/browse/LUCENE-1993
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/*
>Affects Versions: 2.9
>Reporter: Christian Steinert
> Attachments: MoreLikeThis.java.patch
>
>   Original Estimate: 0.17h
>  Remaining Estimate: 0.17h
>
> The MoreLikeThis class allows to generate a likeness query based on a given 
> document. So far, it is impossible to suppress words from the likeness query, 
> that appear in almost all documents, making it necessary to use extensive 
> lists of stop words.
> Therefore I suggest to allow excluding words for which a certain absolute 
> document count or a certain percentage of documents is exceeded. Depending on 
> the corpus of text, words that appear in more than 50 or even 70% of 
> documents can usually be considered insignificant for classifying a document. 
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767299#action_12767299
 ] 

Michael McCandless commented on LUCENE-1987:


bq. I did not remove the other version constants, because then we have them and 
can use them anywhere else. And a user coming from e.g. 2.2 to 3.0 can just use 
LUCENE_22 to match his old behaviour. The user should be free to give his 
version he used before for this backwards compatibility.

OK I think that's reasonable.

bq. Mike: Should I backport the setting for 2.4 to 2.9 to enable 
plugin-replacements from 2.9.1 to 3.0?

+1

{quote}
bq. Going forward, when we fix a bug but need to conditionally preserve the bug 
for back compat, we should use the version switching so that by default for new 
users (VERSION_CURRENT or VERSION_XX if XX is the next release) the bug is 
fixed.

Do you mean I should add the default ctor of StandardAnalyzer() and rewire it 
to LUCENE_CURRENT?
{quote}

Sorry, I wasn't clear...

No -- I don't think we should ever have a ctor that defaults to LUCENE_CURRENT. 
 That's a back compat trap (and it just gets us back to where we started when 
we had no explicit version).  Users must be explicit about which version they 
want.

What I meant was: when fixing some sneaky bug in the future, we should never 
set the default so that the bug is still present (as we did on the first go of 
"invalid acronyms"), expecting new users to realize they have to go out of 
their way to tell Lucene not to emulate the bug.  Instead, the default going 
forward (if version >= next-release-version) should be "the bug is fixed".

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987.patch, LUCENE-1987.patch, LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Highlighting - catering for all query types

2009-10-19 Thread mark harwood

I've been putting together some code to support highlighting of opaque query 
clauses (cached filters, trie range, spatial etc etc) which shows some promise.

This is not intended as a replacement for the existing highlighter(s) which 
deal with free-text but is instead concentrating on the hard-to-highlight 
clauses and has the benefit of working in-line with the query process.
Summarisation is not a requirement here - I simply need to know if a given 
query clause matched on a result.

The approach I have come up with is to wrap query clauses with lightweight 
(processing and RAM-wise) instrumenting objects in order to record which 
clauses matched.
The recorded matches are encoded as a byte in the document score which 
unfortunately requires some loss of precision in the scores - more on this 
later.

The general approach for use looks like this:

//Wrap *any* type of query object for highlight flagging and allocate a 
flag number between 1 and 8 for the clauses of interest
FlagRecordingQuery frqA=new FlagRecordingQuery(new TermQuery(new 
Term("statusField","published")),1);
FlagRecordingQuery frqB=new FlagRecordingQuery(new 
XyzLtd3rdPartyQuery("imageDataField", "unknown magic to find 'sunset'")),2);

BooleanQuery bq=new BooleanQuery();
bq.add(new BooleanClause(frqA,Occur.SHOULD));
bq.add(new BooleanClause(frqB,Occur.SHOULD));

//Parent query must be a FlagCombiningQuery to encode child match info 
in the doc scores
FlagCombiningQuery fcq=new FlagCombiningQuery(bq);

//Run search
TopDocs td = s.search(fcq,10);
ScoreDoc[] sd = td.scoreDocs;
for (ScoreDoc scoreDoc : sd)
{
float score=scoreDoc.score;

//Check to see which flags are encoded in the score.
if(FlagCombiningQuery.hasFlag(1, score))
{
System.out.println("woot! "+scoreDoc.doc+" matched clause 1 ");
}
if(FlagCombiningQuery.hasFlag(2, score))
{
System.out.println("woot! "+scoreDoc.doc+" matched clause 2 ");
}
}


The FlagRecordingQuery child clauses introduce themselves to the 
FlagCombiningQuery through a thread local at "rewrite" time.
The FlagCombiningQuery at the root adjusts the scores as follows:

static final float DEFAULT_MULTIPLIER=1000f;
float multiplier=DEFAULT_MULTIPLIER;

public float score() throws IOException
{
float score = delegateScorer.score();
byte flags=0;
int d=doc();
//encode all matched child clauses into a "flags" byte.
for (FlagRecordingQuery frq : thisThreadsFlags)
{
if(frq.matched(d))
{
byte mask=flagMasks[frq.flag-1];
flags=setFlag(flags, mask);
}
}

//Multiply score to turn float into int with sufficient fractions 
in score.
int shiftedI=(int) (score*multiplier);
//Shift int to make space for byte holding flags
int iPlusSpaceForByte=shiftedI<<8;
//Add match flags
int iCombinedScoreAndFlags=iPlusSpaceForByte|flags;
System.out.println("combined score="+iCombinedScoreAndFlags+" for 
doc#"+doc());
return iCombinedScoreAndFlags;
}

The mechanism works but relies on original score values that :
a) Are not too big - i.e. do not lose significant digits when multiplied by 
"multiplier" and then shifted left 8 bits.
b) Are not too similar - i.e. only differ in very small fractions e.g. all 
scores occur in the range 0.1234 to 0.1235

To give an indication of restrictions this imposes here are the usable score 
ranges for various settings of "multiplier":

multiplier   max score   fraction precision
==      =
10   838860 0.x
100 83886  0.xx
1000   8388 0.xxx
1 838   0.

I would imagine the majority of Lucene query results would still rank sensibly 
with a 1,000 or 10,000 multiplier.

However, all this potentially dangerous bit twiddling could of course be 
avoided if the Lucene search API was expanded to include docid, score AND a 
completely seperate field for recording match flags. 


Thoughts?




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767273#action_12767273
 ] 

Uwe Schindler commented on LUCENE-1987:
---

bq. Going forward, when we fix a bug but need to conditionally preserve the bug 
for back compat, we should use the version switching so that by default for new 
users (VERSION_CURRENT or VERSION_XX if XX is the next release) the bug is 
fixed.

Do you mean I should add the default ctor of StandardAnalyzer() and rewire it 
to LUCENE_CURRENT? We have to put this in the docs, that from 3.0 on, the 
standard analyzer's default ctor now no longer behaves like 2.4, but always 
uses the newest features.

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987.patch, LUCENE-1987.patch, LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767273#action_12767273
 ] 

Uwe Schindler edited comment on LUCENE-1987 at 10/19/09 3:14 AM:
-

bq. Going forward, when we fix a bug but need to conditionally preserve the bug 
for back compat, we should use the version switching so that by default for new 
users (VERSION_CURRENT or VERSION_XX if XX is the next release) the bug is 
fixed.

Do you mean I should add the default ctor of StandardAnalyzer() and rewire it 
to LUCENE_CURRENT? We have to put this in the docs, that from 3.0 on, the 
standard analyzer's default ctor now no longer behaves like 2.4, but always 
uses the newest features.

That would help me lot with the tests

  was (Author: thetaphi):
bq. Going forward, when we fix a bug but need to conditionally preserve the 
bug for back compat, we should use the version switching so that by default for 
new users (VERSION_CURRENT or VERSION_XX if XX is the next release) the bug is 
fixed.

Do you mean I should add the default ctor of StandardAnalyzer() and rewire it 
to LUCENE_CURRENT? We have to put this in the docs, that from 3.0 on, the 
standard analyzer's default ctor now no longer behaves like 2.4, but always 
uses the newest features.
  
> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987.patch, LUCENE-1987.patch, LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1987:
--

Attachment: LUCENE-1987-StopFilter.patch

Updated patch with LUCENE_24. I did not remove the other version constants, 
because then we have them and can use them anywhere else. And a user coming 
from e.g. 2.2 to 3.0 can just use LUCENE_22 to match his old behaviour. The 
user should be free to give his version he used before for this backwards 
compatibility.

Mike: Should I backport the setting for 2.4 to 2.9 to enable 
plugin-replacements from 2.9.1 to 3.0?

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
> LUCENE-1987.patch, LUCENE-1987.patch, LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Call the authorities

2009-10-19 Thread Michael McCandless

Indeed!  It's doing nothing now.  Just creating Sort objects but not
in fact doing any searching with them.  Hmm.

Unfortunately, the test very much relied on the deprecated
"setUseLegacySearch" API, to compare old vs new sorting.  I suppose
its time has past, given that it has had a good amount of time, now,
to assert that old and new were producing identical results.

Should we just remove it?

Mike

On Sun, Oct 18, 2009 at 11:20 PM, Mark Miller  wrote:
> Mark Miller wrote:
>> TestStressSort has been butchered.
>>
>>
> I suppose we could just pull it since it wouldn't check for much any
> more - looks awful funny as is.
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>
> -
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767270#action_12767270
 ] 

Michael McCandless commented on LUCENE-1987:


bq. LUCENE-1068 says: Fix version 2.3

Right, that bug was fixed in 2.3, however with that fix the buggy behavior was 
kept by default.  In 2.4 we then fixed the default to be true, ie, the bug 
would be fixed by default.  So if I were to specify VERSION_23, I should get 
the buggy behavior, but if I specify VERSION_24, I should get the correct 
behavior.

Going forward, when we fix a bug but need to conditionally preserve the bug for 
back compat, we should use the version switching so that by default for new 
users (VERSION_CURRENT or VERSION_XX if XX is the next release) the bug is 
fixed.

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767269#action_12767269
 ] 

Uwe Schindler commented on LUCENE-1987:
---

I just added also 20 and 21. I can remove them again (20 and 21).
22 is needed because the invalidAcronym thing is there in 2.2 and fixed in 2.3 
(according to LUCENE-1068).

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767267#action_12767267
 ] 

Michael McCandless commented on LUCENE-1987:


Why add 2.0, 2.1. 2.2 versions?  We don't anywhere emulate bugs based on those, 
right?  Otherwise, patch looks great!  Thanks Uwe.  Nice to see 
StandardAnalyzer clean again.

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1987:
--

Attachment: LUCENE-1987-StopFilter.patch

Javadocs fixes.

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767265#action_12767265
 ] 

Uwe Schindler commented on LUCENE-1987:
---

LUCENE-1068 says: Fix version 2.3

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter.patch, 
> LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767264#action_12767264
 ] 

Michael McCandless commented on LUCENE-1987:


I'll have a look, but one thing is invalid acronym replacement should be 
enabled if version >= 2.4, not >= 2.3.  Ie, if version is 2.3, the bug is still 
present.

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch, LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1987:
--

Attachment: (was: LUCENE-1987-StopFilter.patch)

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch, LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1987:
--

Attachment: LUCENE-1987-StopFilter.patch

Correct patch.

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch, LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767262#action_12767262
 ] 

Uwe Schindler commented on LUCENE-1987:
---

If we are fine with that, I would backport the version constants and the 
default setting to 2.9.x

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch, LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-19 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1987:
--

Attachment: LUCENE-1987-StopFilter.patch

Hallo Mike,

attached is a patch with all deprecated methods removed (only the 
setOverridesTokenStream is still there, making Analyzers final is another thing 
to do).

Also StopFilter and its stopWord ets were generified (to , which is ok for 
every type of set, as CharArraySet uses toString() to convert everything to 
string when testing, so any set is fine)

I only had the following problems and solution is here (StandardAnalyzer):
{code}
enableStopPositionIncrements = matchVersion.onOrAfter(Version.LUCENE_29);
replaceInvalidAcronym = matchVersion.onOrAfter(Version.LUCENE_23);
{code}

The setting defaultPosIncr was removed (static method, so there is no default 
anymore). Because of that, the pre 2.9 default was false (which is now not 
changeable). So I set the posIncr to false for all older versions (this was the 
default before, but is now fixed as no static setter/sysprop anymore)

For the invalid acronyms I added LUCENE_23 version constant, so for all 
versions >=2.3 it is enabled. If you want old behaviour, use LUCENE_22 or below.

Mike: Can you review this?

If you're ok with it I have to change 175 "new StandardAnalyzer()" occurences 
in tests :(

> Remove rest of analysis deprecations (Token, CharacterCache)
> 
>
> Key: LUCENE-1987
> URL: https://issues.apache.org/jira/browse/LUCENE-1987
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Analysis
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, 
> LUCENE-1987.patch, LUCENE-1987.patch
>
>
> These removes the rest of the deprecations in the analysis package:
> - -Token's termText field-- (DONE)
> - -eventually un-deprecate ctors of Token taking Strings (they are still 
> useful) -> if yes remove deprec in 2.9.1- (DONE)
> - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
> - Stopwords lists
> - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
> are deprecated, but we still have the VERSION constants. Do not know, how to 
> proceed. Keep the settings alive for index compatibility? Or remove it 
> together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1992) intermittent failure in TestIndexWriter. testExceptionDuringSync

2009-10-19 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767258#action_12767258
 ] 

Michael McCandless commented on LUCENE-1992:


bq. Should this also applied to bw branch (2.4 for) 2.9 and (2.9 for) 3.0? 

No, it can only be applied on trunk.

That call tells ConcurrentMergeScheduler to expect exceptions during this test, 
which when autoCommit is true (which this test is doing everywhere except 
trunk) will happen because when a merge completes, it'll commit and call 
Directory.sync which throws the intentional exception.

> intermittent failure in TestIndexWriter. testExceptionDuringSync 
> -
>
> Key: LUCENE-1992
> URL: https://issues.apache.org/jira/browse/LUCENE-1992
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Index
>Reporter: Michael McCandless
>Assignee: Michael McCandless
>Priority: Minor
> Fix For: 2.9.1, 3.0
>
> Attachments: LUCENE-1992.patch
>
>
> {code}
> common.test:
> [mkdir] Created dir: C:\Projects\lucene\trunk-full1\build\test
> [junit] Testsuite: org.apache.lucene.index.TestIndexWriter
> [junit] Tests run: 102, Failures: 0, Errors: 1, Time elapsed: 100,297sec
> [junit]
> [junit] Testcase: 
> testExceptionDuringSync(org.apache.lucene.index.TestIndexWriter): Caused an 
> ERROR
> [junit] _a.fnm
> [junit] java.io.FileNotFoundException: _a.fnm
> [junit] at 
> org.apache.lucene.store.MockRAMDirectory.openInput(MockRAMDirectory.java:226)
> [junit] at 
> org.apache.lucene.index.FieldInfos.(FieldInfos.java:68)
> [junit] at 
> org.apache.lucene.index.SegmentReader$CoreReaders.(SegmentReader.java:116)
> [junit] at 
> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:620)
> [junit] at 
> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:590)
> [junit] at 
> org.apache.lucene.index.DirectoryReader.(DirectoryReader.java:104)
> [junit] at 
> org.apache.lucene.index.ReadOnlyDirectoryReader.(ReadOnlyDirectoryReader.java:27)
> [junit] at 
> org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:74)
> [junit] at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:704)
> [junit] at 
> org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:69)
> [junit] at 
> org.apache.lucene.index.IndexReader.open(IndexReader.java:307)
> [junit] at 
> org.apache.lucene.index.IndexReader.open(IndexReader.java:193)
> [junit] at 
> org.apache.lucene.index.TestIndexWriter.testExceptionDuringSync(TestIndexWriter.java:2723)
> [junit] at 
> org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:206)
> [junit]
> [junit]
> [junit] Test org.apache.lucene.index.TestIndexWriter FAILED
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1257) Port to Java5

2009-10-19 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767242#action_12767242
 ] 

Uwe Schindler commented on LUCENE-1257:
---

One note: I do not want to apply any test-related generics patches, as it makes 
it harder to port patches to the backwards branch currently.
As soon as all deprecations are removed, we could start with fixing the tests. 
Before removing all deprecations it may often be needed to also apply changes 
to the backwards branch, which is Java 1.4 for backwards testing with 2.9.

> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-WordListLoader.patch, 
> LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch, 
> LUCENE-1257_messages.patch, LUCENE-1257_o.a.l.queryParser.patch, 
> LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, 
> LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, 
> lucene1257surround1.patch, shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1257) Port to Java5

2009-10-19 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767241#action_12767241
 ] 

Uwe Schindler commented on LUCENE-1257:
---

Comitted:
   LUCENE-1257-CloseableThreadLocal.patch 2009-10-18 06:31 PM Kay Kay 4 kB 
   LUCENE-1257_analysis.patch 2009-10-18 05:41 PM Robert Muir 8 kB 

At revision: 826601

> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-WordListLoader.patch, 
> LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch, 
> LUCENE-1257_messages.patch, LUCENE-1257_o.a.l.queryParser.patch, 
> LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, 
> LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, 
> lucene1257surround1.patch, shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1257) Port to Java5

2009-10-19 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767240#action_12767240
 ] 

Uwe Schindler commented on LUCENE-1257:
---

I removed some unneeded patches.

> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-WordListLoader.patch, 
> LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch, 
> LUCENE-1257_messages.patch, LUCENE-1257_o.a.l.queryParser.patch, 
> LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, 
> LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, 
> lucene1257surround1.patch, shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1257) Port to Java5

2009-10-19 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1257:
--

Attachment: (was: o.a.l.analysis.patch)

> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-WordListLoader.patch, 
> LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch, 
> LUCENE-1257_messages.patch, LUCENE-1257_o.a.l.queryParser.patch, 
> LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, 
> LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, 
> lucene1257surround1.patch, shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1257) Port to Java5

2009-10-19 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1257:
--

Attachment: (was: java5.patch)

> Port to Java5
> -
>
> Key: LUCENE-1257
> URL: https://issues.apache.org/jira/browse/LUCENE-1257
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis, Examples, Index, Other, Query/Scoring, 
> QueryParser, Search, Store, Term Vectors
>Affects Versions: 3.0
>Reporter: Cédric Champeau
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.0
>
> Attachments: instantiated_fieldable.patch, 
> LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
> LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
> LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
> LUCENE-1257-CompoundFileReaderWriter.patch, 
> LUCENE-1257-ConcurrentMergeScheduler.patch, 
> LUCENE-1257-DirectoryReader.patch, 
> LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
> LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
> LUCENE-1257-IndexDeleter.patch, 
> LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
> LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, 
> LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
> LUCENE-1257-StringBuffer.patch, LUCENE-1257-WordListLoader.patch, 
> LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch, 
> LUCENE-1257_messages.patch, LUCENE-1257_o.a.l.queryParser.patch, 
> LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, 
> LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, 
> LUCENE-1257_o_a_l_search_spans.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, 
> LUCENE-1257_org_apache_lucene_index.patch, lucene1257surround1.patch, 
> lucene1257surround1.patch, o.a.l.analysis.patch, 
> shinglematrixfilter_generified.patch
>
>
> For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
> Java 5 migration had been planned for 2.1 someday in the past, but don't know 
> when it is planned now. This patch against the trunk includes :
> - most obvious generics usage (there are tons of usages of sets, ... Those 
> which are commonly used have been generified)
> - PriorityQueue generification
> - replacement of indexed for loops with for each constructs
> - removal of unnececessary unboxing
> The code is to my opinion much more readable with those features (you 
> actually *know* what is stored in collections reading the code, without the 
> need to lookup for field definitions everytime) and it simplifies many 
> algorithms.
> Note that this patch also includes an interface for the Query class. This has 
> been done for my company's needs for building custom Query classes which add 
> some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
> casts. I know this introduction is not wanted by the team, but it really 
> makes our developments easier to maintain. If you don't want to use this, 
> replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

84 matches

Mail list logo