[jira] Commented: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768118#action_12768118
 ] 

Uwe Schindler commented on LUCENE-1998:
---

I tested it here, we have no backwards problem (at least with normal usage). 
The dynamic linker of Java when running old Java 1.4 code against the new enum 
classes has no problem with the replaced superclass: Old code compiled against 
Field.Store.XXX against lucene-core-2.9.jar with superclass Parameter works 
perfectly with the new lucene-core-3.0.jar. This works because we only use the 
parameter class as a type safe enumeration an did not call any methods (only 
maybe toString()) of it. So the linker has no problem.

I would simply apply this ptach to trunk. I would also remove the Parameter 
class completely, as it breaks no code (only if somebody has used that class 
for own enums). Maybe we should deprecate Parameter in 2.9.1 and say that it 
will be removed in 3.0 as this version uses Java5's enum. But it also does not 
hurt if we keep it and mark it deprecated as in the patch.

To your patch: I only added the license header back in the Version class. It 
must be there.

 Use Java 5 enums
 

 Key: LUCENE-1998
 URL: https://issues.apache.org/jira/browse/LUCENE-1998
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0
Reporter: DM Smith
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1998_enum.patch


 Replace the use of o.a.l.util.Parameter with Java 5 enums, deprecating 
 Parameter.
 Replace other custom enum patterns with Java 5 enums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1998:
--

Attachment: LUCENE-1998_enum.patch

Patch with license header restored.

 Use Java 5 enums
 

 Key: LUCENE-1998
 URL: https://issues.apache.org/jira/browse/LUCENE-1998
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0
Reporter: DM Smith
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1998_enum.patch, LUCENE-1998_enum.patch


 Replace the use of o.a.l.util.Parameter with Java 5 enums, deprecating 
 Parameter.
 Replace other custom enum patterns with Java 5 enums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Assigned: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler reassigned LUCENE-1998:
-

Assignee: Uwe Schindler

 Use Java 5 enums
 

 Key: LUCENE-1998
 URL: https://issues.apache.org/jira/browse/LUCENE-1998
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0
Reporter: DM Smith
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1998_enum.patch, LUCENE-1998_enum.patch


 Replace the use of o.a.l.util.Parameter with Java 5 enums, deprecating 
 Parameter.
 Replace other custom enum patterns with Java 5 enums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1998:
--

Attachment: LUCENE-1998_enum.patch

Some fine tuning: You defined package protected abstract methods, but made them 
public in the enum constant. Changed to all-public. This was also a 
backwards-break in contrib/queryParser.

I think this is ready to commit.

 Use Java 5 enums
 

 Key: LUCENE-1998
 URL: https://issues.apache.org/jira/browse/LUCENE-1998
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0
Reporter: DM Smith
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, 
 LUCENE-1998_enum.patch


 Replace the use of o.a.l.util.Parameter with Java 5 enums, deprecating 
 Parameter.
 Replace other custom enum patterns with Java 5 enums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768132#action_12768132
 ] 

Uwe Schindler commented on LUCENE-1998:
---

Some samll problem that may appear in future: We had renamed some enum 
constants in 2.9 (TOKENIZED - ANALYZED). No problems now, because deprec 
constants removed.

If we want to do the same in future, we can do it the same way, but need to do 
a hack (because it is not officially supprted by Java 5):
[http://forums.sun.com/thread.jspa?threadID=5137742]

So it works, but not with switch statements. Just as a comment. But in my 
opinion, renaming enum constants is a bad thing... 

 Use Java 5 enums
 

 Key: LUCENE-1998
 URL: https://issues.apache.org/jira/browse/LUCENE-1998
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0
Reporter: DM Smith
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, 
 LUCENE-1998_enum.patch


 Replace the use of o.a.l.util.Parameter with Java 5 enums, deprecating 
 Parameter.
 Replace other custom enum patterns with Java 5 enums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2000) Use covariant clone() return types

2009-10-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-2000:
--

Attachment: LUCENE-2000-clone_covariance.patch

 Use covariant clone() return types
 --

 Key: LUCENE-2000
 URL: https://issues.apache.org/jira/browse/LUCENE-2000
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 3.0
Reporter: Uwe Schindler
 Attachments: LUCENE-2000-clone_covariance.patch


 *Paul Cown wrote in LUCENE-1257:*
 OK, thought I'd jump in and help out here with one of my Java 5 favourites. 
 Haven't seen anyone discuss this, and don't believe any of the patches 
 address this, so thought I'd throw a patch out there (against SVN HEAD @ 
 revision 827821) which uses Java 5 covariant return types for (almost) all of 
 the Object#clone() implementations in core. 
 i.e. this:
 public Object clone() {
 changes to:
 public SpanNotQuery clone() {
 which lets us get rid of a whole bunch of now-unnecessary casts, so e.g.
 if (clone == null) clone = (SpanNotQuery) this.clone();
 becomes
 if (clone == null) clone = this.clone();
 Almost everything has been done and all downcasts removed, in core, with the 
 exception of
 Some SpanQuery stuff, where it's assumed that it's safe to cast the clone() 
 of a SpanQuery to a SpanQuery - this can't be made covariant without 
 declaring abstract SpanQuery clone() in SpanQuery itself, which breaks 
 those SpanQuerys that don't declare their own clone() 
 Some IndexReaders, e.g. DirectoryReader - we can't be more specific than 
 changing .clone() to return IndexReader, because it returns the result of 
 IndexReader.clone(boolean). We could use covariant types for THAT, which 
 would work fine, but that didn't follow the pattern of the others so that 
 could be a later commit. 
 Two changes were also made in contrib/, where not making the changes would 
 have broken code by trying to widen IndexInput#clone() back out to returning 
 Object, which is not permitted. contrib/ was otherwise left untouched.
 Let me know what you think, or if you have any other questions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-2000) Use covariant clone() return types

2009-10-21 Thread Uwe Schindler (JIRA)
Use covariant clone() return types
--

 Key: LUCENE-2000
 URL: https://issues.apache.org/jira/browse/LUCENE-2000
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 3.0
Reporter: Uwe Schindler
 Attachments: LUCENE-2000-clone_covariance.patch

*Paul Cown wrote in LUCENE-1257:*

OK, thought I'd jump in and help out here with one of my Java 5 favourites. 
Haven't seen anyone discuss this, and don't believe any of the patches address 
this, so thought I'd throw a patch out there (against SVN HEAD @ revision 
827821) which uses Java 5 covariant return types for (almost) all of the 
Object#clone() implementations in core. 
i.e. this:

public Object clone() {
changes to:
public SpanNotQuery clone() {

which lets us get rid of a whole bunch of now-unnecessary casts, so e.g.

if (clone == null) clone = (SpanNotQuery) this.clone();
becomes
if (clone == null) clone = this.clone();

Almost everything has been done and all downcasts removed, in core, with the 
exception of

Some SpanQuery stuff, where it's assumed that it's safe to cast the clone() of 
a SpanQuery to a SpanQuery - this can't be made covariant without declaring 
abstract SpanQuery clone() in SpanQuery itself, which breaks those SpanQuerys 
that don't declare their own clone() 
Some IndexReaders, e.g. DirectoryReader - we can't be more specific than 
changing .clone() to return IndexReader, because it returns the result of 
IndexReader.clone(boolean). We could use covariant types for THAT, which would 
work fine, but that didn't follow the pattern of the others so that could be a 
later commit. 
Two changes were also made in contrib/, where not making the changes would have 
broken code by trying to widen IndexInput#clone() back out to returning Object, 
which is not permitted. contrib/ was otherwise left untouched.

Let me know what you think, or if you have any other questions.

[ Show » ] Paul Cowan added a comment - 21/Oct/09 03:01 AM OK, thought I'd jump 
in and help out here with one of my Java 5 favourites. Haven't seen anyone 
discuss this, and don't believe any of the patches address this, so thought I'd 
throw a patch out there (against SVN HEAD @ revision 827821) which uses Java 5 
covariant return types for (almost) all of the Object#clone() implementations 
in core. i.e. this: public Object clone() { changes to: public SpanNotQuery 
clone() { which lets us get rid of a whole bunch of now-unnecessary casts, so 
e.g. if (clone == null) clone = (SpanNotQuery) this.clone(); becomes if (clone 
== null) clone = this.clone(); Almost everything has been done and all 
downcasts removed, in core, with the exception of 
Some SpanQuery stuff, where it's assumed that it's safe to cast the clone() of 
a SpanQuery to a SpanQuery - this can't be made covariant without declaring 
abstract SpanQuery clone() in SpanQuery itself, which breaks those SpanQuerys 
that don't declare their own clone() 
Some IndexReaders, e.g. DirectoryReader - we can't be more specific than 
changing .clone() to return IndexReader, because it returns the result of 
IndexReader.clone(boolean). We could use covariant types for THAT, which would 
work fine, but that didn't follow the pattern of the others so that could be a 
later commit. 
Two changes were also made in contrib/, where not making the changes would have 
broken code by trying to widen IndexInput#clone() back out to returning Object, 
which is not permitted. contrib/ was otherwise left untouched. Let me know what 
you think, or if you have any other questions. 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2000) Use covariant clone() return types

2009-10-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-2000:
--

Description: 
*Paul Cown wrote in LUCENE-1257:*

OK, thought I'd jump in and help out here with one of my Java 5 favourites. 
Haven't seen anyone discuss this, and don't believe any of the patches address 
this, so thought I'd throw a patch out there (against SVN HEAD @ revision 
827821) which uses Java 5 covariant return types for (almost) all of the 
Object#clone() implementations in core. 
i.e. this:

public Object clone() {
changes to:
public SpanNotQuery clone() {

which lets us get rid of a whole bunch of now-unnecessary casts, so e.g.

if (clone == null) clone = (SpanNotQuery) this.clone();
becomes
if (clone == null) clone = this.clone();

Almost everything has been done and all downcasts removed, in core, with the 
exception of

Some SpanQuery stuff, where it's assumed that it's safe to cast the clone() of 
a SpanQuery to a SpanQuery - this can't be made covariant without declaring 
abstract SpanQuery clone() in SpanQuery itself, which breaks those SpanQuerys 
that don't declare their own clone() 
Some IndexReaders, e.g. DirectoryReader - we can't be more specific than 
changing .clone() to return IndexReader, because it returns the result of 
IndexReader.clone(boolean). We could use covariant types for THAT, which would 
work fine, but that didn't follow the pattern of the others so that could be a 
later commit. 
Two changes were also made in contrib/, where not making the changes would have 
broken code by trying to widen IndexInput#clone() back out to returning Object, 
which is not permitted. contrib/ was otherwise left untouched.

Let me know what you think, or if you have any other questions.

  was:
*Paul Cown wrote in LUCENE-1257:*

OK, thought I'd jump in and help out here with one of my Java 5 favourites. 
Haven't seen anyone discuss this, and don't believe any of the patches address 
this, so thought I'd throw a patch out there (against SVN HEAD @ revision 
827821) which uses Java 5 covariant return types for (almost) all of the 
Object#clone() implementations in core. 
i.e. this:

public Object clone() {
changes to:
public SpanNotQuery clone() {

which lets us get rid of a whole bunch of now-unnecessary casts, so e.g.

if (clone == null) clone = (SpanNotQuery) this.clone();
becomes
if (clone == null) clone = this.clone();

Almost everything has been done and all downcasts removed, in core, with the 
exception of

Some SpanQuery stuff, where it's assumed that it's safe to cast the clone() of 
a SpanQuery to a SpanQuery - this can't be made covariant without declaring 
abstract SpanQuery clone() in SpanQuery itself, which breaks those SpanQuerys 
that don't declare their own clone() 
Some IndexReaders, e.g. DirectoryReader - we can't be more specific than 
changing .clone() to return IndexReader, because it returns the result of 
IndexReader.clone(boolean). We could use covariant types for THAT, which would 
work fine, but that didn't follow the pattern of the others so that could be a 
later commit. 
Two changes were also made in contrib/, where not making the changes would have 
broken code by trying to widen IndexInput#clone() back out to returning Object, 
which is not permitted. contrib/ was otherwise left untouched.

Let me know what you think, or if you have any other questions.

[ Show » ] Paul Cowan added a comment - 21/Oct/09 03:01 AM OK, thought I'd jump 
in and help out here with one of my Java 5 favourites. Haven't seen anyone 
discuss this, and don't believe any of the patches address this, so thought I'd 
throw a patch out there (against SVN HEAD @ revision 827821) which uses Java 5 
covariant return types for (almost) all of the Object#clone() implementations 
in core. i.e. this: public Object clone() { changes to: public SpanNotQuery 
clone() { which lets us get rid of a whole bunch of now-unnecessary casts, so 
e.g. if (clone == null) clone = (SpanNotQuery) this.clone(); becomes if (clone 
== null) clone = this.clone(); Almost everything has been done and all 
downcasts removed, in core, with the exception of 
Some SpanQuery stuff, where it's assumed that it's safe to cast the clone() of 
a SpanQuery to a SpanQuery - this can't be made covariant without declaring 
abstract SpanQuery clone() in SpanQuery itself, which breaks those SpanQuerys 
that don't declare their own clone() 
Some IndexReaders, e.g. DirectoryReader - we can't be more specific than 
changing .clone() to return IndexReader, because it returns the result of 
IndexReader.clone(boolean). We could use covariant types for THAT, which would 
work fine, but that didn't follow the pattern of the others so that could be a 
later commit. 
Two changes were also made in contrib/, where not making the changes would have 
broken code by trying to widen IndexInput#clone() back out to returning Object, 
which is not 

[jira] Updated: (LUCENE-1257) Port to Java5

2009-10-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1257:
--

Attachment: (was: LUCENE-1257-clone_covariance.patch)

 Port to Java5
 -

 Key: LUCENE-1257
 URL: https://issues.apache.org/jira/browse/LUCENE-1257
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis, Examples, Index, Other, Query/Scoring, 
 QueryParser, Search, Store, Term Vectors
Affects Versions: 3.0
Reporter: Cédric Champeau
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: instantiated_fieldable.patch, 
 LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
 LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
 LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
 LUCENE-1257-CompoundFileReaderWriter.patch, 
 LUCENE-1257-ConcurrentMergeScheduler.patch, 
 LUCENE-1257-DirectoryReader.patch, 
 LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
 LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
 LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch, 
 LUCENE-1257-IndexDeleter.patch, 
 LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
 LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, 
 LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, 
 LUCENE-1257-org_apache_lucene_document.patch, 
 LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
 LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
 LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, 
 LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, 
 LUCENE-1257_BooleanFilter_Generics.patch, 
 LUCENE-1257_contrib_highlighting.patch, LUCENE-1257_javacc_upgrade.patch, 
 LUCENE-1257_messages.patch, LUCENE-1257_more_unnecessary_casts.patch, 
 LUCENE-1257_MultiFieldQueryParser.patch, LUCENE-1257_o.a.l.queryParser.patch, 
 LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, 
 LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, 
 LUCENE-1257_o_a_l_search_spans.patch, 
 LUCENE-1257_org_apache_lucene_index.patch, 
 LUCENE-1257_org_apache_lucene_index.patch, LUCENE-1257_queryParser_jj.patch, 
 LUCENE-1257_unnecessary_casts.patch, lucene1257surround1.patch, 
 lucene1257surround1.patch, shinglematrixfilter_generified.patch


 For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
 Java 5 migration had been planned for 2.1 someday in the past, but don't know 
 when it is planned now. This patch against the trunk includes :
 - most obvious generics usage (there are tons of usages of sets, ... Those 
 which are commonly used have been generified)
 - PriorityQueue generification
 - replacement of indexed for loops with for each constructs
 - removal of unnececessary unboxing
 The code is to my opinion much more readable with those features (you 
 actually *know* what is stored in collections reading the code, without the 
 need to lookup for field definitions everytime) and it simplifies many 
 algorithms.
 Note that this patch also includes an interface for the Query class. This has 
 been done for my company's needs for building custom Query classes which add 
 some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
 casts. I know this introduction is not wanted by the team, but it really 
 makes our developments easier to maintain. If you don't want to use this, 
 replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1257) Port to Java5

2009-10-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1257:
--

Comment: was deleted

(was: OK, thought I'd jump in and help out here with one of my Java 5 
favourites. Haven't seen anyone discuss this, and don't believe any of the 
patches address this, so thought I'd throw a patch out there (against SVN HEAD 
@ revision 827821) which uses Java 5 covariant return types for (almost) all of 
the Object#clone() implementations in core.

i.e. this:

  public Object clone() {
changes to:
  public SpanNotQuery clone() {

which lets us get rid of a whole bunch of now-unnecessary casts, so e.g.

  if (clone == null) clone = (SpanNotQuery) this.clone();
becomes
  if (clone == null) clone = this.clone();

Almost everything has been done and all downcasts removed, in core, with the 
exception of

* Some SpanQuery stuff, where it's assumed that it's safe to cast the clone() 
of a SpanQuery to a SpanQuery -- this can't be made covariant without declaring 
abstract SpanQuery clone() in SpanQuery itself, which breaks those SpanQuerys 
that don't declare their own clone()
* Some IndexReaders, e.g. DirectoryReader -- we can't be more specific than 
changing .clone() to return IndexReader, because it returns the result of 
IndexReader.clone(boolean). We could use covariant types for THAT, which would 
work fine, but that didn't follow the pattern of the others so that could be a 
later commit.

Two changes were also made in contrib/, where not making the changes would have 
broken code by trying to widen IndexInput#clone() back out to returning Object, 
which is not permitted. contrib/ was otherwise left untouched.

Let me know what you think, or if you have any other questions.
)

 Port to Java5
 -

 Key: LUCENE-1257
 URL: https://issues.apache.org/jira/browse/LUCENE-1257
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis, Examples, Index, Other, Query/Scoring, 
 QueryParser, Search, Store, Term Vectors
Affects Versions: 3.0
Reporter: Cédric Champeau
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: instantiated_fieldable.patch, 
 LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
 LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
 LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
 LUCENE-1257-CompoundFileReaderWriter.patch, 
 LUCENE-1257-ConcurrentMergeScheduler.patch, 
 LUCENE-1257-DirectoryReader.patch, 
 LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
 LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
 LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch, 
 LUCENE-1257-IndexDeleter.patch, 
 LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
 LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, 
 LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, 
 LUCENE-1257-org_apache_lucene_document.patch, 
 LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
 LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
 LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, 
 LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, 
 LUCENE-1257_BooleanFilter_Generics.patch, 
 LUCENE-1257_contrib_highlighting.patch, LUCENE-1257_javacc_upgrade.patch, 
 LUCENE-1257_messages.patch, LUCENE-1257_more_unnecessary_casts.patch, 
 LUCENE-1257_MultiFieldQueryParser.patch, LUCENE-1257_o.a.l.queryParser.patch, 
 LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, 
 LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, 
 LUCENE-1257_o_a_l_search_spans.patch, 
 LUCENE-1257_org_apache_lucene_index.patch, 
 LUCENE-1257_org_apache_lucene_index.patch, LUCENE-1257_queryParser_jj.patch, 
 LUCENE-1257_unnecessary_casts.patch, lucene1257surround1.patch, 
 lucene1257surround1.patch, shinglematrixfilter_generified.patch


 For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
 Java 5 migration had been planned for 2.1 someday in the past, but don't know 
 when it is planned now. This patch against the trunk includes :
 - most obvious generics usage (there are tons of usages of sets, ... Those 
 which are commonly used have been generified)
 - PriorityQueue generification
 - replacement of indexed for loops with for each constructs
 - removal of unnececessary unboxing
 The code is to my opinion much more readable with those features (you 
 actually *know* what is stored in collections reading the code, without the 
 need to lookup for field definitions everytime) and it simplifies many 
 algorithms.
 Note that this patch also includes an interface for the Query class. This has 
 been done 

[jira] Commented: (LUCENE-1257) Port to Java5

2009-10-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768136#action_12768136
 ] 

Uwe Schindler commented on LUCENE-1257:
---

Created a new issue out of clone invariance patch: LUCENE-2000

 Port to Java5
 -

 Key: LUCENE-1257
 URL: https://issues.apache.org/jira/browse/LUCENE-1257
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis, Examples, Index, Other, Query/Scoring, 
 QueryParser, Search, Store, Term Vectors
Affects Versions: 3.0
Reporter: Cédric Champeau
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: instantiated_fieldable.patch, 
 LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
 LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
 LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
 LUCENE-1257-CompoundFileReaderWriter.patch, 
 LUCENE-1257-ConcurrentMergeScheduler.patch, 
 LUCENE-1257-DirectoryReader.patch, 
 LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
 LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
 LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch, 
 LUCENE-1257-IndexDeleter.patch, 
 LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
 LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, 
 LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, 
 LUCENE-1257-org_apache_lucene_document.patch, 
 LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
 LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
 LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, 
 LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, 
 LUCENE-1257_BooleanFilter_Generics.patch, 
 LUCENE-1257_contrib_highlighting.patch, LUCENE-1257_javacc_upgrade.patch, 
 LUCENE-1257_messages.patch, LUCENE-1257_more_unnecessary_casts.patch, 
 LUCENE-1257_MultiFieldQueryParser.patch, LUCENE-1257_o.a.l.queryParser.patch, 
 LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, 
 LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, 
 LUCENE-1257_o_a_l_search_spans.patch, 
 LUCENE-1257_org_apache_lucene_index.patch, 
 LUCENE-1257_org_apache_lucene_index.patch, LUCENE-1257_queryParser_jj.patch, 
 LUCENE-1257_unnecessary_casts.patch, lucene1257surround1.patch, 
 lucene1257surround1.patch, shinglematrixfilter_generified.patch


 For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
 Java 5 migration had been planned for 2.1 someday in the past, but don't know 
 when it is planned now. This patch against the trunk includes :
 - most obvious generics usage (there are tons of usages of sets, ... Those 
 which are commonly used have been generified)
 - PriorityQueue generification
 - replacement of indexed for loops with for each constructs
 - removal of unnececessary unboxing
 The code is to my opinion much more readable with those features (you 
 actually *know* what is stored in collections reading the code, without the 
 need to lookup for field definitions everytime) and it simplifies many 
 algorithms.
 Note that this patch also includes an interface for the Query class. This has 
 been done for my company's needs for building custom Query classes which add 
 some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
 casts. I know this introduction is not wanted by the team, but it really 
 makes our developments easier to maintain. If you don't want to use this, 
 replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2000) Use covariant clone() return types

2009-10-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-2000:
--

Description: 
*Paul Cowan wrote in LUCENE-1257:*

OK, thought I'd jump in and help out here with one of my Java 5 favourites. 
Haven't seen anyone discuss this, and don't believe any of the patches address 
this, so thought I'd throw a patch out there (against SVN HEAD @ revision 
827821) which uses Java 5 covariant return types for (almost) all of the 
Object#clone() implementations in core. 
i.e. this:

public Object clone() {
changes to:
public SpanNotQuery clone() {

which lets us get rid of a whole bunch of now-unnecessary casts, so e.g.

if (clone == null) clone = (SpanNotQuery) this.clone();
becomes
if (clone == null) clone = this.clone();

Almost everything has been done and all downcasts removed, in core, with the 
exception of

Some SpanQuery stuff, where it's assumed that it's safe to cast the clone() of 
a SpanQuery to a SpanQuery - this can't be made covariant without declaring 
abstract SpanQuery clone() in SpanQuery itself, which breaks those SpanQuerys 
that don't declare their own clone() 
Some IndexReaders, e.g. DirectoryReader - we can't be more specific than 
changing .clone() to return IndexReader, because it returns the result of 
IndexReader.clone(boolean). We could use covariant types for THAT, which would 
work fine, but that didn't follow the pattern of the others so that could be a 
later commit. 
Two changes were also made in contrib/, where not making the changes would have 
broken code by trying to widen IndexInput#clone() back out to returning Object, 
which is not permitted. contrib/ was otherwise left untouched.

Let me know what you think, or if you have any other questions.

  was:
*Paul Cown wrote in LUCENE-1257:*

OK, thought I'd jump in and help out here with one of my Java 5 favourites. 
Haven't seen anyone discuss this, and don't believe any of the patches address 
this, so thought I'd throw a patch out there (against SVN HEAD @ revision 
827821) which uses Java 5 covariant return types for (almost) all of the 
Object#clone() implementations in core. 
i.e. this:

public Object clone() {
changes to:
public SpanNotQuery clone() {

which lets us get rid of a whole bunch of now-unnecessary casts, so e.g.

if (clone == null) clone = (SpanNotQuery) this.clone();
becomes
if (clone == null) clone = this.clone();

Almost everything has been done and all downcasts removed, in core, with the 
exception of

Some SpanQuery stuff, where it's assumed that it's safe to cast the clone() of 
a SpanQuery to a SpanQuery - this can't be made covariant without declaring 
abstract SpanQuery clone() in SpanQuery itself, which breaks those SpanQuerys 
that don't declare their own clone() 
Some IndexReaders, e.g. DirectoryReader - we can't be more specific than 
changing .clone() to return IndexReader, because it returns the result of 
IndexReader.clone(boolean). We could use covariant types for THAT, which would 
work fine, but that didn't follow the pattern of the others so that could be a 
later commit. 
Two changes were also made in contrib/, where not making the changes would have 
broken code by trying to widen IndexInput#clone() back out to returning Object, 
which is not permitted. contrib/ was otherwise left untouched.

Let me know what you think, or if you have any other questions.

   Priority: Minor  (was: Major)

 Use covariant clone() return types
 --

 Key: LUCENE-2000
 URL: https://issues.apache.org/jira/browse/LUCENE-2000
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 3.0
Reporter: Uwe Schindler
Priority: Minor
 Attachments: LUCENE-2000-clone_covariance.patch


 *Paul Cowan wrote in LUCENE-1257:*
 OK, thought I'd jump in and help out here with one of my Java 5 favourites. 
 Haven't seen anyone discuss this, and don't believe any of the patches 
 address this, so thought I'd throw a patch out there (against SVN HEAD @ 
 revision 827821) which uses Java 5 covariant return types for (almost) all of 
 the Object#clone() implementations in core. 
 i.e. this:
 public Object clone() {
 changes to:
 public SpanNotQuery clone() {
 which lets us get rid of a whole bunch of now-unnecessary casts, so e.g.
 if (clone == null) clone = (SpanNotQuery) this.clone();
 becomes
 if (clone == null) clone = this.clone();
 Almost everything has been done and all downcasts removed, in core, with the 
 exception of
 Some SpanQuery stuff, where it's assumed that it's safe to cast the clone() 
 of a SpanQuery to a SpanQuery - this can't be made covariant without 
 declaring abstract SpanQuery clone() in SpanQuery itself, which breaks 
 those SpanQuerys that don't declare their own clone() 
 Some IndexReaders, e.g. DirectoryReader - we can't be more specific than 
 

[jira] Issue Comment Edited: (LUCENE-2000) Use covariant clone() return types

2009-10-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768141#action_12768141
 ] 

Uwe Schindler edited comment on LUCENE-2000 at 10/21/09 9:14 AM:
-

I moved this to an extra issue, because there is some discussion needed.

I am strongly against this for various reasons:
- Java 5 itsself does not override clone() with covariant return type 
(nowhere!). So e.g. String.clone() always returns jl.Object.
- This is because of backwards problems (which are not easy to explain) -- it 
has something to do, if a subclass compiled against Java 1.4 version of Lucene 
overrides clone and calls super.clone(). Because of this, the JDK does not 
provide String.clone() retrurning String. javac does its best to prevent 
problems here, but for APIs that need to be backwards compatible, it should 
return Object as always.
- Covariant clone return types need, that *all* subclasses of a class, that 
originally implemented a covariant clone() also override it covariant to be 
consistent. And because of this you have consistency problems (see your 
IndexReader problem). This is not possible for backwards compatibility. Because 
of this, covariant clone should only be done for internal classes 
(package-private, private) or final classes. Another example of this problem is 
AttributeImpl which defines a clone() method. Subclasses would need to override 
this covariant clone() method. Custom Attributes compiled against Lucene 2.9 
would fail to do this - MethodNotFoundException (I tried it out, it breaks)

Because of all this problems, I prefer to always cast the return value of 
clone(). This is not unsafe (and because of this you get no unchecked warning), 
because you always know how to cast the clone result. By the way: You still 
have to always clone() the super.clone() call, so you do not get any pros of 
using covariant return types.

I do not want to start a flame war here, but we should not do this.


  was (Author: thetaphi):
I moved this to an extra issue, because there is some discussion needed.

I am strongly against this for various reasons:
- Java 5 itsself does not override clone() with covariant return type 
(nowhere!). So e.g. String.clone() always returns jl.Object.
- This is because of backwards problems (which are not easy to explain) -- it 
has something to do, if a subclass compiled against Java 1.4 version of Lucene 
overrides clone and calls super.clone(). Because of this, the JDK does not 
provide String.clone() retrurning String. javac does its best to prevent 
problems here, but for APIs that need to be backwards compatible, it should 
return Object as always.
- Covariant clone return types need, that *all* subclasses of a class, that 
originally implemented a covariant clone() also override it covariant to be 
consistent. And because of this you have consistency problems (see your 
IndexReader problem). This is not possible for backwards compatibility. Because 
of this, covariant clone should only be done for internal classes 
(package-private, private) or final classes. Another example of this problem is 
AttributeImpl which defines an abstract clone method. Subclasses would need to 
override this covariant clone() method. Custom Attributes compiled against 
Lucene 2.9 would fail to do this - MethodNotFoundException (I tried it out, it 
breaks)

Because of all this problems, I prefer to always cast the return value of 
clone(). This is not unsafe (and because of this you get no unchecked warning), 
because you always know how to cast the clone result. By the way: You still 
have to always clone() the super.clone() call, so you do not get any pros of 
using covariant return types.

I do not want to start a flame war here, but we should not do this.

  
 Use covariant clone() return types
 --

 Key: LUCENE-2000
 URL: https://issues.apache.org/jira/browse/LUCENE-2000
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 3.0
Reporter: Uwe Schindler
Priority: Minor
 Attachments: LUCENE-2000-clone_covariance.patch


 *Paul Cowan wrote in LUCENE-1257:*
 OK, thought I'd jump in and help out here with one of my Java 5 favourites. 
 Haven't seen anyone discuss this, and don't believe any of the patches 
 address this, so thought I'd throw a patch out there (against SVN HEAD @ 
 revision 827821) which uses Java 5 covariant return types for (almost) all of 
 the Object#clone() implementations in core. 
 i.e. this:
 public Object clone() {
 changes to:
 public SpanNotQuery clone() {
 which lets us get rid of a whole bunch of now-unnecessary casts, so e.g.
 if (clone == null) clone = (SpanNotQuery) this.clone();
 becomes
 if (clone == null) clone = this.clone();
 Almost everything has been done and all downcasts removed, in 

Re: lucene 2.9 sorting algorithm

2009-10-21 Thread Michael McCandless
OK, thanks.

I can help out if you've got questions on the python code... it's
rather straightforward: it just iterates over each set of params to
test, writes an alg file, runs it, opens the resulting output  parses
it for the best run, confirms both single  multi PQ gave precisely
the same doc IDs, and prints the results.

It's remotely possible the difference in the results is a bug/overhead
in contrib/benchmark itself, which'd be good to get to the bottom of
anyway.

Mike

On Tue, Oct 20, 2009 at 9:17 PM, John Wang john.w...@gmail.com wrote:
 Hi Mike:
     That's weird. Let me take a look at the patch. Need to brush up on
 python though :)
 Thanks
 -John

 On Tue, Oct 20, 2009 at 10:25 AM, Michael McCandless
 luc...@mikemccandless.com wrote:

 OK I posted a patch that folds the MultiPQ approach into
 contrib/benchmark, plus a simple python wrapper to run old/new tests
 across different queries, sort, topN, etc.

 But I got different results... MultiPQ looks generally slower than
 SinglePQ.  So I think we now need to reconcile what's different
 between our tests.

 Mike

 On Mon, Oct 19, 2009 at 9:28 PM, John Wang john.w...@gmail.com wrote:
  Hi Michael:
       Was wondering if you got a chance to take a look at this.
       Since deprecated APIs are being removed in 3.0, I was wondering
  if/when
  we would decide on keeping the ScoreDocComparator API and thus would be
  kept
  for Lucene 3.0.
  Thanks
  -John
 
  On Fri, Oct 16, 2009 at 9:53 AM, Michael McCandless
  luc...@mikemccandless.com wrote:
 
  Oh, no problem...
 
  Mike
 
  On Fri, Oct 16, 2009 at 12:33 PM, John Wang john.w...@gmail.com
  wrote:
   Mike, just a clarification on my first perf report email.
   The first section, numHits is incorrectly labeled, it should be 20
   instead
   of 50. Sorry about the possible confusion.
   Thanks
   -John
  
   On Fri, Oct 16, 2009 at 3:21 AM, Michael McCandless
   luc...@mikemccandless.com wrote:
  
   Thanks John; I'll have a look.
  
   Mike
  
   On Fri, Oct 16, 2009 at 12:57 AM, John Wang john.w...@gmail.com
   wrote:
Hi Michael:
    I added classes: ScoreDocComparatorQueue
and OneSortNoScoreCollector
as
a more general case. I think keeping the old api for
ScoreDocComparator
and
SortComparatorSource would work.
  Please take a look.
Thanks
-John
   
On Thu, Oct 15, 2009 at 6:52 PM, John Wang john.w...@gmail.com
wrote:
   
Hi Michael:
     It is
open, http://code.google.com/p/lucene-book/source/checkout
     I think I sent the https url instead, sorry.
    The multi PQ sorting is fairly self-contained, I have 2
versions, 1
for string and 1 for int, each are Collector impls.
     I shouldn't say the Multi Q is faster on int sort, it is
within
the
error boundary. The diff is very very small, I would stay they
are
more
equal.
     If you think it is a good thing to go this way, (if not for
the
perf,
just for the simpler api) I'd be happy to work on a patch.
Thanks
-John
On Thu, Oct 15, 2009 at 5:18 PM, Michael McCandless
luc...@mikemccandless.com wrote:
   
John, looks like this requires login -- any plans to open that
up,
or,
post the code on an issue?
   
How self-contained is your Multi PQ sorting?  EG is it a
standalone
Collector impl that I can test?
   
Mike
   
On Thu, Oct 15, 2009 at 6:33 PM, John Wang john.w...@gmail.com
wrote:
 BTW, we are have a little sandbox for these experiments. And
 all
 my
 testcode
 are at. They are not very polished.

 https://lucene-book.googlecode.com/svn/trunk

 -John

 On Thu, Oct 15, 2009 at 3:29 PM, John Wang
 john.w...@gmail.com
 wrote:

 Numbers Mike requested for Int types:

 only the time/cputime are posted, others are all the same
 since
 the
 algorithm is the same.

 Lucene 2.9:
 numhits: 10
 time: 14619495
 cpu: 146126

 numhits: 20
 time: 14550568
 cpu: 163242

 numhits: 100
 time: 16467647
 cpu: 178379


 my test:
 numHits: 10
 time: 14101094
 cpu: 144715

 numHits: 20
 time: 14804821
 cpu: 151305

 numHits: 100
 time: 15372157
 cpu time: 158842

 Conclusions:
 The are very similar, the differences are all within error
 bounds,
 especially with lower PQ sizes, which second sort alg again
 slightly
 faster.

 Hope this helps.

 -John


 On Thu, Oct 15, 2009 at 3:04 PM, Yonik Seeley
 yo...@lucidimagination.com
 wrote:

 On Thu, Oct 15, 2009 at 5:33 PM, Michael McCandless
 luc...@mikemccandless.com wrote:
  Though it'd be odd if the switch to searching by segment
  really was most of the gains here.

 I had assumed that much of the improvement was due to
 ditching
 

Re: lucene 2.9 sorting algorithm

2009-10-21 Thread Michael McCandless
On Tue, Oct 20, 2009 at 11:55 AM, John Wang john.w...@gmail.com wrote:

 the simpler api places less restriction on the type of custom
 sorting that can be done.

Just to verify: this is not a back-compat break, right?

Because, in 2.4, such an interesting custom sort must've been
operating at the top-level index reader level, which is easy to carry
over to 2.9 (you just rebase the docIDs).

But, of course in moving to 2.9, you would like to also switch your
custom sort to be per-segment (for faster reopen/near real-time perf),
but the new sort API makes this more difficult because it requires
that you are able to compare hits across different segments during the
search, not just at the end.

But then I don't understand the difficulty of doing that: if we had a
Collector with the MultiPQ approach, at the end during merge, you'd
also have to compare results across segments, ie, upgrade your ords to
their real values.  The MultiPQ approach does this by calling
sortValue (returns Comparable) in the end.

Putting performance aside for now... when comparing bottom, you don't
actually have to truly invert Comparable - ord on segment
transition.  You could, instead, get the Comparable for each and
compare, but then note the smallest ord for the current segment that
has failed to compete, and short-ciruit the compareBottom test by
checking against that ord. That should enable carrying over the custom
sort to the single PQ API without needing invert ord-value.

We'd obviously have to test performance...

Or, we could commit the MultiPQ approach as another sorting collector?
I know it's not great having two wildly differenet sort APIs, but both
APIs seem to have their strengths in different cases.

Mike

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1999) Match spotter for all query types

2009-10-21 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768163#action_12768163
 ] 

Michael McCandless commented on LUCENE-1999:


Very clever!

Since you are wrapping arbitrary query objs, couldn't the wrapper make a 
separate data structure for tracking which clause matched (instead of encoding 
it into the score)?

Also: doesn't highlighter run, separately, on each doc?  And so it's OK if the 
scores are affected?  Ie, I would run my main search with a normal query, get 
the 10 results for the current page, then step through each of those 10 doc IDs 
make a single-doc-IndexSearcher, and run this wrapper?

{quote}
Avoiding these precision issues would require a change to Lucene core to record 
docId, score AND a matchFlag byte in ScoreDoc objects and collector APIs.
This may be something we should consider.
{quote}

+1  I would love to see the Scorer API extended to optionally provide details 
on matches.  Not just which clause matched which docs/fields, but the positions 
within the field where the match occurred.  I think we could do this by 
absorbing *SpanQuery into their normal Query counterparts, making the getSpans 
API [somehow] optional so that if you didn't invoke it you don't pay a 
performance price.

 Match spotter for all query types
 -

 Key: LUCENE-1999
 URL: https://issues.apache.org/jira/browse/LUCENE-1999
 Project: Lucene - Java
  Issue Type: New Feature
Affects Versions: 2.9
Reporter: Mark Harwood
 Attachments: matchflagger.patch


 Related to LUCENE-1929 and the current inability to highlight 
 NumericRangeQuery, spatial, cached term filters and other exotica.
 This patch provides the ability to wrap *any* Query objects and record match 
 info as flags encoded in the overall document score.
 Using this approach it would be possible to understand (and therefore 
 highlight) which fields matched clauses in a query.
 The match encoding approach loses some precision in scores as noted here: 
 http://tinyurl.com/ykt8nx7
 Avoiding these precision issues would require a change to Lucene core to 
 record docId, score AND a matchFlag byte in ScoreDoc objects and collector 
 APIs.
 This may be something we should consider.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1987:
--

Attachment: LUCENE-1987-StopFilter.patch

A new patch which resolves the Benchmark problem by adding a static method in 
NewAnalyzerTask that loads an analyzer by class name:
{code}
public static final Analyzer createAnalyzer(String className) throws Exception{
final Class? extends Analyzer clazz = 
Class.forName(className).asSubclass(Analyzer.class);
try {
  // first try to use a ctor with version parameter (needed for many new 
Analyzers that have no default one anymore
  Constructor? extends Analyzer cnstr = 
clazz.getConstructor(Version.class);
  return cnstr.newInstance(Version.LUCENE_CURRENT);
} catch (NoSuchMethodException nsme) {
  // otherwise use default ctor
  return clazz.newInstance();
}
}
{code}

This method is reused at other places where an Analyzer is created by a config 
property.

This patch now passes all test. There are still the problems with Analyzer and 
QueryParsr with wrong default properties, but I would like to commit this first 
and then solve the problems, also in 2.9.1.

Mike, are you OK with that?

 Remove rest of analysis deprecations (Token, CharacterCache)
 

 Key: LUCENE-1987
 URL: https://issues.apache.org/jira/browse/LUCENE-1987
 Project: Lucene - Java
  Issue Type: Task
  Components: Analysis
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 2.9.1, 3.0

 Attachments: LUCENE-1987-StopFilter-backport29.patch, 
 LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, 
 LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
 LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
 LUCENE-1987.patch, LUCENE-1987.patch, LUCENE-1987.patch


 These removes the rest of the deprecations in the analysis package:
 - -Token's termText field-- (DONE)
 - -eventually un-deprecate ctors of Token taking Strings (they are still 
 useful) - if yes remove deprec in 2.9.1- (DONE)
 - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
 - Stopwords lists
 - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
 are deprecated, but we still have the VERSION constants. Do not know, how to 
 proceed. Keep the settings alive for index compatibility? Or remove it 
 together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-21 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768167#action_12768167
 ] 

Michael McCandless commented on LUCENE-1987:


bq. Mike, are you OK with that?

Looks great!  Not only am I OK with it, it's exactly what I proposed (above -- 
https://issues.apache.org/jira/browse/LUCENE-1987?focusedCommentId=12767449page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12767449).
  Maybe you missed my response there?  (I also suggested adding Version to QP 
ctor).

 Remove rest of analysis deprecations (Token, CharacterCache)
 

 Key: LUCENE-1987
 URL: https://issues.apache.org/jira/browse/LUCENE-1987
 Project: Lucene - Java
  Issue Type: Task
  Components: Analysis
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 2.9.1, 3.0

 Attachments: LUCENE-1987-StopFilter-backport29.patch, 
 LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, 
 LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
 LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
 LUCENE-1987.patch, LUCENE-1987.patch, LUCENE-1987.patch


 These removes the rest of the deprecations in the analysis package:
 - -Token's termText field-- (DONE)
 - -eventually un-deprecate ctors of Token taking Strings (they are still 
 useful) - if yes remove deprec in 2.9.1- (DONE)
 - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
 - Stopwords lists
 - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
 are deprecated, but we still have the VERSION constants. Do not know, how to 
 proceed. Keep the settings alive for index compatibility? Or remove it 
 together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768178#action_12768178
 ] 

Uwe Schindler commented on LUCENE-1987:
---

I have seen your comment yesterday and implemented the benchmark thing that way.

The QP ctor with Version param also looks good, but we have to add this to 2.9, 
too, to be able to remove the no-arg ctor, too.

My patch still has a failed test int the ant task (missing no-arg ctor), will 
look into it, but fix is same like for benchmark.

 Remove rest of analysis deprecations (Token, CharacterCache)
 

 Key: LUCENE-1987
 URL: https://issues.apache.org/jira/browse/LUCENE-1987
 Project: Lucene - Java
  Issue Type: Task
  Components: Analysis
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 2.9.1, 3.0

 Attachments: LUCENE-1987-StopFilter-backport29.patch, 
 LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, 
 LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
 LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
 LUCENE-1987.patch, LUCENE-1987.patch, LUCENE-1987.patch


 These removes the rest of the deprecations in the analysis package:
 - -Token's termText field-- (DONE)
 - -eventually un-deprecate ctors of Token taking Strings (they are still 
 useful) - if yes remove deprec in 2.9.1- (DONE)
 - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
 - Stopwords lists
 - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
 are deprecated, but we still have the VERSION constants. Do not know, how to 
 proceed. Keep the settings alive for index compatibility? Or remove it 
 together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-2001) wordnet parsing bug

2009-10-21 Thread Robert Muir (JIRA)
wordnet parsing bug
---

 Key: LUCENE-2001
 URL: https://issues.apache.org/jira/browse/LUCENE-2001
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/*
Affects Versions: 2.9
Reporter: Robert Muir
Priority: Minor


A user reported that wordnet parses the prolog file incorrectly.

Also need to check the wordnet parser in the memory contrib for this problem.

If this is a false alarm, i'm not worried, because the test will be the first 
unit test wordnet package ever had.

{noformat}
For example, looking up the synsets for the
word king, we get:

java SynLookup wnindex king
baron
magnate
mogul
power
queen
rex
scrofula
struma
tycoon

Here, scrofula and struma are extraneous. This happens because, the line
parser code in Syns2Index.java interpretes the two consecutive single quotes
in entry s(114144247,3,'king''s evil',n,1,1) in  wn_s.pl file, as
termination
of the string and separates into king. This entry concerns
synset of words scrofula and struma, and thus they get inserted in the
synset of king. *There 1382 such entries, in wn_s.pl* and more in other
WordNet
Prolog data-base files, where such use of two consecutive single quotes
appears.

We have resolved this by adding a statement in the line parsing portion of
Syns2Index.java, as follows:

   // parse line
   line = line.substring(2);
  * line = line.replaceAll(\'\', `); // added statement*
   int comma = line.indexOf(',');
   String num = line.substring(0, comma);  ... ... etc.
In short we replace '' by ` (a back-quote). Then on recreating the
index, we get:

java SynLookup zwnindex king
baron
magnate
mogul
power
queen
rex
tycoon
{noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1999) Match spotter for all query types

2009-10-21 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768191#action_12768191
 ] 

Michael McCandless commented on LUCENE-1999:


I see, it sounds like your use case is different from the typical
highlighting use case in that 1) you don't need the positions of the
matches (just whether a given clause matched the doc or not), and 2)
you need it for every single doc visited by the query, not just for
the handful of docs that are being presented to the user on the
current page.

bq. This would suggest that you might need 2 query expressions - one for 
execution and one for adding highlighter instrumentation.

I'm thinking it's the same query, but we fix the Scorer API for all
queries (= big change!!) to be able to produce match details on
demand, where those match details look something like what getSpans
now returns.  But for the normal case (only highlighting the docs
being shown on current page), we'd only get the match details for that
small set of docs.

Then we ideally would not need a separate mirrored set of span
queries.  Ie, SpanTermQuery would be absorbed into TermQuery, etc.

But I could easily be being too naive here :) Maybe there is some
serious performance cost to even adding the optional API in.

 Match spotter for all query types
 -

 Key: LUCENE-1999
 URL: https://issues.apache.org/jira/browse/LUCENE-1999
 Project: Lucene - Java
  Issue Type: New Feature
Affects Versions: 2.9
Reporter: Mark Harwood
 Attachments: matchflagger.patch


 Related to LUCENE-1929 and the current inability to highlight 
 NumericRangeQuery, spatial, cached term filters and other exotica.
 This patch provides the ability to wrap *any* Query objects and record match 
 info as flags encoded in the overall document score.
 Using this approach it would be possible to understand (and therefore 
 highlight) which fields matched clauses in a query.
 The match encoding approach loses some precision in scores as noted here: 
 http://tinyurl.com/ykt8nx7
 Avoiding these precision issues would require a change to Lucene core to 
 record docId, score AND a matchFlag byte in ScoreDoc objects and collector 
 APIs.
 This may be something we should consider.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1987) Remove rest of analysis deprecations (Token, CharacterCache)

2009-10-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-1987.
---

Resolution: Fixed

Committed in 2.9, 3.0, backwards branch.

For the QueryParser problems and other additions of version constants I will 
open another issue.

 Remove rest of analysis deprecations (Token, CharacterCache)
 

 Key: LUCENE-1987
 URL: https://issues.apache.org/jira/browse/LUCENE-1987
 Project: Lucene - Java
  Issue Type: Task
  Components: Analysis
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 2.9.1, 3.0

 Attachments: LUCENE-1987-StopFilter-backport29.patch, 
 LUCENE-1987-StopFilter-BW.patch, LUCENE-1987-StopFilter.patch, 
 LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
 LUCENE-1987-StopFilter.patch, LUCENE-1987-StopFilter.patch, 
 LUCENE-1987-StopFilter.patch, LUCENE-1987.patch, LUCENE-1987.patch, 
 LUCENE-1987.patch


 These removes the rest of the deprecations in the analysis package:
 - -Token's termText field-- (DONE)
 - -eventually un-deprecate ctors of Token taking Strings (they are still 
 useful) - if yes remove deprec in 2.9.1- (DONE)
 - -remove CharacterCache and use Character.valueOf() from Java5- (DONE)
 - Stopwords lists
 - Remove the backwards settings from analyzers (acronym, posIncr,...). They 
 are deprecated, but we still have the VERSION constants. Do not know, how to 
 proceed. Keep the settings alive for index compatibility? Or remove it 
 together with the version constants (which were undeprecated).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1998:
--

Attachment: LUCENE-1998_enum.patch

Updated patch (merged with StandardAnalyzer version constants). Also added 
Lucene version 3.0 for completeness to enable users to build apps and do not 
need to use the CURRENT constant.

 Use Java 5 enums
 

 Key: LUCENE-1998
 URL: https://issues.apache.org/jira/browse/LUCENE-1998
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0
Reporter: DM Smith
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, 
 LUCENE-1998_enum.patch, LUCENE-1998_enum.patch


 Replace the use of o.a.l.util.Parameter with Java 5 enums, deprecating 
 Parameter.
 Replace other custom enum patterns with Java 5 enums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-2002) Add oal.util.Version ctor to QueryParser

2009-10-21 Thread Uwe Schindler (JIRA)
Add oal.util.Version ctor to QueryParser


 Key: LUCENE-2002
 URL: https://issues.apache.org/jira/browse/LUCENE-2002
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.9, 3.0
Reporter: Uwe Schindler
 Fix For: 3.0, 2.9


This is a followup of LUCENE-1987:

If somebody uses StandardAnalyzer with Version.LUCENE_CURRENT and then uses 
QueryParser, phrase queries will not work, because the StopFilter enables 
position Increments for stop words, but QueryParser ignores them per default. 
The user has to explicitely enable them.

This issue would add a ctor taking the Version constant and automatically 
enable this setting. The same applies to the contrib queryparser. Eventually 
also StopAnalyzer should add this version ctor.

To be able to remove the default ctor for 3.0 (to remove a possible trap for 
users of QueryParser), it must be deprecated and the new one also added to 
2.9.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread DM Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768215#action_12768215
 ] 

DM Smith commented on LUCENE-1998:
--

.bq I only added the license header back in the Version class. It must be there.

Sorry about wacking the license on Version. It must have been an accident. I 
know it needs to be there.

.bq Some fine tuning: You defined package protected abstract methods, but made 
them public in the enum constant. Changed to all-public. This was also a 
backwards-break in contrib/queryParser.

Thanks. Inadvertently,  I was following the pattern for an Interface, where 
scoping does not matter.

.bq So it works, but not with switch statements.
IMHO: Having a switch statement (or cascading if-then-else) over the collection 
of values is generally indicative of a bad design (or an opportunity for an 
improved design :) By adding methods to each enum that return literals, we can 
eliminate this and at the same time, improve performance.

There is another tuning opportunity, which I didn't take. We are marshaling out 
the flags from the enums into member variables. I'm not sure how efficient the 
storage of a boolean vs an enum is. If it is a wash, then having an enum value 
as replacement would be a good thing. It sould clearly document what controls 
the flag.

The only complication would be the set/get for some of the flags. (E.g. 
AbstractField.setOmitNorms.) What's with that? Are the enum values merely a 
hint??? Does it make sense to allow omitNorms to be changed after an 
AbstractField is being used?



 Use Java 5 enums
 

 Key: LUCENE-1998
 URL: https://issues.apache.org/jira/browse/LUCENE-1998
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0
Reporter: DM Smith
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, 
 LUCENE-1998_enum.patch, LUCENE-1998_enum.patch


 Replace the use of o.a.l.util.Parameter with Java 5 enums, deprecating 
 Parameter.
 Replace other custom enum patterns with Java 5 enums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1257) Port to Java5

2009-10-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768234#action_12768234
 ] 

Uwe Schindler commented on LUCENE-1257:
---

Committed:
- LUCENE-1257_more_unnecessary_casts.patch
- Remove the rest of unchecked warnings. I added a TODO, where I do not 
understand the code and not for sure know, whats inside the collections. This 
could be fixed some time later. But the core code now compiles without any 
unchecked warning.

Revision: 828011


 Port to Java5
 -

 Key: LUCENE-1257
 URL: https://issues.apache.org/jira/browse/LUCENE-1257
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis, Examples, Index, Other, Query/Scoring, 
 QueryParser, Search, Store, Term Vectors
Affects Versions: 3.0
Reporter: Cédric Champeau
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: instantiated_fieldable.patch, 
 LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch, 
 LUCENE-1257-BufferedDeletes_DocumentsWriter.patch, 
 LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch, 
 LUCENE-1257-CompoundFileReaderWriter.patch, 
 LUCENE-1257-ConcurrentMergeScheduler.patch, 
 LUCENE-1257-DirectoryReader.patch, 
 LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch, 
 LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch, 
 LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch, 
 LUCENE-1257-IndexDeleter.patch, 
 LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch, LUCENE-1257-iw.patch, 
 LUCENE-1257-MTQWF.patch, LUCENE-1257-NormalizeCharMap.patch, 
 LUCENE-1257-o.a.l.util.patch, LUCENE-1257-org_apache_lucene_document.patch, 
 LUCENE-1257-org_apache_lucene_document.patch, 
 LUCENE-1257-org_apache_lucene_document.patch, LUCENE-1257-SegmentInfos.patch, 
 LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch, 
 LUCENE-1257-StringBuffer.patch, LUCENE-1257-TopDocsCollector.patch, 
 LUCENE-1257-WordListLoader.patch, LUCENE-1257_analysis.patch, 
 LUCENE-1257_BooleanFilter_Generics.patch, 
 LUCENE-1257_contrib_highlighting.patch, LUCENE-1257_javacc_upgrade.patch, 
 LUCENE-1257_messages.patch, LUCENE-1257_more_unnecessary_casts.patch, 
 LUCENE-1257_MultiFieldQueryParser.patch, LUCENE-1257_o.a.l.queryParser.patch, 
 LUCENE-1257_o.a.l.store.patch, LUCENE-1257_o_a_l_index_test.patch, 
 LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch, 
 LUCENE-1257_o_a_l_search_spans.patch, 
 LUCENE-1257_org_apache_lucene_index.patch, 
 LUCENE-1257_org_apache_lucene_index.patch, LUCENE-1257_queryParser_jj.patch, 
 LUCENE-1257_unnecessary_casts.patch, lucene1257surround1.patch, 
 lucene1257surround1.patch, shinglematrixfilter_generified.patch


 For my needs I've updated Lucene so that it uses Java 5 constructs. I know 
 Java 5 migration had been planned for 2.1 someday in the past, but don't know 
 when it is planned now. This patch against the trunk includes :
 - most obvious generics usage (there are tons of usages of sets, ... Those 
 which are commonly used have been generified)
 - PriorityQueue generification
 - replacement of indexed for loops with for each constructs
 - removal of unnececessary unboxing
 The code is to my opinion much more readable with those features (you 
 actually *know* what is stored in collections reading the code, without the 
 need to lookup for field definitions everytime) and it simplifies many 
 algorithms.
 Note that this patch also includes an interface for the Query class. This has 
 been done for my company's needs for building custom Query classes which add 
 some behaviour to the base Lucene queries. It prevents multiple unnnecessary 
 casts. I know this introduction is not wanted by the team, but it really 
 makes our developments easier to maintain. If you don't want to use this, 
 replace all /Queriable/ calls with standard /Query/.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768242#action_12768242
 ] 

Uwe Schindler commented on LUCENE-1998:
---

(it's bq. not .bq :-) )

{quote}
bq. So it works, but not with switch statements.
IMHO: Having a switch statement (or cascading if-then-else) over the collection 
of values is generally indicative of a bad design (or an opportunity for an 
improved design  By adding methods to each enum that return literals, we can 
eliminate this and at the same time, improve performance.
{quote}

You are right, my problem was more for client code of Lucene that may for 
example have a switch statement on Field.Index (e.g. Solr) to control some 
further indexing steps. If we rename the constant, the switch statement would 
not work (it would work in already compiled code), but not if the code is 
recompiled against the modified version. That was my problem. In 3.0 this will 
not happen as there are no deprec enum constants, but maybe later. In this 
case, a CHANGES.txt entry should be added.

bq. There is another tuning opportunity, which I didn't take. We are marshaling 
out the flags from the enums into member variables. I'm not sure how efficient 
the storage of a boolean vs an enum is. If it is a wash, then having an enum 
value as replacement would be a good thing. It sould clearly document what 
controls the flag.

This is currently not possibible because of backwards compatibility, because 
the fields are protected and not deprecated in 2.9. I think with your change we 
are fine.

bq. The only complication would be the set/get for some of the flags. (E.g. 
AbstractField.setOmitNorms.) What's with that? Are the enum values merely a 
hint??? Does it make sense to allow omitNorms to be changed after an 
AbstractField is being used?

It is perfectly legal to change these constants after creating the field, so 
the setters must be there.

 Use Java 5 enums
 

 Key: LUCENE-1998
 URL: https://issues.apache.org/jira/browse/LUCENE-1998
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0
Reporter: DM Smith
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, 
 LUCENE-1998_enum.patch, LUCENE-1998_enum.patch


 Replace the use of o.a.l.util.Parameter with Java 5 enums, deprecating 
 Parameter.
 Replace other custom enum patterns with Java 5 enums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread DM Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768215#action_12768215
 ] 

DM Smith edited comment on LUCENE-1998 at 10/21/09 2:22 PM:


bq. I only added the license header back in the Version class. It must be there.

Sorry about wacking the license on Version. It must have been an accident. I 
know it needs to be there.

bq.Some fine tuning: You defined package protected abstract methods, but made 
them public in the enum constant. Changed to all-public. This was also a 
backwards-break in contrib/queryParser.

Thanks. Inadvertently,  I was following the pattern for an Interface, where 
scoping does not matter.

bq. So it works, but not with switch statements.
IMHO: Having a switch statement (or cascading if-then-else) over the collection 
of values is generally indicative of a bad design (or an opportunity for an 
improved design :) By adding methods to each enum that return literals, we can 
eliminate this and at the same time, improve performance.

There is another tuning opportunity, which I didn't take. We are marshaling out 
the flags from the enums into member variables. I'm not sure how efficient the 
storage of a boolean vs an enum is. If it is a wash, then having an enum value 
as replacement would be a good thing. It sould clearly document what controls 
the flag.

The only complication would be the set/get for some of the flags. (E.g. 
AbstractField.setOmitNorms.) What's with that? Are the enum values merely a 
hint??? Does it make sense to allow omitNorms to be changed after an 
AbstractField is being used?



  was (Author: dmsmith):
.bq I only added the license header back in the Version class. It must be 
there.

Sorry about wacking the license on Version. It must have been an accident. I 
know it needs to be there.

.bq Some fine tuning: You defined package protected abstract methods, but made 
them public in the enum constant. Changed to all-public. This was also a 
backwards-break in contrib/queryParser.

Thanks. Inadvertently,  I was following the pattern for an Interface, where 
scoping does not matter.

.bq So it works, but not with switch statements.
IMHO: Having a switch statement (or cascading if-then-else) over the collection 
of values is generally indicative of a bad design (or an opportunity for an 
improved design :) By adding methods to each enum that return literals, we can 
eliminate this and at the same time, improve performance.

There is another tuning opportunity, which I didn't take. We are marshaling out 
the flags from the enums into member variables. I'm not sure how efficient the 
storage of a boolean vs an enum is. If it is a wash, then having an enum value 
as replacement would be a good thing. It sould clearly document what controls 
the flag.

The only complication would be the set/get for some of the flags. (E.g. 
AbstractField.setOmitNorms.) What's with that? Are the enum values merely a 
hint??? Does it make sense to allow omitNorms to be changed after an 
AbstractField is being used?


  
 Use Java 5 enums
 

 Key: LUCENE-1998
 URL: https://issues.apache.org/jira/browse/LUCENE-1998
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0
Reporter: DM Smith
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, 
 LUCENE-1998_enum.patch, LUCENE-1998_enum.patch


 Replace the use of o.a.l.util.Parameter with Java 5 enums, deprecating 
 Parameter.
 Replace other custom enum patterns with Java 5 enums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1999) Match spotter for all query types

2009-10-21 Thread Mark Harwood (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768257#action_12768257
 ] 

Mark Harwood commented on LUCENE-1999:
--

bq. and 2) you need it for every single doc visited by the query

Actually I don't need it for every doc, only the top ones  - it just happens to 
be so cheap to produce that I can afford to run this in-line with the query. (I 
haven't actually benchmarked it at scale buy my gut feel is it would be fast )

I was thinking that this might be orthogonal to the existing free-text based 
highlighter. The logic for this being roughly that

1) Highlighting of free-text fields is reasonably well-catered for with 
summarisation etc.
2) The remaining problem areas for highlighting (NumericRangeQuery, Spatial, 
Cached term filters on enums eg gender:male/female) are all likely to be 
non-free-text fields which don't require summarisation and only contain a 
single value.

I may be wrong in these assumptions about the existing state of play (any 
thoughts, Mark M?) but it might be useful to think of attacking the problem 
with these 2 different requirements in mind.

Regardless of type e.g. int, long etc I tend to think of fields as falling into 
these broad usage categories:

a) Identifiers (e.g. primary keys)
b) Quantifiers (e.g numerics, dates, spatial)
c) Free-text 
d) Controlled vocabularies (e.g. enums such as gender:m/f)

Type a ) is catered for with a straight TermQuery and therefore can be handled 
with the existing highlighter
Type b) needs special indexes/queries (spatial/trie) and isn't catered for by 
the existing term/span-based Highlighter
Type c) is catered for with the existing highlighter and its summarising 
features
Type d) involves many TermDoc.next reads so is usefully cached as filters and 
therefore not catered for by existing Highlighter

So this patch helps cater for types b) and d) where simply knowing the field 
matched is all that is required to highlight.


 Match spotter for all query types
 -

 Key: LUCENE-1999
 URL: https://issues.apache.org/jira/browse/LUCENE-1999
 Project: Lucene - Java
  Issue Type: New Feature
Affects Versions: 2.9
Reporter: Mark Harwood
 Attachments: matchflagger.patch


 Related to LUCENE-1929 and the current inability to highlight 
 NumericRangeQuery, spatial, cached term filters and other exotica.
 This patch provides the ability to wrap *any* Query objects and record match 
 info as flags encoded in the overall document score.
 Using this approach it would be possible to understand (and therefore 
 highlight) which fields matched clauses in a query.
 The match encoding approach loses some precision in scores as noted here: 
 http://tinyurl.com/ykt8nx7
 Avoiding these precision issues would require a change to Lucene core to 
 record docId, score AND a matchFlag byte in ScoreDoc objects and collector 
 APIs.
 This may be something we should consider.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread DM Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768270#action_12768270
 ] 

DM Smith commented on LUCENE-1998:
--

I just noticed that enums are comparable. For the enum Version, we could take 
advantage for this and not store a number for each value. It would be important 
to maintain order of versions in the file from earliest to latest.

Should we do this?

Then the current patch's (comments removed for clarity):

public enum Version {
  LUCENE_CURRENT (0),
  LUCENE_20  (2000),
  LUCENE_21  (2100),
  LUCENE_22  (2200),
  LUCENE_23  (2300),
  LUCENE_24  (2400),
  LUCENE_29  (2900),
  LUCENE_30  (3000);
 
  private Version(int v) {
this.v = v;
  }
 
  public boolean onOrAfter(Version other) {
return v == 0 || v = other.v;
  }

  private final int v;
}

Would become (the comment on strict ordering is necessary):

public enum Version {

  // These have to be ordered from the oldest to the newest version
  LUCENE_20,
  LUCENE_21,
  LUCENE_22,
  LUCENE_23,
  LUCENE_24,
  LUCENE_29,
  LUCENE_30,
  // This needs to be last
  LUCENE_CURRENT;
 
  /** A convienence method merely calling this.compareTo(other) = 0 */
  public boolean onOrAfter(Version other) {
return compareTo(other) = 0;
  }

}


 Use Java 5 enums
 

 Key: LUCENE-1998
 URL: https://issues.apache.org/jira/browse/LUCENE-1998
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0
Reporter: DM Smith
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, 
 LUCENE-1998_enum.patch, LUCENE-1998_enum.patch


 Replace the use of o.a.l.util.Parameter with Java 5 enums, deprecating 
 Parameter.
 Replace other custom enum patterns with Java 5 enums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768273#action_12768273
 ] 

Uwe Schindler commented on LUCENE-1998:
---

I thought about that, too: I would not do this. Especially because I want to 
have the 0-version (current) as first element for serialization purposes 
(changing the order of enum constants is bad, you should always add them at the 
end).

Eventually we want to make the accessor to the interver v somehow public (for 
more specific comaprisons and so on).

 Use Java 5 enums
 

 Key: LUCENE-1998
 URL: https://issues.apache.org/jira/browse/LUCENE-1998
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0
Reporter: DM Smith
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, 
 LUCENE-1998_enum.patch, LUCENE-1998_enum.patch


 Replace the use of o.a.l.util.Parameter with Java 5 enums, deprecating 
 Parameter.
 Replace other custom enum patterns with Java 5 enums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2001) wordnet parsing bug

2009-10-21 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-2001:


Attachment: LUCENE-2001.patch

fix and tests for the bug
this only affects wordnet contrib, the bug does not exist in the wordnet 
synonymfilter from the memory package, but add a test there too.


 wordnet parsing bug
 ---

 Key: LUCENE-2001
 URL: https://issues.apache.org/jira/browse/LUCENE-2001
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/*
Affects Versions: 2.9
Reporter: Robert Muir
Priority: Minor
 Attachments: LUCENE-2001.patch


 A user reported that wordnet parses the prolog file incorrectly.
 Also need to check the wordnet parser in the memory contrib for this problem.
 If this is a false alarm, i'm not worried, because the test will be the first 
 unit test wordnet package ever had.
 {noformat}
 For example, looking up the synsets for the
 word king, we get:
 java SynLookup wnindex king
 baron
 magnate
 mogul
 power
 queen
 rex
 scrofula
 struma
 tycoon
 Here, scrofula and struma are extraneous. This happens because, the line
 parser code in Syns2Index.java interpretes the two consecutive single quotes
 in entry s(114144247,3,'king''s evil',n,1,1) in  wn_s.pl file, as
 termination
 of the string and separates into king. This entry concerns
 synset of words scrofula and struma, and thus they get inserted in the
 synset of king. *There 1382 such entries, in wn_s.pl* and more in other
 WordNet
 Prolog data-base files, where such use of two consecutive single quotes
 appears.
 We have resolved this by adding a statement in the line parsing portion of
 Syns2Index.java, as follows:
// parse line
line = line.substring(2);
   * line = line.replaceAll(\'\', `); // added statement*
int comma = line.indexOf(',');
String num = line.substring(0, comma);  ... ... etc.
 In short we replace '' by ` (a back-quote). Then on recreating the
 index, we get:
 java SynLookup zwnindex king
 baron
 magnate
 mogul
 power
 queen
 rex
 tycoon
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2002) Add oal.util.Version ctor to QueryParser

2009-10-21 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2002:
---

Fix Version/s: (was: 3.0)
   (was: 2.9)
   2.9.1

 Add oal.util.Version ctor to QueryParser
 

 Key: LUCENE-2002
 URL: https://issues.apache.org/jira/browse/LUCENE-2002
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.9, 3.0
Reporter: Uwe Schindler
 Fix For: 2.9.1


 This is a followup of LUCENE-1987:
 If somebody uses StandardAnalyzer with Version.LUCENE_CURRENT and then uses 
 QueryParser, phrase queries will not work, because the StopFilter enables 
 position Increments for stop words, but QueryParser ignores them per default. 
 The user has to explicitely enable them.
 This issue would add a ctor taking the Version constant and automatically 
 enable this setting. The same applies to the contrib queryparser. Eventually 
 also StopAnalyzer should add this version ctor.
 To be able to remove the default ctor for 3.0 (to remove a possible trap for 
 users of QueryParser), it must be deprecated and the new one also added to 
 2.9.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2002) Add oal.util.Version ctor to QueryParser

2009-10-21 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768294#action_12768294
 ] 

Michael McCandless commented on LUCENE-2002:


Uwe I can take this if you want?  Have you started?

 Add oal.util.Version ctor to QueryParser
 

 Key: LUCENE-2002
 URL: https://issues.apache.org/jira/browse/LUCENE-2002
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.9, 3.0
Reporter: Uwe Schindler
 Fix For: 2.9.1


 This is a followup of LUCENE-1987:
 If somebody uses StandardAnalyzer with Version.LUCENE_CURRENT and then uses 
 QueryParser, phrase queries will not work, because the StopFilter enables 
 position Increments for stop words, but QueryParser ignores them per default. 
 The user has to explicitely enable them.
 This issue would add a ctor taking the Version constant and automatically 
 enable this setting. The same applies to the contrib queryparser. Eventually 
 also StopAnalyzer should add this version ctor.
 To be able to remove the default ctor for 3.0 (to remove a possible trap for 
 users of QueryParser), it must be deprecated and the new one also added to 
 2.9.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2002) Add oal.util.Version ctor to QueryParser

2009-10-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768301#action_12768301
 ] 

Uwe Schindler commented on LUCENE-2002:
---

During 1987, I also found a bug in Highlighter, which is also not able to 
handle the posIncr of stopwords correctly. Add another issue?

 Add oal.util.Version ctor to QueryParser
 

 Key: LUCENE-2002
 URL: https://issues.apache.org/jira/browse/LUCENE-2002
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.9, 3.0
Reporter: Uwe Schindler
 Fix For: 2.9.1


 This is a followup of LUCENE-1987:
 If somebody uses StandardAnalyzer with Version.LUCENE_CURRENT and then uses 
 QueryParser, phrase queries will not work, because the StopFilter enables 
 position Increments for stop words, but QueryParser ignores them per default. 
 The user has to explicitely enable them.
 This issue would add a ctor taking the Version constant and automatically 
 enable this setting. The same applies to the contrib queryparser. Eventually 
 also StopAnalyzer should add this version ctor.
 To be able to remove the default ctor for 3.0 (to remove a possible trap for 
 users of QueryParser), it must be deprecated and the new one also added to 
 2.9.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread DM Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768304#action_12768304
 ] 

DM Smith commented on LUCENE-1998:
--

bq. changing the order of enum constants is bad, you should always add them at 
the end
Is this true?

I did not know how Java serializes enums so I went looking:
See: http://java.sun.com/j2se/1.5.0/docs/guide/serialization/relnotes15.html

Turns out it serializes the text representation of the enum constant and class 
info. This is just like the Parameter class.

If I understand it correctly, with this, an enum is resilient to changes in 
order. New constants can go in any place (for example, we can later add 
LUCENE_291 before LUCENE_30) and not break serialization compatibility.

This is especially good for the future as it allows a path for deprecations. 
(E.g. deprecation of o.a.l.d.Field.Index.COMPRESS)

So having LUCENE_CURRENT at the end is fine.

If we wanted it first (or anywhere else) we could have onOrAfter to be:
public boolean onOrAfter(Version other) { return other == LUCENE_CURRENT || 
compareTo(other) = 0; }

If we wanted to expose version numbering info in the future, I'd suggest the 
following pattern (names are unimportant):
LUCENE_29 {
   public int getMajor() { return 2; }
   public int getMinor() { return 9; }
   public int getFix()  { return 0; }
}
because it does not require storage and unlike 2900 does not have positional 
notation meaning (PIC code), e.g. public int getMajor() { return 
int(2900/1000); }

 Use Java 5 enums
 

 Key: LUCENE-1998
 URL: https://issues.apache.org/jira/browse/LUCENE-1998
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0
Reporter: DM Smith
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, 
 LUCENE-1998_enum.patch, LUCENE-1998_enum.patch


 Replace the use of o.a.l.util.Parameter with Java 5 enums, deprecating 
 Parameter.
 Replace other custom enum patterns with Java 5 enums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Assigned: (LUCENE-2002) Add oal.util.Version ctor to QueryParser

2009-10-21 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reassigned LUCENE-2002:
--

Assignee: Michael McCandless

 Add oal.util.Version ctor to QueryParser
 

 Key: LUCENE-2002
 URL: https://issues.apache.org/jira/browse/LUCENE-2002
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.9, 3.0
Reporter: Uwe Schindler
Assignee: Michael McCandless
 Fix For: 2.9.1


 This is a followup of LUCENE-1987:
 If somebody uses StandardAnalyzer with Version.LUCENE_CURRENT and then uses 
 QueryParser, phrase queries will not work, because the StopFilter enables 
 position Increments for stop words, but QueryParser ignores them per default. 
 The user has to explicitely enable them.
 This issue would add a ctor taking the Version constant and automatically 
 enable this setting. The same applies to the contrib queryparser. Eventually 
 also StopAnalyzer should add this version ctor.
 To be able to remove the default ctor for 3.0 (to remove a possible trap for 
 users of QueryParser), it must be deprecated and the new one also added to 
 2.9.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2002) Add oal.util.Version ctor to QueryParser

2009-10-21 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768308#action_12768308
 ] 

Michael McCandless commented on LUCENE-2002:


bq. Add another issue?

+1!

 Add oal.util.Version ctor to QueryParser
 

 Key: LUCENE-2002
 URL: https://issues.apache.org/jira/browse/LUCENE-2002
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.9, 3.0
Reporter: Uwe Schindler
Assignee: Michael McCandless
 Fix For: 2.9.1


 This is a followup of LUCENE-1987:
 If somebody uses StandardAnalyzer with Version.LUCENE_CURRENT and then uses 
 QueryParser, phrase queries will not work, because the StopFilter enables 
 position Increments for stop words, but QueryParser ignores them per default. 
 The user has to explicitely enable them.
 This issue would add a ctor taking the Version constant and automatically 
 enable this setting. The same applies to the contrib queryparser. Eventually 
 also StopAnalyzer should add this version ctor.
 To be able to remove the default ctor for 3.0 (to remove a possible trap for 
 users of QueryParser), it must be deprecated and the new one also added to 
 2.9.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1998:
--

Attachment: LUCENE-1998_enum_BW.patch
LUCENE-1998_enum.patch

I changed the Version enum. All test still pass. I also added a test for the 
backwards branch that tests, that the transition from Parameter - enum is 
binary compatible and supported by Java's linker.

I will commit soon.

 Use Java 5 enums
 

 Key: LUCENE-1998
 URL: https://issues.apache.org/jira/browse/LUCENE-1998
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0
Reporter: DM Smith
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, 
 LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, 
 LUCENE-1998_enum_BW.patch


 Replace the use of o.a.l.util.Parameter with Java 5 enums, deprecating 
 Parameter.
 Replace other custom enum patterns with Java 5 enums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2001) wordnet parsing bug

2009-10-21 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-2001:


Lucene Fields: [New, Patch Available]  (was: [New])
Fix Version/s: 3.0
   2.9.1

Committed revision 828091 to trunk.

I set fix for 2.9.1 here, in case someone has some free time to commit the 
patch.

Thanks Parag! 

 wordnet parsing bug
 ---

 Key: LUCENE-2001
 URL: https://issues.apache.org/jira/browse/LUCENE-2001
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/*
Affects Versions: 2.9
Reporter: Robert Muir
Priority: Minor
 Fix For: 2.9.1, 3.0

 Attachments: LUCENE-2001.patch, LUCENE-2001_branch.patch, 
 LUCENE-2001_branch.patch


 A user reported that wordnet parses the prolog file incorrectly.
 Also need to check the wordnet parser in the memory contrib for this problem.
 If this is a false alarm, i'm not worried, because the test will be the first 
 unit test wordnet package ever had.
 {noformat}
 For example, looking up the synsets for the
 word king, we get:
 java SynLookup wnindex king
 baron
 magnate
 mogul
 power
 queen
 rex
 scrofula
 struma
 tycoon
 Here, scrofula and struma are extraneous. This happens because, the line
 parser code in Syns2Index.java interpretes the two consecutive single quotes
 in entry s(114144247,3,'king''s evil',n,1,1) in  wn_s.pl file, as
 termination
 of the string and separates into king. This entry concerns
 synset of words scrofula and struma, and thus they get inserted in the
 synset of king. *There 1382 such entries, in wn_s.pl* and more in other
 WordNet
 Prolog data-base files, where such use of two consecutive single quotes
 appears.
 We have resolved this by adding a statement in the line parsing portion of
 Syns2Index.java, as follows:
// parse line
line = line.substring(2);
   * line = line.replaceAll(\'\', `); // added statement*
int comma = line.indexOf(',');
String num = line.substring(0, comma);  ... ... etc.
 In short we replace '' by ` (a back-quote). Then on recreating the
 index, we get:
 java SynLookup zwnindex king
 baron
 magnate
 mogul
 power
 queen
 rex
 tycoon
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2001) wordnet parsing bug

2009-10-21 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-2001:


Attachment: LUCENE-2001_branch.patch

updated patch for the branch, i forget about String.replace(String,String) 
being java 5 only... sorry guys.

 wordnet parsing bug
 ---

 Key: LUCENE-2001
 URL: https://issues.apache.org/jira/browse/LUCENE-2001
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/*
Affects Versions: 2.9
Reporter: Robert Muir
Priority: Minor
 Attachments: LUCENE-2001.patch, LUCENE-2001_branch.patch, 
 LUCENE-2001_branch.patch


 A user reported that wordnet parses the prolog file incorrectly.
 Also need to check the wordnet parser in the memory contrib for this problem.
 If this is a false alarm, i'm not worried, because the test will be the first 
 unit test wordnet package ever had.
 {noformat}
 For example, looking up the synsets for the
 word king, we get:
 java SynLookup wnindex king
 baron
 magnate
 mogul
 power
 queen
 rex
 scrofula
 struma
 tycoon
 Here, scrofula and struma are extraneous. This happens because, the line
 parser code in Syns2Index.java interpretes the two consecutive single quotes
 in entry s(114144247,3,'king''s evil',n,1,1) in  wn_s.pl file, as
 termination
 of the string and separates into king. This entry concerns
 synset of words scrofula and struma, and thus they get inserted in the
 synset of king. *There 1382 such entries, in wn_s.pl* and more in other
 WordNet
 Prolog data-base files, where such use of two consecutive single quotes
 appears.
 We have resolved this by adding a statement in the line parsing portion of
 Syns2Index.java, as follows:
// parse line
line = line.substring(2);
   * line = line.replaceAll(\'\', `); // added statement*
int comma = line.indexOf(',');
String num = line.substring(0, comma);  ... ... etc.
 In short we replace '' by ` (a back-quote). Then on recreating the
 index, we get:
 java SynLookup zwnindex king
 baron
 magnate
 mogul
 power
 queen
 rex
 tycoon
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2001) wordnet parsing bug

2009-10-21 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-2001:


Attachment: LUCENE-2001_branch.patch

patch for the 2.9 branch (same just without java 5 constructs).

I will commit the one to trunk shortly, can someone help with this one, if we 
think it should be fixed in 2.9.1 also?

 wordnet parsing bug
 ---

 Key: LUCENE-2001
 URL: https://issues.apache.org/jira/browse/LUCENE-2001
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/*
Affects Versions: 2.9
Reporter: Robert Muir
Priority: Minor
 Attachments: LUCENE-2001.patch, LUCENE-2001_branch.patch


 A user reported that wordnet parses the prolog file incorrectly.
 Also need to check the wordnet parser in the memory contrib for this problem.
 If this is a false alarm, i'm not worried, because the test will be the first 
 unit test wordnet package ever had.
 {noformat}
 For example, looking up the synsets for the
 word king, we get:
 java SynLookup wnindex king
 baron
 magnate
 mogul
 power
 queen
 rex
 scrofula
 struma
 tycoon
 Here, scrofula and struma are extraneous. This happens because, the line
 parser code in Syns2Index.java interpretes the two consecutive single quotes
 in entry s(114144247,3,'king''s evil',n,1,1) in  wn_s.pl file, as
 termination
 of the string and separates into king. This entry concerns
 synset of words scrofula and struma, and thus they get inserted in the
 synset of king. *There 1382 such entries, in wn_s.pl* and more in other
 WordNet
 Prolog data-base files, where such use of two consecutive single quotes
 appears.
 We have resolved this by adding a statement in the line parsing portion of
 Syns2Index.java, as follows:
// parse line
line = line.substring(2);
   * line = line.replaceAll(\'\', `); // added statement*
int comma = line.indexOf(',');
String num = line.substring(0, comma);  ... ... etc.
 In short we replace '' by ` (a back-quote). Then on recreating the
 index, we get:
 java SynLookup zwnindex king
 baron
 magnate
 mogul
 power
 queen
 rex
 tycoon
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1998:
--

Attachment: LUCENE-1998_enum_BW.patch

Better BW test

 Use Java 5 enums
 

 Key: LUCENE-1998
 URL: https://issues.apache.org/jira/browse/LUCENE-1998
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0
Reporter: DM Smith
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, 
 LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, 
 LUCENE-1998_enum_BW.patch, LUCENE-1998_enum_BW.patch


 Replace the use of o.a.l.util.Parameter with Java 5 enums, deprecating 
 Parameter.
 Replace other custom enum patterns with Java 5 enums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1998) Use Java 5 enums

2009-10-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-1998.
---

Resolution: Fixed

Committed revision: 828156

Thanks DM Smith!

 Use Java 5 enums
 

 Key: LUCENE-1998
 URL: https://issues.apache.org/jira/browse/LUCENE-1998
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.0
Reporter: DM Smith
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, 
 LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, LUCENE-1998_enum.patch, 
 LUCENE-1998_enum_BW.patch, LUCENE-1998_enum_BW.patch


 Replace the use of o.a.l.util.Parameter with Java 5 enums, deprecating 
 Parameter.
 Replace other custom enum patterns with Java 5 enums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-2003) Highlighter ahs problems when you use StandardAnalyzer with LUCENE_29 or simplier StopFilter with stopWordsPosIncr mode switched on

2009-10-21 Thread Uwe Schindler (JIRA)
Highlighter ahs problems when you use StandardAnalyzer with LUCENE_29 or 
simplier StopFilter with stopWordsPosIncr mode switched on
---

 Key: LUCENE-2003
 URL: https://issues.apache.org/jira/browse/LUCENE-2003
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.9, 3.0
Reporter: Uwe Schindler
 Fix For: 2.9.1, 3.0


This is a followup on LUCENE-1987:

If you set in HighligterTest the constant static final Version TEST_VERSION = 
Version.LUCENE_24 to LUCENE_29 or LUCENE_CURRENT, the test 
testSimpleQueryScorerPhraseHighlighting fails. Please note, that currently 
(before LUCENE-2002 is fixed), you must also set the QueryParser to respect 
posIncr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2002) Add oal.util.Version ctor to QueryParser

2009-10-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768393#action_12768393
 ] 

Uwe Schindler commented on LUCENE-2002:
---

Issue created!

 Add oal.util.Version ctor to QueryParser
 

 Key: LUCENE-2002
 URL: https://issues.apache.org/jira/browse/LUCENE-2002
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.9, 3.0
Reporter: Uwe Schindler
Assignee: Michael McCandless
 Fix For: 2.9.1


 This is a followup of LUCENE-1987:
 If somebody uses StandardAnalyzer with Version.LUCENE_CURRENT and then uses 
 QueryParser, phrase queries will not work, because the StopFilter enables 
 position Increments for stop words, but QueryParser ignores them per default. 
 The user has to explicitely enable them.
 This issue would add a ctor taking the Version constant and automatically 
 enable this setting. The same applies to the contrib queryparser. Eventually 
 also StopAnalyzer should add this version ctor.
 To be able to remove the default ctor for 3.0 (to remove a possible trap for 
 users of QueryParser), it must be deprecated and the new one also added to 
 2.9.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2003) Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or simplier StopFilter with stopWordsPosIncr mode switched on

2009-10-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-2003:
--

Summary: Highlighter has problems when you use StandardAnalyzer with 
LUCENE_29 or simplier StopFilter with stopWordsPosIncr mode switched on  (was: 
Highlighter ahs problems when you use StandardAnalyzer with LUCENE_29 or 
simplier StopFilter with stopWordsPosIncr mode switched on)

 Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or 
 simplier StopFilter with stopWordsPosIncr mode switched on
 ---

 Key: LUCENE-2003
 URL: https://issues.apache.org/jira/browse/LUCENE-2003
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.9, 3.0
Reporter: Uwe Schindler
 Fix For: 2.9.1, 3.0


 This is a followup on LUCENE-1987:
 If you set in HighligterTest the constant static final Version TEST_VERSION = 
 Version.LUCENE_24 to LUCENE_29 or LUCENE_CURRENT, the test 
 testSimpleQueryScorerPhraseHighlighting fails. Please note, that currently 
 (before LUCENE-2002 is fixed), you must also set the QueryParser to respect 
 posIncr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2002) Add oal.util.Version ctor to QueryParser

2009-10-21 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2002:
---

Attachment: LUCENE-2002-29.patch

Attached patch, for 2.9..x

I added required Version param to QueryParser, MultiFieldQueryParser
and ComplexPhraseQueryParser (contrib), which enable position
increments when matchVersion = LUCENE_19.

For the deprecated ctors it defaults to Version.LUCENE_24 for back
compat.

Unfortunately, JavaCC generates two public ctors for QueryParser (one taking
CharStream, another taking QueryParserTokenManager) that I don't know
how to override to take a Version param.

 Add oal.util.Version ctor to QueryParser
 

 Key: LUCENE-2002
 URL: https://issues.apache.org/jira/browse/LUCENE-2002
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.9, 3.0
Reporter: Uwe Schindler
Assignee: Michael McCandless
 Fix For: 2.9.1

 Attachments: LUCENE-2002-29.patch


 This is a followup of LUCENE-1987:
 If somebody uses StandardAnalyzer with Version.LUCENE_CURRENT and then uses 
 QueryParser, phrase queries will not work, because the StopFilter enables 
 position Increments for stop words, but QueryParser ignores them per default. 
 The user has to explicitely enable them.
 This issue would add a ctor taking the Version constant and automatically 
 enable this setting. The same applies to the contrib queryparser. Eventually 
 also StopAnalyzer should add this version ctor.
 To be able to remove the default ctor for 3.0 (to remove a possible trap for 
 users of QueryParser), it must be deprecated and the new one also added to 
 2.9.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Assigned: (LUCENE-2003) Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or simplier StopFilter with stopWordsPosIncr mode switched on

2009-10-21 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reassigned LUCENE-2003:
--

Assignee: Michael McCandless

 Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or 
 simplier StopFilter with stopWordsPosIncr mode switched on
 ---

 Key: LUCENE-2003
 URL: https://issues.apache.org/jira/browse/LUCENE-2003
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.9, 3.0
Reporter: Uwe Schindler
Assignee: Michael McCandless
 Fix For: 2.9.1, 3.0


 This is a followup on LUCENE-1987:
 If you set in HighligterTest the constant static final Version TEST_VERSION = 
 Version.LUCENE_24 to LUCENE_29 or LUCENE_CURRENT, the test 
 testSimpleQueryScorerPhraseHighlighting fails. Please note, that currently 
 (before LUCENE-2002 is fixed), you must also set the QueryParser to respect 
 posIncr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2002) Add oal.util.Version ctor to QueryParser

2009-10-21 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768459#action_12768459
 ] 

Robert Muir commented on LUCENE-2002:
-

Mike, saw a couple of these and laughed a little :)

@param matchVersion Lucene version to *patch*; this is passed through to 
QueryParser.


 Add oal.util.Version ctor to QueryParser
 

 Key: LUCENE-2002
 URL: https://issues.apache.org/jira/browse/LUCENE-2002
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.9, 3.0
Reporter: Uwe Schindler
Assignee: Michael McCandless
 Fix For: 2.9.1

 Attachments: LUCENE-2002-29.patch


 This is a followup of LUCENE-1987:
 If somebody uses StandardAnalyzer with Version.LUCENE_CURRENT and then uses 
 QueryParser, phrase queries will not work, because the StopFilter enables 
 position Increments for stop words, but QueryParser ignores them per default. 
 The user has to explicitely enable them.
 This issue would add a ctor taking the Version constant and automatically 
 enable this setting. The same applies to the contrib queryparser. Eventually 
 also StopAnalyzer should add this version ctor.
 To be able to remove the default ctor for 3.0 (to remove a possible trap for 
 users of QueryParser), it must be deprecated and the new one also added to 
 2.9.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2002) Add oal.util.Version ctor to QueryParser

2009-10-21 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12768467#action_12768467
 ] 

Michael McCandless commented on LUCENE-2002:


Eek!  My fingers are doing the thinking, apparently :)  Been typing that word a 
bit too much!!  I'll fix.  Thanks.

 Add oal.util.Version ctor to QueryParser
 

 Key: LUCENE-2002
 URL: https://issues.apache.org/jira/browse/LUCENE-2002
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 2.9, 3.0
Reporter: Uwe Schindler
Assignee: Michael McCandless
 Fix For: 2.9.1

 Attachments: LUCENE-2002-29.patch


 This is a followup of LUCENE-1987:
 If somebody uses StandardAnalyzer with Version.LUCENE_CURRENT and then uses 
 QueryParser, phrase queries will not work, because the StopFilter enables 
 position Increments for stop words, but QueryParser ignores them per default. 
 The user has to explicitely enable them.
 This issue would add a ctor taking the Version constant and automatically 
 enable this setting. The same applies to the contrib queryparser. Eventually 
 also StopAnalyzer should add this version ctor.
 To be able to remove the default ctor for 3.0 (to remove a possible trap for 
 users of QueryParser), it must be deprecated and the new one also added to 
 2.9.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Build failed in Hudson: Lucene-trunk #986

2009-10-21 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Lucene-trunk/986/changes

Changes:

[uschindler] remove unneeded import

[uschindler] LUCENE-1998: Parameter - Java 5 enum transition

[rmuir] LUCENE-2001: Fix parsing bug in wordnet contrib

[uschindler] Add varargs to MultiSearcher

[uschindler] Fix test failure because of wrong cast. Hard stuff :( Could be 
implemented better, the hq is used for 2 different types

[uschindler] LUCENE-1257: Remove the rest of unchecked warnings and some 
unneeded casts. I added a TODO, where I do not understand the code and not for 
sure know, whats inside the collections. This could be fixed some time later. 
But the core code now compiles without any unchecked warning.

[uschindler] LUCENE-1987: Remove rest of analysis deprecations 
(StandardAnalyzer, StopAnalyzer)

--
[...truncated 16035 lines...]
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 2.643 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.TestSort
[junit] Tests run: 22, Failures: 0, Errors: 0, Time elapsed: 9.88 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.TestSpanQueryFilter
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.66 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.TestStressSort
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 9.973 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.TestTermRangeFilter
[junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 6.51 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.TestTermRangeQuery
[junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 0.968 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.TestTermScorer
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.761 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.TestTermVectors
[junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 3.004 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.TestThreadSafe
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 7.951 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.TestTimeLimitingCollector
[junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 8.635 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.TestTopDocsCollector
[junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 0.565 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.TestTopScoreDocCollector
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.637 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.TestWildcard
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.751 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.function.TestCustomScoreQuery
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 146.423 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.function.TestDocValues
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.309 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.function.TestFieldScoreQuery
[junit] Tests run: 12, Failures: 0, Errors: 0, Time elapsed: 3.117 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.function.TestOrdValues
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 1.529 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.function.TestValueSource
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 2.542 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.payloads.TestPayloadNearQuery
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 2.331 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.payloads.TestPayloadTermQuery
[junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 7.347 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.spans.TestBasics
[junit] Tests run: 20, Failures: 0, Errors: 0, Time elapsed: 38.507 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.spans.TestFieldMaskingSpanQuery
[junit] Tests run: 11, Failures: 0, Errors: 0, Time elapsed: 7.041 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.spans.TestNearSpansOrdered
[junit] Tests run: 10, Failures: 0, Errors: 0, Time elapsed: 1.397 sec
[junit] 
[junit] Testsuite: org.apache.lucene.search.spans.TestPayloadSpans
[junit] Tests run: 10, Failures: 0, Errors: 0, Time elapsed: 4.653 sec
[junit] 
[junit] - Standard Output ---
[junit] 
[junit] Spans Dump --
[junit] payloads for span:2
[junit] doc:0 s:3 e:6 three:Noise:5
[junit] doc:0 s:3 e:6 one:Entity:3
[junit] 
[junit] Spans Dump --
[junit] payloads for span:3
[junit] doc:0 s:0 e:3 xx:Entity:0
[junit] doc:0 s:0 e:3 rr:Noise:1
[junit] doc:0 s:0 e:3 yy:Noise:2
[junit] 

[jira] Updated: (LUCENE-1359) FrenchAnalyzer's tokenStream method does not honour the contract of Analyzer

2009-10-21 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-1359:


Lucene Fields: [New, Patch Available]  (was: [New])
Fix Version/s: 3.0
 Assignee: Robert Muir

 FrenchAnalyzer's tokenStream method does not honour the contract of Analyzer
 

 Key: LUCENE-1359
 URL: https://issues.apache.org/jira/browse/LUCENE-1359
 Project: Lucene - Java
  Issue Type: Bug
  Components: Analysis
Affects Versions: 2.2
Reporter: Andrew Lynch
Assignee: Robert Muir
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1359.patch


 In {{Analyzer}} :
 {code}
 /** Creates a TokenStream which tokenizes all the text in the provided
 Reader.  Default implementation forwards to tokenStream(Reader) for 
 compatibility with older version.  Override to allow Analyzer to choose 
 strategy based on document and/or field.  Must be able to handle null
 field name for backward compatibility. */
   public abstract TokenStream tokenStream(String fieldName, Reader reader);
 {code}
 and in {{FrenchAnalyzer}}
 {code}
 public final TokenStream tokenStream(String fieldName, Reader reader) {
 if (fieldName == null) throw new IllegalArgumentException(fieldName must 
 not be null);
 if (reader == null) throw new IllegalArgumentException(reader must not 
 be null);
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1359) FrenchAnalyzer's tokenStream method does not honour the contract of Analyzer

2009-10-21 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-1359.
-

Resolution: Fixed

Committed revision 828298.

this inconsistency annoyed me too.

thanks Andrew!

 FrenchAnalyzer's tokenStream method does not honour the contract of Analyzer
 

 Key: LUCENE-1359
 URL: https://issues.apache.org/jira/browse/LUCENE-1359
 Project: Lucene - Java
  Issue Type: Bug
  Components: Analysis
Affects Versions: 2.2
Reporter: Andrew Lynch
Assignee: Robert Muir
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1359.patch


 In {{Analyzer}} :
 {code}
 /** Creates a TokenStream which tokenizes all the text in the provided
 Reader.  Default implementation forwards to tokenStream(Reader) for 
 compatibility with older version.  Override to allow Analyzer to choose 
 strategy based on document and/or field.  Must be able to handle null
 field name for backward compatibility. */
   public abstract TokenStream tokenStream(String fieldName, Reader reader);
 {code}
 and in {{FrenchAnalyzer}}
 {code}
 public final TokenStream tokenStream(String fieldName, Reader reader) {
 if (fieldName == null) throw new IllegalArgumentException(fieldName must 
 not be null);
 if (reader == null) throw new IllegalArgumentException(reader must not 
 be null);
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1904) move wordnet based synonym code out of contrib/memory and into contrib/wordnet (or somewhere else)

2009-10-21 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-1904:


Fix Version/s: 3.0
 Assignee: Robert Muir

will bring this patch up to speed. its silly to be in the memory contrib 
instead of wordnet where it belongs.

 move wordnet based synonym code out of contrib/memory and into 
 contrib/wordnet (or somewhere else)
 --

 Key: LUCENE-1904
 URL: https://issues.apache.org/jira/browse/LUCENE-1904
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/*
Reporter: Hoss Man
Assignee: Robert Muir
Priority: Minor
 Fix For: 3.0

 Attachments: LUCENE-1904.patch, LUCENE-1904.patch


 see LUCENE-387 ... some synonym related code has been living in 
 contrib/memory for a very long time ... it should be refactored out.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org