[jira] Resolved: (LUCENE-1529) back-compat tests (ant test-tag) should test JAR drop-in-ability

2009-03-20 Thread Michael Busch (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Busch resolved LUCENE-1529.
---

Resolution: Fixed

Committed revision 756374.

 back-compat tests (ant test-tag) should test JAR drop-in-ability
 --

 Key: LUCENE-1529
 URL: https://issues.apache.org/jira/browse/LUCENE-1529
 Project: Lucene - Java
  Issue Type: New Feature
  Components: Build
Affects Versions: 2.9
Reporter: Michael McCandless
Assignee: Michael Busch
Priority: Minor
 Fix For: 2.9

 Attachments: lucene-1529.patch


 We now test back-compat with ant test-tag, which is very useful for
 catching breaks in back compat before committing.
 However, that currently checks out src/test sources and then
 compiles them against the trunk JAR, and runs the tests.  Whereas our
 back compat policy:
   http://wiki.apache.org/lucene-java/BackwardsCompatibility
 states that no recompilation is required on upgrading to a new JAR.
 Ie you should be able to drop in the new JAR in place of your old one
 and things should work fine.
 So... we should fix ant test-tag to:
   * Do full checkout of core sources  tests from the back-compat-tag
   * Compile the JAR from the back-compat sources
   * Compile the tests against that back-compat JAR
   * Swap in the trunk JAR
   * Run the tests

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1567) New flexible query parser

2009-03-20 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683879#action_12683879
 ] 

Grant Ingersoll commented on LUCENE-1567:
-

OK, I have started the IP Clearance in incubation.  Please send in the software 
grant ASAP and make sure you CC me on it (gsing...@a.o)

 New flexible query parser
 -

 Key: LUCENE-1567
 URL: https://issues.apache.org/jira/browse/LUCENE-1567
 Project: Lucene - Java
  Issue Type: New Feature
  Components: QueryParser
 Environment: N/A
Reporter: Luis Alves
Assignee: Michael Busch

 From New flexible query parser thread by Micheal Busch
 in my team at IBM we have used a different query parser than Lucene's in
 our products for quite a while. Recently we spent a significant amount
 of time in refactoring the code and designing a very generic
 architecture, so that this query parser can be easily used for different
 products with varying query syntaxes.
 This work was originally driven by Andreas Neumann (who, however, left
 our team); most of the code was written by Luis Alves, who has been a
 bit active in Lucene in the past, and Adriano Campos, who joined our
 team at IBM half a year ago. Adriano is Apache committer and PMC member
 on the Tuscany project and getting familiar with Lucene now too.
 We think this code is much more flexible and extensible than the current
 Lucene query parser, and would therefore like to contribute it to
 Lucene. I'd like to give a very brief architecture overview here,
 Adriano and Luis can then answer more detailed questions as they're much
 more familiar with the code than I am.
 The goal was it to separate syntax and semantics of a query. E.g. 'a AND
 b', '+a +b', 'AND(a,b)' could be different syntaxes for the same query.
 We distinguish the semantics of the different query components, e.g.
 whether and how to tokenize/lemmatize/normalize the different terms or
 which Query objects to create for the terms. We wanted to be able to
 write a parser with a new syntax, while reusing the underlying
 semantics, as quickly as possible.
 In fact, Adriano is currently working on a 100% Lucene-syntax compatible
 implementation to make it easy for people who are using Lucene's query
 parser to switch.
 The query parser has three layers and its core is what we call the
 QueryNodeTree. It is a tree that initially represents the syntax of the
 original query, e.g. for 'a AND b':
   AND
  /   \
 A B
 The three layers are:
 1. QueryParser
 2. QueryNodeProcessor
 3. QueryBuilder
 1. The upper layer is the parsing layer which simply transforms the
 query text string into a QueryNodeTree. Currently our implementations of
 this layer use javacc.
 2. The query node processors do most of the work. It is in fact a
 configurable chain of processors. Each processors can walk the tree and
 modify nodes or even the tree's structure. That makes it possible to
 e.g. do query optimization before the query is executed or to tokenize
 terms.
 3. The third layer is also a configurable chain of builders, which
 transform the QueryNodeTree into Lucene Query objects.
 Furthermore the query parser uses flexible configuration objects, which
 are based on AttributeSource/Attribute. It also uses message classes that
 allow to attach resource bundles. This makes it possible to translate
 messages, which is an important feature of a query parser.
 This design allows us to develop different query syntaxes very quickly.
 Adriano wrote the Lucene-compatible syntax in a matter of hours, and the
 underlying processors and builders in a few days. We now have a 100%
 compatible Lucene query parser, which means the syntax is identical and
 all query parser test cases pass on the new one too using a wrapper.
 Recent posts show that there is demand for query syntax improvements,
 e.g improved range query syntax or operator precedence. There are
 already different QP implementations in Lucene+contrib, however I think
 we did not keep them all up to date and in sync. This is not too
 surprising, because usually when fixes and changes are made to the main
 query parser, people don't make the corresponding changes in the contrib
 parsers. (I'm guilty here too)
 With this new architecture it will be much easier to maintain different
 query syntaxes, as the actual code for the first layer is not very much.
 All syntaxes would benefit from patches and improvements we make to the
 underlying layers, which will make supporting different syntaxes much
 more manageable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: 

[jira] Created: (LUCENE-1568) Fix for NPE's in Spatial Lucene for searching bounding box only

2009-03-20 Thread patrick o'leary (JIRA)
Fix for NPE's in Spatial Lucene for searching bounding box only
---

 Key: LUCENE-1568
 URL: https://issues.apache.org/jira/browse/LUCENE-1568
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Reporter: patrick o'leary
Assignee: patrick o'leary
Priority: Minor


NPE occurs when using DistanceQueryBuilder for minimal bounding box search 
without the distance filter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1568) Fix for NPE's in Spatial Lucene for searching bounding box only

2009-03-20 Thread patrick o'leary (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

patrick o'leary updated LUCENE-1568:


Attachment: LUCENE-1568.patch

Fixes an NPE when using DistanceQueryBuilder for just minimal bounding box 
searches
e.g.
{code}
final DistanceQueryBuilder dq = new DistanceQueryBuilder(
 latitude, longitude,
 radius,
 latField, //name of 
latitude field in index
 lngField, //name of 
longitude field in index
 tierPrefix, // prefix 
of tier fields in index
 false  /*filter by 
radius, false means mbb search */ );
{code}

 Fix for NPE's in Spatial Lucene for searching bounding box only
 ---

 Key: LUCENE-1568
 URL: https://issues.apache.org/jira/browse/LUCENE-1568
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Reporter: patrick o'leary
Assignee: patrick o'leary
Priority: Minor
 Attachments: LUCENE-1568.patch


 NPE occurs when using DistanceQueryBuilder for minimal bounding box search 
 without the distance filter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1568) Fix for NPE's in Spatial Lucene for searching bounding box only

2009-03-20 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1568:
---

Fix Version/s: 2.9

 Fix for NPE's in Spatial Lucene for searching bounding box only
 ---

 Key: LUCENE-1568
 URL: https://issues.apache.org/jira/browse/LUCENE-1568
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Reporter: patrick o'leary
Assignee: patrick o'leary
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1568.patch


 NPE occurs when using DistanceQueryBuilder for minimal bounding box search 
 without the distance filter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-652) Compressed fields should be externalized (from Fields into Document)

2009-03-20 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-652.
---

Resolution: Fixed

 Compressed fields should be externalized (from Fields into Document)
 --

 Key: LUCENE-652
 URL: https://issues.apache.org/jira/browse/LUCENE-652
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 1.9, 2.0.0, 2.1
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-652.patch


 Right now, as of 2.0 release, Lucene supports compressed stored fields.  
 However, after discussion on java-dev, the suggestion arose, from Robert 
 Engels, that it would be better if this logic were moved into the Document 
 level.  This way the indexing level just stores opaque binary fields, and 
 then Document handles compress/uncompressing as needed.
 This approach would have prevented issues like LUCENE-629 because merging of 
 segments would never need to decompress.
 See this thread for the recent discussion:
 http://www.gossamer-threads.com/lists/lucene/java-dev/38836
 When we do this we should also work on related issue LUCENE-648.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1568) Fix for NPE's in Spatial Lucene for searching bounding box only

2009-03-20 Thread patrick o'leary (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683951#action_12683951
 ] 

patrick o'leary commented on LUCENE-1568:
-

If nobody objects I'll commit this later today

 Fix for NPE's in Spatial Lucene for searching bounding box only
 ---

 Key: LUCENE-1568
 URL: https://issues.apache.org/jira/browse/LUCENE-1568
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Reporter: patrick o'leary
Assignee: patrick o'leary
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1568.patch


 NPE occurs when using DistanceQueryBuilder for minimal bounding box search 
 without the distance filter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1561) Maybe rename Field.omitTf, and strengthen the javadocs

2009-03-20 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-1561.


Resolution: Fixed

 Maybe rename Field.omitTf, and strengthen the javadocs
 --

 Key: LUCENE-1561
 URL: https://issues.apache.org/jira/browse/LUCENE-1561
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 2.4.1
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 2.9

 Attachments: LUCENE-1561.patch


 Spinoff from here:
   
 http://www.nabble.com/search-problem-when-indexed-using-Field.setOmitTf()-td22456141.html
 Maybe rename omitTf to something like omitTermPositions, and make it clear 
 what queries will silently fail to work as a result.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1327) TermSpans skipTo() doesn't always move forwards

2009-03-20 Thread Michael Busch (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Busch resolved LUCENE-1327.
---

Resolution: Fixed

Committed revision 756669.

 TermSpans skipTo() doesn't always move forwards
 ---

 Key: LUCENE-1327
 URL: https://issues.apache.org/jira/browse/LUCENE-1327
 Project: Lucene - Java
  Issue Type: Bug
  Components: Query/Scoring, Search
Affects Versions: 1.9, 2.0.0, 2.1, 2.2, 2.3, 2.3.1, 2.3.2, 2.4
Reporter: Moti Nisenson
Assignee: Michael Busch
Priority: Minor
 Fix For: 2.9

 Attachments: lucene-1327.patch


 In TermSpans (or the anonymous Spans class returned by SpansTermQuery, 
 depending on the version), the skipTo() method is improperly implemented if 
 the target doc is less than or equal to the current doc:
   public boolean skipTo(int target) throws IOException {
   // are we already at the correct position?
   if (doc = target) {
 return true;
   }
   ...
 This violates the correct behavior (as described in the Spans interface 
 documentation), that skipTo() should always move forwards, in other words the 
 correct implementation would be:
 if (doc = target) {
   return next();
 }
 This bug causes particular problems if one wants to use the payloads feature 
 - this is because if one loads a payload, then performs a skipTo() to the 
 same document, then tries to load the next payload, the spans hasn't 
 changed position and it attempts to load the same payload again (which is an 
 error).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-652) Compressed fields should be externalized (from Fields into Document)

2009-03-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683962#action_12683962
 ] 

Uwe Schindler commented on LUCENE-652:
--

Is an index compressed with Store.COMPRESS still readable? Can i uncompress 
fields compressed using the old tools also by retrieving the byte array and 
using CompressionTools? There should be some documentation about that.

Another question: Compressing was also used for string fields, maybe 
CompressionTols also suplies a method to compress strings (and convert them to 
UTF-8 during that to be backwards compatible). This would prevent people from 
calling String.getBytes() without charset and then wondering, why they cannoit 
read their index again...

 Compressed fields should be externalized (from Fields into Document)
 --

 Key: LUCENE-652
 URL: https://issues.apache.org/jira/browse/LUCENE-652
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 1.9, 2.0.0, 2.1
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-652.patch


 Right now, as of 2.0 release, Lucene supports compressed stored fields.  
 However, after discussion on java-dev, the suggestion arose, from Robert 
 Engels, that it would be better if this logic were moved into the Document 
 level.  This way the indexing level just stores opaque binary fields, and 
 then Document handles compress/uncompressing as needed.
 This approach would have prevented issues like LUCENE-629 because merging of 
 segments would never need to decompress.
 See this thread for the recent discussion:
 http://www.gossamer-threads.com/lists/lucene/java-dev/38836
 When we do this we should also work on related issue LUCENE-648.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-652) Compressed fields should be externalized (from Fields into Document)

2009-03-20 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683967#action_12683967
 ] 

Michael McCandless commented on LUCENE-652:
---

Good questions!

bq. Is an index compressed with Store.COMPRESS still readable?

Yes, we have to support that until Lucene 4.0.  But
Field.Store.COMPRESS will be removed in 3.0 (ie you can read previous
compressed fields, interact w/ an index that has compressed fields in
it, etc., just not add docs with Field.Store.COMPRESS to an index as
of 3.0).

bq. Can i uncompress fields compressed using the old tools also by retrieving 
the byte array and using CompressionTools?

Well... yes, but: you can't actually get the compressed byte[]
(because Lucene will decompress it for you).

bq. Compressing was also used for string fields, maybe CompressionTols also 
suplies a method to compress strings (and convert them to UTF-8 during that to 
be backwards compatible). This would prevent people from calling 
String.getBytes() without charset and then wondering, why they cannoit read 
their index again...

OK I'll add them.  I'll name them compressString and decompressString.


 Compressed fields should be externalized (from Fields into Document)
 --

 Key: LUCENE-652
 URL: https://issues.apache.org/jira/browse/LUCENE-652
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 1.9, 2.0.0, 2.1
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-652.patch


 Right now, as of 2.0 release, Lucene supports compressed stored fields.  
 However, after discussion on java-dev, the suggestion arose, from Robert 
 Engels, that it would be better if this logic were moved into the Document 
 level.  This way the indexing level just stores opaque binary fields, and 
 then Document handles compress/uncompressing as needed.
 This approach would have prevented issues like LUCENE-629 because merging of 
 segments would never need to decompress.
 See this thread for the recent discussion:
 http://www.gossamer-threads.com/lists/lucene/java-dev/38836
 When we do this we should also work on related issue LUCENE-648.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-652) Compressed fields should be externalized (from Fields into Document)

2009-03-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683969#action_12683969
 ] 

Uwe Schindler commented on LUCENE-652:
--

bq. OK I'll add them. I'll name them compressString and decompressString.

Maybe it is better to use the new UTF-8 tools to encode/decode (instead of 
toBytes()). This would be consistent with the rest bof Lucene.
But for the old deprecated Field.Store.COMPRESS, keep it how it is (backwards 
compatibility).

 Compressed fields should be externalized (from Fields into Document)
 --

 Key: LUCENE-652
 URL: https://issues.apache.org/jira/browse/LUCENE-652
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 1.9, 2.0.0, 2.1
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-652.patch


 Right now, as of 2.0 release, Lucene supports compressed stored fields.  
 However, after discussion on java-dev, the suggestion arose, from Robert 
 Engels, that it would be better if this logic were moved into the Document 
 level.  This way the indexing level just stores opaque binary fields, and 
 then Document handles compress/uncompressing as needed.
 This approach would have prevented issues like LUCENE-629 because merging of 
 segments would never need to decompress.
 See this thread for the recent discussion:
 http://www.gossamer-threads.com/lists/lucene/java-dev/38836
 When we do this we should also work on related issue LUCENE-648.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-652) Compressed fields should be externalized (from Fields into Document)

2009-03-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683977#action_12683977
 ] 

Uwe Schindler commented on LUCENE-652:
--

Yes, should I prepare a patch for trunk and add these methods?

 Compressed fields should be externalized (from Fields into Document)
 --

 Key: LUCENE-652
 URL: https://issues.apache.org/jira/browse/LUCENE-652
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 1.9, 2.0.0, 2.1
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-652.patch


 Right now, as of 2.0 release, Lucene supports compressed stored fields.  
 However, after discussion on java-dev, the suggestion arose, from Robert 
 Engels, that it would be better if this logic were moved into the Document 
 level.  This way the indexing level just stores opaque binary fields, and 
 then Document handles compress/uncompressing as needed.
 This approach would have prevented issues like LUCENE-629 because merging of 
 segments would never need to decompress.
 See this thread for the recent discussion:
 http://www.gossamer-threads.com/lists/lucene/java-dev/38836
 When we do this we should also work on related issue LUCENE-648.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Reopened: (LUCENE-652) Compressed fields should be externalized (from Fields into Document)

2009-03-20 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reopened LUCENE-652:
---


 Compressed fields should be externalized (from Fields into Document)
 --

 Key: LUCENE-652
 URL: https://issues.apache.org/jira/browse/LUCENE-652
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 1.9, 2.0.0, 2.1
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-652.patch, LUCENE-652.patch


 Right now, as of 2.0 release, Lucene supports compressed stored fields.  
 However, after discussion on java-dev, the suggestion arose, from Robert 
 Engels, that it would be better if this logic were moved into the Document 
 level.  This way the indexing level just stores opaque binary fields, and 
 then Document handles compress/uncompressing as needed.
 This approach would have prevented issues like LUCENE-629 because merging of 
 segments would never need to decompress.
 See this thread for the recent discussion:
 http://www.gossamer-threads.com/lists/lucene/java-dev/38836
 When we do this we should also work on related issue LUCENE-648.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-652) Compressed fields should be externalized (from Fields into Document)

2009-03-20 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683985#action_12683985
 ] 

Michael McCandless commented on LUCENE-652:
---

If we switch to UnicodeUntil we may want to allow instantiation of 
CompressionTools, since UnicodeUtil is optimized for re-use.

And if we do that we have to think about thread safety  concurrency, probably 
using CloseableThreadLocal under the hood, and then add a close() method.


 Compressed fields should be externalized (from Fields into Document)
 --

 Key: LUCENE-652
 URL: https://issues.apache.org/jira/browse/LUCENE-652
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 1.9, 2.0.0, 2.1
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-652.patch, LUCENE-652.patch


 Right now, as of 2.0 release, Lucene supports compressed stored fields.  
 However, after discussion on java-dev, the suggestion arose, from Robert 
 Engels, that it would be better if this logic were moved into the Document 
 level.  This way the indexing level just stores opaque binary fields, and 
 then Document handles compress/uncompressing as needed.
 This approach would have prevented issues like LUCENE-629 because merging of 
 segments would never need to decompress.
 See this thread for the recent discussion:
 http://www.gossamer-threads.com/lists/lucene/java-dev/38836
 When we do this we should also work on related issue LUCENE-648.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-652) Compressed fields should be externalized (from Fields into Document)

2009-03-20 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683981#action_12683981
 ] 

Michael McCandless commented on LUCENE-652:
---

bq. Yes, should I prepare a patch for trunk and add these methods?

You mean to switch to UnicodeUtil?  That would be great!


 Compressed fields should be externalized (from Fields into Document)
 --

 Key: LUCENE-652
 URL: https://issues.apache.org/jira/browse/LUCENE-652
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 1.9, 2.0.0, 2.1
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-652.patch, LUCENE-652.patch


 Right now, as of 2.0 release, Lucene supports compressed stored fields.  
 However, after discussion on java-dev, the suggestion arose, from Robert 
 Engels, that it would be better if this logic were moved into the Document 
 level.  This way the indexing level just stores opaque binary fields, and 
 then Document handles compress/uncompressing as needed.
 This approach would have prevented issues like LUCENE-629 because merging of 
 segments would never need to decompress.
 See this thread for the recent discussion:
 http://www.gossamer-threads.com/lists/lucene/java-dev/38836
 When we do this we should also work on related issue LUCENE-648.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-652) Compressed fields should be externalized (from Fields into Document)

2009-03-20 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-652:
--

Attachment: LUCENE-652.patch

Added compress/decompressString, and improved javadocs to say this compression 
format matches Field.Store.COMPRESS.

 Compressed fields should be externalized (from Fields into Document)
 --

 Key: LUCENE-652
 URL: https://issues.apache.org/jira/browse/LUCENE-652
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 1.9, 2.0.0, 2.1
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-652.patch, LUCENE-652.patch


 Right now, as of 2.0 release, Lucene supports compressed stored fields.  
 However, after discussion on java-dev, the suggestion arose, from Robert 
 Engels, that it would be better if this logic were moved into the Document 
 level.  This way the indexing level just stores opaque binary fields, and 
 then Document handles compress/uncompressing as needed.
 This approach would have prevented issues like LUCENE-629 because merging of 
 segments would never need to decompress.
 See this thread for the recent discussion:
 http://www.gossamer-threads.com/lists/lucene/java-dev/38836
 When we do this we should also work on related issue LUCENE-648.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-652) Compressed fields should be externalized (from Fields into Document)

2009-03-20 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-652:
-

Attachment: LUCENE-652.patch

This is a first version using UnicodeUtils. The deprecated Store.COMPRESS part 
still uses String.getBytes() because of backwards compatibility (otherwise it 
would be a change in index format).
This version currenty creates a new UTFxResult, because no state, so not close 
method. It can also be synchronized or without ThreadLocal, but this may not be 
so good.
The current version has a little performance impact because of array copying.

 Compressed fields should be externalized (from Fields into Document)
 --

 Key: LUCENE-652
 URL: https://issues.apache.org/jira/browse/LUCENE-652
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 1.9, 2.0.0, 2.1
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-652.patch, LUCENE-652.patch, LUCENE-652.patch


 Right now, as of 2.0 release, Lucene supports compressed stored fields.  
 However, after discussion on java-dev, the suggestion arose, from Robert 
 Engels, that it would be better if this logic were moved into the Document 
 level.  This way the indexing level just stores opaque binary fields, and 
 then Document handles compress/uncompressing as needed.
 This approach would have prevented issues like LUCENE-629 because merging of 
 segments would never need to decompress.
 See this thread for the recent discussion:
 http://www.gossamer-threads.com/lists/lucene/java-dev/38836
 When we do this we should also work on related issue LUCENE-648.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (LUCENE-652) Compressed fields should be externalized (from Fields into Document)

2009-03-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683993#action_12683993
 ] 

Uwe Schindler edited comment on LUCENE-652 at 3/20/09 12:09 PM:


This is a first version using UnicodeUtils. The deprecated Store.COMPRESS part 
still uses String.getBytes() because of backwards compatibility (otherwise it 
would be a change in index format).
This version currenty creates a new UTFxResult, because no state and no close 
method. It can also be synchronized without ThreadLocal, but this may not be so 
good.
The current version has a little performance impact because of array copying.

  was (Author: thetaphi):
This is a first version using UnicodeUtils. The deprecated Store.COMPRESS 
part still uses String.getBytes() because of backwards compatibility (otherwise 
it would be a change in index format).
This version currenty creates a new UTFxResult, because no state, so not close 
method. It can also be synchronized or without ThreadLocal, but this may not be 
so good.
The current version has a little performance impact because of array copying.
  
 Compressed fields should be externalized (from Fields into Document)
 --

 Key: LUCENE-652
 URL: https://issues.apache.org/jira/browse/LUCENE-652
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 1.9, 2.0.0, 2.1
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-652.patch, LUCENE-652.patch, LUCENE-652.patch


 Right now, as of 2.0 release, Lucene supports compressed stored fields.  
 However, after discussion on java-dev, the suggestion arose, from Robert 
 Engels, that it would be better if this logic were moved into the Document 
 level.  This way the indexing level just stores opaque binary fields, and 
 then Document handles compress/uncompressing as needed.
 This approach would have prevented issues like LUCENE-629 because merging of 
 segments would never need to decompress.
 See this thread for the recent discussion:
 http://www.gossamer-threads.com/lists/lucene/java-dev/38836
 When we do this we should also work on related issue LUCENE-648.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-652) Compressed fields should be externalized (from Fields into Document)

2009-03-20 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12684067#action_12684067
 ] 

Michael McCandless commented on LUCENE-652:
---

OK thanks Uwe, it looks good.  We can leave the other changes I
suggested to future optimizations.  I'll commit soon!


 Compressed fields should be externalized (from Fields into Document)
 --

 Key: LUCENE-652
 URL: https://issues.apache.org/jira/browse/LUCENE-652
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 1.9, 2.0.0, 2.1
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-652.patch, LUCENE-652.patch, LUCENE-652.patch


 Right now, as of 2.0 release, Lucene supports compressed stored fields.  
 However, after discussion on java-dev, the suggestion arose, from Robert 
 Engels, that it would be better if this logic were moved into the Document 
 level.  This way the indexing level just stores opaque binary fields, and 
 then Document handles compress/uncompressing as needed.
 This approach would have prevented issues like LUCENE-629 because merging of 
 segments would never need to decompress.
 See this thread for the recent discussion:
 http://www.gossamer-threads.com/lists/lucene/java-dev/38836
 When we do this we should also work on related issue LUCENE-648.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: New flexible query parser

2009-03-20 Thread Chris Hostetter

: My vote for contrib would depend on the state of the code - if it passes all
: the tests and is truly back compat, and is not crazy slower, I don't see why
: we don't move it in right away depending on confidence levels. That would
: ensure use and attention that contrib often misses. The old parser could hang
: around in deprecation.

FWIW: It's always bugged me that the existing queryParser is in the core 
anyway ... as i've mentioned before: I'd love to see us move towards 
putting more features and add-on functionality in contribs and keeping the 
core as lean as possible: just the core functionality for indexing  
searching ... when things are split up, it's easy for people who want 
every lucene feature to include a bunch of jars; it's harder for people 
who want to run lucene in a small footprint (embedded apps?) to extract 
classes from a big jar.

so my vote would be to make it a contrib ... even if we do deprecate the 
current query parser because this can be 100% back compatible -- it just 
makes it a great opportunity to get query parsing out of hte core.




-Hoss


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-652) Compressed fields should be externalized (from Fields into Document)

2009-03-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12685385#action_12685385
 ] 

Uwe Schindler commented on LUCENE-652:
--

Fine! In my opinion the little overhead of UnicodeUtils is far lower that the 
one by compression and the ByteArrayStreams.

{quote}
bq. Can i uncompress fields compressed using the old tools also by retrieving 
the byte array and using CompressionTools?

Well... yes, but: you can't actually get the compressed byte[]
(because Lucene will decompress it for you).
{quote}

You can: With a FieldSelector that load the fields for merge, you get the raw 
binary values (found out from the code of FieldsReader).

 Compressed fields should be externalized (from Fields into Document)
 --

 Key: LUCENE-652
 URL: https://issues.apache.org/jira/browse/LUCENE-652
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 1.9, 2.0.0, 2.1
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-652.patch, LUCENE-652.patch, LUCENE-652.patch


 Right now, as of 2.0 release, Lucene supports compressed stored fields.  
 However, after discussion on java-dev, the suggestion arose, from Robert 
 Engels, that it would be better if this logic were moved into the Document 
 level.  This way the indexing level just stores opaque binary fields, and 
 then Document handles compress/uncompressing as needed.
 This approach would have prevented issues like LUCENE-629 because merging of 
 segments would never need to decompress.
 See this thread for the recent discussion:
 http://www.gossamer-threads.com/lists/lucene/java-dev/38836
 When we do this we should also work on related issue LUCENE-648.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: New flexible query parser

2009-03-20 Thread Michael Busch

On 3/20/09 10:58 PM, Chris Hostetter wrote:

: My vote for contrib would depend on the state of the code - if it passes all
: the tests and is truly back compat, and is not crazy slower, I don't see why
: we don't move it in right away depending on confidence levels. That would
: ensure use and attention that contrib often misses. The old parser could hang
: around in deprecation.

FWIW: It's always bugged me that the existing queryParser is in the core
anyway ... as i've mentioned before: I'd love to see us move towards
putting more features and add-on functionality in contribs and keeping the
core as lean as possible: just the core functionality for indexing
searching ... when things are split up, it's easy for people who want
every lucene feature to include a bunch of jars; it's harder for people
who want to run lucene in a small footprint (embedded apps?) to extract
classes from a big jar.
   

+1. I'd love to see Lucene going into such a direction.

However, I'm a little worried about contrib's reputation. I think it 
contains components with differing levels of activity, maturity and support.
So maybe instead of moving things from core into contrib to achieve the 
goal you mentioned, we could create a new folder named e.g. 
'components', which will contain stuff that we claim is as stable, 
mature and supported as the core, just packaged into separate jars. 
Those jars should then only have dependencies on the core, but not on 
each other. They would also follow the same backwards-compatibility and 
other requirements as the core. Thoughts?


-Michael


so my vote would be to make it a contrib ... even if we do deprecate the
current query parser because this can be 100% back compatible -- it just
makes it a great opportunity to get query parsing out of hte core.




-Hoss


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org


   



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Closed: (LUCENE-1568) Fix for NPE's in Spatial Lucene for searching bounding box only

2009-03-20 Thread patrick o'leary (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

patrick o'leary closed LUCENE-1568.
---


resolved

 Fix for NPE's in Spatial Lucene for searching bounding box only
 ---

 Key: LUCENE-1568
 URL: https://issues.apache.org/jira/browse/LUCENE-1568
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/spatial
Reporter: patrick o'leary
Assignee: patrick o'leary
Priority: Minor
 Fix For: 2.9

 Attachments: LUCENE-1568.patch


 NPE occurs when using DistanceQueryBuilder for minimal bounding box search 
 without the distance filter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Is TopDocCollector's collect() implementation correct?

2009-03-20 Thread Chris Hostetter


(resending msg from earlier today during @apache mail outage -- i didn't 
get a copy from the list, so i'm assuming no one did)


-- Forwarded message --
Date: Fri, 20 Mar 2009 15:29:13 -0700 (PDT)

: TopDocCollector's (TDC) implementation of collect() seems a bit problematic
: to me.

This code isn't an area i'm very familiar with, but your assessment seems
correct ... it looks like when LUCENE-1356 introduced the ability to
provide a PriorityQueue to the constructor, the existing optimization when
the score was obvoiusly too low was overlooked.

It looks like this same bug got propogated to TopScoreDocCollector
when it was introduced as well.

: Introduce in TDC a private boolean which signals whether the default PQ is
: used or not. If it's not used, don't do the 'else if' at all. If it is used,
: then the 'else if' is safe. Then code could look like:

my vote would just be to change the = comarison to a hq.lessThan call
... but i can understand how your proposal might be more efficient -- I'll
let the performance experts fight it out ... but i definitely think you
should fil a bug.



-Hoss


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Using Highlighter for highlighting Phrase query

2009-03-20 Thread Chris Hostetter

(resending msg from earlier today during @apache mail outage -- i didn't 
get a copy from the list, so i'm assuming no one did)

: Date: Fri, 20 Mar 2009 15:30:27 -0700 (PDT)
: 
: http://people.apache.org/~hossman/#java-dev
: Please Use java-u...@lucene Not java-...@lucene
: 
: Your question is better suited for the java-u...@lucene mailing list ...
: not the java-...@lucene list.  java-dev is for discussing development of
: the internals of the Lucene Java library ... it is *not* the appropriate
: place to ask questions about how to use the Lucene Java library when
: developing your own applications.  Please resend your message to
: the java-user mailing list, where you are likely to get more/better
: responses since that list also has a larger number of subscribers.
: 
: 
: 
: : Date: Tue, 17 Mar 2009 07:38:08 -0700 (PDT)
: : From: mitu2009 musicfrea...@gmail.com
: : Reply-To: java-dev@lucene.apache.org
: : To: java-dev@lucene.apache.org
: : Subject: Using Highlighter for highlighting Phrase query
: : 
: : 
: : Am using this version of Lucene highlighter.net API. I want to get a phrase
: : highlighted only when ALL of its words are present in the search
: : results..But,am not able to do sofor example, if my input search string
: : is Leading telecom company, then the API only highlights telecom in the
: : results if the result does not contain the words leading and company...
: : 
: : Here is the code i'm using:
: : 
: : SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter();
: : 
: : var appData =
: : (string)AppDomain.CurrentDomain.GetData(DataDirectory);
: : var folderpath = System.IO.Path.Combine(appData, MyFolder);
: : 
: : indexReader = IndexReader.Open(folderpath);
: : 
: : Highlighter highlighter = new Highlighter(htmlFormatter, new
: : QueryScorer(finalQuery.Rewrite(indexReader)));
: : 
: : 
: : highlighter.SetTextFragmenter(new SimpleFragmenter(800));
: : 
: : int maxNumFragmentsRequired = 5;
: : 
: : string highlightedText = string.Empty;
: : 
: : TokenStream tokenStream = this._analyzer.TokenStream(fieldName,
: : new System.IO.StringReader(fieldText));
: : 
: : highlightedText = highlighter.GetBestFragments(tokenStream,
: : fieldText, maxNumFragmentsRequired, ...);
: : 
: : return highlightedText;
: : 
: : -- 
: : View this message in context: 
http://www.nabble.com/Using-Highlighter-for-highlighting-Phrase-query-tp22560334p22560334.html
: : Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
: : 
: : 
: : -
: : To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
: : For additional commands, e-mail: java-dev-h...@lucene.apache.org
: : 
: 
: 
: 
: -Hoss
: 
: 



-Hoss


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Using MultiFieldQueryParser

2009-03-20 Thread Chris Hostetter

(resending msg from earlier today during @apache mail outage -- i didn't 
get a copy from the list, so i'm assuming no one did)

: Date: Fri, 20 Mar 2009 15:30:59 -0700 (PDT)
: 
: http://people.apache.org/~hossman/#java-dev
: Please Use java-u...@lucene Not java-...@lucene
: 
: Your question is better suited for the java-u...@lucene mailing list ...
: not the java-...@lucene list.  java-dev is for discussing development of
: the internals of the Lucene Java library ... it is *not* the appropriate
: place to ask questions about how to use the Lucene Java library when
: developing your own applications.  Please resend your message to
: the java-user mailing list, where you are likely to get more/better
: responses since that list also has a larger number of subscribers.
: 
: 
: : Date: Tue, 17 Mar 2009 08:47:05 -0700 (PDT)
: : From: mitu2009 musicfrea...@gmail.com
: : Reply-To: java-dev@lucene.apache.org
: : To: java-dev@lucene.apache.org
: : Subject: Using MultiFieldQueryParser
: : 
: : 
: : Hi,
: : 
: : Am working on a book search api using Lucene.User can search for a book
: : whose title or description field contains C.F.A..
: : Am using Lucene's MultiFieldQueryParser..But after parsing, its removing the
: : dots in the string. 
: : 
: : What am i missing here?
: : 
: : Thanks.
: : 
: : -- 
: : View this message in context: 
http://www.nabble.com/Using-MultiFieldQueryParser-tp22562134p22562134.html
: : Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
: : 
: : 
: : -
: : To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
: : For additional commands, e-mail: java-dev-h...@lucene.apache.org
: : 
: 
: 
: 
: -Hoss
: 
: 



-Hoss


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: move TrieRange* to core?

2009-03-20 Thread Chris Hostetter

(resending msg from earlier today during @apache mail outage -- i didn't 
get a copy from the list, so i'm assuming no one did)

: Date: Fri, 20 Mar 2009 16:51:05 -0700 (PDT)
: 
: : I think we should move TrieRange* into core before 2.9?
: 
: -0
: 
: I think we should try to move more things *out* of the core in 3.0 (as 
: i've mentioned in other threads) ... but i certianly understand the 
: arguments for going the other direction.
: 
: : It's received alot of attention, from both developers (Uwe  Yonik did
: : lots of iterations, and Solr is folding it in) and user interest.
: 
: it's a chicken/egg problem that we move things into the core because they 
: are very useful and we want to give them more visibilty, but if we had 
: less things in the core and more things in contribs (query parser, spans, 
: standard analyzer, non-primative Query impls, etc...) then contribs as a 
: whole would be more visible.  ... I'm getting a sense of deja-vu, ah 
: yes, here it is ...
: 
: 
http://www.nabble.com/Moving-SweetSpotSimilarity-out-of-contrib-to19267437.html#a19320894
: 
: 
: -Hoss
: 
: 



-Hoss


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Is TopDocCollector's collect() implementation correct?

2009-03-20 Thread Shai Erera
Thanks Chris/Hoss (not sure who sent the original reply).

I don't like calling pq.lessThan, as pq.insert and pq.insertWithOverflow
call it anyway internally and since it would add a method call (something
that was tried to be avoided in the current implementation), I prefer the
code I proposed below.

BTW, I introduced 1356, so I take full responsibility on this overlooking.
The main reason for 1356 was to allow creating extensions of TopDocCollector
so they can be of the same type, and share the topDocs() and totalHIts()
implementations.

I can file an issue. Any other comments?

Shai

On Sat, Mar 21, 2009 at 3:48 AM, Chris Hostetter
hossman_luc...@fucit.orgwrote:


 (resending msg from earlier today during @apache mail outage -- i didn't
 get a copy from the list, so i'm assuming no one did)

 -- Forwarded message --
 Date: Fri, 20 Mar 2009 15:29:13 -0700 (PDT)

 : TopDocCollector's (TDC) implementation of collect() seems a bit
 problematic
 : to me.

 This code isn't an area i'm very familiar with, but your assessment seems
 correct ... it looks like when LUCENE-1356 introduced the ability to
 provide a PriorityQueue to the constructor, the existing optimization when
 the score was obvoiusly too low was overlooked.

 It looks like this same bug got propogated to TopScoreDocCollector
 when it was introduced as well.

 : Introduce in TDC a private boolean which signals whether the default PQ
 is
 : used or not. If it's not used, don't do the 'else if' at all. If it is
 used,
 : then the 'else if' is safe. Then code could look like:

 my vote would just be to change the = comarison to a hq.lessThan call
 ... but i can understand how your proposal might be more efficient -- I'll
 let the performance experts fight it out ... but i definitely think you
 should fil a bug.



 -Hoss


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org