Re: [Lucene.Net] style cop, fx cop rules

2011-08-01 Thread Christopher Currens
I don't think there's any harm in putting StyleCop in the project at this
stage, but of course, no harm not putting it in either.  It would be handy
for people who already have VS2008/2010, as we could keep Lucene with the
same style format across the project as a whole.

IMO, I think the Naming, Maintainability, and layour rules are the most
important.  I use R#, so many of the default ones there are the ones I'm
partial to.  For example, I like my private fields to start with
underscores. I like my private properties, method names, public fields to be
in pascal case.  I like local variables and method parameters to use camel
case.  I dislike hungarian notation.  I like only one class per file, and
one namespace per file, those being in the maintainability rules.

I would like to hear other people's opinions on this, or maybe one of us
should just make a rule set and have everyone else look over it.

Thanks,
Christopher

On Wed, Jul 27, 2011 at 7:11 PM, Michael Herndon 
mhern...@wickedsoftware.net wrote:

 Does anyone have any preferred rules that they want ignored or want
 required
 for the project for either Fx Cop or Style Cop?

 It might be prudent to wait on putting style cop int the project, it
 currently doesn't have a command line client and if installed it would
 generate warnings on each time someone builds on their local.

 - Michael.



[jira] [Created] (LUCENE-3352) ParametricRangeQueryNodeProcessor support for time zones

2011-08-01 Thread Trejkaz (JIRA)
ParametricRangeQueryNodeProcessor support for time zones


 Key: LUCENE-3352
 URL: https://issues.apache.org/jira/browse/LUCENE-3352
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/queryparser
Reporter: Trejkaz


It would be nice if there were a config attribute for setting a time zone for 
dates in the query.  At the moment I am using my own query node processor to 
implement this, but I stumbled upon ParametricRangeQueryNodeProcessor and it is 
very close to being usable as-is.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3353) ParametricRangeQueryNodeProcessor uses incorrect logic at the lower bound

2011-08-01 Thread Trejkaz (JIRA)
ParametricRangeQueryNodeProcessor uses incorrect logic at the lower bound
-

 Key: LUCENE-3353
 URL: https://issues.apache.org/jira/browse/LUCENE-3353
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/queryparser
Affects Versions: 3.3
Reporter: Trejkaz


ParametricRangeQueryNodeProcessor currently works as follows:

# If the operator was LE or GE, set inclusive = true.
# Set up a calendar
# If inclusive, set the second time to 23:59:59:999
# Convert that to a string using the DateResolution.

The problem is, this breaks for *exclusive* queries.  For instance, if the user 
types in {20100110 TO 20100120} they would expect to get the 10th to the 20th 
exclusive, i.e. the 11th to the 19th.  But in reality, the 10th will be 
*inclusive*.

To get an actually-exclusive range for the lower bound, the time should be set 
to 23:59:59:999, much the same as what is done for the inclusive upper bound.

I suspect the original query parser has the same issue, though possibly in 
different words.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2337) Solr needs hits= added to the log when using grouping

2011-08-01 Thread Martijn van Groningen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073447#comment-13073447
 ] 

Martijn van Groningen commented on SOLR-2337:
-

Yes the trunk and branch3x do log the request when using grouping.
The number of matches of the first command is actually logged now.

 Solr needs hits= added to the log when using grouping 
 --

 Key: SOLR-2337
 URL: https://issues.apache.org/jira/browse/SOLR-2337
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 4.0
Reporter: Bill Bell
 Fix For: 4.0

 Attachments: SOLR.2337.patch


 We monitor the Solr logs to try to review queries that have hits=0. This 
 enables us to improve relevancy since they are easy to find and review.
 When using group=true, hits= does not show up:
 {code}
 2011-01-27 01:10:16,117 INFO  core.SolrCore  - [collection1] webapp= 
 path=/select params={group=truegroup.field=gendergroup.field=idq=*:*} 
 status=0 QTime=15
 {code}
 The code in QueryComponent.java needs to add the matches() after calling 
 grouping.execute() and add up the total.
 It does return hits= in the log for mainResult, but not for standard grouping.
 This should be easy to add since matches are defined...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Solr-trunk - Build # 1586 - Failure

2011-08-01 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Solr-trunk/1586/

1 tests failed.
REGRESSION:  org.apache.solr.core.TestJmxIntegration.testJmxOnCoreReload

Error Message:
Number of registered MBeans is not the same as info registry size expected:51 
but was:46

Stack Trace:
junit.framework.AssertionFailedError: Number of registered MBeans is not the 
same as info registry size expected:51 but was:46
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1522)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1427)
at 
org.apache.solr.core.TestJmxIntegration.testJmxOnCoreReload(TestJmxIntegration.java:158)




Build Log (for compile errors):
[...truncated 9329 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-1879) Parallel incremental indexing

2011-08-01 Thread Eks Dev (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073462#comment-13073462
 ] 

Eks Dev commented on LUCENE-1879:
-

The user mentioned above in comment was me, I guess. Commenting here just to 
add interesting use case that would be perfectly solved by this issue.  

Imagine solr Master - Slave setup, full document contains CONTENT and ID 
fields, e.g. 200Mio+ collection. On master, we need field ID indexed in order 
to process delete/update commands. On slave, we do not need lookup on ID and 
would like to keep our TermsDictionary small, without exploding TermsDictionary 
with 200Mio+ unique ID terms (ouch, this is a lot compared to 5Mio unique terms 
in CONTENT, with or without pulsing). 

With this issue,  this could be nativly achieved by modifying solr 
UpdateHandler not to transfer ID-Index to slaves at all.

There are other ways to fix it, but this would be the best.(I am currently 
investigating an option to transfer full index on update, but to filter-out 
TermsDictionary on IndexReader level (it remains on disk, but this part never 
gets accessed on slaves). I do not know yet if this is possible at all in 
general , e.g. FST based term dictionary is already built (prefix compressed 
TermDict would be doable)

 Parallel incremental indexing
 -

 Key: LUCENE-1879
 URL: https://issues.apache.org/jira/browse/LUCENE-1879
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/index
Reporter: Michael Busch
Assignee: Michael Busch
 Fix For: 4.0

 Attachments: parallel_incremental_indexing.tar


 A new feature that allows building parallel indexes and keeping them in sync 
 on a docID level, independent of the choice of the MergePolicy/MergeScheduler.
 Find details on the wiki page for this feature:
 http://wiki.apache.org/lucene-java/ParallelIncrementalIndexing 
 Discussion on java-dev:
 http://markmail.org/thread/ql3oxzkob7aqf3jd

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3353) ParametricRangeQueryNodeProcessor uses incorrect logic at the lower bound

2011-08-01 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073474#comment-13073474
 ] 

Uwe Schindler commented on LUCENE-3353:
---

This seems to be related to LUCENE-1768 and LUCENE-2979, as we need to change 
the config API, so the timezone would only be another param. Maybe that can be 
done with the work on those two issues.

 ParametricRangeQueryNodeProcessor uses incorrect logic at the lower bound
 -

 Key: LUCENE-3353
 URL: https://issues.apache.org/jira/browse/LUCENE-3353
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/queryparser
Affects Versions: 3.3
Reporter: Trejkaz

 ParametricRangeQueryNodeProcessor currently works as follows:
 # If the operator was LE or GE, set inclusive = true.
 # Set up a calendar
 # If inclusive, set the second time to 23:59:59:999
 # Convert that to a string using the DateResolution.
 The problem is, this breaks for *exclusive* queries.  For instance, if the 
 user types in {20100110 TO 20100120} they would expect to get the 10th to the 
 20th exclusive, i.e. the 11th to the 19th.  But in reality, the 10th will be 
 *inclusive*.
 To get an actually-exclusive range for the lower bound, the time should be 
 set to 23:59:59:999, much the same as what is done for the inclusive upper 
 bound.
 I suspect the original query parser has the same issue, though possibly in 
 different words.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2979) Simplify configuration API of contrib Query Parser

2011-08-01 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073475#comment-13073475
 ] 

Uwe Schindler commented on LUCENE-2979:
---

There was an issue opened today: LUCENE-3353

Maybe that is related to the config changes here, perhaps already fixed?

 Simplify configuration API of contrib Query Parser
 --

 Key: LUCENE-2979
 URL: https://issues.apache.org/jira/browse/LUCENE-2979
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 2.9, 3.0
Reporter: Adriano Crestani
Assignee: Adriano Crestani
  Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor
 Fix For: 3.4, 4.0

 Attachments: LUCENE-2979_phillipe_ramalho_2.patch, 
 LUCENE-2979_phillipe_ramalho_3.patch, LUCENE-2979_phillipe_ramalho_3.patch, 
 LUCENE-2979_phillipe_reamalho.patch


 The current configuration API is very complicated and inherit the concept 
 used by Attribute API to store token information in token streams. However, 
 the requirements for both (QP config and token stream) are not the same, so 
 they shouldn't be using the same thing.
 I propose to simplify QP config and make it less scary for people intending 
 to use contrib QP. The task is not difficult, it will just require a lot of 
 code change and figure out the best way to do it. That's why it's a good 
 candidate for a GSoC project.
 I would like to hear good proposals about how to make the API more friendly 
 and less scaring :)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3353) ParametricRangeQueryNodeProcessor uses incorrect logic at the lower bound

2011-08-01 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3353:
--

Comment: was deleted

(was: This seems to be related to LUCENE-1768 and LUCENE-2979, as we need to 
change the config API, so the timezone would only be another param. Maybe that 
can be done with the work on those two issues.)

 ParametricRangeQueryNodeProcessor uses incorrect logic at the lower bound
 -

 Key: LUCENE-3353
 URL: https://issues.apache.org/jira/browse/LUCENE-3353
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/queryparser
Affects Versions: 3.3
Reporter: Trejkaz

 ParametricRangeQueryNodeProcessor currently works as follows:
 # If the operator was LE or GE, set inclusive = true.
 # Set up a calendar
 # If inclusive, set the second time to 23:59:59:999
 # Convert that to a string using the DateResolution.
 The problem is, this breaks for *exclusive* queries.  For instance, if the 
 user types in {20100110 TO 20100120} they would expect to get the 10th to the 
 20th exclusive, i.e. the 11th to the 19th.  But in reality, the 10th will be 
 *inclusive*.
 To get an actually-exclusive range for the lower bound, the time should be 
 set to 23:59:59:999, much the same as what is done for the inclusive upper 
 bound.
 I suspect the original query parser has the same issue, though possibly in 
 different words.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (LUCENE-2979) Simplify configuration API of contrib Query Parser

2011-08-01 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073475#comment-13073475
 ] 

Uwe Schindler edited comment on LUCENE-2979 at 8/1/11 9:40 AM:
---

There was an issue opened today: LUCENE-3352

Maybe that is related to the config changes here, perhaps already fixed?

  was (Author: thetaphi):
There was an issue opened today: LUCENE-3353

Maybe that is related to the config changes here, perhaps already fixed?
  
 Simplify configuration API of contrib Query Parser
 --

 Key: LUCENE-2979
 URL: https://issues.apache.org/jira/browse/LUCENE-2979
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 2.9, 3.0
Reporter: Adriano Crestani
Assignee: Adriano Crestani
  Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor
 Fix For: 3.4, 4.0

 Attachments: LUCENE-2979_phillipe_ramalho_2.patch, 
 LUCENE-2979_phillipe_ramalho_3.patch, LUCENE-2979_phillipe_ramalho_3.patch, 
 LUCENE-2979_phillipe_reamalho.patch


 The current configuration API is very complicated and inherit the concept 
 used by Attribute API to store token information in token streams. However, 
 the requirements for both (QP config and token stream) are not the same, so 
 they shouldn't be using the same thing.
 I propose to simplify QP config and make it less scary for people intending 
 to use contrib QP. The task is not difficult, it will just require a lot of 
 code change and figure out the best way to do it. That's why it's a good 
 candidate for a GSoC project.
 I would like to hear good proposals about how to make the API more friendly 
 and less scaring :)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3352) ParametricRangeQueryNodeProcessor support for time zones

2011-08-01 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073476#comment-13073476
 ] 

Uwe Schindler commented on LUCENE-3352:
---

This seems to be related to LUCENE-1768 and LUCENE-2979, as we need to change 
the config API, so the timezone would only be another param. Maybe that can be 
done with the work on those two issues.

 ParametricRangeQueryNodeProcessor support for time zones
 

 Key: LUCENE-3352
 URL: https://issues.apache.org/jira/browse/LUCENE-3352
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/queryparser
Reporter: Trejkaz

 It would be nice if there were a config attribute for setting a time zone for 
 dates in the query.  At the moment I am using my own query node processor to 
 implement this, but I stumbled upon ParametricRangeQueryNodeProcessor and it 
 is very close to being usable as-is.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-2337) Solr needs hits= added to the log when using grouping

2011-08-01 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen closed SOLR-2337.
---

Resolution: Fixed

 Solr needs hits= added to the log when using grouping 
 --

 Key: SOLR-2337
 URL: https://issues.apache.org/jira/browse/SOLR-2337
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 4.0
Reporter: Bill Bell
 Fix For: 4.0

 Attachments: SOLR.2337.patch


 We monitor the Solr logs to try to review queries that have hits=0. This 
 enables us to improve relevancy since they are easy to find and review.
 When using group=true, hits= does not show up:
 {code}
 2011-01-27 01:10:16,117 INFO  core.SolrCore  - [collection1] webapp= 
 path=/select params={group=truegroup.field=gendergroup.field=idq=*:*} 
 status=0 QTime=15
 {code}
 The code in QueryComponent.java needs to add the matches() after calling 
 grouping.execute() and add up the total.
 It does return hits= in the log for mainResult, but not for standard grouping.
 This should be easy to add since matches are defined...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-1682) Implement CollapseComponent

2011-08-01 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen closed SOLR-1682.
---

Resolution: Fixed

 Implement CollapseComponent
 ---

 Key: SOLR-1682
 URL: https://issues.apache.org/jira/browse/SOLR-1682
 Project: Solr
  Issue Type: Sub-task
  Components: search
Reporter: Martijn van Groningen
Assignee: Shalin Shekhar Mangar
 Fix For: 3.4, 4.0

 Attachments: SOLR-1682.patch, SOLR-1682.patch, SOLR-1682.patch, 
 SOLR-1682.patch, SOLR-1682.patch, SOLR-1682.patch, SOLR-1682.patch, 
 SOLR-1682_prototype.patch, SOLR-1682_prototype.patch, 
 SOLR-1682_prototype.patch, SOLR-236.patch, field-collapsing.patch


 Child issue of SOLR-236. This issue is dedicated to field collapsing in 
 general and all its code (CollapseComponent, DocumentCollapsers and 
 CollapseCollectors). The main goal is the finalize the request parameters and 
 response format.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2686) Extend FieldCache architecture to multiple Values

2011-08-01 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073515#comment-13073515
 ] 

Michael McCandless commented on SOLR-2686:
--

+1, though really this should be a Lucene issue (FieldCache is in Lucene).

We actually have a start at this: the core part of UnInvertedField was factored 
into Lucene as oal.index.DocTermOrds.  I think all we need to do is make this 
accessible through FieldCache.

 Extend FieldCache architecture to multiple Values
 -

 Key: SOLR-2686
 URL: https://issues.apache.org/jira/browse/SOLR-2686
 Project: Solr
  Issue Type: Bug
Reporter: Bill Bell

 I would consider this a bug. It appears lots of people are working around 
 this limitation, 
 why don't we just change the underlying data structures to natively support 
 multiValued fields in the FieldCache architecture?
 Then functions() will work properly, and we can do things like easily 
 geodist() on a multiValued field.
 Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org




[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

2011-08-01 Thread Martijn van Groningen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073518#comment-13073518
 ] 

Martijn van Groningen commented on LUCENE-3354:
---

+1. If DocTermOrds is available in FieldCache, then Grouping (Term based impl) 
can also use DocTermOrds.

 Extend FieldCache architecture to multiple Values
 -

 Key: LUCENE-3354
 URL: https://issues.apache.org/jira/browse/LUCENE-3354
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Bill Bell

 I would consider this a bug. It appears lots of people are working around 
 this limitation, 
 why don't we just change the underlying data structures to natively support 
 multiValued fields in the FieldCache architecture?
 Then functions() will work properly, and we can do things like easily 
 geodist() on a multiValued field.
 Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

2011-08-01 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073522#comment-13073522
 ] 

Ryan McKinley commented on LUCENE-3354:
---

What are thoughts on using DocValues rather then FieldCache?

If we do choose to extend the FieldCache architecture, it would be so much 
cleaner if it were a simple MapK,V directly on the Reader rather then a 
static thing holding a WeakHashMapReader,Cache


 Extend FieldCache architecture to multiple Values
 -

 Key: LUCENE-3354
 URL: https://issues.apache.org/jira/browse/LUCENE-3354
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Bill Bell

 I would consider this a bug. It appears lots of people are working around 
 this limitation, 
 why don't we just change the underlying data structures to natively support 
 multiValued fields in the FieldCache architecture?
 Then functions() will work properly, and we can do things like easily 
 geodist() on a multiValued field.
 Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2676) Add a welcome-file-list with a welcome-file index.jsp to web.xml in Solr servlet war.

2011-08-01 Thread Jay R. Jaeger (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073527#comment-13073527
 ] 

Jay R. Jaeger commented on SOLR-2676:
-

Wow.  That was fast.  Thanks.



 Add a welcome-file-list with a welcome-file index.jsp to web.xml in Solr 
 servlet war.
 -

 Key: SOLR-2676
 URL: https://issues.apache.org/jira/browse/SOLR-2676
 Project: Solr
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1
Reporter: Jay R. Jaeger
Assignee: Hoss Man
Priority: Trivial
 Fix For: 3.4, 4.0


 Some web application servers (e.g., IBM WebSphere application server) do not 
 have a default welcome file list.  The Solr servlet and related JSPs 
 currently depend upon a default welcome file list.  Adding a 
 welcome-file-list entry to web.xml will rectify this problem in a compatible 
 way:
 welcome-file-list
   welcome-fileindex.jsp/welcome-file
 /welcome-file-list

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1152673 - /lucene/dev/branches/branch_3x/lucene/src/test/org/apache/lucene/index/Tes tPayloads.java

2011-08-01 Thread Michael McCandless
OK I found the problem: 3.x can't handle the U+ character (we
replace it on indexing), while trunk can.  So I think we just have to
fix randomFixedByteLengthUnicodeString to never use that char.  I'll
commit...

Mike McCandless

http://blog.mikemccandless.com

On Sun, Jul 31, 2011 at 10:33 PM, Robert Muir rcm...@gmail.com wrote:
 On Sun, Jul 31, 2011 at 10:24 PM, Chris Hostetter
 hossman_luc...@fucit.org wrote:

 : fix test to not create invalid unicode

 I'm confused ... when/why does randomFixedByteLengthUnicodeString not
 return valid unicode?


 I'm confused too, but all of the issues are with replacement chars for
 invalid unicode:

 Since the intent of this test is to test thread safety, not to test
 unicode enc/dec back and forth, I switched it to ascii until the test
 grows up (e.g. trunk, which now uses full unicode range correctly,
 maybe I backported this wrong before)

 --
 lucidimagination.com

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

2011-08-01 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073530#comment-13073530
 ] 

Robert Muir commented on LUCENE-3354:
-

+1, die insanity, die.

 Extend FieldCache architecture to multiple Values
 -

 Key: LUCENE-3354
 URL: https://issues.apache.org/jira/browse/LUCENE-3354
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Bill Bell

 I would consider this a bug. It appears lots of people are working around 
 this limitation, 
 why don't we just change the underlying data structures to natively support 
 multiValued fields in the FieldCache architecture?
 Then functions() will work properly, and we can do things like easily 
 geodist() on a multiValued field.
 Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

2011-08-01 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073533#comment-13073533
 ] 

Michael McCandless commented on LUCENE-3354:


+1 to moving FC to atomic readers only, and let SlowMultiReaderWrapper absorb 
the insanity.

 Extend FieldCache architecture to multiple Values
 -

 Key: LUCENE-3354
 URL: https://issues.apache.org/jira/browse/LUCENE-3354
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Bill Bell

 I would consider this a bug. It appears lots of people are working around 
 this limitation, 
 why don't we just change the underlying data structures to natively support 
 multiValued fields in the FieldCache architecture?
 Then functions() will work properly, and we can do things like easily 
 geodist() on a multiValued field.
 Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

2011-08-01 Thread Martijn van Groningen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073532#comment-13073532
 ] 

Martijn van Groningen commented on LUCENE-3354:
---

bq. What are thoughts on using DocValues rather then FieldCache?
Maybe both should be available. Not all fields have indexed docvalues.

bq. We should start with this in 4.0! For backwards compatibility we could 
still have the FieldCache class, but just delegating.
Changing the architecture seems like a big task to me. Maybe that should be 
done in a different issue. This issue will then depend on it.

 Extend FieldCache architecture to multiple Values
 -

 Key: LUCENE-3354
 URL: https://issues.apache.org/jira/browse/LUCENE-3354
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Bill Bell

 I would consider this a bug. It appears lots of people are working around 
 this limitation, 
 why don't we just change the underlying data structures to natively support 
 multiValued fields in the FieldCache architecture?
 Then functions() will work properly, and we can do things like easily 
 geodist() on a multiValued field.
 Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



order by function

2011-08-01 Thread Gastone Penzo
Hi,
i need to order by function like:

sort=sum(field1,field2,field3)+desc

but solr gives me this error:
Missing sort order.
why is this possible? i read that is possible to order by function, from 
version 1.3
(http://wiki.apache.org/solr/FunctionQuery#Sort_By_Function)

i use version 1.4

nobody has an idea?

thanx


Gastone
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3348) IndexWriter applies wrong deletes during concurrent flush-all

2011-08-01 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073535#comment-13073535
 ] 

Michael McCandless commented on LUCENE-3348:


Thanks Simon; I'll make both of those fixes.

Unfortunately there is still at least one more thread safety issue that I'm 
trying to track down... beasting uncovered a good seed.

 IndexWriter applies wrong deletes during concurrent flush-all
 -

 Key: LUCENE-3348
 URL: https://issues.apache.org/jira/browse/LUCENE-3348
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3348.patch, LUCENE-3348.patch


 Yonik uncovered this with the TestRealTimeGet test: if a flush-all is
 underway, it is possible for an incoming update to pick a DWPT that is
 stale, ie, not yet pulled/marked for flushing, yet the DW has cutover
 to a new deletes queue.  If this happens, and the deleted term was
 also updated in one of the non-stale DWPTs, then the wrong document is
 deleted and the test fails by detecting the wrong value.
 There's a 2nd failure mode that I haven't figured out yet, whereby 2
 docs are returned when searching by id (there should only ever be 1
 doc since the test uses updateDocument which is atomic wrt
 commit/reopen).
 Yonik verified the test passes pre-DWPT, so my guess is (but I
 have yet to verify) this test also passes on 3.x.  I'll backport
 the test to 3.x to be sure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

2011-08-01 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073536#comment-13073536
 ] 

Yonik Seeley commented on LUCENE-3354:
--

bq. (icluding the broken Solr parts still using TopLevel FieldCache entries).

Some top-level field cache uses are very much by design in Solr.
If that ability is removed from Lucene, I guess we could always move some of 
the old FieldCache logic to Solr though.

 Extend FieldCache architecture to multiple Values
 -

 Key: LUCENE-3354
 URL: https://issues.apache.org/jira/browse/LUCENE-3354
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Bill Bell

 I would consider this a bug. It appears lots of people are working around 
 this limitation, 
 why don't we just change the underlying data structures to natively support 
 multiValued fields in the FieldCache architecture?
 Then functions() will work properly, and we can do things like easily 
 geodist() on a multiValued field.
 Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

2011-08-01 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073538#comment-13073538
 ] 

Uwe Schindler commented on LUCENE-3354:
---

bq. If that ability is removed from Lucene, I guess we could always move some 
of the old FieldCache logic to Solr though.

Solr can always use SlowMultiReaderWrapper (see above)

 Extend FieldCache architecture to multiple Values
 -

 Key: LUCENE-3354
 URL: https://issues.apache.org/jira/browse/LUCENE-3354
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Bill Bell

 I would consider this a bug. It appears lots of people are working around 
 this limitation, 
 why don't we just change the underlying data structures to natively support 
 multiValued fields in the FieldCache architecture?
 Then functions() will work properly, and we can do things like easily 
 geodist() on a multiValued field.
 Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[Lucene.Net] Incubator PMC/Board report for August 2011 (lucene-net-...@lucene.apache.org)

2011-08-01 Thread no-reply
Dear Lucene.NET Developers,

This email was sent by an automated system on behalf of the Apache Incubator 
PMC.
It is an initial reminder to give you plenty of time to prepare your quarterly
board report.

The board meeting is scheduled for  Wed, 17 August 2011, 10 am Pacific. The 
report 
for your podling will form a part of the Incubator PMC report. The Incubator 
PMC 
requires your report to be submitted one week before the board meeting, to 
allow 
sufficient time for review.

Please submit your report with sufficient time to allow the incubator PMC, and 
subsequently board members to review and digest. Again, the very latest you 
should submit your report is one week prior to the board meeting.

Thanks,

The Apache Incubator PMC

Submitting your Report
--

Your report should contain the following:

 * Your project name
 * A brief description of your project, which assumes no knowledge of the 
project
   or necessarily of its field
 * A list of the three most important issues to address in the move towards 
   graduation.
 * Any issues that the Incubator PMC or ASF Board might wish/need to be aware of
 * How has the community developed since the last report
 * How has the project developed since the last report.
 
This should be appended to the Incubator Wiki page at:

  http://wiki.apache.org/incubator/August2011

Note: This manually populated. You may need to wait a little before this page is
  created from a template.

Mentors
---
Mentors should review reports for their project(s) and sign them off on the 
Incubator wiki page. Signing off reports shows that you are following the 
project - projects that are not signed may raise alarms for the Incubator PMC.

Incubator PMC



Re: order by function

2011-08-01 Thread Grant Ingersoll
Sort by function is not available in 1.4.  It's in 3.1.


On Aug 1, 2011, at 6:43 AM, Gastone Penzo wrote:

 Hi,
 i need to order by function like:
 
 sort=sum(field1,field2,field3)+desc
 
 but solr gives me this error:
 Missing sort order.
 why is this possible? i read that is possible to order by function, from 
 version 1.3
 (http://wiki.apache.org/solr/FunctionQuery#Sort_By_Function)
 
 i use version 1.4
 
 nobody has an idea?
 
 thanx
 
 
 Gastone
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 


Grant Ingersoll



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3348) IndexWriter applies wrong deletes during concurrent flush-all

2011-08-01 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073556#comment-13073556
 ] 

Simon Willnauer commented on LUCENE-3348:
-

bq. Unfortunately there is still at least one more thread safety issue that I'm 
trying to track down... beasting uncovered a good seed.

argh! can you post it here?

simon


 IndexWriter applies wrong deletes during concurrent flush-all
 -

 Key: LUCENE-3348
 URL: https://issues.apache.org/jira/browse/LUCENE-3348
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3348.patch, LUCENE-3348.patch


 Yonik uncovered this with the TestRealTimeGet test: if a flush-all is
 underway, it is possible for an incoming update to pick a DWPT that is
 stale, ie, not yet pulled/marked for flushing, yet the DW has cutover
 to a new deletes queue.  If this happens, and the deleted term was
 also updated in one of the non-stale DWPTs, then the wrong document is
 deleted and the test fails by detecting the wrong value.
 There's a 2nd failure mode that I haven't figured out yet, whereby 2
 docs are returned when searching by id (there should only ever be 1
 doc since the test uses updateDocument which is atomic wrt
 commit/reopen).
 Yonik verified the test passes pre-DWPT, so my guess is (but I
 have yet to verify) this test also passes on 3.x.  I'll backport
 the test to 3.x to be sure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 9905 - Failure

2011-08-01 Thread Yonik Seeley
Caused by: java.io.IOException: Cannot create directory:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/example/multicore/core1/data/index
at 
org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:121)
at org.apache.lucene.store.Lock.obtain(Lock.java:72)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1116)


It feels like we've been experiencing a lot of failures lately in the
example jetty tests.
Although it may be a coincidence, it feels like it coincided with the
solr build rewrite (and IIRC there were some changes made to the
example test framework at that time?)

-Yonik
http://www.lucidimagination.com



On Mon, Aug 1, 2011 at 12:24 AM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
 https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9905/

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: JCC CompileError -- incorrect generic parameter detection

2011-08-01 Thread Andi Vajda


 Hi Lukasz,

On Sun, 31 Jul 2011, ?ukasz Jancewicz wrote:


On Fri, Jul 29, 2011 at 17:09, Andi Vajda va...@apache.org wrote:

For example, is there a piece of gentyref code that I could 'borrow' (with
attribution of course) and include that in the JCC sources to fix this
particular problem ?


If you look at this file:

http://code.google.com/p/gentyref/source/browse/src/main/java/com/googlecode/gentyref/GenericTypeReflector.java

you'll see that it's completely independent from the rest of gentyref
library, so I guess that, technically, you could just copy  paste it
to JCC.


Not really, no. It depends on the whole thing but that's no big deal. The 
licensing is also compatible since it's Apache 2.0 licensed (like you said 
originally but I missed that, sorry). So it's probably ok to use/include it.



I don't know much about Apache licensing, so I can't tell you
if it's legal/appropriate to do so. But the technical possibility
obviously exists. The advantage of including the library as a whole is
that any future Java changes (Java 7, etc.) and bugs can be
potentially taken care of by developers of gentyref.


This code could be written in Java (and wrapped by JCC for itself)


That's what I did in my patch. I included gentyref.jar in the
classpath and generated JCC wrappers for it.


So I did a custom build of JCC with that gentyref class wrapped and it does 
fix the problem you encountered but it then no longer compiles Lucene :-(


I get this detailed error message from gentyref:

 jcc.cpp.JavaError: com.googlecode.gentyref.UnresolvedTypeVariableException:
 An exact type is requested, but the type contains a type variable that
 cannot be resolved.
Variable: A from public org.apache.lucene.util.Attribute
 org.apache.lucene.util.AttributeSource.addAttribute(java.lang.Class)
Hint: This is usually caused by trying to get an exact type when a
 generic method who's type parameters are not given is involved.

Hacking it a bit, I catch the error and use the original reflection code 
when gentyref fails to see how far I get and I get a bit further but I hit 
more problems with too-specific types being resolved (like array of bool 
into [B). I could probably fix this too but I'm not yet convinced that 
gentyref is actually needed to solve the original problem. It feels like 
gentyref, cool as it is, is actually doing too much.


Clearly, I see the bug you reported but I'm not sure where it is yet. Is it 
in the java.lang.reflect code or is it in jcc itself ? For example, the same 
problem happens if you just define DirectoryEntry as:


 public interface DirectoryEntry extends Entry, Iterable {
 }

But not when I define it thus:

 import java.util.Iterator;
 public interface DirectoryEntry extends Entry, Iterable {
 Iterator iterator();
 }

Or thus:

 import java.util.Iterator;
 public interface DirectoryEntry extends Entry, IterableEntry {
 IteratorEntry iterator();
 }

It looks like the absence of an iterator() method definition triggers this.
Maybe all I need to do is make the iterator method code generation a bit 
smarter, like not generate it if it's inherited from above anyway ?

Or see if it's inherited and its return type is overriden by the extends ?

I'm not quite sure yet what to do about this bug...

Andi..


[jira] [Commented] (LUCENE-3348) IndexWriter applies wrong deletes during concurrent flush-all

2011-08-01 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073588#comment-13073588
 ] 

Michael McCandless commented on LUCENE-3348:


Here's what I run with the while1 tester in luceneutil: {{TestStressNRT -iters 
3 -verbose -seed -6208047570437556381:-3138230871915238634}}

I think what's special about the seed is maxBufferedDocs is 3, so we are doing 
tons of segment flushing.  I dumbed back the test somewhat (turned off merging 
entirely, only 1 reader thread, up to 5 writer threads, and it still fails.

 IndexWriter applies wrong deletes during concurrent flush-all
 -

 Key: LUCENE-3348
 URL: https://issues.apache.org/jira/browse/LUCENE-3348
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3348.patch, LUCENE-3348.patch, LUCENE-3348.patch


 Yonik uncovered this with the TestRealTimeGet test: if a flush-all is
 underway, it is possible for an incoming update to pick a DWPT that is
 stale, ie, not yet pulled/marked for flushing, yet the DW has cutover
 to a new deletes queue.  If this happens, and the deleted term was
 also updated in one of the non-stale DWPTs, then the wrong document is
 deleted and the test fails by detecting the wrong value.
 There's a 2nd failure mode that I haven't figured out yet, whereby 2
 docs are returned when searching by id (there should only ever be 1
 doc since the test uses updateDocument which is atomic wrt
 commit/reopen).
 Yonik verified the test passes pre-DWPT, so my guess is (but I
 have yet to verify) this test also passes on 3.x.  I'll backport
 the test to 3.x to be sure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3030) Block tree terms dict index

2011-08-01 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3030:
---

Attachment: LUCENE-3030.patch

Checkpointing my current state here -- the big change is I added a 
Terms.intersect(CompiledAutomaton) method, which returns a TermsEnum, but 
there's something wrong it still -- seems to give the right results but makes 
LEV2 FuzzyQ slower.

 Block tree terms dict  index
 -

 Key: LUCENE-3030
 URL: https://issues.apache.org/jira/browse/LUCENE-3030
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-3030.patch, LUCENE-3030.patch, LUCENE-3030.patch, 
 LUCENE-3030.patch


 Our default terms index today breaks terms into blocks of fixed size
 (ie, every 32 terms is a new block), and then we build an index on top
 of that (holding the start term for each block).
 But, it should be better to instead break terms according to how they
 share prefixes.  This results in variable sized blocks, but means
 within each block we maximize the shared prefix and minimize the
 resulting terms index.  It should also be a speedup for terms dict
 intensive queries because the terms index becomes a true prefix
 trie, and can be used to fast-fail on term lookup (ie returning
 NOT_FOUND without having to seek/scan a terms block).
 Having a true prefix trie should also enable much faster intersection
 with automaton (but this will be a new issue).
 I've made an initial impl for this (called
 BlockTreeTermsWriter/Reader).  It's still a work in progress... lots
 of nocommits, and hairy code, but tests pass (at least once!).
 I made two new codecs, temporarily called StandardTree, PulsingTree,
 that are just like their counterparts but use this new terms dict.
 I added a new exactOnly boolean to TermsEnum.seek.  If that's true
 and the term is NOT_FOUND, we will (quickly) return NOT_FOUND and the
 enum is unpositioned (ie you should not call next(), docs(), etc.).
 In this approach the index and dict are tightly connected, so it does
 not support a pluggable index impl like BlockTermsWriter/Reader.
 Blocks are stored on certain nodes of the prefix trie, and can contain
 both terms and pointers to sub-blocks (ie, if the block is not a leaf
 block).  So there are two trees, tied to one another -- the index
 trie, and the blocks.  Only certain nodes in the trie map to a block
 in the block tree.
 I think this algorithm is similar to burst tries
 (http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.3499),
 except it allows terms to be stored on inner blocks (not just leaf
 blocks).  This is important for Lucene because an [accidental]
 adversary could produce a terms dict with way too many blocks (way
 too much RAM used by the terms index).  Still, with my current patch,
 an adversary can produce too-big blocks... which we may need to fix,
 by letting the terms index not be a true prefix trie on it's leaf
 edges.
 Exactly how the blocks are picked can be factored out as its own
 policy (but I haven't done that yet).  Then, burst trie is one policy,
 my current approach is another, etc.  The policy can be tuned to
 the terms' expected distribution, eg if it's a primary key field and
 you only use base 10 for each character then you want block sizes of
 size 10.  This can make a sizable difference on lookup cost.
 I modified the FST Builder to allow for a plugin that freezes the
 tail (changed suffix) of each added term, because I use this to find
 the blocks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 9905 - Failure

2011-08-01 Thread Steven A Rowe
I agree, these events do feel coincidental.

Not sure what constitutes the example test framework, but I did move the 
ExternalPaths utility class into the solr test-framework, because it's used 
from both the Solr core tests and the Solrj tests, which don't depend (anymore) 
on the Solr core tests. (branch_3x:7/13/2011:r1146191)

I also had to change ExternalPaths.determineSourceHome() to handle the 
situation where there is no solr/conf/ dir in the classpath, e.g. the situation 
for Solrj  common tests (which are now housed with the Solrj internal module) 
-- these tests do not have solr/core/src/test-files/solr/conf/ in their 
classpath.  (branch_3x:7/22/2011:r1149691)

That all said, it's not clear to me how these changes could have affected 
directory creation?

Steve

 -Original Message-
 From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
 Seeley
 Sent: Monday, August 01, 2011 11:49 AM
 To: dev@lucene.apache.org
 Subject: Re: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 9905 -
 Failure
 
 Caused by: java.io.IOException: Cannot create directory:
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-
 3.x/checkout/solr/example/multicore/core1/data/index
   at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:121)
   at org.apache.lucene.store.Lock.obtain(Lock.java:72)
   at
 org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1116)
 
 
 It feels like we've been experiencing a lot of failures lately in the
 example jetty tests.
 Although it may be a coincidence, it feels like it coincided with the
 solr build rewrite (and IIRC there were some changes made to the
 example test framework at that time?)
 
 -Yonik
 http://www.lucidimagination.com
 
 
 
 On Mon, Aug 1, 2011 at 12:24 AM, Apache Jenkins Server
 jenk...@builds.apache.org wrote:
  https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9905/
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1223) Query Filter fq with OR operator

2011-08-01 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073611#comment-13073611
 ] 

Shawn Heisey commented on SOLR-1223:


bq. I'd rather see a new filterQuery type like ofq than being stuck with the 
current options. Nested filterQueries including variables would obviously be 
the most flexible solution, but imho having two different filter types would 
add enough benefit in the meantime.

I see that someone else had the same idea a long time before I did.  I just 
brought this up on the solr-user list a few days ago, but I couldn't think of a 
good parameter name.  The parameter name I came up with (fqu, filter query 
union) is not as good as ofq.

I like Brian and Frederik's idea.

 Query Filter fq with OR operator
 

 Key: SOLR-1223
 URL: https://issues.apache.org/jira/browse/SOLR-1223
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Brian Pearson
Priority: Minor

 See this 
 [issue|http://www.nabble.com/Query-Filter-fq-with-OR-operator-td23895837.html]
  for some background.Today, all of the Query filters specified with the 
 fq parameter are AND'd together.  
 This issue is about allowing a set of filters to be OR'd together (in 
 addition to having another set of filters that are AND'd).The OR'd 
 filters would of course be applied before any scoring is done.  
 The advantage of this feature is that you will be able to break up complex 
 filters into simple, more cacheable filters, which should improve performance.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2684) ConcurrentModificationException from BinaryResponseWriter

2011-08-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-2684.


Resolution: Fixed
  Assignee: Hoss Man

Committed revision 1152885.

CHANGES.txt attrib added to SOLR-1566 since this was a bug in unreleased code.

Thanks Arul!

 ConcurrentModificationException from BinaryResponseWriter
 -

 Key: SOLR-2684
 URL: https://issues.apache.org/jira/browse/SOLR-2684
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.0
Reporter: Arul Kalaipandian
Assignee: Hoss Man
Priority: Critical
  Labels: ConcurrentModificationException
 Fix For: 4.0

 Attachments: SOLR-2684.patch, SOLR-2684.patch


 ConcurrentModificationException  thrown from BinaryResponseWriter  while 
 writing SolrDocument to the response.
 SEVERE: java.util.ConcurrentModificationException
   at 
 java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:373)
   at java.util.LinkedHashMap$KeyIterator.next(LinkedHashMap.java:384)
   at 
 org.apache.solr.response.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:98)
   at 
 org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:242)
   at 
 org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:139)
   at 
 org.apache.solr.common.util.JavaBinCodec.writeArray(JavaBinCodec.java:377)
   at 
 org.apache.solr.common.util.JavaBinCodec.writeSolrDocumentList(JavaBinCodec.java:340)
   at 
 org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:226)
   at 
 org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:139)
   at 
 org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:134)
   at 
 org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:222)
   at 
 org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:139)
   at 
 org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:87)
   at 
 org.apache.solr.response.BinaryResponseWriter.write(BinaryResponseWriter.java:49)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:333)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:261)
   at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
   at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
   at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
   at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
   at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
   at 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
   at 
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
   at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
   at org.mortbay.jetty.Server.handle(Server.java:326)
   at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
   at 
 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
   at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
   at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 9905 - Failure

2011-08-01 Thread Yonik Seeley
On trunk currently, it looks like the following data directories
(outside of /build)  are used by the tests:

./core/src/test-files/solr/data
./example/multicore/core0/data
./example/multicore/core1/data

Not sure if this was the case in the past or not... I'll verify.

-Yonik
http://www.lucidimagination.com



On Mon, Aug 1, 2011 at 12:15 PM, Steven A Rowe sar...@syr.edu wrote:
 I agree, these events do feel coincidental.

 Not sure what constitutes the example test framework, but I did move the 
 ExternalPaths utility class into the solr test-framework, because it's used 
 from both the Solr core tests and the Solrj tests, which don't depend 
 (anymore) on the Solr core tests. (branch_3x:7/13/2011:r1146191)

 I also had to change ExternalPaths.determineSourceHome() to handle the 
 situation where there is no solr/conf/ dir in the classpath, e.g. the 
 situation for Solrj  common tests (which are now housed with the Solrj 
 internal module) -- these tests do not have 
 solr/core/src/test-files/solr/conf/ in their classpath.  
 (branch_3x:7/22/2011:r1149691)

 That all said, it's not clear to me how these changes could have affected 
 directory creation?

 Steve

 -Original Message-
 From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
 Seeley
 Sent: Monday, August 01, 2011 11:49 AM
 To: dev@lucene.apache.org
 Subject: Re: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 9905 -
 Failure

 Caused by: java.io.IOException: Cannot create directory:
 /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-
 3.x/checkout/solr/example/multicore/core1/data/index
       at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:121)
       at org.apache.lucene.store.Lock.obtain(Lock.java:72)
       at
 org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1116)


 It feels like we've been experiencing a lot of failures lately in the
 example jetty tests.
 Although it may be a coincidence, it feels like it coincided with the
 solr build rewrite (and IIRC there were some changes made to the
 example test framework at that time?)

 -Yonik
 http://www.lucidimagination.com



 On Mon, Aug 1, 2011 at 12:24 AM, Apache Jenkins Server
 jenk...@builds.apache.org wrote:
  https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9905/

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-2308) Separately specify a field's type

2011-08-01 Thread Nikola Tankovic (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikola Tankovic updated LUCENE-2308:


Attachment: LUCENE-2308-19.patch

Removed old oal.Document, except in documentation. Tests pass!

 Separately specify a field's type
 -

 Key: LUCENE-2308
 URL: https://issues.apache.org/jira/browse/LUCENE-2308
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
  Labels: gsoc2011, lucene-gsoc-11, mentor
 Fix For: 4.0

 Attachments: LUCENE-2308-10.patch, LUCENE-2308-11.patch, 
 LUCENE-2308-12.patch, LUCENE-2308-13.patch, LUCENE-2308-14.patch, 
 LUCENE-2308-15.patch, LUCENE-2308-16.patch, LUCENE-2308-17.patch, 
 LUCENE-2308-18.patch, LUCENE-2308-19.patch, LUCENE-2308-2.patch, 
 LUCENE-2308-3.patch, LUCENE-2308-4.patch, LUCENE-2308-5.patch, 
 LUCENE-2308-6.patch, LUCENE-2308-7.patch, LUCENE-2308-8.patch, 
 LUCENE-2308-9.patch, LUCENE-2308-ltc.patch, LUCENE-2308.patch, 
 LUCENE-2308.patch


 This came up from dicussions on IRC.  I'm summarizing here...
 Today when you make a Field to add to a document you can set things
 index or not, stored or not, analyzed or not, details like omitTfAP,
 omitNorms, index term vectors (separately controlling
 offsets/positions), etc.
 I think we should factor these out into a new class (FieldType?).
 Then you could re-use this FieldType instance across multiple fields.
 The Field instance would still hold the actual value.
 We could then do per-field analyzers by adding a setAnalyzer on the
 FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise
 for per-field codecs (with flex), where we now have
 PerFieldCodecWrapper).
 This would NOT be a schema!  It's just refactoring what we already
 specify today.  EG it's not serialized into the index.
 This has been discussed before, and I know Michael Busch opened a more
 ambitious (I think?) issue.  I think this is a good first baby step.  We could
 consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold
 off on that for starters...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3338) Flexible query parser does not support open ranges and range queries with mixed inclusive and exclusive ranges

2011-08-01 Thread Adriano Crestani (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adriano Crestani resolved LUCENE-3338.
--

Resolution: Fixed
  Assignee: Adriano Crestani  (was: Uwe Schindler)

 Flexible query parser does not support open ranges and range queries with 
 mixed inclusive and exclusive ranges
 --

 Key: LUCENE-3338
 URL: https://issues.apache.org/jira/browse/LUCENE-3338
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/queryparser
Affects Versions: 3.3
Reporter: Vinicius Barros
Assignee: Adriano Crestani
 Fix For: 4.0

 Attachments: LUCENE_3338_and_3343_2011_07_30.patch, 
 week9-merged-nosurround.patch, 
 week9-merged-nosurround_with_failing_junit.patch, week9-merged.patch, 
 week9.patch


 Flexible query parser does not support open ranges and range queries with 
 mixed inclusive and exclusive ranges.
 These two problems were found while developing LUCENE-1768.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3343) Comparison operators ,=,,= and = support as RangeQuery syntax in QueryParser

2011-08-01 Thread Adriano Crestani (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073663#comment-13073663
 ] 

Adriano Crestani commented on LUCENE-3343:
--

The code for 4.0 was just committed to the repository (rev 1152892)

 Comparison operators ,=,,= and = support as RangeQuery syntax in 
 QueryParser
 

 Key: LUCENE-3343
 URL: https://issues.apache.org/jira/browse/LUCENE-3343
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/queryparser
Reporter: Olivier Favre
Assignee: Adriano Crestani
Priority: Minor
  Labels: parser, query
 Fix For: 3.4, 4.0

 Attachments: NumCompQueryParser-3x.patch, NumCompQueryParser.patch

   Original Estimate: 96h
  Remaining Estimate: 96h

 To offer better interoperability with other search engines and to provide an 
 easier and more straight forward syntax,
 the operators , =, , = and = should be available to express an open range 
 query.
 They should at least work for numeric queries.
 '=' can be made a synonym for ':'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 9905 - Failure

2011-08-01 Thread Yonik Seeley
On Mon, Aug 1, 2011 at 2:49 PM, Yonik Seeley yo...@lucidimagination.com wrote:
 On trunk currently, it looks like the following data directories
 (outside of /build)  are used by the tests:

 ./core/src/test-files/solr/data
 ./example/multicore/core0/data
 ./example/multicore/core1/data

 Not sure if this was the case in the past or not... I'll verify.

Yep, older versions of Solr seem to have the same behavior.

-Yonik
http://www.lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3348) IndexWriter applies wrong deletes during concurrent flush-all

2011-08-01 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073678#comment-13073678
 ] 

Simon Willnauer commented on LUCENE-3348:
-

mike I can not reproduce this failure.. what exactly is failing there? maybe 
you can put the output in a text file and attache it?

Regarding the latest patch, I think we can call 
DWFlushControl#addFlushableState() from DWFlushControl#markForFullFlush() and 
use a global list to collect the DWPT for the full flush. 

I think we should move the getAndLock call into DWFlushControl something like 
DWFlushControl#obtainAndLock(), this would allow us to make the check and the 
DWFlushControl#addFlushableState() method private to DWFC. Further we can also 
simplify the deleteQueue check a little since we already obtained a ThreadState 
we don't need to unlock the state again after calling addFlushableState(), 
something like this:

{code}
ThreadState obtainAndLock() {
final ThreadState perThread = perThreadPool.getAndLock(Thread
.currentThread(), documentsWriter);
if (perThread.isActive()
 perThread.perThread.deleteQueue != documentsWriter.deleteQueue) {
  // There is a flush-all in process and this DWPT is
  // now stale -- enroll it for flush and try for
  // another DWPT:
  addFlushableState(perThread);
}
return perThread;
  }
{code}

Eventually we are spending too much time in full flush since we lock all 
ThreadStates at least once while some indexing threads might have already 
helped out with swapping out DWPT instances. I think we can collect already 
swapped out ThreadStates during a full flush and only check the ones that have 
not been processed? 


 IndexWriter applies wrong deletes during concurrent flush-all
 -

 Key: LUCENE-3348
 URL: https://issues.apache.org/jira/browse/LUCENE-3348
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3348.patch, LUCENE-3348.patch, LUCENE-3348.patch


 Yonik uncovered this with the TestRealTimeGet test: if a flush-all is
 underway, it is possible for an incoming update to pick a DWPT that is
 stale, ie, not yet pulled/marked for flushing, yet the DW has cutover
 to a new deletes queue.  If this happens, and the deleted term was
 also updated in one of the non-stale DWPTs, then the wrong document is
 deleted and the test fails by detecting the wrong value.
 There's a 2nd failure mode that I haven't figured out yet, whereby 2
 docs are returned when searching by id (there should only ever be 1
 doc since the test uses updateDocument which is atomic wrt
 commit/reopen).
 Yonik verified the test passes pre-DWPT, so my guess is (but I
 have yet to verify) this test also passes on 3.x.  I'll backport
 the test to 3.x to be sure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-3.x - Build # 9924 - Failure

2011-08-01 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9924/

No tests ran.

Build Log (for compile errors):
[...truncated 12464 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3030) Block tree terms dict index

2011-08-01 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073800#comment-13073800
 ] 

Michael McCandless commented on LUCENE-3030:


I created a branch 
https://svn.apache.org/repos/asf/lucene/dev/branches/blocktree_3030 for 
iterating on this.

 Block tree terms dict  index
 -

 Key: LUCENE-3030
 URL: https://issues.apache.org/jira/browse/LUCENE-3030
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-3030.patch, LUCENE-3030.patch, LUCENE-3030.patch, 
 LUCENE-3030.patch


 Our default terms index today breaks terms into blocks of fixed size
 (ie, every 32 terms is a new block), and then we build an index on top
 of that (holding the start term for each block).
 But, it should be better to instead break terms according to how they
 share prefixes.  This results in variable sized blocks, but means
 within each block we maximize the shared prefix and minimize the
 resulting terms index.  It should also be a speedup for terms dict
 intensive queries because the terms index becomes a true prefix
 trie, and can be used to fast-fail on term lookup (ie returning
 NOT_FOUND without having to seek/scan a terms block).
 Having a true prefix trie should also enable much faster intersection
 with automaton (but this will be a new issue).
 I've made an initial impl for this (called
 BlockTreeTermsWriter/Reader).  It's still a work in progress... lots
 of nocommits, and hairy code, but tests pass (at least once!).
 I made two new codecs, temporarily called StandardTree, PulsingTree,
 that are just like their counterparts but use this new terms dict.
 I added a new exactOnly boolean to TermsEnum.seek.  If that's true
 and the term is NOT_FOUND, we will (quickly) return NOT_FOUND and the
 enum is unpositioned (ie you should not call next(), docs(), etc.).
 In this approach the index and dict are tightly connected, so it does
 not support a pluggable index impl like BlockTermsWriter/Reader.
 Blocks are stored on certain nodes of the prefix trie, and can contain
 both terms and pointers to sub-blocks (ie, if the block is not a leaf
 block).  So there are two trees, tied to one another -- the index
 trie, and the blocks.  Only certain nodes in the trie map to a block
 in the block tree.
 I think this algorithm is similar to burst tries
 (http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.3499),
 except it allows terms to be stored on inner blocks (not just leaf
 blocks).  This is important for Lucene because an [accidental]
 adversary could produce a terms dict with way too many blocks (way
 too much RAM used by the terms index).  Still, with my current patch,
 an adversary can produce too-big blocks... which we may need to fix,
 by letting the terms index not be a true prefix trie on it's leaf
 edges.
 Exactly how the blocks are picked can be factored out as its own
 policy (but I haven't done that yet).  Then, burst trie is one policy,
 my current approach is another, etc.  The policy can be tuned to
 the terms' expected distribution, eg if it's a primary key field and
 you only use base 10 for each character then you want block sizes of
 size 10.  This can make a sizable difference on lookup cost.
 I modified the FST Builder to allow for a plugin that freezes the
 tail (changed suffix) of each added term, because I use this to find
 the blocks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3348) IndexWriter applies wrong deletes during concurrent flush-all

2011-08-01 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3348:
---

Attachment: fail.txt.bz2

Full output from a failure.

 IndexWriter applies wrong deletes during concurrent flush-all
 -

 Key: LUCENE-3348
 URL: https://issues.apache.org/jira/browse/LUCENE-3348
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3348.patch, LUCENE-3348.patch, LUCENE-3348.patch, 
 fail.txt.bz2


 Yonik uncovered this with the TestRealTimeGet test: if a flush-all is
 underway, it is possible for an incoming update to pick a DWPT that is
 stale, ie, not yet pulled/marked for flushing, yet the DW has cutover
 to a new deletes queue.  If this happens, and the deleted term was
 also updated in one of the non-stale DWPTs, then the wrong document is
 deleted and the test fails by detecting the wrong value.
 There's a 2nd failure mode that I haven't figured out yet, whereby 2
 docs are returned when searching by id (there should only ever be 1
 doc since the test uses updateDocument which is atomic wrt
 commit/reopen).
 Yonik verified the test passes pre-DWPT, so my guess is (but I
 have yet to verify) this test also passes on 3.x.  I'll backport
 the test to 3.x to be sure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3348) IndexWriter applies wrong deletes during concurrent flush-all

2011-08-01 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073804#comment-13073804
 ] 

Michael McCandless commented on LUCENE-3348:


OK I attached output of a failure -- it's 400K lines.  Search for the 
AssertionError, where id:26 couldn't find a doc nor tombstone.

 IndexWriter applies wrong deletes during concurrent flush-all
 -

 Key: LUCENE-3348
 URL: https://issues.apache.org/jira/browse/LUCENE-3348
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.4, 4.0

 Attachments: LUCENE-3348.patch, LUCENE-3348.patch, LUCENE-3348.patch, 
 fail.txt.bz2


 Yonik uncovered this with the TestRealTimeGet test: if a flush-all is
 underway, it is possible for an incoming update to pick a DWPT that is
 stale, ie, not yet pulled/marked for flushing, yet the DW has cutover
 to a new deletes queue.  If this happens, and the deleted term was
 also updated in one of the non-stale DWPTs, then the wrong document is
 deleted and the test fails by detecting the wrong value.
 There's a 2nd failure mode that I haven't figured out yet, whereby 2
 docs are returned when searching by id (there should only ever be 1
 doc since the test uses updateDocument which is atomic wrt
 commit/reopen).
 Yonik verified the test passes pre-DWPT, so my guess is (but I
 have yet to verify) this test also passes on 3.x.  I'll backport
 the test to 3.x to be sure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3354) Extend FieldCache architecture to multiple Values

2011-08-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073828#comment-13073828
 ] 

Hoss Man commented on LUCENE-3354:
--

bq. This would also remove the insanity issues. 

FWIW: the WeakHashMap isn't the sole source of insanity - that can also come 
about from inconsistent usage for a single field (ie: asking for string and int 
caches for the same field)

 Extend FieldCache architecture to multiple Values
 -

 Key: LUCENE-3354
 URL: https://issues.apache.org/jira/browse/LUCENE-3354
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Bill Bell

 I would consider this a bug. It appears lots of people are working around 
 this limitation, 
 why don't we just change the underlying data structures to natively support 
 multiValued fields in the FieldCache architecture?
 Then functions() will work properly, and we can do things like easily 
 geodist() on a multiValued field.
 Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-2979) Simplify configuration API of contrib Query Parser

2011-08-01 Thread Phillipe Ramalho (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phillipe Ramalho updated LUCENE-2979:
-

Attachment: LUCENE-2979_phillipe_ramalho_4_trunk.patch
LUCENE-2979_phillipe_ramalho_4_3x.patch

Here is a patch that backports the new configuration API to 3.x. I did exactly 
as I described in my proposal and it seems to be working as expected. I changed 
the documentation as well (I hope I everything, can you double check that 
Adriano?).

I also created a simple example of how to use the new API in package.html and 
added to both 3.x and trunk.

Please, let me know if everything looks good and if I didn't break any API.

 Simplify configuration API of contrib Query Parser
 --

 Key: LUCENE-2979
 URL: https://issues.apache.org/jira/browse/LUCENE-2979
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 2.9, 3.0
Reporter: Adriano Crestani
Assignee: Adriano Crestani
  Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor
 Fix For: 3.4, 4.0

 Attachments: LUCENE-2979_phillipe_ramalho_2.patch, 
 LUCENE-2979_phillipe_ramalho_3.patch, LUCENE-2979_phillipe_ramalho_3.patch, 
 LUCENE-2979_phillipe_ramalho_4_3x.patch, 
 LUCENE-2979_phillipe_ramalho_4_trunk.patch, 
 LUCENE-2979_phillipe_reamalho.patch


 The current configuration API is very complicated and inherit the concept 
 used by Attribute API to store token information in token streams. However, 
 the requirements for both (QP config and token stream) are not the same, so 
 they shouldn't be using the same thing.
 I propose to simplify QP config and make it less scary for people intending 
 to use contrib QP. The task is not difficult, it will just require a lot of 
 code change and figure out the best way to do it. That's why it's a good 
 candidate for a GSoC project.
 I would like to hear good proposals about how to make the API more friendly 
 and less scaring :)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2979) Simplify configuration API of contrib Query Parser

2011-08-01 Thread Phillipe Ramalho (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13075974#comment-13075974
 ] 

Phillipe Ramalho commented on LUCENE-2979:
--

Hi Uwe,

Is there anything to be fixed in 3352? I see it's a new feature JIRA. Am I 
missing something?

Currently, I am only working on migrating the old to new API and doing no 
changes on how the configuration is used. So nothing here changes (at least 
should not) how ParametricQueryNodeProcessor works.

 Simplify configuration API of contrib Query Parser
 --

 Key: LUCENE-2979
 URL: https://issues.apache.org/jira/browse/LUCENE-2979
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 2.9, 3.0
Reporter: Adriano Crestani
Assignee: Adriano Crestani
  Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor
 Fix For: 3.4, 4.0

 Attachments: LUCENE-2979_phillipe_ramalho_2.patch, 
 LUCENE-2979_phillipe_ramalho_3.patch, LUCENE-2979_phillipe_ramalho_3.patch, 
 LUCENE-2979_phillipe_ramalho_4_3x.patch, 
 LUCENE-2979_phillipe_ramalho_4_trunk.patch, 
 LUCENE-2979_phillipe_reamalho.patch


 The current configuration API is very complicated and inherit the concept 
 used by Attribute API to store token information in token streams. However, 
 the requirements for both (QP config and token stream) are not the same, so 
 they shouldn't be using the same thing.
 I propose to simplify QP config and make it less scary for people intending 
 to use contrib QP. The task is not difficult, it will just require a lot of 
 code change and figure out the best way to do it. That's why it's a good 
 candidate for a GSoC project.
 I would like to hear good proposals about how to make the API more friendly 
 and less scaring :)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2308) Separately specify a field's type

2011-08-01 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13075978#comment-13075978
 ] 

Michael McCandless commented on LUCENE-2308:


Patch looks good Nikola; I'll commit to the branch!

I think the next step is to remove the oal.document package and any related 
classes (eg, DocumentStoredFieldVisitor), and then do a massive rename of 
doc/document2 back to doc/document?

 Separately specify a field's type
 -

 Key: LUCENE-2308
 URL: https://issues.apache.org/jira/browse/LUCENE-2308
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
  Labels: gsoc2011, lucene-gsoc-11, mentor
 Fix For: 4.0

 Attachments: LUCENE-2308-10.patch, LUCENE-2308-11.patch, 
 LUCENE-2308-12.patch, LUCENE-2308-13.patch, LUCENE-2308-14.patch, 
 LUCENE-2308-15.patch, LUCENE-2308-16.patch, LUCENE-2308-17.patch, 
 LUCENE-2308-18.patch, LUCENE-2308-19.patch, LUCENE-2308-2.patch, 
 LUCENE-2308-3.patch, LUCENE-2308-4.patch, LUCENE-2308-5.patch, 
 LUCENE-2308-6.patch, LUCENE-2308-7.patch, LUCENE-2308-8.patch, 
 LUCENE-2308-9.patch, LUCENE-2308-ltc.patch, LUCENE-2308.patch, 
 LUCENE-2308.patch


 This came up from dicussions on IRC.  I'm summarizing here...
 Today when you make a Field to add to a document you can set things
 index or not, stored or not, analyzed or not, details like omitTfAP,
 omitNorms, index term vectors (separately controlling
 offsets/positions), etc.
 I think we should factor these out into a new class (FieldType?).
 Then you could re-use this FieldType instance across multiple fields.
 The Field instance would still hold the actual value.
 We could then do per-field analyzers by adding a setAnalyzer on the
 FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise
 for per-field codecs (with flex), where we now have
 PerFieldCodecWrapper).
 This would NOT be a schema!  It's just refactoring what we already
 specify today.  EG it's not serialized into the index.
 This has been discussed before, and I know Michael Busch opened a more
 ambitious (I think?) issue.  I think this is a good first baby step.  We could
 consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold
 off on that for starters...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2565) Prevent IW#close and cut over to IW#commit

2011-08-01 Thread Jason Rutherglen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13075980#comment-13075980
 ] 

Jason Rutherglen commented on SOLR-2565:


This issue says committed in the comments, however it's status is: Unresolved?

 Prevent IW#close and cut over to IW#commit
 --

 Key: SOLR-2565
 URL: https://issues.apache.org/jira/browse/SOLR-2565
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Mark Miller
 Fix For: 4.0

 Attachments: SOLR-2565.patch


 Spinnoff from SOLR-2193. We already have a branch to work on this issue here 
 https://svn.apache.org/repos/asf/lucene/dev/branches/solr2193 
 The main goal here is to prevent solr from closing the IW and use IW#commit 
 instead. AFAIK the main issues here are:
 The update handler needs an overhaul.
 A few goals I think we might want to look at:
 1. Expose the SolrIndexWriter in the api or add the proper abstractions to 
 get done what we now do with special casing:
 2. Stop closing the IndexWriter and start using commit (still lazy IW init 
 though).
 3. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level.
 4. Address the current issues we face because multiple original/'reloaded' 
 cores can have a different IndexWriter on the same index.
 Eventually this is a preparation for NRT support in Solr which I will create 
 a followup issue for.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2565) Prevent IW#close and cut over to IW#commit

2011-08-01 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13075993#comment-13075993
 ] 

Mark Miller commented on SOLR-2565:
---

Yeah, sorry - it's open as a reminder for me to make that changes note (or at 
least evaluate if something should be done) and do the wiki documentation. I'll 
try and do that tomorrow if I can and get this closed.

 Prevent IW#close and cut over to IW#commit
 --

 Key: SOLR-2565
 URL: https://issues.apache.org/jira/browse/SOLR-2565
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Mark Miller
 Fix For: 4.0

 Attachments: SOLR-2565.patch


 Spinnoff from SOLR-2193. We already have a branch to work on this issue here 
 https://svn.apache.org/repos/asf/lucene/dev/branches/solr2193 
 The main goal here is to prevent solr from closing the IW and use IW#commit 
 instead. AFAIK the main issues here are:
 The update handler needs an overhaul.
 A few goals I think we might want to look at:
 1. Expose the SolrIndexWriter in the api or add the proper abstractions to 
 get done what we now do with special casing:
 2. Stop closing the IndexWriter and start using commit (still lazy IW init 
 though).
 3. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level.
 4. Address the current issues we face because multiple original/'reloaded' 
 cores can have a different IndexWriter on the same index.
 Eventually this is a preparation for NRT support in Solr which I will create 
 a followup issue for.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2308) Separately specify a field's type

2011-08-01 Thread Nikola Tankovic (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13076033#comment-13076033
 ] 

Nikola Tankovic commented on LUCENE-2308:
-

Yes, exactly! That is my next step, shouldn't take long.

 Separately specify a field's type
 -

 Key: LUCENE-2308
 URL: https://issues.apache.org/jira/browse/LUCENE-2308
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
  Labels: gsoc2011, lucene-gsoc-11, mentor
 Fix For: 4.0

 Attachments: LUCENE-2308-10.patch, LUCENE-2308-11.patch, 
 LUCENE-2308-12.patch, LUCENE-2308-13.patch, LUCENE-2308-14.patch, 
 LUCENE-2308-15.patch, LUCENE-2308-16.patch, LUCENE-2308-17.patch, 
 LUCENE-2308-18.patch, LUCENE-2308-19.patch, LUCENE-2308-2.patch, 
 LUCENE-2308-3.patch, LUCENE-2308-4.patch, LUCENE-2308-5.patch, 
 LUCENE-2308-6.patch, LUCENE-2308-7.patch, LUCENE-2308-8.patch, 
 LUCENE-2308-9.patch, LUCENE-2308-ltc.patch, LUCENE-2308.patch, 
 LUCENE-2308.patch


 This came up from dicussions on IRC.  I'm summarizing here...
 Today when you make a Field to add to a document you can set things
 index or not, stored or not, analyzed or not, details like omitTfAP,
 omitNorms, index term vectors (separately controlling
 offsets/positions), etc.
 I think we should factor these out into a new class (FieldType?).
 Then you could re-use this FieldType instance across multiple fields.
 The Field instance would still hold the actual value.
 We could then do per-field analyzers by adding a setAnalyzer on the
 FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise
 for per-field codecs (with flex), where we now have
 PerFieldCodecWrapper).
 This would NOT be a schema!  It's just refactoring what we already
 specify today.  EG it's not serialized into the index.
 This has been discussed before, and I know Michael Busch opened a more
 ambitious (I think?) issue.  I think this is a good first baby step.  We could
 consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold
 off on that for starters...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-3.x - Build # 9930 - Failure

2011-08-01 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9930/

1 tests failed.
REGRESSION:  
org.apache.solr.client.solrj.embedded.MergeIndexesEmbeddedTest.testMergeIndexesByCoreName

Error Message:
org.apache.solr.client.solrj.SolrServerException: 
org.apache.lucene.store.LockReleaseFailedException: failed to delete 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/example/multicore/core1/data/index/org.apache.solr.core.RefCntRamDirectory@46b4be3d
 lockFactory=org.apache.lucene.store.simplefslockfact...@235f4a7f-write.lock

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: 
org.apache.solr.client.solrj.SolrServerException: 
org.apache.lucene.store.LockReleaseFailedException: failed to delete 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/example/multicore/core1/data/index/org.apache.solr.core.RefCntRamDirectory@46b4be3d
 lockFactory=org.apache.lucene.store.simplefslockfact...@235f4a7f-write.lock
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:153)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at 
org.apache.solr.client.solrj.MergeIndexesExampleTestBase.setupCores(MergeIndexesExampleTestBase.java:90)
at 
org.apache.solr.client.solrj.MergeIndexesExampleTestBase.testMergeIndexesByCoreName(MergeIndexesExampleTestBase.java:145)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1335)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1240)
Caused by: org.apache.solr.client.solrj.SolrServerException: 
org.apache.lucene.store.LockReleaseFailedException: failed to delete 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/example/multicore/core1/data/index/org.apache.solr.core.RefCntRamDirectory@46b4be3d
 lockFactory=org.apache.lucene.store.simplefslockfact...@235f4a7f-write.lock
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:142)
Caused by: org.apache.lucene.store.LockReleaseFailedException: failed to delete 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/solr/example/multicore/core1/data/index/org.apache.solr.core.RefCntRamDirectory@46b4be3d
 lockFactory=org.apache.lucene.store.simplefslockfact...@235f4a7f-write.lock
at 
org.apache.lucene.store.SimpleFSLock.release(SimpleFSLockFactory.java:133)
at 
org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1885)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1815)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1779)
at 
org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:143)
at 
org.apache.solr.update.DirectUpdateHandler2.closeWriter(DirectUpdateHandler2.java:183)
at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:416)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85)
at 
org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:107)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:71)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:140)




Build Log (for compile errors):
[...truncated 14167 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3355) Incorrect behaviour of MultiFieldQueryNodeProcessor when default operator is 'AND'

2011-08-01 Thread Trejkaz (JIRA)
Incorrect behaviour of MultiFieldQueryNodeProcessor when default operator is 
'AND'
--

 Key: LUCENE-3355
 URL: https://issues.apache.org/jira/browse/LUCENE-3355
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/queryparser
Affects Versions: 3.3
Reporter: Trejkaz


StandardQueryNodeProcessorPipeline runs MultiFieldQueryNodeProcessor before 
GroupQueryNodeProcessor.

MultiFieldQueryNodeProcessor, if it encounters a node with no field, will do 
this:
{code}
return new GroupQueryNode(new BooleanQueryNode(children));
{code}

GroupQueryNodeProcessor comes along later on, sees that no operator is 
specified, so it applies the default operator, which, if set to 'AND', results 
in:

{code}
+properties:text +text:text
{code}

Which I don't think matches the intent of the multi-field processor.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3335) jrebug causes porter stemmer to sigsegv

2011-08-01 Thread Shay Banon (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13076053#comment-13076053
 ] 

Shay Banon commented on LUCENE-3335:


@Uwe I actually forgot about this, and did not think it was because of the 
porter stemmer at the time, especially since I did try and reproduce it and 
never managed to (I thought it was coincidence it crashed there). From my 
experience, you get very little help from sun/oracle when using unorthodox 
flags like agressive opts without proper recreation. Well, you get very little 
help there even when you do produce recreation... (see this issue that I opened 
for example: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7066129) . I am 
the reason behind Lucene 1.9.1 release with the major bug in buffering 
introduced in 1.9 way back in the days, do you really think I would not contact 
if I thought there really was a problem associated with Lucene?

 jrebug causes porter stemmer to sigsegv
 ---

 Key: LUCENE-3335
 URL: https://issues.apache.org/jira/browse/LUCENE-3335
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 1.9, 1.9.1, 2.0.0, 2.1, 2.2, 2.3, 2.3.1, 2.3.2, 2.4, 
 2.4.1, 2.9, 2.9.1, 2.9.2, 2.9.3, 2.9.4, 3.0, 3.0.1, 3.0.2, 3.0.3, 3.1, 3.2, 
 3.3, 3.4, 4.0
 Environment: - JDK 7 Preview Release, GA (may also affect update _1, 
 targeted fix is JDK 1.7.0_2)
 - JDK 1.6.0_20+ with -XX:+OptimizeStringConcat or -XX:+AggressiveOpts
Reporter: Robert Muir
Assignee: Robert Muir
  Labels: Java7
 Attachments: LUCENE-3335.patch, LUCENE-3335_slow.patch, 
 patch-0uwe.patch


 happens easily on java7: ant test -Dtestcase=TestPorterStemFilter 
 -Dtests.iter=100
 might happen on 1.6.0_u26 too, a user reported something that looks like the 
 same bug already:
 http://www.lucidimagination.com/search/document/3beaa082c4d2fdd4/porterstemfilter_kills_jvm

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org