date:20110524

[jira] [Created] (SOLR-2539) VectorValueSource returnes floatVal of DocValues is wrong

2011-05-24 Thread tom liu (JIRA)

VectorValueSource returnes floatVal of DocValues is wrong
-

 Key: SOLR-2539
 URL: https://issues.apache.org/jira/browse/SOLR-2539
 Project: Solr
  Issue Type: Bug
  Components: search
 Environment: JDK1.6/Tomcat6
Reporter: tom liu


@Override
public void floatVal(int doc, float[] vals) {
  vals[0] = x.byteVal(doc);
  vals[1] = y.byteVal(doc);
}
should be:
@Override
public void floatVal(int doc, float[] vals) {
  vals[0] = x.floatVal(doc);
  vals[1] = y.floatVal(doc);
}


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2058) Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

2011-05-24 Thread Ron Mayer (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038437#comment-13038437
 ] 

Ron Mayer commented on SOLR-2058:
-

Nick, thanks!  Glad you like it.

I'm keeping a version that's kept more up-to-date with trunk on github here:
https://github.com/ramayer/lucene-solr/tree/solr_2058_edismax_pf2_phrase_slop



> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with 
> field~slop^boost syntax
> 
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Affects Versions: 3.1, 4.0
> Environment: n/a
>Reporter: Ron Mayer
>Priority: Minor
> Attachments: edismax_pf_with_slop_v2.1.patch, 
> edismax_pf_with_slop_v2.patch, pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3E
> {quote}
> From  Ron Mayer 
> ... my results might  be even better if I had a couple different "pf2"s with 
> different "ps"'s  at the same time.   In particular.   One with ps=0 to put a 
> high boost on ones the have  the right ordering of words.  For example 
> insuring that [the query]:
>   "red hat black jacket"
>  boosts only documents with "red hats" and not "black hats".   And another 
> pf2 with a more modest boost with ps=5 or so to handle the query above also 
> boosting docs with 
>   "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3E]
> {quote}
> From  Yonik Seeley 
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2// current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3E]
> {quote}
> From  Chris Hostetter 
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-2058) Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with field~slop^boost syntax

2011-05-24 Thread Ron Mayer (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038437#comment-13038437
 ] 

Ron Mayer edited comment on SOLR-2058 at 5/24/11 8:18 AM:
--

Nick, thanks!  Glad you like it.

I'm keeping a version that's kept more up-to-date with trunk on github here:
https://github.com/ramayer/lucene-solr/tree/solr_2058_edismax_pf2_phrase_slop

(though I must admit I've tested that less than an internal fork of trunk we 
made in Sep 2010 and deployed with only a few additional cherry-picked patches)

  was (Author: ramayer):
Nick, thanks!  Glad you like it.

I'm keeping a version that's kept more up-to-date with trunk on github here:
https://github.com/ramayer/lucene-solr/tree/solr_2058_edismax_pf2_phrase_slop


  
> Adds optional "phrase slop" to edismax "pf2", "pf3" and "pf" parameters with 
> field~slop^boost syntax
> 
>
> Key: SOLR-2058
> URL: https://issues.apache.org/jira/browse/SOLR-2058
> Project: Solr
>  Issue Type: Improvement
>  Components: SearchComponents - other
>Affects Versions: 3.1, 4.0
> Environment: n/a
>Reporter: Ron Mayer
>Priority: Minor
> Attachments: edismax_pf_with_slop_v2.1.patch, 
> edismax_pf_with_slop_v2.patch, pf2_with_slop.patch
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3E
> {quote}
> From  Ron Mayer 
> ... my results might  be even better if I had a couple different "pf2"s with 
> different "ps"'s  at the same time.   In particular.   One with ps=0 to put a 
> high boost on ones the have  the right ordering of words.  For example 
> insuring that [the query]:
>   "red hat black jacket"
>  boosts only documents with "red hats" and not "black hats".   And another 
> pf2 with a more modest boost with ps=5 or so to handle the query above also 
> boosting docs with 
>   "red baseball hat".
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3E]
> {quote}
> From  Yonik Seeley 
> Perhaps fold it into the pf/pf2 syntax?
> pf=text^2// current syntax... makes phrases with a boost of 2
> pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
> a boost of 2
> That actually seems pretty natural given the lucene query syntax - an
> actual boosted sloppy phrase query already looks like
> {{text:"foo bar"~1^2}}
> -Yonik
> {quote}
> [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3E]
> {quote}
> From  Chris Hostetter 
> Big +1 to this idea ... the existing "ps" param can stick arround as the 
> default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
> fields using the "~" syntax.
> -Hoss
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3133) Fix QueryParser to handle nested fields

2011-05-24 Thread Mark Harwood (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038448#comment-13038448
 ] 

Mark Harwood commented on LUCENE-3133:
--

bq.  I wonder if LUCENE-2454 could be extended to allow recursive 
ChildDocumentQuery 

No need to extend. This can be done today by nesting a NestedDocumentQuery 
inside another.
The only thing you need to do is set the "ParentsFilter" to roll up results to 
the appropriate point e.g. parent/child/grandchild

bq. is there a "normal" use case where you would want to put same field name on 
both parent & child docs?

I wouldn't want to rule that possibility out e.g. a person has a name and age 
and their sons and daughters have names and ages too.



> Fix QueryParser to handle nested fields
> ---
>
> Key: LUCENE-3133
> URL: https://issues.apache.org/jira/browse/LUCENE-3133
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 3.2, 4.0
>
>
> Once we commit LUCENE-2454, we need to make it easy for apps to enable this 
> with QueryParser.
> It seems like it's a "schema" like behavior, ie we need to be able to express 
> the join structure of the related fields.
> And then whenever QP produces a query that spans fields requiring a join, the 
> NestedDocumentQuery is used to wrap the child fields?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-3.x - Build # 8325 - Failure

2011-05-24 Thread Apache Jenkins Server

Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/8325/

1 tests failed.
REGRESSION:  
org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads

Error Message:
thread Indexer 2: hit unexpected failure

Stack Trace:
junit.framework.AssertionFailedError: thread Indexer 2: hit unexpected failure
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1191)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1109)
at 
org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads(TestIndexWriterExceptions.java:238)




Build Log (for compile errors):
[...truncated 4948 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2454) Nested Document query support

2011-05-24 Thread Mark Harwood (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038460#comment-13038460
 ] 

Mark Harwood commented on LUCENE-2454:
--

Thanks for the patch work, Mike. I'll need to check LUCENE-3129 for equivalence 
with PerParentLimitQuery. It's certainly a central part of what I typically 
deploy for nested queries - pass 1 is usually a NestedDocumentQuery to get the 
best parents and pass 2 uses PerParentLimitQuery to get the best children for 
these best parents. Of course some apps can simply fetch ALL children for the 
top parents but in some cases summarising children is required (note: this is 
potentially a great solution for performance issues on highlighting big docs 
e.g. entire books).

I haven't benchmarked nextSetBit vs the existing "rewind" implementation but I 
imagine it may be quicker. Parent- followed-by-children seems more natural from 
a user's point of view however. I guess you could always keep the 
parent-then-child insertion order but flip the bitset (then cache) for query 
execution if that was faster. Benchmarking rewind vs nextSetbit vs flip then 
nextSetBit would reveal all.

Thomas - maintaining a strict order of parent/child docs is important and the 
recently-committed LUCENE-3112 should help with this.

> Nested Document query support
> -
>
> Key: LUCENE-2454
> URL: https://issues.apache.org/jira/browse/LUCENE-2454
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: core/search
>Affects Versions: 3.0.2
>Reporter: Mark Harwood
>Assignee: Mark Harwood
>Priority: Minor
> Attachments: LUCENE-2454.patch, LuceneNestedDocumentSupport.zip
>
>
> A facility for querying nested documents in a Lucene index as outlined in 
> http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in-lucene

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-3.x - Build # 8327 - Failure

2011-05-24 Thread Apache Jenkins Server

Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/8327/

1 tests failed.
REGRESSION:  
org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads

Error Message:
thread Indexer 2: hit unexpected failure

Stack Trace:
junit.framework.AssertionFailedError: thread Indexer 2: hit unexpected failure
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1191)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1109)
at 
org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads(TestIndexWriterExceptions.java:238)




Build Log (for compile errors):
[...truncated 4940 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3137) Benchmark's ExtractReuters creates its temp dir wrongly if provided out-dir param ends by slash

2011-05-24 Thread Doron Cohen (JIRA)

Benchmark's ExtractReuters creates its temp dir wrongly if provided out-dir 
param ends by slash
---

 Key: LUCENE-3137
 URL: https://issues.apache.org/jira/browse/LUCENE-3137
 Project: Lucene - Java
  Issue Type: Bug
  Components: modules/benchmark
Affects Versions: 3.2, 4.0
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor


See LUCENE-929 for context.
As result, it might fail to create the temp dir at all.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 8327 - Failure

2011-05-24 Thread Michael McCandless

I'll dig -- likely this is from LUCENE-3112.

Mike

http://blog.mikemccandless.com

On Tue, May 24, 2011 at 6:34 AM, Apache Jenkins Server
 wrote:
> Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/8327/
>
> 1 tests failed.
> REGRESSION:  
> org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads
>
> Error Message:
> thread Indexer 2: hit unexpected failure
>
> Stack Trace:
> junit.framework.AssertionFailedError: thread Indexer 2: hit unexpected failure
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1191)
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1109)
>        at 
> org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads(TestIndexWriterExceptions.java:238)
>
>
>
>
> Build Log (for compile errors):
> [...truncated 4940 lines...]
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2519) Improve the defaults for the "text" field type in default schema.xml

2011-05-24 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated SOLR-2519:
-

Attachment: SOLR-2519.patch

New patch, fixes the tutorial (mainly the text analysis section, but also lots 
of little "collateral improvements").

I think it's ready to commit!

> Improve the defaults for the "text" field type in default schema.xml
> 
>
> Key: SOLR-2519
> URL: https://issues.apache.org/jira/browse/SOLR-2519
> Project: Solr
>  Issue Type: Bug
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 3.2, 4.0
>
> Attachments: SOLR-2519.patch, SOLR-2519.patch, SOLR-2519.patch
>
>
> Spinoff from: http://lucene.markmail.org/thread/ww6mhfi3rfpngmc5
> The text fieldType in schema.xml is unusable for non-whitespace
> languages, because it has the dangerous auto-phrase feature (of
> Lucene's QP -- see LUCENE-2458) enabled.
> Lucene leaves this off by default, as does ElasticSearch
> (http://http://www.elasticsearch.org/).
> Furthermore, the "text" fieldType uses WhitespaceTokenizer when
> StandardTokenizer is a better cross-language default.
> Until we have language specific field types, I think we should fix
> the "text" fieldType to work well for all languages, by:
>   * Switching from WhitespaceTokenizer to StandardTokenizer
>   * Turning off auto-phrase

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3137) Benchmark's ExtractReuters creates its temp dir wrongly if provided out-dir param ends by slash

2011-05-24 Thread Doron Cohen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen updated LUCENE-3137:


Attachment: LUCENE-3137.patch

Simple patch solving this slash problem.

> Benchmark's ExtractReuters creates its temp dir wrongly if provided out-dir 
> param ends by slash
> ---
>
> Key: LUCENE-3137
> URL: https://issues.apache.org/jira/browse/LUCENE-3137
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: modules/benchmark
>Affects Versions: 3.2, 4.0
>Reporter: Doron Cohen
>Assignee: Doron Cohen
>Priority: Minor
> Attachments: LUCENE-3137.patch
>
>
> See LUCENE-929 for context.
> As result, it might fail to create the temp dir at all.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-929) contrib/benchmark build doesn't handle checking if content is properly extracted

2011-05-24 Thread Doron Cohen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038498#comment-13038498
 ] 

Doron Cohen commented on LUCENE-929:


bq. Note, this fix this doesn't work if the output dir has a trailing slash

I think this is a separate issue - I mean not handling a trailing slash. 
Created LUCENE-3137 for handling this.

> contrib/benchmark build doesn't handle checking if content is properly 
> extracted
> 
>
> Key: LUCENE-929
> URL: https://issues.apache.org/jira/browse/LUCENE-929
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: modules/benchmark
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 3.1, 4.0
>
>
> The contrib/benchmark build does not properly handle checking to see if the 
> content (such as Reuters coll.) is properly extracted.  It only checks to see 
> if the directory exists.  Thus, it is possible that the directory gets 
> created and the extraction fails.  Then, the next time it is run, it skips 
> the extraction part and tries to continue on running the benchmark.
> The workaround is to manually delete the extraction directory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-929) contrib/benchmark build doesn't handle checking if content is properly extracted

2011-05-24 Thread Doron Cohen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038502#comment-13038502
 ] 

Doron Cohen commented on LUCENE-929:


There's now a simple patch for this in LUCENE-3137. 
I think this one can be closed?

> contrib/benchmark build doesn't handle checking if content is properly 
> extracted
> 
>
> Key: LUCENE-929
> URL: https://issues.apache.org/jira/browse/LUCENE-929
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: modules/benchmark
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 3.1, 4.0
>
>
> The contrib/benchmark build does not properly handle checking to see if the 
> content (such as Reuters coll.) is properly extracted.  It only checks to see 
> if the directory exists.  Thus, it is possible that the directory gets 
> created and the extraction fails.  Then, the next time it is run, it skips 
> the extraction part and tries to continue on running the benchmark.
> The workaround is to manually delete the extraction directory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-3.x #132: POMs out of sync

2011-05-24 Thread Apache Jenkins Server

Build: https://builds.apache.org/hudson/job/Lucene-Solr-Maven-3.x/132/

No tests ran.

Build Log (for compile errors):
[...truncated 7388 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Solr Highlight Component Config

2011-05-24 Thread Lord Khan Han

Hi ,

Can I limit the terms that the HighlightComponent uses. My query is
generally long and I want specific ones to be highlighted and the rest is
not highlighted. Is there an option like the SpellCheckComponent. it uses q
unless spellcheck.q if specified. Is  a hl.q parameter possible?


Or any other tricky way to workaround ..


PS: I need this tomorrow (hopefully) to show my boss insisting some other
stupid well known  commercial search engines..


Regards

PS:I also send the message to user group.. I am not sure is it a user or
developer issue ..Do I need to put my hands on the components code ?

Re: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 8327 - Failure

2011-05-24 Thread Michael McCandless

I reverted LUCENE-3112 from 3.x for now...

Mike

http://blog.mikemccandless.com

On Tue, May 24, 2011 at 7:00 AM, Michael McCandless
 wrote:
> I'll dig -- likely this is from LUCENE-3112.
>
> Mike
>
> http://blog.mikemccandless.com
>
> On Tue, May 24, 2011 at 6:34 AM, Apache Jenkins Server
>  wrote:
>> Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/8327/
>>
>> 1 tests failed.
>> REGRESSION:  
>> org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads
>>
>> Error Message:
>> thread Indexer 2: hit unexpected failure
>>
>> Stack Trace:
>> junit.framework.AssertionFailedError: thread Indexer 2: hit unexpected 
>> failure
>>        at 
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1191)
>>        at 
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1109)
>>        at 
>> org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads(TestIndexWriterExceptions.java:238)
>>
>>
>>
>>
>> Build Log (for compile errors):
>> [...truncated 4940 lines...]
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2762) Don't leak deleted open file handles with pooled readers

2011-05-24 Thread Josef (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038533#comment-13038533
 ] 

Josef commented on LUCENE-2762:
---

Since I am still observing file references on deleted index files, I have a 
quick question regarding this issue:

We are using Lucene 3.0.3 with standard configuration and near real-time 
reader, as the following pseudo code shows:

{code}
...
// initially obtaining and referencing a near-real time searcher
IndexSearcher currentSearcher = new IndexSearcher(writer.getReader())
...

// subsequently obtaining a new near-real time searcher if necessary
if (!currentSearcher.getIndexReader().isCurrent()) {
IndexReader newReader = currentSearcher.getIndexReader().reopen();
IndexSearcher newSearcher = new IndexSearcher(newReader);

// release old searcher (by decreasing reference)
currentSearcher.getIndexReader().decRef();

currentSearcher = newSearcher;
}
...
{code}

After running the application for a while with index updates and doing new 
searcher using the newly obtained IndexSearcher, we noticed that JVM still 
holds references to already deleted index files.
For example:
{code}
java20742 xxx  394r   REG8,6  366 3412376 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fc.cfs (deleted)
java20742 xxx  398r   REG8,6  366 3412375 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fq.cfs
java20742 xxx  401uw  REG8,60 3412333 
/home/xxx/Desktop/mp.home/dev/index/Artist/write.lock
java20742 xxx  415r   REG8,6  128 3412349 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fp.tis
java20742 xxx  416r   REG8,6  366 3412341 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fd.cfs (deleted)
java20742 xxx  417r   REG8,6  366 3412344 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fe.cfs (deleted)
java20742 xxx  418r   REG8,6   71 3412356 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fb.tis (deleted)
java20742 xxx  424r   REG8,67 3412362 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fb.frq (deleted)
java20742 xxx  425r   REG8,67 3412363 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fb.prx (deleted)
java20742 xxx  426r   REG8,6   23 3412351 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fb.fdt (deleted)
java20742 xxx  427r   REG8,6   12 3412352 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fb.fdx (deleted)
java20742 xxx  428r   REG8,6   10 3412365 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fb.nrm (deleted)
java20742 xxx  429r   REG8,6   21 3412357 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fp.frq
java20742 xxx  432r   REG8,6   21 3412358 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fp.prx
java20742 xxx  433r   REG8,6   61 3412347 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fp.fdt
java20742 xxx  434r   REG8,6   28 3412348 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fp.fdx
java20742 xxx  445r   REG8,6   22 3412360 
/home/xxx/Desktop/mp.home/dev/index/Artist/_fp.nrm
{code}

The application reaches the limit of maximum number of open files and then 
stops. 

Are we doing anything wrong here or does the bug still exist in version 3.0.3? 
Any advice is welcome.
Thanks!

> Don't leak deleted open file handles with pooled readers
> 
>
> Key: LUCENE-2762
> URL: https://issues.apache.org/jira/browse/LUCENE-2762
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 2.9.4, 3.0.3, 3.1, 4.0
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 2.9.4, 3.0.3, 3.1, 4.0
>
> Attachments: LUCENE-2762.patch
>
>
> If you have CFS enabled today, and pooling is enabled (either directly
> or because you've pulled an NRT reader), IndexWriter will hold open
> SegmentReaders against the non-CFS format of each merged segment.
> So even if you close all NRT readers you've pulled from the writer,
> you'll still see file handles open against files that have been
> deleted.
> This count will not grow unbounded, since it's limited by the number
> of segments in the index, but it's still a serious problem since the
> app had turned off CFS in the first place presumably to avoid risk of
> too-many-open-files.  It's also bad because it ties up disk space
> since these files would otherwise be deleted.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

--

[jira] [Commented] (LUCENE-929) contrib/benchmark build doesn't handle checking if content is properly extracted

2011-05-24 Thread Grant Ingersoll (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038534#comment-13038534
 ] 

Grant Ingersoll commented on LUCENE-929:


Doron, that's fine to open a new issue and close this one, but it was this 
issue's fix that introduced the bug.

> contrib/benchmark build doesn't handle checking if content is properly 
> extracted
> 
>
> Key: LUCENE-929
> URL: https://issues.apache.org/jira/browse/LUCENE-929
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: modules/benchmark
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 3.1, 4.0
>
>
> The contrib/benchmark build does not properly handle checking to see if the 
> content (such as Reuters coll.) is properly extracted.  It only checks to see 
> if the directory exists.  Thus, it is possible that the directory gets 
> created and the extraction fails.  Then, the next time it is run, it skips 
> the extraction part and tries to continue on running the benchmark.
> The workaround is to manually delete the extraction directory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3112) Add IW.add/updateDocuments to support nested documents

2011-05-24 Thread Steven Rowe (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038538#comment-13038538
 ] 

Steven Rowe commented on LUCENE-3112:
-

Mike, when you put the branch_3x port back, contrib/misc's IndexSplitter.java 
has javadocs references to IndexWriter#addDocuments(Iterable), instead of 
IW#aDs(Collection) - this triggers javadocs warnings which fail the build.

> Add IW.add/updateDocuments to support nested documents
> --
>
> Key: LUCENE-3112
> URL: https://issues.apache.org/jira/browse/LUCENE-3112
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3112.patch, LUCENE-3112.patch
>
>
> I think nested documents (LUCENE-2454) is a very compelling addition
> to Lucene.  It's also a popular (many votes) issue.
> Beyond supporting nested document querying, which is already an
> incredible addition since it preserves the relational model on
> indexing normalized content (eg, DB tables, XML docs), LUCENE-2454
> should also enable speedups in grouping implementation when you group
> by a nested field.
> For the same reason, it can also enable very fast post-group facet
> counting impl (LUCENE-3097) when you what to
> count(distinct(nestedField)), instead of unique documents, as your
> "identifier".  I expect many apps that use faceting need this ability
> (to count(distinct(nestedField)) not distinct(docID)).
> To support these use cases, I believe the only core change needed is
> the ability to atomically add or update multiple documents, which you
> cannot do today since in between add/updateDocument calls a flush (eg
> due to commit or getReader()) could occur.
> This new API (addDocuments(Iterable), updateDocuments(Term
> delTerm, Iterable) would also further guarantee that the
> documents are assigned sequential docIDs in the order the iterator
> provided them, and that the docIDs all reside in one segment.
> Segment merging never splits segments apart, so this invariant would
> hold even as merges/optimizes take place.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3126) IndexWriter.addIndexes can make any incoming segment into CFS if it isn't already

2011-05-24 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038558#comment-13038558
 ] 

Shai Erera commented on LUCENE-3126:


bq. Right, these too only live outside a CFS. You create them by opening a 
writable IndexReader (I know: confusing!) and calling setNorm, then closing it. 
They are not only for old indices... 4.0 creates them too.

Thanks Mike ! I was able to reproduce it and fix it (+ add to test). Are there 
other files that are normally created outside the .cfs? I've seen sometimes 
that the stored fields of CFS are created outside. Was it only for shared doc 
stores?

bq. More generally: does addIndexes properly refuse to import a too-old index? 
We should throw IndexFormatTooOldExc in this case? (And, maybe also 
IndexFormatTooNewExc?).

Not today. I believe it will fail in later stages (e.g. commit()), but we 
better fail up front. I think it's a separate issue though, only for 4.0 (b/c 
3x supports all formats today)?

> IndexWriter.addIndexes can make any incoming segment into CFS if it isn't 
> already
> -
>
> Key: LUCENE-3126
> URL: https://issues.apache.org/jira/browse/LUCENE-3126
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3126.patch
>
>
> Today, IW.addIndexes(Directory) does not modify the CFS-mode of the incoming 
> segments. However, if IndexWriter's MP wants to create CFS (in general), 
> there's no reason why not turn the incoming non-CFS segments into CFS. We 
> anyway copy them, and if MP is not against CFS, we should create a CFS out of 
> them.
> Will need to use CFW, not sure it's ready for that w/ current API (I'll need 
> to check), but luckily we're allowed to change it (@lucene.internal).
> This should be done, IMO, even if the incoming segment is large (i.e., passes 
> MP.noCFSRatio) b/c like I wrote above, we anyway copy it. However, if you 
> think otherwise, speak up :).
> I'll take a look at this in the next few days.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2280) commitWithin ignored for a delete query

2011-05-24 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038560#comment-13038560
 ] 

Jan Høydahl commented on SOLR-2280:
---

+1 I have client side code explicitly working around this bug, would be nice 
with a fix in 3.2

> commitWithin ignored for a delete query
> ---
>
> Key: SOLR-2280
> URL: https://issues.apache.org/jira/browse/SOLR-2280
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Reporter: David Smiley
>Priority: Minor
>
> The commitWithin option on an UpdateRequest is only honored for requests 
> containing new documents.  It does not, for example, work with a delete 
> query.  The following doesn't work as expected:
> {code:java}
> UpdateRequest request = new UpdateRequest();
> request.deleteById("id123");
> request.setCommitWithin(1000);
> solrServer.request(request);
> {code}
> In my opinion, the commitWithin attribute should be  permitted on the 
>  xml tag as well as .  Such a change would go in 
> XMLLoader.java and its would have some ramifications elsewhere too.  Once 
> this is done, then UpdateRequest.getXml() can be updated to generate the 
> right XML.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2921) Now that we track the code version at the segment level, we can stop tracking it also in each file level

2011-05-24 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038561#comment-13038561
 ] 

Shai Erera commented on LUCENE-2921:


On this thread (http://lucene.markmail.org/thread/2puabohgbkgtbq7o) we 
discussed the option of using Version in SI.getVersion(), and then we'll be 
able to easily compare segment versions. Also, after we do this, 
StringHelper.versionComparator can be deleted.

> Now that we track the code version at the segment level, we can stop tracking 
> it also in each file level
> 
>
> Key: LUCENE-2921
> URL: https://issues.apache.org/jira/browse/LUCENE-2921
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Shai Erera
> Fix For: 3.2, 4.0
>
>
> Now that we track the code version that created the segment at the segment 
> level, we can stop tracking versions in each file. This has several major 
> benefits:
> # Today the constant names that use to track versions are confusing - they do 
> not state since which version it applies to, and so it's harder to determine 
> which formats we can stop supporting when working on the next major release.
> # Those format numbers are usually negative, but in some cases positive 
> (inconsistency) -- we need to remember to increase it "one down" for the 
> negative ones, which I always find confusing.
> # It will remove the format tracking from all the *Writers, and the *Reader 
> will receive the code format (String) and work w/ the appropriate constant 
> (e.g. Constants.LUCENE_30). Centralizing version tracking to SegmentInfo is 
> an advantage IMO.
> It's not urgent that we do it for 3.1 (though it requires an index format 
> change), because starting from 3.1 all segments track their version number 
> anyway (or migrated to track it), so we can safely release it in follow-on 3x 
> release.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #130: POMs out of sync

2011-05-24 Thread Apache Jenkins Server

Build: https://builds.apache.org/hudson/job/Lucene-Solr-Maven-trunk/130/

No tests ran.

Build Log (for compile errors):
[...truncated 9316 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3126) IndexWriter.addIndexes can make any incoming segment into CFS if it isn't already

2011-05-24 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3126:
---

Attachment: LUCENE-3126.patch

Patch w/ all fixes. Tests pass. No CHANGES entry yet, I'll add it in the next 
patch (after some comments).

> IndexWriter.addIndexes can make any incoming segment into CFS if it isn't 
> already
> -
>
> Key: LUCENE-3126
> URL: https://issues.apache.org/jira/browse/LUCENE-3126
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3126.patch, LUCENE-3126.patch
>
>
> Today, IW.addIndexes(Directory) does not modify the CFS-mode of the incoming 
> segments. However, if IndexWriter's MP wants to create CFS (in general), 
> there's no reason why not turn the incoming non-CFS segments into CFS. We 
> anyway copy them, and if MP is not against CFS, we should create a CFS out of 
> them.
> Will need to use CFW, not sure it's ready for that w/ current API (I'll need 
> to check), but luckily we're allowed to change it (@lucene.internal).
> This should be done, IMO, even if the incoming segment is large (i.e., passes 
> MP.noCFSRatio) b/c like I wrote above, we anyway copy it. However, if you 
> think otherwise, speak up :).
> I'll take a look at this in the next few days.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3138) IW.addIndexes should fail fast if an index is too old/new

2011-05-24 Thread Shai Erera (JIRA)

IW.addIndexes should fail fast if an index is too old/new
-

 Key: LUCENE-3138
 URL: https://issues.apache.org/jira/browse/LUCENE-3138
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Shai Erera
Priority: Minor
 Fix For: 4.0


Today IW.addIndexes (both Dir and IR versions) do not check the format of the 
incoming indexes. Therefore it could add a too old/new index and the app will 
discover that only later, maybe during commit() or segment merges. We should 
check that up front and fail fast.

This issue is relevant only to 4.0 at the moment, which will not support 2.x 
indexes anymore.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3112) Add IW.add/updateDocuments to support nested documents

2011-05-24 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038583#comment-13038583
 ] 

Michael McCandless commented on LUCENE-3112:


Argh!  Thanks Steven -- I will fix those jdocs 2nd time around.

> Add IW.add/updateDocuments to support nested documents
> --
>
> Key: LUCENE-3112
> URL: https://issues.apache.org/jira/browse/LUCENE-3112
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3112.patch, LUCENE-3112.patch
>
>
> I think nested documents (LUCENE-2454) is a very compelling addition
> to Lucene.  It's also a popular (many votes) issue.
> Beyond supporting nested document querying, which is already an
> incredible addition since it preserves the relational model on
> indexing normalized content (eg, DB tables, XML docs), LUCENE-2454
> should also enable speedups in grouping implementation when you group
> by a nested field.
> For the same reason, it can also enable very fast post-group facet
> counting impl (LUCENE-3097) when you what to
> count(distinct(nestedField)), instead of unique documents, as your
> "identifier".  I expect many apps that use faceting need this ability
> (to count(distinct(nestedField)) not distinct(docID)).
> To support these use cases, I believe the only core change needed is
> the ability to atomically add or update multiple documents, which you
> cannot do today since in between add/updateDocument calls a flush (eg
> due to commit or getReader()) could occur.
> This new API (addDocuments(Iterable), updateDocuments(Term
> delTerm, Iterable) would also further guarantee that the
> documents are assigned sequential docIDs in the order the iterator
> provided them, and that the docIDs all reside in one segment.
> Segment merging never splits segments apart, so this invariant would
> hold even as merges/optimizes take place.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2280) commitWithin ignored for a delete query

2011-05-24 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-2280:
--

Fix Version/s: 3.2

> commitWithin ignored for a delete query
> ---
>
> Key: SOLR-2280
> URL: https://issues.apache.org/jira/browse/SOLR-2280
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Reporter: David Smiley
>Priority: Minor
> Fix For: 3.2
>
>
> The commitWithin option on an UpdateRequest is only honored for requests 
> containing new documents.  It does not, for example, work with a delete 
> query.  The following doesn't work as expected:
> {code:java}
> UpdateRequest request = new UpdateRequest();
> request.deleteById("id123");
> request.setCommitWithin(1000);
> solrServer.request(request);
> {code}
> In my opinion, the commitWithin attribute should be  permitted on the 
>  xml tag as well as .  Such a change would go in 
> XMLLoader.java and its would have some ramifications elsewhere too.  Once 
> this is done, then UpdateRequest.getXml() can be updated to generate the 
> right XML.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3139) LuceneTestCase.afterClass does not print enough information if a temp-test-dir fails to delete

2011-05-24 Thread Shai Erera (JIRA)

LuceneTestCase.afterClass does not print enough information if a temp-test-dir 
fails to delete
--

 Key: LUCENE-3139
 URL: https://issues.apache.org/jira/browse/LUCENE-3139
 Project: Lucene - Java
  Issue Type: Test
  Components: general/test
Reporter: Shai Erera
Priority: Minor
 Fix For: 3.2, 4.0


I've hit an exception from LTC.afterClass when _TestUtil.rmDir failed (on 
write.lock, as if some test did not release resources). However, I had no idea 
which test caused that (i.e. opened the temp directory and did not release 
resources).

I think we should do the following:
* Track in LTC a map from dirName -> StackTraceElement
* In afterClass if _TestUtil.rmDir fails, print the STE of that particular dir, 
so we know where was this directory created from
* Make tempDirs private and create accessor method, so that we control the 
inserts to this map (today the Set is updated by LTC, _TestUtils and 
TestBackwards !)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

2011-05-24 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038589#comment-13038589
 ] 

Michael McCandless commented on SOLR-2524:
--

bq. The only part I couldn't port was the random testing. The API is different 
in 3x.

Martijn, could you shed some more light on this one...?  Like is there 
something in trunk's test infra that we should back port to 3.x...?  Maybe we 
should open a separate issue for this?

> Adding grouping to Solr 3x
> --
>
> Key: SOLR-2524
> URL: https://issues.apache.org/jira/browse/SOLR-2524
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 3.2
>Reporter: Martijn van Groningen
>Assignee: Michael McCandless
> Attachments: SOLR-2524.patch, SOLR-2524.patch, SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more 
> information.
> I think it would be nice if we expose this functionality also to the Solr 
> users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the 
> functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping 
> by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is 
> acceptable. I have it more or less running here. It supports the response 
> format and request parameters (expect: group.query and group.func) described 
> in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many 
> people are using grouping as patch now and this would help them a lot. Any 
> thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3139) LuceneTestCase.afterClass does not print enough information if a temp-test-dir fails to delete

2011-05-24 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038591#comment-13038591
 ] 

Robert Muir commented on LUCENE-3139:
-

+1

> LuceneTestCase.afterClass does not print enough information if a 
> temp-test-dir fails to delete
> --
>
> Key: LUCENE-3139
> URL: https://issues.apache.org/jira/browse/LUCENE-3139
> Project: Lucene - Java
>  Issue Type: Test
>  Components: general/test
>Reporter: Shai Erera
>Priority: Minor
> Fix For: 3.2, 4.0
>
>
> I've hit an exception from LTC.afterClass when _TestUtil.rmDir failed (on 
> write.lock, as if some test did not release resources). However, I had no 
> idea which test caused that (i.e. opened the temp directory and did not 
> release resources).
> I think we should do the following:
> * Track in LTC a map from dirName -> StackTraceElement
> * In afterClass if _TestUtil.rmDir fails, print the STE of that particular 
> dir, so we know where was this directory created from
> * Make tempDirs private and create accessor method, so that we control the 
> inserts to this map (today the Set is updated by LTC, _TestUtils and 
> TestBackwards !)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

2011-05-24 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038593#comment-13038593
 ] 

Robert Muir commented on SOLR-2524:
---

There is a little random testing framework in SolrTestCaseJ4, but only in trunk.

I tried really quick (just a few minutes) to backport it and started chasing 
crazy things, so I agree we should open a separate issue for this.


> Adding grouping to Solr 3x
> --
>
> Key: SOLR-2524
> URL: https://issues.apache.org/jira/browse/SOLR-2524
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 3.2
>Reporter: Martijn van Groningen
>Assignee: Michael McCandless
> Attachments: SOLR-2524.patch, SOLR-2524.patch, SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more 
> information.
> I think it would be nice if we expose this functionality also to the Solr 
> users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the 
> functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping 
> by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is 
> acceptable. I have it more or less running here. It supports the response 
> format and request parameters (expect: group.query and group.func) described 
> in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many 
> people are using grouping as patch now and this would help them a lot. Any 
> thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: svn commit: r1127017 - /lucene/dev/trunk/lucene/contrib/misc/src/java/org/apache/lucene/index/IndexSplitter.java

2011-05-24 Thread Michael McCandless

Thanks Steven!

Mike

http://blog.mikemccandless.com

On Tue, May 24, 2011 at 8:56 AM,   wrote:
> Author: sarowe
> Date: Tue May 24 12:56:56 2011
> New Revision: 1127017
>
> URL: http://svn.apache.org/viewvc?rev=1127017&view=rev
> Log:
> LUCENE-3112: Make javadocs work under 1.5 JDK
>
> Modified:
>    
> lucene/dev/trunk/lucene/contrib/misc/src/java/org/apache/lucene/index/IndexSplitter.java
>
> Modified: 
> lucene/dev/trunk/lucene/contrib/misc/src/java/org/apache/lucene/index/IndexSplitter.java
> URL: 
> http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/contrib/misc/src/java/org/apache/lucene/index/IndexSplitter.java?rev=1127017&r1=1127016&r2=1127017&view=diff
> ==
> --- 
> lucene/dev/trunk/lucene/contrib/misc/src/java/org/apache/lucene/index/IndexSplitter.java
>  (original)
> +++ 
> lucene/dev/trunk/lucene/contrib/misc/src/java/org/apache/lucene/index/IndexSplitter.java
>  Tue May 24 12:56:56 2011
> @@ -26,6 +26,7 @@ import java.text.DecimalFormat;
>  import java.util.ArrayList;
>  import java.util.List;
>
> +import org.apache.lucene.index.IndexWriter;  // Required for javadocs
>  import org.apache.lucene.index.codecs.CodecProvider;
>  import org.apache.lucene.store.FSDirectory;
>
>
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3139) LuceneTestCase.afterClass does not print enough information if a temp-test-dir fails to delete

2011-05-24 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038598#comment-13038598
 ] 

Shai Erera commented on LUCENE-3139:


Perhaps we should also fail the test if that happens? Was there reason why only 
the stacktrace printed, but tests were considered successful?

> LuceneTestCase.afterClass does not print enough information if a 
> temp-test-dir fails to delete
> --
>
> Key: LUCENE-3139
> URL: https://issues.apache.org/jira/browse/LUCENE-3139
> Project: Lucene - Java
>  Issue Type: Test
>  Components: general/test
>Reporter: Shai Erera
>Priority: Minor
> Fix For: 3.2, 4.0
>
>
> I've hit an exception from LTC.afterClass when _TestUtil.rmDir failed (on 
> write.lock, as if some test did not release resources). However, I had no 
> idea which test caused that (i.e. opened the temp directory and did not 
> release resources).
> I think we should do the following:
> * Track in LTC a map from dirName -> StackTraceElement
> * In afterClass if _TestUtil.rmDir fails, print the STE of that particular 
> dir, so we know where was this directory created from
> * Make tempDirs private and create accessor method, so that we control the 
> inserts to this map (today the Set is updated by LTC, _TestUtils and 
> TestBackwards !)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3139) LuceneTestCase.afterClass does not print enough information if a temp-test-dir fails to delete

2011-05-24 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3139:
---

Attachment: LUCENE-3139.patch

Patch adds registerTempFile to LTC plus prints stack information if rmDir fails.

I think we should also fail the test if that happens?

> LuceneTestCase.afterClass does not print enough information if a 
> temp-test-dir fails to delete
> --
>
> Key: LUCENE-3139
> URL: https://issues.apache.org/jira/browse/LUCENE-3139
> Project: Lucene - Java
>  Issue Type: Test
>  Components: general/test
>Reporter: Shai Erera
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3139.patch
>
>
> I've hit an exception from LTC.afterClass when _TestUtil.rmDir failed (on 
> write.lock, as if some test did not release resources). However, I had no 
> idea which test caused that (i.e. opened the temp directory and did not 
> release resources).
> I think we should do the following:
> * Track in LTC a map from dirName -> StackTraceElement
> * In afterClass if _TestUtil.rmDir fails, print the STE of that particular 
> dir, so we know where was this directory created from
> * Make tempDirs private and create accessor method, so that we control the 
> inserts to this map (today the Set is updated by LTC, _TestUtils and 
> TestBackwards !)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3139) LuceneTestCase.afterClass does not print enough information if a temp-test-dir fails to delete

2011-05-24 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038601#comment-13038601
 ] 

Robert Muir commented on LUCENE-3139:
-

some tests are still problematic, at least on windows... I think perhaps some 
of the crazier ones like DiskFull, TestCrash, anything that has to disable 
MockDirectoryWrappers's checks because they must create corrupt indexes or 
other scary things.


> LuceneTestCase.afterClass does not print enough information if a 
> temp-test-dir fails to delete
> --
>
> Key: LUCENE-3139
> URL: https://issues.apache.org/jira/browse/LUCENE-3139
> Project: Lucene - Java
>  Issue Type: Test
>  Components: general/test
>Reporter: Shai Erera
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3139.patch
>
>
> I've hit an exception from LTC.afterClass when _TestUtil.rmDir failed (on 
> write.lock, as if some test did not release resources). However, I had no 
> idea which test caused that (i.e. opened the temp directory and did not 
> release resources).
> I think we should do the following:
> * Track in LTC a map from dirName -> StackTraceElement
> * In afterClass if _TestUtil.rmDir fails, print the STE of that particular 
> dir, so we know where was this directory created from
> * Make tempDirs private and create accessor method, so that we control the 
> inserts to this map (today the Set is updated by LTC, _TestUtils and 
> TestBackwards !)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3139) LuceneTestCase.afterClass does not print enough information if a temp-test-dir fails to delete

2011-05-24 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038604#comment-13038604
 ] 

Shai Erera commented on LUCENE-3139:


bq. some tests are still problematic, at least on windows... 

Ok, I didn't notice your comment when posted the patch. So let's keep it as-is.

I think it's ready to commit

> LuceneTestCase.afterClass does not print enough information if a 
> temp-test-dir fails to delete
> --
>
> Key: LUCENE-3139
> URL: https://issues.apache.org/jira/browse/LUCENE-3139
> Project: Lucene - Java
>  Issue Type: Test
>  Components: general/test
>Reporter: Shai Erera
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3139.patch
>
>
> I've hit an exception from LTC.afterClass when _TestUtil.rmDir failed (on 
> write.lock, as if some test did not release resources). However, I had no 
> idea which test caused that (i.e. opened the temp directory and did not 
> release resources).
> I think we should do the following:
> * Track in LTC a map from dirName -> StackTraceElement
> * In afterClass if _TestUtil.rmDir fails, print the STE of that particular 
> dir, so we know where was this directory created from
> * Make tempDirs private and create accessor method, so that we control the 
> inserts to this map (today the Set is updated by LTC, _TestUtils and 
> TestBackwards !)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3097) Post grouping faceting

2011-05-24 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038605#comment-13038605
 ] 

Michael McCandless commented on LUCENE-3097:


{quote}
bq. I think you need to hold the docBase from each setNextReader and re-base 
your docs stored in the GroupHead?

I think I'm doing that. If you look at the updateHead() methods. You see that I 
rebasing the ids.
{quote}

Ahh excellent, I missed that.  Looks good!

{quote}
bq. Once docs within one can have different values for field X then we need a 
different approach for counting their facets...

But that would only happen when if an update happen during a search?
Then all collectors can have this problem, right?
{quote}

This is independent of updating during search I think.

I don't think the existing collectors have a problem here?  Ie the
grouping collectors aren't normally concerned w/ multivalued fields of
the docs within each group.

It's only because we intend for these new group collectors to make
"post-grouping facet counting" work in Solr that we have a problem.
Ie, these collectors won't properly count facets of fields that have
different values w/in one group?

Say this is my original content:

{noformat}
  name=3-wolf shirt
size=M, color=red
size=S, color=red
size=L, color=blue

  name=frog shirt
size=M, color=white
size=S, color=red
{noformat}

But, I'm not using nested docs (LUCENE-2454), so I had to fully
denormalize into these docs:

{noformat}
  name=3-wolf shirt, size=M, color=red
  name=3-wolf shirt, size=S, color=red
  name=3-wolf shirt, size=L, color=blue
  name=frog shirt,   size=M, color=white
  name=frog shirt,   size=S, color=red
{noformat}

Now, if user does a search for "color=red"... without post-group
faceting (ie what Solr has today), you incorrectly see count=3 for
color=red.

With post-group faceting, you should see count=2 for color=red (which
these collectors will do, correctly, I think?), but you should also
see count=2 for size=S, which I think these collectors will fail to
do?  (Ie, because they only retain the top doc per group...?).


> Post grouping faceting
> --
>
> Key: LUCENE-3097
> URL: https://issues.apache.org/jira/browse/LUCENE-3097
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: modules/grouping
>Reporter: Martijn van Groningen
>Assignee: Martijn van Groningen
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3097.patch
>
>
> This issues focuses on implementing post grouping faceting.
> * How to handle multivalued fields. What field value to show with the facet.
> * Where the facet counts should be based on
> ** Facet counts can be based on the normal documents. Ungrouped counts. 
> ** Facet counts can be based on the groups. Grouped counts.
> ** Facet counts can be based on the combination of group value and facet 
> value. Matrix counts.   
> And properly more implementation options.
> The first two methods are implemented in the SOLR-236 patch. For the first 
> option it calculates a DocSet based on the individual documents from the 
> query result. For the second option it calculates a DocSet for all the most 
> relevant documents of a group. Once the DocSet is computed the FacetComponent 
> and StatsComponent use one the DocSet to create facets and statistics.  
> This last one is a bit more complex. I think it is best explained with an 
> example. Lets say we search on travel offers:
> |||hotel||departure_airport||duration||
> |Hotel a|AMS|5
> |Hotel a|DUS|10
> |Hotel b|AMS|5
> |Hotel b|AMS|10
> If we group by hotel and have a facet for airport. Most end users expect 
> (according to my experience off course) the following airport facet:
> AMS: 2
> DUS: 1
> The above result can't be achieved by the first two methods. You either get 
> counts AMS:3 and DUS:1 or 1 for both airports.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3138) IW.addIndexes should fail fast if an index is too old/new

2011-05-24 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038606#comment-13038606
 ] 

Michael McCandless commented on LUCENE-3138:


It also applies to 3.x?  Ie if you try to add a 4.x index in a 3.x world you 
should get too-new-exc?

> IW.addIndexes should fail fast if an index is too old/new
> -
>
> Key: LUCENE-3138
> URL: https://issues.apache.org/jira/browse/LUCENE-3138
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Shai Erera
>Priority: Minor
> Fix For: 4.0
>
>
> Today IW.addIndexes (both Dir and IR versions) do not check the format of the 
> incoming indexes. Therefore it could add a too old/new index and the app will 
> discover that only later, maybe during commit() or segment merges. We should 
> check that up front and fail fast.
> This issue is relevant only to 4.0 at the moment, which will not support 2.x 
> indexes anymore.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3138) IW.addIndexes should fail fast if an index is too old/new

2011-05-24 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3138:
---

Fix Version/s: 3.2

bq. It also applies to 3.x? Ie if you try to add a 4.x index in a 3.x world you 
should get too-new-exc?

You're right :). Added 3.2 as fix version too.

> IW.addIndexes should fail fast if an index is too old/new
> -
>
> Key: LUCENE-3138
> URL: https://issues.apache.org/jira/browse/LUCENE-3138
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Shai Erera
>Priority: Minor
> Fix For: 3.2, 4.0
>
>
> Today IW.addIndexes (both Dir and IR versions) do not check the format of the 
> incoming indexes. Therefore it could add a too old/new index and the app will 
> discover that only later, maybe during commit() or segment merges. We should 
> check that up front and fail fast.
> This issue is relevant only to 4.0 at the moment, which will not support 2.x 
> indexes anymore.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3126) IndexWriter.addIndexes can make any incoming segment into CFS if it isn't already

2011-05-24 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038611#comment-13038611
 ] 

Michael McCandless commented on LUCENE-3126:


bq. Are there other files that are normally created outside the .cfs? I've seen 
sometimes that the stored fields of CFS are created outside. Was it only for 
shared doc stores?

I think just separate norms (_N.sM) and deletions (_N.del) live
outside CFS?  Yes, only w/ shared doc stores did the shared doc store
files live outside cfs...

{quote}
bq. More generally: does addIndexes properly refuse to import a too-old index? 
We should throw IndexFormatTooOldExc in this case? (And, maybe also 
IndexFormatTooNewExc?).

Not today. I believe it will fail in later stages (e.g. commit()), but
we better fail up front. I think it's a separate issue though, only
for 4.0 (b/c 3x supports all formats today)?
{quote}

OK I see LUCENE-3138 for this... thanks.


> IndexWriter.addIndexes can make any incoming segment into CFS if it isn't 
> already
> -
>
> Key: LUCENE-3126
> URL: https://issues.apache.org/jira/browse/LUCENE-3126
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3126.patch, LUCENE-3126.patch
>
>
> Today, IW.addIndexes(Directory) does not modify the CFS-mode of the incoming 
> segments. However, if IndexWriter's MP wants to create CFS (in general), 
> there's no reason why not turn the incoming non-CFS segments into CFS. We 
> anyway copy them, and if MP is not against CFS, we should create a CFS out of 
> them.
> Will need to use CFW, not sure it's ready for that w/ current API (I'll need 
> to check), but luckily we're allowed to change it (@lucene.internal).
> This should be done, IMO, even if the incoming segment is large (i.e., passes 
> MP.noCFSRatio) b/c like I wrote above, we anyway copy it. However, if you 
> think otherwise, speak up :).
> I'll take a look at this in the next few days.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-24 Thread Ben West (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben West updated LUCENENET-415:
---

Attachment: facet performance.xls

Everything is exactly as DIGY predicted. I will never disagree with him again 
:-) 
See tab "Round 3".

I commented out the Cardinality() call, and enabled caching but with unique 
queries. The bitset way is much faster now.

> Contrib/Faceted Search
> --
>
> Key: LUCENENET-415
> URL: https://issues.apache.org/jira/browse/LUCENENET-415
> Project: Lucene.Net
>  Issue Type: New Feature
>Affects Versions: Lucene.Net 2.9.4
>Reporter: Digy
>Priority: Minor
> Attachments: PerformanceTest.cs, PerformanceTest.cs, 
> PerformanceTest.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
> SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs, 
> TestSimpleFacetedSearch.cs, facet performance.xls, facet performance.xls, 
> facet performance.xls
>
>
> Since I see a lot of questions about faceted search in these days, I plan to 
> add a Faceted-Search project to contrib.
> DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SOLR-2524) Adding grouping to Solr 3x

2011-05-24 Thread Martijn van Groningen (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038617#comment-13038617
 ] 

Martijn van Groningen commented on SOLR-2524:
-

bq. So once Lucene is patched to support POST facets in LUCENE-3097, are you 
planning on adding that into this (or a new ticket for Solr) ?
Yes I'm!

bq. Maybe we should open a separate issue for this?
I'll open a separate issue for back porting the SolrTestCaseJ4 random testing 
behaviour in 3x. 

> Adding grouping to Solr 3x
> --
>
> Key: SOLR-2524
> URL: https://issues.apache.org/jira/browse/SOLR-2524
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 3.2
>Reporter: Martijn van Groningen
>Assignee: Michael McCandless
> Attachments: SOLR-2524.patch, SOLR-2524.patch, SOLR-2524.patch
>
>
> Grouping was recently added to Lucene 3x. See LUCENE-1421 for more 
> information.
> I think it would be nice if we expose this functionality also to the Solr 
> users that are bound to a 3.x version.
> The grouping feature added to Lucene is currently a subset of the 
> functionality that Solr 4.0-trunk offers. Mainly it doesn't support grouping 
> by function / query.
> The work involved getting the grouping contrib to work on Solr 3x is 
> acceptable. I have it more or less running here. It supports the response 
> format and request parameters (expect: group.query and group.func) described 
> in the FieldCollapse page on the Solr wiki.
> I think it would be great if this is included in the Solr 3.2 release. Many 
> people are using grouping as patch now and this would help them a lot. Any 
> thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3126) IndexWriter.addIndexes can make any incoming segment into CFS if it isn't already

2011-05-24 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038618#comment-13038618
 ] 

Michael McCandless commented on LUCENE-3126:


Patch looks good!

Maybe we should add asserts to SegmentMerger.createCompoundFile, that 
SI.files() did not return del/separate norms?  I don't like the ambiguity 
here... and then strengthen the comment saying SI.files should never return 
these in this context?  Hopefully this does not cause any test files!  (We can 
do this as a separate issue...).

> IndexWriter.addIndexes can make any incoming segment into CFS if it isn't 
> already
> -
>
> Key: LUCENE-3126
> URL: https://issues.apache.org/jira/browse/LUCENE-3126
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3126.patch, LUCENE-3126.patch
>
>
> Today, IW.addIndexes(Directory) does not modify the CFS-mode of the incoming 
> segments. However, if IndexWriter's MP wants to create CFS (in general), 
> there's no reason why not turn the incoming non-CFS segments into CFS. We 
> anyway copy them, and if MP is not against CFS, we should create a CFS out of 
> them.
> Will need to use CFW, not sure it's ready for that w/ current API (I'll need 
> to check), but luckily we're allowed to change it (@lucene.internal).
> This should be done, IMO, even if the incoming segment is large (i.e., passes 
> MP.noCFSRatio) b/c like I wrote above, we anyway copy it. However, if you 
> think otherwise, speak up :).
> I'll take a look at this in the next few days.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3139) LuceneTestCase.afterClass does not print enough information if a temp-test-dir fails to delete

2011-05-24 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3139:
---

Attachment: LUCENE-3139.patch

Patch applies the same changes to backwards' LTC.

> LuceneTestCase.afterClass does not print enough information if a 
> temp-test-dir fails to delete
> --
>
> Key: LUCENE-3139
> URL: https://issues.apache.org/jira/browse/LUCENE-3139
> Project: Lucene - Java
>  Issue Type: Test
>  Components: general/test
>Reporter: Shai Erera
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3139.patch, LUCENE-3139.patch
>
>
> I've hit an exception from LTC.afterClass when _TestUtil.rmDir failed (on 
> write.lock, as if some test did not release resources). However, I had no 
> idea which test caused that (i.e. opened the temp directory and did not 
> release resources).
> I think we should do the following:
> * Track in LTC a map from dirName -> StackTraceElement
> * In afterClass if _TestUtil.rmDir fails, print the STE of that particular 
> dir, so we know where was this directory created from
> * Make tempDirs private and create accessor method, so that we control the 
> inserts to this map (today the Set is updated by LTC, _TestUtils and 
> TestBackwards !)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3139) LuceneTestCase.afterClass does not print enough information if a temp-test-dir fails to delete

2011-05-24 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038625#comment-13038625
 ] 

Shai Erera commented on LUCENE-3139:


When I run 'ant test-backwards' I see these exceptions:

{noformat}
   [junit] - Standard Error -
   [junit] java.io.IOException: could not delete 
D:\dev\lucene\lucene-3x\lucene\build\backwards\test\1\test4293913517498927234tmp\_1.fdt
   [junit] at org.apache.lucene.util._TestUtil.rmDir(_TestUtil.java:65)
   [junit] at 
org.apache.lucene.util.LuceneTestCase.afterClassLuceneTestCaseJ4(LuceneTestCase.java:291)
   [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   [junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
   [junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
   [junit] at java.lang.reflect.Method.invoke(Method.java:611)
   [junit] at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
   [junit] at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   [junit] at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
   [junit] at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:37)
   [junit] at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
   [junit] at 
junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
   [junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
   [junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
   [junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743)
   [junit] path 
D:\dev\lucene\lucene-3x\lucene\build\backwards\test\1\test4293913517498927234tmp
 allocated from
   [junit] 
org.apache.lucene.util.LuceneTestCase.registerTempFile(LuceneTestCase.java:930)
   [junit] 
org.apache.lucene.util.LuceneTestCase.newDirectoryImpl(LuceneTestCase.java:945)
   [junit] 
org.apache.lucene.util.LuceneTestCase.newDirectory(LuceneTestCase.java:733)
   [junit] 
org.apache.lucene.util.LuceneTestCase.newDirectory(LuceneTestCase.java:725)
   [junit] 
org.apache.lucene.index.TestIndexWriterWithThreads._testMultipleThreadsFailure(TestIndexWriterWithThreads.java:212)
   [junit] 
org.apache.lucene.index.TestIndexWriterWithThreads.testIOExceptionDuringWriteSegmentWithThreads(TestIndexWriterWithThreads.java:381)
{noformat}

and

{noformat}
[junit] - Standard Error -
[junit] java.io.IOException: could not delete 
D:\dev\lucene\lucene-3x\lucene\build\backwards\test\5\test6976265647485126574tmp\write.lock
[junit] at org.apache.lucene.util._TestUtil.rmDir(_TestUtil.java:65)
[junit] at 
org.apache.lucene.util.LuceneTestCase.afterClassLuceneTestCaseJ4(LuceneTestCase.java:291)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
[junit] at java.lang.reflect.Method.invoke(Method.java:611)
[junit] at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
[junit] at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
[junit] at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
[junit] at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:37)
[junit] at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
[junit] at 
junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743)
[junit] path 
D:\dev\lucene\lucene-3x\lucene\build\backwards\test\5\test6976265647485126574tmp
 allocated from
[junit] 
org.apache.lucene.util.LuceneTestCase.registerTempFile(LuceneTestCase.java:930)
[junit] 
org.apache.lucene.util.LuceneTestCase.newDirectoryImpl(LuceneTestCase.java:945)
[junit] 
org.apache.lucene.util.LuceneTestCase.newDirectory(LuceneTestCase.java:733)
[junit] 
org.apache.lucene.util.LuceneTestCase.newDirectory(LuceneTestCase.java:725)
[junit] 
org.apache.lucene.index.TestIndexReader.testReopenChangeReadonly(TestIndexReader.java:1622)
{noformat}

and

{noformat}
[junit]

[jira] [Commented] (LUCENE-2762) Don't leak deleted open file handles with pooled readers

2011-05-24 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038629#comment-13038629
 ] 

Michael McCandless commented on LUCENE-2762:


Hmmm... the fix here was backported to 3.0.x, so it should be fixed in 3.0.3.

The code fragment looks correct, but your usage of decRef (not close) likely 
means you have other places that also incRef the reader?  Are you sure all 
incRefs are balanced by a decRef?

Are you sure you really have leaking file handles and not just too many open 
files?  EG what mergeFactor are you using...?

> Don't leak deleted open file handles with pooled readers
> 
>
> Key: LUCENE-2762
> URL: https://issues.apache.org/jira/browse/LUCENE-2762
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 2.9.4, 3.0.3, 3.1, 4.0
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 2.9.4, 3.0.3, 3.1, 4.0
>
> Attachments: LUCENE-2762.patch
>
>
> If you have CFS enabled today, and pooling is enabled (either directly
> or because you've pulled an NRT reader), IndexWriter will hold open
> SegmentReaders against the non-CFS format of each merged segment.
> So even if you close all NRT readers you've pulled from the writer,
> you'll still see file handles open against files that have been
> deleted.
> This count will not grow unbounded, since it's limited by the number
> of segments in the index, but it's still a serious problem since the
> app had turned off CFS in the first place presumably to avoid risk of
> too-many-open-files.  It's also bad because it ties up disk space
> since these files would otherwise be deleted.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-24 Thread Digy (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: TestSimpleFacetedSearch2.cs
SimpleFacetedSearch2.cs

I take one step further. Multi-field faceting.
It requires many code cleanups, but works.

>>  "SimpleFacetedSeach2"

DIGY

> Contrib/Faceted Search
> --
>
> Key: LUCENENET-415
> URL: https://issues.apache.org/jira/browse/LUCENENET-415
> Project: Lucene.Net
>  Issue Type: New Feature
>Affects Versions: Lucene.Net 2.9.4
>Reporter: Digy
>Priority: Minor
> Attachments: PerformanceTest.cs, PerformanceTest.cs, 
> PerformanceTest.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
> SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch2.cs, 
> TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs, 
> TestSimpleFacetedSearch2.cs, facet performance.xls, facet performance.xls, 
> facet performance.xls
>
>
> Since I see a lot of questions about faceted search in these days, I plan to 
> add a Faceted-Search project to contrib.
> DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (LUCENE-3112) Add IW.add/updateDocuments to support nested documents

2011-05-24 Thread Steven Rowe (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038641#comment-13038641
 ] 

Steven Rowe commented on LUCENE-3112:
-

bq. IndexSplitter.java has javadocs references

Oops, I got the trunk javadocs problem mixed up with the branch_3x javadocs 
problem...

The incorrect references are actually in IndexWriter.java itself, on 
updateDocuments() methods.

> Add IW.add/updateDocuments to support nested documents
> --
>
> Key: LUCENE-3112
> URL: https://issues.apache.org/jira/browse/LUCENE-3112
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3112.patch, LUCENE-3112.patch
>
>
> I think nested documents (LUCENE-2454) is a very compelling addition
> to Lucene.  It's also a popular (many votes) issue.
> Beyond supporting nested document querying, which is already an
> incredible addition since it preserves the relational model on
> indexing normalized content (eg, DB tables, XML docs), LUCENE-2454
> should also enable speedups in grouping implementation when you group
> by a nested field.
> For the same reason, it can also enable very fast post-group facet
> counting impl (LUCENE-3097) when you what to
> count(distinct(nestedField)), instead of unique documents, as your
> "identifier".  I expect many apps that use faceting need this ability
> (to count(distinct(nestedField)) not distinct(docID)).
> To support these use cases, I believe the only core change needed is
> the ability to atomically add or update multiple documents, which you
> cannot do today since in between add/updateDocument calls a flush (eg
> due to commit or getReader()) could occur.
> This new API (addDocuments(Iterable), updateDocuments(Term
> delTerm, Iterable) would also further guarantee that the
> documents are assigned sequential docIDs in the order the iterator
> provided them, and that the docIDs all reside in one segment.
> Segment merging never splits segments apart, so this invariant would
> hold even as merges/optimizes take place.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2540) CommitWithin as an Update Request parameter

2011-05-24 Thread JIRA

CommitWithin as an Update Request parameter
---

 Key: SOLR-2540
 URL: https://issues.apache.org/jira/browse/SOLR-2540
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 3.1
Reporter: Jan Høydahl


It would be useful to support commitWithin HTTP GET request param on all 
UpdateRequestHandlers.
That way, you could set commitWithin on the request (for XML, JSON, CSV, Binary 
and Extracting handlers) with this syntax:

{code}
  curl 
http://localhost:8983/solr/update/extract?literal.id=123&commitWithin=1
   -H "Content-Type: application/pdf" --data-binary @file.pdf
{code}

PS: The JsonUpdateRequestHandler and BinaryUpdateRequestHandler already support 
this syntax.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2540) CommitWithin as an Update Request parameter

2011-05-24 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-2540:
--

Attachment: SOLR-2540.patch

First patch which solves this for XML, CSV, Extracting. No new tests added so 
far.

> CommitWithin as an Update Request parameter
> ---
>
> Key: SOLR-2540
> URL: https://issues.apache.org/jira/browse/SOLR-2540
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Affects Versions: 3.1
>Reporter: Jan Høydahl
>  Labels: commit, commitWithin
> Attachments: SOLR-2540.patch
>
>
> It would be useful to support commitWithin HTTP GET request param on all 
> UpdateRequestHandlers.
> That way, you could set commitWithin on the request (for XML, JSON, CSV, 
> Binary and Extracting handlers) with this syntax:
> {code}
>   curl 
> http://localhost:8983/solr/update/extract?literal.id=123&commitWithin=1
>-H "Content-Type: application/pdf" --data-binary @file.pdf
> {code}
> PS: The JsonUpdateRequestHandler and BinaryUpdateRequestHandler already 
> support this syntax.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2541) Plugininfo tries to load nodes of type "long"

2011-05-24 Thread Frank Wesemann (JIRA)

Plugininfo tries to load nodes of type "long"
-

 Key: SOLR-2541
 URL: https://issues.apache.org/jira/browse/SOLR-2541
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 3.1
 Environment: all
Reporter: Frank Wesemann


As of version 3.1 Plugininfo adds all nodes whose types are not 
"lst","str","int","bool","arr","float" or "double" to the children list.

The type "long" is missing in the NL_TAGS set.
I assume this a bug because DOMUtil recognizes this type, so I consider it a 
valid tag in solrconfig.xml
Maybe it's time for a dtd? Or one may define SolrConfig.nodetypes somewhere.
I'll add a patch, that extends the NL_TAGS Set.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2541) Plugininfo tries to load nodes of type "long"

2011-05-24 Thread Frank Wesemann (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frank Wesemann updated SOLR-2541:
-

Component/s: (was: SearchComponents - other)

> Plugininfo tries to load nodes of type "long"
> -
>
> Key: SOLR-2541
> URL: https://issues.apache.org/jira/browse/SOLR-2541
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 3.1
> Environment: all
>Reporter: Frank Wesemann
>
> As of version 3.1 Plugininfo adds all nodes whose types are not 
> "lst","str","int","bool","arr","float" or "double" to the children list.
> The type "long" is missing in the NL_TAGS set.
> I assume this a bug because DOMUtil recognizes this type, so I consider it a 
> valid tag in solrconfig.xml
> Maybe it's time for a dtd? Or one may define SolrConfig.nodetypes somewhere.
> I'll add a patch, that extends the NL_TAGS Set.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2541) Plugininfo tries to load nodes of type "long"

2011-05-24 Thread Frank Wesemann (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frank Wesemann updated SOLR-2541:
-

Attachment: Solr-2541.patch

adds "long" to the NL_TAGS set

> Plugininfo tries to load nodes of type "long"
> -
>
> Key: SOLR-2541
> URL: https://issues.apache.org/jira/browse/SOLR-2541
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 3.1
> Environment: all
>Reporter: Frank Wesemann
> Attachments: Solr-2541.patch
>
>
> As of version 3.1 Plugininfo adds all nodes whose types are not 
> "lst","str","int","bool","arr","float" or "double" to the children list.
> The type "long" is missing in the NL_TAGS set.
> I assume this a bug because DOMUtil recognizes this type, so I consider it a 
> valid tag in solrconfig.xml
> Maybe it's time for a dtd? Or one may define SolrConfig.nodetypes somewhere.
> I'll add a patch, that extends the NL_TAGS Set.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [jira] [Updated] (SOLR-2540) CommitWithin as an Update Request parameter

2011-05-24 Thread Mark Miller

Hey Jan - just realized you are not a JIRA contributor, so I've upped your JIRA 
status. You should be able to modify existing issues and such now.

- Mark
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2762) Don't leak deleted open file handles with pooled readers

2011-05-24 Thread Josef (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038668#comment-13038668
 ] 

Josef commented on LUCENE-2762:
---

Thanks for the quick response.

You are right, we just increase the reference for the current Searcher when it 
is obtained and used. Decreasing it again is part of a finally block. I have 
triple checked this code to make sure, we are not doing anything wrong and I am 
very sure, we correctly decrease the reference in any case.

The output of the 'lsof' command above shows handles to files already marked 
deleted. The number of files marked deleted increases over time until we get a 
Java 'IOException: Too many open files'. 
If we don't hold the reference to the reader contained in our searcher, who is?

Regarding merge factor etc: we are using standard configurations, i.e. no 
tweaking (yet).
Besides, we do not call IndexWriter.optimize() in our code yet.


> Don't leak deleted open file handles with pooled readers
> 
>
> Key: LUCENE-2762
> URL: https://issues.apache.org/jira/browse/LUCENE-2762
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 2.9.4, 3.0.3, 3.1, 4.0
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 2.9.4, 3.0.3, 3.1, 4.0
>
> Attachments: LUCENE-2762.patch
>
>
> If you have CFS enabled today, and pooling is enabled (either directly
> or because you've pulled an NRT reader), IndexWriter will hold open
> SegmentReaders against the non-CFS format of each merged segment.
> So even if you close all NRT readers you've pulled from the writer,
> you'll still see file handles open against files that have been
> deleted.
> This count will not grow unbounded, since it's limited by the number
> of segments in the index, but it's still a serious problem since the
> app had turned off CFS in the first place presumably to avoid risk of
> too-many-open-files.  It's also bad because it ties up disk space
> since these files would otherwise be deleted.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [jira] [Updated] (SOLR-2540) CommitWithin as an Update Request parameter

2011-05-24 Thread Mark Miller

David Smiley added as well.

On May 24, 2011, at 10:05 AM, Mark Miller wrote:

> Hey Jan - just realized you are not a JIRA contributor, so I've upped your 
> JIRA status. You should be able to modify existing issues and such now.
> 
> - Mark

- Mark Miller
lucidimagination.com

Lucene/Solr User Conference
May 25-26, San Francisco
www.lucenerevolution.org






-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-1942) Ability to select codec per field

2011-05-24 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-1942:
--

Attachment: SOLR-1942.patch

updated patch: I moved the parsing out of SolrCore, instead codecproviders just 
get a namedList so they can parse whatever they want (e.g. they might want more 
than just a list of classnames, but also codec params, ...)

additionally i removed the interaction with the default codec provider... i 
think if you specify a codecprovider thats the only one that should be used 
directly, instead of a "union" with CodecProvider.getDefault()

> Ability to select codec per field
> -
>
> Key: SOLR-1942
> URL: https://issues.apache.org/jira/browse/SOLR-1942
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0
>Reporter: Yonik Seeley
>Assignee: Grant Ingersoll
> Fix For: 4.0
>
> Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch
>
>
> We should use PerFieldCodecWrapper to allow users to select the codec 
> per-field.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-2972) Intermittent failure in TestFieldCacheTermsFilter.testMissingTerms

2011-05-24 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2972.


Resolution: Fixed

> Intermittent failure in TestFieldCacheTermsFilter.testMissingTerms
> --
>
> Key: LUCENE-2972
> URL: https://issues.apache.org/jira/browse/LUCENE-2972
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-2972.patch
>
>
> Running tests in while(1) I hit this:
> {noformat}
> NOTE: reproduce with: ant test -Dtestcase=TestFieldCacheTermsFilter 
> -Dtestmethod=testMissingTerms 
> -Dtests.seed=-1046382732738729184:5855929314778232889
> 1) testMissingTerms(org.apache.lucene.search.TestFieldCacheTermsFilter)
> java.lang.AssertionError: Must match 1 expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:91)
>   at org.junit.Assert.failNotEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:126)
>   at org.junit.Assert.assertEquals(Assert.java:470)
>   at 
> org.apache.lucene.search.TestFieldCacheTermsFilter.testMissingTerms(TestFieldCacheTermsFilter.java:63)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
>   at org.junit.rules.TestWatchman$1.evaluate(TestWatchman.java:48)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
>   at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1214)
>   at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1146)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:24)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:157)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:136)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:117)
>   at org.junit.runner.JUnitCore.runMain(JUnitCore.java:98)
>   at org.junit.runner.JUnitCore.runMainAndExit(JUnitCore.java:53)
>   at org.junit.runner.JUnitCore.main(JUnitCore.java:45)
> {noformat}
> Unfortunately the seed doesn't [consistently] repro for me...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-1942) Ability to select codec per field

2011-05-24 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir reassigned SOLR-1942:
-

Assignee: Robert Muir  (was: Grant Ingersoll)

> Ability to select codec per field
> -
>
> Key: SOLR-1942
> URL: https://issues.apache.org/jira/browse/SOLR-1942
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0
>Reporter: Yonik Seeley
>Assignee: Robert Muir
> Fix For: 4.0
>
> Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch
>
>
> We should use PerFieldCodecWrapper to allow users to select the codec 
> per-field.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2136) Function Queries: if() function

2011-05-24 Thread Yonik Seeley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-2136:
---

Attachment: SOLR-2136.patch

Here's a first cut at adding boolean support to function queries, if(), 
exists(), and(), or(), and not().

No tests yet.

> Function Queries: if() function
> ---
>
> Key: SOLR-2136
> URL: https://issues.apache.org/jira/browse/SOLR-2136
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4.1
>Reporter: Jan Høydahl
> Attachments: SOLR-2136.patch
>
>
> Add an if() function which will enable conditional function queries.
> The function could be modeled after a spreadsheet if function (e.g: 
> http://wiki.services.openoffice.org/wiki/Documentation/How_Tos/Calc:_IF_function)
> IF(test; value1; value2) where:
> test is or refers to a logical value or expression that returns a logical 
> value (TRUE or FALSE).
> value1 is the value that is returned by the function if test yields TRUE.
> value2 is the value that is returned by the function if test yields FALSE.
> If value2 is omitted it is assumed to be FALSE; if value1 is also omitted it 
> is assumed to be TRUE.
> Example use:
> if(color=="red"; 100; if(color=="green"; 50; 25))
> This function will check the document field "color", and if it is "red" 
> return 100, if it is "green" return 50, else return 25.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2972) Intermittent failure in TestFieldCacheTermsFilter.testMissingTerms

2011-05-24 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038722#comment-13038722
 ] 

Yonik Seeley commented on LUCENE-2972:
--

To avoid slowing anything down, can we also maintain numBits in an assert?
Perhaps something like
assert (numBits = Math.max(other.numBits, numBits)) > 0;

> Intermittent failure in TestFieldCacheTermsFilter.testMissingTerms
> --
>
> Key: LUCENE-2972
> URL: https://issues.apache.org/jira/browse/LUCENE-2972
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-2972.patch
>
>
> Running tests in while(1) I hit this:
> {noformat}
> NOTE: reproduce with: ant test -Dtestcase=TestFieldCacheTermsFilter 
> -Dtestmethod=testMissingTerms 
> -Dtests.seed=-1046382732738729184:5855929314778232889
> 1) testMissingTerms(org.apache.lucene.search.TestFieldCacheTermsFilter)
> java.lang.AssertionError: Must match 1 expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:91)
>   at org.junit.Assert.failNotEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:126)
>   at org.junit.Assert.assertEquals(Assert.java:470)
>   at 
> org.apache.lucene.search.TestFieldCacheTermsFilter.testMissingTerms(TestFieldCacheTermsFilter.java:63)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
>   at org.junit.rules.TestWatchman$1.evaluate(TestWatchman.java:48)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
>   at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1214)
>   at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1146)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:24)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:157)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:136)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:117)
>   at org.junit.runner.JUnitCore.runMain(JUnitCore.java:98)
>   at org.junit.runner.JUnitCore.runMainAndExit(JUnitCore.java:53)
>   at org.junit.runner.JUnitCore.main(JUnitCore.java:45)
> {noformat}
> Unfortunately the seed doesn't [consistently] repro for me...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2972) Intermittent failure in TestFieldCacheTermsFilter.testMissingTerms

2011-05-24 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038755#comment-13038755
 ] 

Michael McCandless commented on LUCENE-2972:


Good idea Yonik, I'll do that...

> Intermittent failure in TestFieldCacheTermsFilter.testMissingTerms
> --
>
> Key: LUCENE-2972
> URL: https://issues.apache.org/jira/browse/LUCENE-2972
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-2972.patch
>
>
> Running tests in while(1) I hit this:
> {noformat}
> NOTE: reproduce with: ant test -Dtestcase=TestFieldCacheTermsFilter 
> -Dtestmethod=testMissingTerms 
> -Dtests.seed=-1046382732738729184:5855929314778232889
> 1) testMissingTerms(org.apache.lucene.search.TestFieldCacheTermsFilter)
> java.lang.AssertionError: Must match 1 expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:91)
>   at org.junit.Assert.failNotEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:126)
>   at org.junit.Assert.assertEquals(Assert.java:470)
>   at 
> org.apache.lucene.search.TestFieldCacheTermsFilter.testMissingTerms(TestFieldCacheTermsFilter.java:63)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
>   at org.junit.rules.TestWatchman$1.evaluate(TestWatchman.java:48)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
>   at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1214)
>   at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1146)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:24)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:157)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:136)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:117)
>   at org.junit.runner.JUnitCore.runMain(JUnitCore.java:98)
>   at org.junit.runner.JUnitCore.runMainAndExit(JUnitCore.java:53)
>   at org.junit.runner.JUnitCore.main(JUnitCore.java:45)
> {noformat}
> Unfortunately the seed doesn't [consistently] repro for me...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2535) In Solr 3.1.0 the admin/file handler fails to show directory listings

2011-05-24 Thread Peter Wolanin (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Wolanin updated SOLR-2535:


Description: 
In Solr 4.1.1, going to the path solr/admin/file I see an XML-formatted listing 
of the conf directory, like:
{noformat}

01

  12742011-03-06T20:42:54Z
  ...


{noformat}

I can list the xslt sub-dir using solr/admin/files?file=/xslt


In Solr 3.1.0, both of these fail with a 500 error:
{noformat}
HTTP ERROR 500

Problem accessing /solr/admin/file/. Reason:

did not find a CONTENT object

java.io.IOException: did not find a CONTENT object
{noformat}

Looking at the code in class ShowFileRequestHandler, it seem like 3.1.0 should 
still handle directory listings if not file name is given, or if the file is a 
directory, so I am filing this as a bug.


  was:
In Solr 4.1.1, going to the path solr/admin/file I see and XML-formatted 
listing of the conf directory, like:
{noformat}

01

  12742011-03-06T20:42:54Z
  ...


{noformat}

I can list the xslt sub-dir using solr/admin/files?file=/xslt


In Solr 3.1.0, both of these fail with a 500 error:
{noformat}
HTTP ERROR 500

Problem accessing /solr/admin/file/. Reason:

did not find a CONTENT object

java.io.IOException: did not find a CONTENT object
{noformat}

Looking at the code in class ShowFileRequestHandler, it seem like 3.1.0 should 
still handle directory listings if not file name is given, or if the file is a 
directory, so I am filing this as a bug.



> In Solr 3.1.0 the admin/file handler fails to show directory listings
> -
>
> Key: SOLR-2535
> URL: https://issues.apache.org/jira/browse/SOLR-2535
> Project: Solr
>  Issue Type: Bug
>  Components: SearchComponents - other
>Affects Versions: 3.1
> Environment: java 1.6, jetty
>Reporter: Peter Wolanin
> Fix For: 3.1.1, 3.2
>
>
> In Solr 4.1.1, going to the path solr/admin/file I see an XML-formatted 
> listing of the conf directory, like:
> {noformat}
> 
> 0 name="QTime">1
> 
>   1274 name="modified">2011-03-06T20:42:54Z
>   ...
> 
> 
> {noformat}
> I can list the xslt sub-dir using solr/admin/files?file=/xslt
> In Solr 3.1.0, both of these fail with a 500 error:
> {noformat}
> HTTP ERROR 500
> Problem accessing /solr/admin/file/. Reason:
> did not find a CONTENT object
> java.io.IOException: did not find a CONTENT object
> {noformat}
> Looking at the code in class ShowFileRequestHandler, it seem like 3.1.0 
> should still handle directory listings if not file name is given, or if the 
> file is a directory, so I am filing this as a bug.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [jira] [Updated] (SOLR-2540) CommitWithin as an Update Request parameter

2011-05-24 Thread Jan Høydahl

Thanks,

What's your definition of a JIRA contributor vs just a member of the community 
uploading patches to JIRA? Would it make sense to assign issues I'm working on 
to me, even if I'm not a committer?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 24. mai 2011, at 19.05, Mark Miller wrote:

> Hey Jan - just realized you are not a JIRA contributor, so I've upped your 
> JIRA status. You should be able to modify existing issues and such now.
> 
> - Mark
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Apache Nutch / SOLR Ontology Plugin for Ontology-Focus Crawling?

2011-05-24 Thread Börteçin Ege

Hello, does anybody know whether Apache-Nutch or SOLR also has an *ontology
plugin* for the enabling *ontology-focused crawling?*

(http://wiki.apache.org/nutch/OntologyPlugin supposed only does *query
refinement* to expand the possible result set).

Re: [jira] [Updated] (SOLR-2540) CommitWithin as an Update Request parameter

2011-05-24 Thread Mark Miller

Yup - please feel free to assign to yourself and drive the issue to completion. 
At the moment, someone would then have to commit it for you - but that won't 
last long at all.

- Mark

Sent from my iPad

On May 24, 2011, at 4:30 PM, Jan Høydahl  wrote:

> Thanks,
> 
> What's your definition of a JIRA contributor vs just a member of the 
> community uploading patches to JIRA? Would it make sense to assign issues I'm 
> working on to me, even if I'm not a committer?
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> 
> On 24. mai 2011, at 19.05, Mark Miller wrote:
> 
>> Hey Jan - just realized you are not a JIRA contributor, so I've upped your 
>> JIRA status. You should be able to modify existing issues and such now.
>> 
>> - Mark
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-24 Thread Digy (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: TestSimpleFacetedSearch2.cs
SimpleFacetedSearch2.cs

Some comments.

> Contrib/Faceted Search
> --
>
> Key: LUCENENET-415
> URL: https://issues.apache.org/jira/browse/LUCENENET-415
> Project: Lucene.Net
>  Issue Type: New Feature
>Affects Versions: Lucene.Net 2.9.4
>Reporter: Digy
>Priority: Minor
> Attachments: PerformanceTest.cs, PerformanceTest.cs, 
> PerformanceTest.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
> SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch2.cs, 
> SimpleFacetedSearch2.cs, TestSimpleFacetedSearch.cs, 
> TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch2.cs, 
> TestSimpleFacetedSearch2.cs, facet performance.xls, facet performance.xls, 
> facet performance.xls
>
>
> Since I see a lot of questions about faceted search in these days, I plan to 
> add a Faceted-Search project to contrib.
> DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SOLR-2530) Remove Noggit CharArr from FieldType

2011-05-24 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038777#comment-13038777
 ] 

Robert Muir commented on SOLR-2530:
---

bq. I think this is ready though, all tests pass.

Looks good to me, there is still the one nocommit about interned fields but I 
don't know the answer to that one either! The rest of the patch looks great, 
good to finally get rid of UTF16Result/incremental api and to have a CharsRef.

> Remove Noggit CharArr from FieldType
> 
>
> Key: SOLR-2530
> URL: https://issues.apache.org/jira/browse/SOLR-2530
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
>  Labels: api-change
> Fix For: 4.0
>
> Attachments: SOLR-2530.patch, SOLR-2530.patch, SOLR-2530.patch, 
> SOLR-2530.patch, SOLR-2530.patch
>
>
> FieldType#indexedToReadable(BytesRef, CharArr) uses a noggit dependency that 
> also spreads into ByteUtils. The uses of this method area all convert to 
> String which makes this extra reference and the dependency unnecessary. I 
> refactored it to simply return string and removed ByteUtils entirely. The 
> only leftover from BytesUtils is a constant, i moved that one to Lucenes 
> UnicodeUtils. I will upload a patch in a second

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2530) Remove Noggit CharArr from FieldType

2011-05-24 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038784#comment-13038784
 ] 

Simon Willnauer commented on SOLR-2530:
---

yeah I wonder if yonik could look at this nocommit?

> Remove Noggit CharArr from FieldType
> 
>
> Key: SOLR-2530
> URL: https://issues.apache.org/jira/browse/SOLR-2530
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
>  Labels: api-change
> Fix For: 4.0
>
> Attachments: SOLR-2530.patch, SOLR-2530.patch, SOLR-2530.patch, 
> SOLR-2530.patch, SOLR-2530.patch
>
>
> FieldType#indexedToReadable(BytesRef, CharArr) uses a noggit dependency that 
> also spreads into ByteUtils. The uses of this method area all convert to 
> String which makes this extra reference and the dependency unnecessary. I 
> refactored it to simply return string and removed ByteUtils entirely. The 
> only leftover from BytesUtils is a constant, i moved that one to Lucenes 
> UnicodeUtils. I will upload a patch in a second

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[Lucene.Net] Adding proper System.IO.Stream support to Lucene.Net

2011-05-24 Thread Christopher Currens

All,

I've spent the past few days looking at what it would take to implement
proper streaming of data into and out of an index.  Fortunately, it hasn't
proven very difficult at all, leaving me with a solution that works very
nicely.  Now that I know it's possible, I wanted to discuss with the
community the best way to add this to the API.

Currently, it's setup that a field can have a Stream value if its binary
(System.IO.Stream StreamValue()).  I have plans to, wherever in Lucene a
byte[] is used, to replace it with streaming functions, internally.  I think
its a good idea to keep the byte[] BinaryValue() as it is, but essentially
replace it, by default, with a kind of lazy loading.  In the current version
of lucene, if a user were to open a document with a binary field, that
entire field will be loaded into memory.

The idea behind replacing the internals of FieldsReader.cs by passing a
stream along instead of a byte[], is that people using the API to stream the
data out will load no more into memory than they have to.  People using the
byte[] BinaryValue() function to get the binary data will actually have
improved performance as well, as the byte array will be loaded when calling
the method, instead of the creation of the document.

As a final note on binary data streaming, by streaming the data in, we
obviously can't support compression of those fields.  The compression in
Lucene is poor anyway, as it's not compression that can be done in blocks,
it requires large amounts of memory as it needs all the data in memory to do
the compression, which is also done in a separate byte array.  However, an
ability I had briefly talked to Troy about in person, was the ability to add
StreamFilters, so that data passed is filtered first by a compression
algorithm or such before its stored in the index.  However, that doesn't
really apply directly to the lucene domain, but it does at least afford the
user the opportunity to be able to do that via streaming data into
lucene.net.

I also want to add proper TextReader support to Lucene.Net.  A large
difference between the Java and .NET versions of lucene is that the Java
version supports setting a field's value to a TextReader, that both analyzes
and stores the data.  Due to the fact that the TextReader in .Net doesn't
support resetting or seeking of the underlying stream, we can only analyze
the text in lucene, we can't store the field.

A solution that comes to mind would be creating a util class, something
like SeekableTextReader, that inherits from TextReader that can be passed to
the field, with special behavior that allows it to be reset, and thus both
analyzed and stored.  Perhaps the largest downside to that solution, is in
order to keep the API the same while allowing it to be stored, it would
require fairly ugly checks like "if(reader is SeekableTextReader) //do
this".

Perhaps a cleaner solution would be to add yet another value to the Field
class that allowed for a SeekableTextReader to be passed.  This way has its
own downsides, in that now there are two methods that expect TextReaders,
one stores and one doesn't, seems rather confusing.  But I suppose this is
why I was looking for the community's opinion in the first place.


The more comments about this the better.  I think adding this could add some
much needed functionality to Lucene, and start setting apart its performance
from the Java version.


Thanks,
Christopher

[jira] [Commented] (SOLR-1942) Ability to select codec per field

2011-05-24 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038787#comment-13038787
 ] 

Simon Willnauer commented on SOLR-1942:
---

great work robert

here are some comments:

 * SchemaCodecProvider#listAll should also sync on the delegate to be consistent
 * CodecProviderFactory should get a jdoc string IMO
 * I think we need to fix SolrTestCase to enable the random codecs for Solr 
right? Or is it using the default provider if no provider is configured?



> Ability to select codec per field
> -
>
> Key: SOLR-1942
> URL: https://issues.apache.org/jira/browse/SOLR-1942
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0
>Reporter: Yonik Seeley
>Assignee: Robert Muir
> Fix For: 4.0
>
> Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch
>
>
> We should use PerFieldCodecWrapper to allow users to select the codec 
> per-field.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1942) Ability to select codec per field

2011-05-24 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038797#comment-13038797
 ] 

Robert Muir commented on SOLR-1942:
---

{quote}
SchemaCodecProvider#listAll should also sync on the delegate to be consistent
CodecProviderFactory should get a jdoc string IMO
{quote}

I'll update the patch to fix both of these.

{quote}
I think we need to fix SolrTestCase to enable the random codecs for Solr right? 
Or is it using the default provider if no provider is configured?
{quote}

This is still working fine, in SolrCore if you do not configure a 
CodecProviderFactory, then we use CodecProvider.getDefault(), which is set 
randomly by LuceneTestCase. I added some prints to check and this is still ok.


> Ability to select codec per field
> -
>
> Key: SOLR-1942
> URL: https://issues.apache.org/jira/browse/SOLR-1942
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0
>Reporter: Yonik Seeley
>Assignee: Robert Muir
> Fix For: 4.0
>
> Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch
>
>
> We should use PerFieldCodecWrapper to allow users to select the codec 
> per-field.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[Lucene.Net] [jira] [Updated] (LUCENENET-415) Contrib/Faceted Search

2011-05-24 Thread Digy (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-415:
---

Attachment: TestSimpleFacetedSearch2.cs

cleaner test case

> Contrib/Faceted Search
> --
>
> Key: LUCENENET-415
> URL: https://issues.apache.org/jira/browse/LUCENENET-415
> Project: Lucene.Net
>  Issue Type: New Feature
>Affects Versions: Lucene.Net 2.9.4
>Reporter: Digy
>Priority: Minor
> Attachments: PerformanceTest.cs, PerformanceTest.cs, 
> PerformanceTest.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, 
> SimpleFacetedSearch.cs, SimpleFacetedSearch.cs, SimpleFacetedSearch2.cs, 
> SimpleFacetedSearch2.cs, TestSimpleFacetedSearch.cs, 
> TestSimpleFacetedSearch.cs, TestSimpleFacetedSearch2.cs, 
> TestSimpleFacetedSearch2.cs, TestSimpleFacetedSearch2.cs, facet 
> performance.xls, facet performance.xls, facet performance.xls
>
>
> Since I see a lot of questions about faceted search in these days, I plan to 
> add a Faceted-Search project to contrib.
> DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (SOLR-2530) Remove Noggit CharArr from FieldType

2011-05-24 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated SOLR-2530:
--

Attachment: SOLR-2530.patch

updated patch to new suggest module and replaced the nocommit in SimpleFacets 
with a TODO. I will commit this soon if nobody objects.

> Remove Noggit CharArr from FieldType
> 
>
> Key: SOLR-2530
> URL: https://issues.apache.org/jira/browse/SOLR-2530
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
>  Labels: api-change
> Fix For: 4.0
>
> Attachments: SOLR-2530.patch, SOLR-2530.patch, SOLR-2530.patch, 
> SOLR-2530.patch, SOLR-2530.patch, SOLR-2530.patch
>
>
> FieldType#indexedToReadable(BytesRef, CharArr) uses a noggit dependency that 
> also spreads into ByteUtils. The uses of this method area all convert to 
> String which makes this extra reference and the dependency unnecessary. I 
> refactored it to simply return string and removed ByteUtils entirely. The 
> only leftover from BytesUtils is a constant, i moved that one to Lucenes 
> UnicodeUtils. I will upload a patch in a second

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1942) Ability to select codec per field

2011-05-24 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038807#comment-13038807
 ] 

Simon Willnauer commented on SOLR-1942:
---

bq. I'll update the patch to fix both of these.

everything else is looking fine! +1 to commit once updated

{quote}

This is still working fine, in SolrCore if you do not configure a 
CodecProviderFactory, then we use CodecProvider.getDefault(), which is set 
randomly by LuceneTestCase. I added some prints to check and this is still ok.

{quote}

cool, thanks for ensuring that :)

> Ability to select codec per field
> -
>
> Key: SOLR-1942
> URL: https://issues.apache.org/jira/browse/SOLR-1942
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0
>Reporter: Yonik Seeley
>Assignee: Robert Muir
> Fix For: 4.0
>
> Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch
>
>
> We should use PerFieldCodecWrapper to allow users to select the codec 
> per-field.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-1942) Ability to select codec per field

2011-05-24 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-1942:
--

Attachment: SOLR-1942.patch

Updated patch with Simon's suggestions

> Ability to select codec per field
> -
>
> Key: SOLR-1942
> URL: https://issues.apache.org/jira/browse/SOLR-1942
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0
>Reporter: Yonik Seeley
>Assignee: Robert Muir
> Fix For: 4.0
>
> Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch
>
>
> We should use PerFieldCodecWrapper to allow users to select the codec 
> per-field.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1942) Ability to select codec per field

2011-05-24 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038813#comment-13038813
 ] 

Robert Muir commented on SOLR-1942:
---

Thanks for reviewing Simon, I'll commit this soon.



> Ability to select codec per field
> -
>
> Key: SOLR-1942
> URL: https://issues.apache.org/jira/browse/SOLR-1942
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0
>Reporter: Yonik Seeley
>Assignee: Robert Muir
> Fix For: 4.0
>
> Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch
>
>
> We should use PerFieldCodecWrapper to allow users to select the codec 
> per-field.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2530) Remove Noggit CharArr from FieldType

2011-05-24 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038817#comment-13038817
 ] 

Mark Miller commented on SOLR-2530:
---

bq. nocommit in SimpleFacets with a TODO.

I've looked into this in the past - its fine. Perhaps it's safer to change it 
(equals or enum or whatever), but all current usage uses the same constant.

> Remove Noggit CharArr from FieldType
> 
>
> Key: SOLR-2530
> URL: https://issues.apache.org/jira/browse/SOLR-2530
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
>  Labels: api-change
> Fix For: 4.0
>
> Attachments: SOLR-2530.patch, SOLR-2530.patch, SOLR-2530.patch, 
> SOLR-2530.patch, SOLR-2530.patch, SOLR-2530.patch
>
>
> FieldType#indexedToReadable(BytesRef, CharArr) uses a noggit dependency that 
> also spreads into ByteUtils. The uses of this method area all convert to 
> String which makes this extra reference and the dependency unnecessary. I 
> refactored it to simply return string and removed ByteUtils entirely. The 
> only leftover from BytesUtils is a constant, i moved that one to Lucenes 
> UnicodeUtils. I will upload a patch in a second

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2539) VectorValueSource returnes floatVal of DocValues is wrong

2011-05-24 Thread Yonik Seeley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-2539.


Resolution: Fixed

committed, thanks Tom!

> VectorValueSource returnes floatVal of DocValues is wrong
> -
>
> Key: SOLR-2539
> URL: https://issues.apache.org/jira/browse/SOLR-2539
> Project: Solr
>  Issue Type: Bug
>  Components: search
> Environment: JDK1.6/Tomcat6
>Reporter: tom liu
>
> @Override
> public void floatVal(int doc, float[] vals) {
>   vals[0] = x.byteVal(doc);
>   vals[1] = y.byteVal(doc);
> }
> should be:
> @Override
> public void floatVal(int doc, float[] vals) {
>   vals[0] = x.floatVal(doc);
>   vals[1] = y.floatVal(doc);
> }

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2539) VectorValueSource returnes floatVal of DocValues is wrong

2011-05-24 Thread Yonik Seeley (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-2539:
---

Fix Version/s: 3.2

> VectorValueSource returnes floatVal of DocValues is wrong
> -
>
> Key: SOLR-2539
> URL: https://issues.apache.org/jira/browse/SOLR-2539
> Project: Solr
>  Issue Type: Bug
>  Components: search
> Environment: JDK1.6/Tomcat6
>Reporter: tom liu
> Fix For: 3.2
>
>
> @Override
> public void floatVal(int doc, float[] vals) {
>   vals[0] = x.byteVal(doc);
>   vals[1] = y.byteVal(doc);
> }
> should be:
> @Override
> public void floatVal(int doc, float[] vals) {
>   vals[0] = x.floatVal(doc);
>   vals[1] = y.floatVal(doc);
> }

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2762) Don't leak deleted open file handles with pooled readers

2011-05-24 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038847#comment-13038847
 ] 

Michael McCandless commented on LUCENE-2762:


We have a test, TestNRTThreads, that stress-tests NRT to try to catch issues 
like this.

I've backported to 3.0.x, and was able to run it for 30 minutes with max 1024 
file handles, successfully... so if there is a file handle leak, this test 
failed to uncover it.

Not sure what's up... can you try to boil down your test case to a small test 
showing the problem?

Note that when you incRef the searcher you have to do that sync'd somehow with 
your reopen logic to prevent the final decRef from running at the same time.  
(I don't think that's related to the file handle issue but it's important to 
get right else you have a thread hazard).

> Don't leak deleted open file handles with pooled readers
> 
>
> Key: LUCENE-2762
> URL: https://issues.apache.org/jira/browse/LUCENE-2762
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 2.9.4, 3.0.3, 3.1, 4.0
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 2.9.4, 3.0.3, 3.1, 4.0
>
> Attachments: LUCENE-2762.patch
>
>
> If you have CFS enabled today, and pooling is enabled (either directly
> or because you've pulled an NRT reader), IndexWriter will hold open
> SegmentReaders against the non-CFS format of each merged segment.
> So even if you close all NRT readers you've pulled from the writer,
> you'll still see file handles open against files that have been
> deleted.
> This count will not grow unbounded, since it's limited by the number
> of segments in the index, but it's still a serious problem since the
> app had turned off CFS in the first place presumably to avoid risk of
> too-many-open-files.  It's also bad because it ties up disk space
> since these files would otherwise be deleted.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-1942) Ability to select codec per field

2011-05-24 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved SOLR-1942.
---

Resolution: Fixed

Committed revision 1127313.


> Ability to select codec per field
> -
>
> Key: SOLR-1942
> URL: https://issues.apache.org/jira/browse/SOLR-1942
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0
>Reporter: Yonik Seeley
>Assignee: Robert Muir
> Fix For: 4.0
>
> Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch
>
>
> We should use PerFieldCodecWrapper to allow users to select the codec 
> per-field.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2530) Remove Noggit CharArr from FieldType

2011-05-24 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038852#comment-13038852
 ] 

Simon Willnauer commented on SOLR-2530:
---

bq. I've looked into this in the past - its fine. Perhaps it's safer to change 
it (equals or enum or whatever), but all current usage uses the same constant.

alright, I changed the todo to point out that we should cut over to an enum or 
something similar. This is totally unrelated but wanted to make sure it doesn't 
get lost.

I will commit in a minute

> Remove Noggit CharArr from FieldType
> 
>
> Key: SOLR-2530
> URL: https://issues.apache.org/jira/browse/SOLR-2530
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
>  Labels: api-change
> Fix For: 4.0
>
> Attachments: SOLR-2530.patch, SOLR-2530.patch, SOLR-2530.patch, 
> SOLR-2530.patch, SOLR-2530.patch, SOLR-2530.patch
>
>
> FieldType#indexedToReadable(BytesRef, CharArr) uses a noggit dependency that 
> also spreads into ByteUtils. The uses of this method area all convert to 
> String which makes this extra reference and the dependency unnecessary. I 
> refactored it to simply return string and removed ByteUtils entirely. The 
> only leftover from BytesUtils is a constant, i moved that one to Lucenes 
> UnicodeUtils. I will upload a patch in a second

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1942) Ability to select codec per field

2011-05-24 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038864#comment-13038864
 ] 

Simon Willnauer commented on SOLR-1942:
---

Awesome! :) eventually its in...

> Ability to select codec per field
> -
>
> Key: SOLR-1942
> URL: https://issues.apache.org/jira/browse/SOLR-1942
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0
>Reporter: Yonik Seeley
>Assignee: Robert Muir
> Fix For: 4.0
>
> Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, 
> SOLR-1942.patch
>
>
> We should use PerFieldCodecWrapper to allow users to select the codec 
> per-field.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-3018) Lucene Native Directory implementation need automated build

2011-05-24 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer reassigned LUCENE-3018:
---

Assignee: Simon Willnauer  (was: Varun Thacker)

> Lucene Native Directory implementation need automated build
> ---
>
> Key: LUCENE-3018
> URL: https://issues.apache.org/jira/browse/LUCENE-3018
> Project: Lucene - Java
>  Issue Type: Wish
>  Components: general/build
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
> Fix For: 4.0
>
> Attachments: LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, 
> LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, 
> LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, 
> cpptasks-1.0b5.jar, cpptasks-LICENSE-ASL.txt, cpptasks.jar, cpptasks.jar
>
>
> Currently the native directory impl in contrib/misc require manual action to 
> compile the c code (partially) documented in 
>  
> https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/contrib/misc/src/java/overview.html
> yet it would be nice if we had an ant task and documentation for all 
> platforms how to compile them and set up the prerequisites.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2530) Remove Noggit CharArr from FieldType

2011-05-24 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved SOLR-2530.
---

Resolution: Fixed

Committed in revision 1127326.


> Remove Noggit CharArr from FieldType
> 
>
> Key: SOLR-2530
> URL: https://issues.apache.org/jira/browse/SOLR-2530
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
>  Labels: api-change
> Fix For: 4.0
>
> Attachments: SOLR-2530.patch, SOLR-2530.patch, SOLR-2530.patch, 
> SOLR-2530.patch, SOLR-2530.patch, SOLR-2530.patch
>
>
> FieldType#indexedToReadable(BytesRef, CharArr) uses a noggit dependency that 
> also spreads into ByteUtils. The uses of this method area all convert to 
> String which makes this extra reference and the dependency unnecessary. I 
> refactored it to simply return string and removed ByteUtils entirely. The 
> only leftover from BytesUtils is a constant, i moved that one to Lucenes 
> UnicodeUtils. I will upload a patch in a second

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3018) Lucene Native Directory implementation need automated build

2011-05-24 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-3018.
-

Resolution: Fixed

Committed in revision 1127328.

thanks varun


> Lucene Native Directory implementation need automated build
> ---
>
> Key: LUCENE-3018
> URL: https://issues.apache.org/jira/browse/LUCENE-3018
> Project: Lucene - Java
>  Issue Type: Wish
>  Components: general/build
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
> Fix For: 4.0
>
> Attachments: LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, 
> LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, 
> LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, 
> cpptasks-1.0b5.jar, cpptasks-LICENSE-ASL.txt, cpptasks.jar, cpptasks.jar
>
>
> Currently the native directory impl in contrib/misc require manual action to 
> compile the c code (partially) documented in 
>  
> https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/contrib/misc/src/java/overview.html
> yet it would be nice if we had an ant task and documentation for all 
> platforms how to compile them and set up the prerequisites.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2530) Remove Noggit CharArr from FieldType

2011-05-24 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038875#comment-13038875
 ] 

Simon Willnauer commented on SOLR-2530:
---

I am backporting this to 3.x now

> Remove Noggit CharArr from FieldType
> 
>
> Key: SOLR-2530
> URL: https://issues.apache.org/jira/browse/SOLR-2530
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
>  Labels: api-change
> Fix For: 4.0
>
> Attachments: SOLR-2530.patch, SOLR-2530.patch, SOLR-2530.patch, 
> SOLR-2530.patch, SOLR-2530.patch, SOLR-2530.patch
>
>
> FieldType#indexedToReadable(BytesRef, CharArr) uses a noggit dependency that 
> also spreads into ByteUtils. The uses of this method area all convert to 
> String which makes this extra reference and the dependency unnecessary. I 
> refactored it to simply return string and removed ByteUtils entirely. The 
> only leftover from BytesUtils is a constant, i moved that one to Lucenes 
> UnicodeUtils. I will upload a patch in a second

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3140) Backport FSTs to 3.x

2011-05-24 Thread Michael McCandless (JIRA)

Backport FSTs to 3.x


 Key: LUCENE-3140
 URL: https://issues.apache.org/jira/browse/LUCENE-3140
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 3.2




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3140) Backport FSTs to 3.x

2011-05-24 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3140:
---

Attachment: LUCENE-3140.patch

Initial patch.  TestFSTs passes...

I pulled back DataInput/Output too.

Lucene backwards tests failed because IO.copyBytes changed from IndexInput to 
DataInput...

> Backport FSTs to 3.x
> 
>
> Key: LUCENE-3140
> URL: https://issues.apache.org/jira/browse/LUCENE-3140
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 3.2
>
> Attachments: LUCENE-3140.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3141) FastVectorHighlighter - expose FieldFragList.fragInfo for user-customizable FragmentsBuilder

2011-05-24 Thread Sujit Pal (JIRA)

FastVectorHighlighter - expose FieldFragList.fragInfo for user-customizable 
FragmentsBuilder


 Key: LUCENE-3141
 URL: https://issues.apache.org/jira/browse/LUCENE-3141
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 3.1
 Environment: Lucene 3.1
Reporter: Sujit Pal
Priority: Minor


Needed to build a custom highlightable snippet - snippet should start with the 
sentence containing the first match, then continue for 250 characters.

So created a custom FragmentsBuilder extending SimpleFragmentsBuilder and 
overriding the createFragments(IndexReader reader, int docId, String fieldName, 
FieldFragList fieldFragList) method - unit test containing the code is attached 
to the JIRA.

To get this to work, needed to expose (make public) the FieldFragList.fragInfo 
member variable. This is currently package private, so only FragmentsBuilder 
implementations within the lucene-highlighter o.a.l.s.vectorhighlight package 
(such as SimpleFragmentsBuilder) can access it. Since I am just using the 
lucene-highlighter.jar as an external dependency to my application, the 
simplest way to access FieldFragList.fragInfo in my class was to make it public.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3141) FastVectorHighlighter - expose FieldFragList.fragInfo for user-customizable FragmentsBuilder

2011-05-24 Thread Sujit Pal (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sujit Pal updated LUCENE-3141:
--

Attachment: lucene-3141-patch.diff

1) Patch of the change to FieldFragList (taken from the root of the 3.1 release 
branch (svn co http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_1/)



> FastVectorHighlighter - expose FieldFragList.fragInfo for user-customizable 
> FragmentsBuilder
> 
>
> Key: LUCENE-3141
> URL: https://issues.apache.org/jira/browse/LUCENE-3141
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/highlighter
>Affects Versions: 3.1
> Environment: Lucene 3.1
>Reporter: Sujit Pal
>Priority: Minor
>  Labels: features, lucene,
> Attachments: lucene-3141-patch.diff
>
>
> Needed to build a custom highlightable snippet - snippet should start with 
> the sentence containing the first match, then continue for 250 characters.
> So created a custom FragmentsBuilder extending SimpleFragmentsBuilder and 
> overriding the createFragments(IndexReader reader, int docId, String 
> fieldName, FieldFragList fieldFragList) method - unit test containing the 
> code is attached to the JIRA.
> To get this to work, needed to expose (make public) the 
> FieldFragList.fragInfo member variable. This is currently package private, so 
> only FragmentsBuilder implementations within the lucene-highlighter 
> o.a.l.s.vectorhighlight package (such as SimpleFragmentsBuilder) can access 
> it. Since I am just using the lucene-highlighter.jar as an external 
> dependency to my application, the simplest way to access 
> FieldFragList.fragInfo in my class was to make it public.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3141) FastVectorHighlighter - expose FieldFragList.fragInfo for user-customizable FragmentsBuilder

2011-05-24 Thread Sujit Pal (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sujit Pal updated LUCENE-3141:
--

Attachment: LIABookTest.java

Unit test demonstrating use of the FieldFragList.fragInfo by an external (out 
of vectorhighlight package) FragmentsBuilder.


> FastVectorHighlighter - expose FieldFragList.fragInfo for user-customizable 
> FragmentsBuilder
> 
>
> Key: LUCENE-3141
> URL: https://issues.apache.org/jira/browse/LUCENE-3141
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/highlighter
>Affects Versions: 3.1
> Environment: Lucene 3.1
>Reporter: Sujit Pal
>Priority: Minor
>  Labels: features, lucene,
> Attachments: LIABookTest.java, lucene-3141-patch.diff
>
>
> Needed to build a custom highlightable snippet - snippet should start with 
> the sentence containing the first match, then continue for 250 characters.
> So created a custom FragmentsBuilder extending SimpleFragmentsBuilder and 
> overriding the createFragments(IndexReader reader, int docId, String 
> fieldName, FieldFragList fieldFragList) method - unit test containing the 
> code is attached to the JIRA.
> To get this to work, needed to expose (make public) the 
> FieldFragList.fragInfo member variable. This is currently package private, so 
> only FragmentsBuilder implementations within the lucene-highlighter 
> o.a.l.s.vectorhighlight package (such as SimpleFragmentsBuilder) can access 
> it. Since I am just using the lucene-highlighter.jar as an external 
> dependency to my application, the simplest way to access 
> FieldFragList.fragInfo in my class was to make it public.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (SOLR-2539) VectorValueSource returnes floatVal of DocValues is wrong

2011-05-24 Thread tom liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tom liu closed SOLR-2539.
-


fixed

> VectorValueSource returnes floatVal of DocValues is wrong
> -
>
> Key: SOLR-2539
> URL: https://issues.apache.org/jira/browse/SOLR-2539
> Project: Solr
>  Issue Type: Bug
>  Components: search
> Environment: JDK1.6/Tomcat6
>Reporter: tom liu
> Fix For: 3.2
>
>
> @Override
> public void floatVal(int doc, float[] vals) {
>   vals[0] = x.byteVal(doc);
>   vals[1] = y.byteVal(doc);
> }
> should be:
> @Override
> public void floatVal(int doc, float[] vals) {
>   vals[0] = x.floatVal(doc);
>   vals[1] = y.floatVal(doc);
> }

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3139) LuceneTestCase.afterClass does not print enough information if a temp-test-dir fails to delete

2011-05-24 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038948#comment-13038948
 ] 

Shai Erera commented on LUCENE-3139:


I think I've found the problem - MockIndexOutputWrapper did not close delegate 
if dir.maybeThrowEx actually threw an exception. Here's a patch that fixes it:

{code}
Index: 
lucene/src/test-framework/org/apache/lucene/store/MockIndexOutputWrapper.java
===
--- 
lucene/src/test-framework/org/apache/lucene/store/MockIndexOutputWrapper.java   
(revision 1127062)
+++ 
lucene/src/test-framework/org/apache/lucene/store/MockIndexOutputWrapper.java   
(working copy)
@@ -45,20 +45,23 @@

   @Override
   public void close() throws IOException {
-dir.maybeThrowDeterministicException();
-delegate.close();
-if (dir.trackDiskUsage) {
-  // Now compute actual disk usage & track the maxUsedSize
-  // in the MockDirectoryWrapper:
-  long size = dir.getRecomputedActualSizeInBytes();
-  if (size > dir.maxUsedSize) {
-dir.maxUsedSize = size;
+try {
+  dir.maybeThrowDeterministicException();
+} finally {
+  delegate.close();
+  if (dir.trackDiskUsage) {
+// Now compute actual disk usage & track the maxUsedSize
+// in the MockDirectoryWrapper:
+long size = dir.getRecomputedActualSizeInBytes();
+if (size > dir.maxUsedSize) {
+  dir.maxUsedSize = size;
+}
   }
+  synchronized(dir) {
+dir.openFileHandles.remove(this);
+dir.openFilesForWrite.remove(name);
+  }
 }
-synchronized(dir) {
-  dir.openFileHandles.remove(this);
-  dir.openFilesForWrite.remove(name);
-}
   }

   @Override
{code}

Maybe we solve it by moving delegate.close() before dir.maybeThrow, instead of 
the try-finally?

> LuceneTestCase.afterClass does not print enough information if a 
> temp-test-dir fails to delete
> --
>
> Key: LUCENE-3139
> URL: https://issues.apache.org/jira/browse/LUCENE-3139
> Project: Lucene - Java
>  Issue Type: Test
>  Components: general/test
>Reporter: Shai Erera
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3139.patch, LUCENE-3139.patch
>
>
> I've hit an exception from LTC.afterClass when _TestUtil.rmDir failed (on 
> write.lock, as if some test did not release resources). However, I had no 
> idea which test caused that (i.e. opened the temp directory and did not 
> release resources).
> I think we should do the following:
> * Track in LTC a map from dirName -> StackTraceElement
> * In afterClass if _TestUtil.rmDir fails, print the STE of that particular 
> dir, so we know where was this directory created from
> * Make tempDirs private and create accessor method, so that we control the 
> inserts to this map (today the Set is updated by LTC, _TestUtils and 
> TestBackwards !)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

95 matches

Mail list logo