[jira] [Commented] (SOLR-2690) Date Faceting or Range Faceting doesn't take timezone into account.

2011-12-07 Thread Shotaro Kamio (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165058#comment-13165058
 ] 

Shotaro Kamio commented on SOLR-2690:
-

David, we also faced the date facet gap (rounding) issue. If you can post your 
patch here, it's very helpful.


> Date Faceting or Range Faceting doesn't take timezone into account.
> ---
>
> Key: SOLR-2690
> URL: https://issues.apache.org/jira/browse/SOLR-2690
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 3.3
>Reporter: David Schlotfeldt
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Timezone needs to be taken into account when doing date math. Currently it 
> isn't. DateMathParser instances created are always being constructed with 
> UTC. This is a huge issue when it comes to faceting. Depending on your 
> timezone day-light-savings changes the length of a month. A facet gap of 
> +1MONTH is different depending on the timezone and the time of the year.
> I believe the issue is very simple to fix. There are three places in the code 
> DateMathParser is created. All three are configured with the timezone being 
> UTC. If a user could specify the TimeZone to pass into DateMathParser this 
> faceting issue would be resolved.
> Though it would be nice if we could always specify the timezone 
> DateMathParser uses (since date math DOES depend on timezone) its really only 
> essential that we can affect DateMathParser the SimpleFacets uses when 
> dealing with the gap of the date facets.
> Another solution is to expand the syntax of the expressions DateMathParser 
> understands. For example we could allow "(?timeZone=VALUE)" to be added 
> anywhere within an expression. VALUE would be the id of the timezone. When 
> DateMathParser reads this in sets the timezone on the Calendar it is using.
> Two examples:
> - "(?timeZone=America/Chicago)NOW/YEAR"
> - "(?timeZone=America/Chicago)+1MONTH"
> I would be more then happy to modify DateMathParser and provide a patch. I 
> just need a committer to agree this needs to be resolved and a decision needs 
> to be made on the syntax used
> Thanks!
> David

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2953) Introducing hit Count as an alternative to score

2011-12-07 Thread Kaleem Ahmed (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165050#comment-13165050
 ] 

Kaleem Ahmed commented on SOLR-2953:


I don't think changing the similarity does it all.. refer this link 
http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/package-summary.html#changingSimilarity

it says
{quote}

Changing Scoring — Expert Level

Changing scoring is an expert level task, so tread carefully and be prepared to 
share your code if you want help.

With the warning out of the way, it is possible to change a lot more than just 
the Similarity when it comes to scoring in Lucene. Lucene's scoring is a 
complex mechanism that is grounded by three main classes:

Query — The abstract object representation of the user's information need.
Weight — The internal interface representation of the user's Query, so that 
Query objects may be reused.
Scorer — An abstract class containing common functionality for scoring. 
Provides both scoring and explanation capabilities.{quote}
 I mainly changed the scorers and query classes of different queries to achieve 
it

> Introducing hit Count as an alternative to score 
> -
>
> Key: SOLR-2953
> URL: https://issues.apache.org/jira/browse/SOLR-2953
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.0
>Reporter: Kaleem Ahmed
>  Labels: features
> Fix For: 4.0
>
>   Original Estimate: 1,008h
>  Remaining Estimate: 1,008h
>
> As of now we have score as relevancy factor for a query against a document, 
> and this score is relative to the number of documents in the index. In the 
> same way why not have some other relevancy feature say "hitCounts" which is 
> absolute for a given doc and a given query, It shouldn't depend on the number 
> of documents in the index. This will help a lot for the frequently changing 
> indexes , where the search rules are predefined along the relevancy factor 
> for a document to be qualified for that query(search rule). 
> Ex: consider a use case where a list of queries are formed with a threshold 
> number for each query and these are searched on a frequently updated index to 
> get the documents that score above the threshold i.e. when a document's 
> relevancy factor crosses the threshold for a query the document is said to be 
> qualified for that query. 
> For the above use case to satisfy the score shouldn't change every time the 
> index gets updated with new documents. So we introduce new feature called 
> "hitCount"  which represents the relevancy of a document against a query and 
> it is absolute(won't change with index size). 
> This hitCount is a positive integer and is calculated as follows 
> Ex: Document with text "the quick fox jumped over the lazy dog, while the 
> lazy dog was too lazy to care" 
> 1. for the query "lazy AND dog" the hitCount will be == (no of occurrences of 
> "lazy" in the document) +  (no of occurrences of "dog" in the document)  =>  
> 3+2 => 5  
> 2. for the phrase query  \"lazy dog\"  the hitCount will be == (no of 
> occurrences of exact phrase "lazy dog" in the document) => 2
> This will be very useful  as an alternative scoring mechanism.
> I already implemented this whole thing in the Solr source code(that I 
> downloaded) and we are using it. So far it's going good. 
> It would be really great if this feature is added to trunk (original  Solr) 
> so that we don't have to implement the changes every time  a new version is 
> released and also others could be benefited with this. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (SOLR-2953) Introducing hit Count as an alternative to score

2011-12-07 Thread Kaleem Ahmed (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165050#comment-13165050
 ] 

Kaleem Ahmed edited comment on SOLR-2953 at 12/8/11 6:58 AM:
-

I don't think changing the similarity does it all.. refer this link 
http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/package-summary.html#changingSimilarity

it says
{quote}

Changing Scoring — Expert Level

Changing scoring is an expert level task, so tread carefully and be prepared to 
share your code if you want help.

With the warning out of the way, it is possible to change a lot more than just 
the Similarity when it comes to scoring in Lucene. Lucene's scoring is a 
complex mechanism that is grounded by three main classes:

Query — The abstract object representation of the user's information need.
Weight — The internal interface representation of the user's Query, so that 
Query objects may be reused.
Scorer — An abstract class containing common functionality for scoring. 
Provides both scoring and explanation capabilities.{quote}
 I mainly changed the scorers and query classes of different queries to achieve 
it  .. 

Will post the patch soon..

  was (Author: kaleemxy):
I don't think changing the similarity does it all.. refer this link 
http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/package-summary.html#changingSimilarity

it says
{quote}

Changing Scoring — Expert Level

Changing scoring is an expert level task, so tread carefully and be prepared to 
share your code if you want help.

With the warning out of the way, it is possible to change a lot more than just 
the Similarity when it comes to scoring in Lucene. Lucene's scoring is a 
complex mechanism that is grounded by three main classes:

Query — The abstract object representation of the user's information need.
Weight — The internal interface representation of the user's Query, so that 
Query objects may be reused.
Scorer — An abstract class containing common functionality for scoring. 
Provides both scoring and explanation capabilities.{quote}
 I mainly changed the scorers and query classes of different queries to achieve 
it
  
> Introducing hit Count as an alternative to score 
> -
>
> Key: SOLR-2953
> URL: https://issues.apache.org/jira/browse/SOLR-2953
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.0
>Reporter: Kaleem Ahmed
>  Labels: features
> Fix For: 4.0
>
>   Original Estimate: 1,008h
>  Remaining Estimate: 1,008h
>
> As of now we have score as relevancy factor for a query against a document, 
> and this score is relative to the number of documents in the index. In the 
> same way why not have some other relevancy feature say "hitCounts" which is 
> absolute for a given doc and a given query, It shouldn't depend on the number 
> of documents in the index. This will help a lot for the frequently changing 
> indexes , where the search rules are predefined along the relevancy factor 
> for a document to be qualified for that query(search rule). 
> Ex: consider a use case where a list of queries are formed with a threshold 
> number for each query and these are searched on a frequently updated index to 
> get the documents that score above the threshold i.e. when a document's 
> relevancy factor crosses the threshold for a query the document is said to be 
> qualified for that query. 
> For the above use case to satisfy the score shouldn't change every time the 
> index gets updated with new documents. So we introduce new feature called 
> "hitCount"  which represents the relevancy of a document against a query and 
> it is absolute(won't change with index size). 
> This hitCount is a positive integer and is calculated as follows 
> Ex: Document with text "the quick fox jumped over the lazy dog, while the 
> lazy dog was too lazy to care" 
> 1. for the query "lazy AND dog" the hitCount will be == (no of occurrences of 
> "lazy" in the document) +  (no of occurrences of "dog" in the document)  =>  
> 3+2 => 5  
> 2. for the phrase query  \"lazy dog\"  the hitCount will be == (no of 
> occurrences of exact phrase "lazy dog" in the document) => 2
> This will be very useful  as an alternative scoring mechanism.
> I already implemented this whole thing in the Solr source code(that I 
> downloaded) and we are using it. So far it's going good. 
> It would be really great if this feature is added to trunk (original  Solr) 
> so that we don't have to implement the changes every time  a new version is 
> released and also others could be benefited with this. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please c

[jira] [Commented] (SOLR-2409) edismax unescaped colon returns no results

2011-12-07 Thread Michael Watts (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164978#comment-13164978
 ] 

Michael Watts commented on SOLR-2409:
-

I guess \_val\_ is another 'special' field to account for. Does anyone know of 
any others?

> edismax unescaped colon returns no results
> --
>
> Key: SOLR-2409
> URL: https://issues.apache.org/jira/browse/SOLR-2409
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Reporter: Ryan McKinley
>Assignee: Yonik Seeley
>Priority: Minor
> Fix For: 3.2
>
> Attachments: SOLR-2409-unescapedcolon.patch, SOLR-2409.patch, 
> SOLR-2409.patch
>
>
> The edismax query parser should behave OK when a colon is in the query, but 
> does not refer to a field name.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 11712 - Failure

2011-12-07 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/11712/

1 tests failed.
REGRESSION:  org.apache.solr.search.TestRealTimeGet.testStressGetRealtime

Error Message:
java.lang.AssertionError: Some threads threw uncaught exceptions!

Stack Trace:
java.lang.RuntimeException: java.lang.AssertionError: Some threads threw 
uncaught exceptions!
at 
org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:657)
at org.apache.solr.SolrTestCaseJ4.tearDown(SolrTestCaseJ4.java:86)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at 
org.apache.lucene.util.LuceneTestCase.checkUncaughtExceptionsAfter(LuceneTestCase.java:685)
at 
org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:629)




Build Log (for compile errors):
[...truncated 8534 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2802) Toolkit of UpdateProcessors for modifying document values

2011-12-07 Thread Hoss Man (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-2802:
---

Attachment: SOLR-2802_update_processor_toolkit.patch

I had some time to revisit this issue more again today.

Improvements in this patch:

* exclude options - you can now specify one ore more sets of "exclude" lists 
which are parsed just like the main list of field specifies (examples below)
* improved defaults for ConcatFieldUpdateProcessorFactory - default behavior is 
now to only concat values for fields that the schema says are multiValued=false 
and (StrField or TextField)
* new RemoveBlankFieldUpdateProcessorFactory - removes any 0 length 
CharSequence values it finds, by default looks at all fields
* new FieldLengthUpdateProcessorFactory - replaces any CharSequence values it 
finds with their length, by default it looks at no fields

As part of this work, i tweaked the abstract classes so that the "default" 
assumption about what fields a subclass should match "by default" is still "all 
fields" but it's easy for the subclasses to override this -- the user still has 
the final say, and the abstract class handles that, but if the user doesn't 
configure anything the sub-class can easily say "my default should be ___"

bq. I think I don't completely follow the explicit ruling

I explained myself really terribly before - i was convoluting what should 
really be two orthogonal things:

1) the *field names* that a processor looks at -- the user should have lots of 
options for configuring the field selector explicitly, and if they don't, then 
a sensible default based on the specifics of the processor should be applied, 
and the user should still have the ability to configure exclusion rules on top 
of that default

2) the *values types* that a process will deal with -- regardless of what field 
names a processor is configured with, it should be logical about the types of 
values it finds in those fields.  The FieldLengthUpdateProcessorFactory i just 
added for example only pays attention to values that are CharSequence, if for 
example the SolrInputField already contained an Integer wouldn't make sense to 
toString() that and then find the length of that String vlaue.

bq. I think Date/Number parsing should only be done on compatible fields only. 
I think if a subsequent parser moves / renames fields, then this processor 
should have been configured before the processor that does the Date/Number 
parsing.

But that could easily lead to a chicken-vs-egg problem.  I think ideally you 
should be able to have field names in your SolrInputDocuments (and in your 
processor configurations) that don't exist in your schema at all, so you can 
have "transitory" names that exist purely for passing info arround.

Imagine a situation where you want to let clients submit documents containing a 
"publishDate" field, but you want to be able to cleanly accept real Date 
objects (from java clients) or Strings in a variety of formats, and then you 
want the final index to contain two versions of that date: one indexed 
TrieDateField called "pubDate", and one non indexed StrField called 
"prettyDate" -- ie, there is no  "publishDate" in your schema at all.  You 
could then configure some "ParseDateFieldUpdateProcessor" on the "publishDate" 
even though that field name isn't in your schema, so that you have consistent 
Date objects, and then use a CloneFieldUpdateProcessor and/or 
RenameFieldUpdateProcessor to get that Date object into both your "pubDate" and 
"prettyDate" fields, and then use some sort of FormatDateFieldUpdateProcessor 
on the "prettyDate" field.

There may be other solutions to that type of problem, but I guess the bottom 
line from my perspective is: why bother making a processor deliberately fails 
the user configures it to do something unexpected but still viable?  If they 
want to Parse Strings -> Dates on a TrieIntField, why not just let them do it?  
maybe they've got another processor later that is going to convert that Date to 
"days since epoc" as an integer?


{panel}
Examples of the exclude configuration...

{code}

  
foo.*
bar.*


  solr.DateField


  .*HOSS.*

  


  
foo.*
bar.*


  solr.DateField
  .*HOSS.*

  

{code}

In the "trim-few" case, field names will be excluded if they are DateFields 
_or_ match the "HOSS" regex.  In the "trim-some" case, field names will be 
excluded only if they are _both_ a DateField _and_ match the "HOSS" regex.
{panel}

> Toolkit of UpdateProcessors for modifying document values
> -
>
> Key: SOLR-2802
> URL: https://issues.apache.org/jira/browse/SOLR-2802
> Project: Solr
>  Issue Type: New Feature
>Reporter: Hoss Man
> Attachments: SOLR-2802_update_processor_toolkit.

[jira] [Commented] (SOLR-2409) edismax unescaped colon returns no results

2011-12-07 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164930#comment-13164930
 ] 

Yonik Seeley commented on SOLR-2409:


Hmmm, you're right... this was definitely not intended.

> edismax unescaped colon returns no results
> --
>
> Key: SOLR-2409
> URL: https://issues.apache.org/jira/browse/SOLR-2409
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Reporter: Ryan McKinley
>Assignee: Yonik Seeley
>Priority: Minor
> Fix For: 3.2
>
> Attachments: SOLR-2409-unescapedcolon.patch, SOLR-2409.patch, 
> SOLR-2409.patch
>
>
> The edismax query parser should behave OK when a colon is in the query, but 
> does not refer to a field name.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2409) edismax unescaped colon returns no results

2011-12-07 Thread Michael Watts (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164924#comment-13164924
 ] 

Michael Watts commented on SOLR-2409:
-

(Sorry, I'm not familiar with the Jira syntax, there should be underscores on 
the extremes of 'query')

> edismax unescaped colon returns no results
> --
>
> Key: SOLR-2409
> URL: https://issues.apache.org/jira/browse/SOLR-2409
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Reporter: Ryan McKinley
>Assignee: Yonik Seeley
>Priority: Minor
> Fix For: 3.2
>
> Attachments: SOLR-2409-unescapedcolon.patch, SOLR-2409.patch, 
> SOLR-2409.patch
>
>
> The edismax query parser should behave OK when a colon is in the query, but 
> does not refer to a field name.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2409) edismax unescaped colon returns no results

2011-12-07 Thread Michael Watts (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164923#comment-13164923
 ] 

Michael Watts commented on SOLR-2409:
-

This seems to give up support for '_query_:{!parser p1=v1 ... pn=vn}'. Is this 
intended? (as far as I know, this would decrease the expressiveness of edismax)

> edismax unescaped colon returns no results
> --
>
> Key: SOLR-2409
> URL: https://issues.apache.org/jira/browse/SOLR-2409
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Reporter: Ryan McKinley
>Assignee: Yonik Seeley
>Priority: Minor
> Fix For: 3.2
>
> Attachments: SOLR-2409-unescapedcolon.patch, SOLR-2409.patch, 
> SOLR-2409.patch
>
>
> The edismax query parser should behave OK when a colon is in the query, but 
> does not refer to a field name.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3627) CorruptIndexException on indexing after a failure occurs after segments file creation but before any bytes are written

2011-12-07 Thread Ken McCracken (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ken McCracken updated LUCENE-3627:
--

  Description: 
FSDirectory.createOutput(..) uses a RandomAccessFile to do its work.  On my 
system the default FSDirectory.open(..) creates an NIOFSDirectory.  If 
createOutput is called on a segments_* file and a crash occurs between 
RandomAccessFile creation (file system shows a segments_* file exists but has 
zero bytes) but before any bytes are written to the file, subsequent 
IndexWriters cannot proceed.  The difficulty is that it does not know how to 
clear the empty segments_* file.  None of the file deletions will happen on 
such a segment file because the opening bytes cannot not be read to determine 
format and version.

An initial proposed patch file is attached below.



  was:
FSDirectory.createOutput(..) uses a RandomAccessFile to do its work.  On my 
system the default FSDirectory.open(..) creates an NIOFSDirectory.  If 
createOutput is called on a segments_* file and a crash occurs between 
RandomAccessFile creation (file system shows a segments_* file exists but has 
zero bytes) but before any bytes are written to the file, subsequent 
IndexWriters cannot proceed.  The difficulty is that it does not know how to 
clear the empty segments_* file.  None of the file deletions will happen on 
such a segment file because the opening bytes cannot not be read to determine 
format and version.

I will attempt to attach a Test file demonstrates the issue; place it in your 
src/test/org/apache/lucene/store/
directory and run the unit tests with JUnit4.



Lucene Fields: New,Patch Available  (was: New)

> CorruptIndexException on indexing after a failure occurs after segments file 
> creation but before any bytes are written
> --
>
> Key: LUCENE-3627
> URL: https://issues.apache.org/jira/browse/LUCENE-3627
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 3.5
> Environment: lucene-3.5.0, src download from GA release 
> lucene.apache.org.
> Mac OS X 10.6.5, running tests in Eclipse Build id: 20100218-1602, 
> java version "1.6.0_24"
> Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326)
> Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode)
>Reporter: Ken McCracken
>Priority: Critical
> Attachments: LUCENE-3627_initial_proposal.txt, 
> TestCrashCausesCorruptIndex.java
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> FSDirectory.createOutput(..) uses a RandomAccessFile to do its work.  On my 
> system the default FSDirectory.open(..) creates an NIOFSDirectory.  If 
> createOutput is called on a segments_* file and a crash occurs between 
> RandomAccessFile creation (file system shows a segments_* file exists but has 
> zero bytes) but before any bytes are written to the file, subsequent 
> IndexWriters cannot proceed.  The difficulty is that it does not know how to 
> clear the empty segments_* file.  None of the file deletions will happen on 
> such a segment file because the opening bytes cannot not be read to determine 
> format and version.
> An initial proposed patch file is attached below.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (LUCENE-3627) CorruptIndexException on indexing after a failure occurs after segments file creation but before any bytes are written

2011-12-07 Thread Ken McCracken (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164837#comment-13164837
 ] 

Ken McCracken edited comment on LUCENE-3627 at 12/7/11 11:36 PM:
-

Initial proposed patch.  I'm not sure if this is the correct place and scope.  
But it does fix my test case.
The test case and the proposed code change are attached.

aaa:lucene kmccrack$ svn info
Path: .
URL: http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_5/lucene
Repository Root: http://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: 1211687
Node Kind: directory
Schedule: normal
Last Changed Author: sarowe
Last Changed Rev: 1207561
Last Changed Date: 2011-11-28 15:11:35 -0500 (Mon, 28 Nov 2011)


  was (Author: ken.mccracken):
Initial proposed patch.  I'm not sure if this is the correct place and 
scope.  But it does fix my test case.
The test case and the proposed code change are attached.
  
> CorruptIndexException on indexing after a failure occurs after segments file 
> creation but before any bytes are written
> --
>
> Key: LUCENE-3627
> URL: https://issues.apache.org/jira/browse/LUCENE-3627
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 3.5
> Environment: lucene-3.5.0, src download from GA release 
> lucene.apache.org.
> Mac OS X 10.6.5, running tests in Eclipse Build id: 20100218-1602, 
> java version "1.6.0_24"
> Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326)
> Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode)
>Reporter: Ken McCracken
>Priority: Critical
> Attachments: LUCENE-3627_initial_proposal.txt, 
> TestCrashCausesCorruptIndex.java
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> FSDirectory.createOutput(..) uses a RandomAccessFile to do its work.  On my 
> system the default FSDirectory.open(..) creates an NIOFSDirectory.  If 
> createOutput is called on a segments_* file and a crash occurs between 
> RandomAccessFile creation (file system shows a segments_* file exists but has 
> zero bytes) but before any bytes are written to the file, subsequent 
> IndexWriters cannot proceed.  The difficulty is that it does not know how to 
> clear the empty segments_* file.  None of the file deletions will happen on 
> such a segment file because the opening bytes cannot not be read to determine 
> format and version.
> I will attempt to attach a Test file demonstrates the issue; place it in your 
> src/test/org/apache/lucene/store/
> directory and run the unit tests with JUnit4.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3627) CorruptIndexException on indexing after a failure occurs after segments file creation but before any bytes are written

2011-12-07 Thread Ken McCracken (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ken McCracken updated LUCENE-3627:
--

Attachment: LUCENE-3627_initial_proposal.txt

Initial proposed patch.  I'm not sure if this is the correct place and scope.  
But it does fix my test case.
The test case and the proposed code change are attached.

> CorruptIndexException on indexing after a failure occurs after segments file 
> creation but before any bytes are written
> --
>
> Key: LUCENE-3627
> URL: https://issues.apache.org/jira/browse/LUCENE-3627
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 3.5
> Environment: lucene-3.5.0, src download from GA release 
> lucene.apache.org.
> Mac OS X 10.6.5, running tests in Eclipse Build id: 20100218-1602, 
> java version "1.6.0_24"
> Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326)
> Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode)
>Reporter: Ken McCracken
>Priority: Critical
> Attachments: LUCENE-3627_initial_proposal.txt, 
> TestCrashCausesCorruptIndex.java
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> FSDirectory.createOutput(..) uses a RandomAccessFile to do its work.  On my 
> system the default FSDirectory.open(..) creates an NIOFSDirectory.  If 
> createOutput is called on a segments_* file and a crash occurs between 
> RandomAccessFile creation (file system shows a segments_* file exists but has 
> zero bytes) but before any bytes are written to the file, subsequent 
> IndexWriters cannot proceed.  The difficulty is that it does not know how to 
> clear the empty segments_* file.  None of the file deletions will happen on 
> such a segment file because the opening bytes cannot not be read to determine 
> format and version.
> I will attempt to attach a Test file demonstrates the issue; place it in your 
> src/test/org/apache/lucene/store/
> directory and run the unit tests with JUnit4.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1730) Solr fails to start if QueryElevationComponent config is missing

2011-12-07 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164836#comment-13164836
 ] 

Mark Miller commented on SOLR-1730:
---

I have not had a chance to apply it and look thoroughly, but I read the patch 
and comment earlier today and it all looks good to me.

> Solr fails to start if QueryElevationComponent config is missing
> 
>
> Key: SOLR-1730
> URL: https://issues.apache.org/jira/browse/SOLR-1730
> Project: Solr
>  Issue Type: Bug
>  Components: SearchComponents - other
>Affects Versions: 1.4
>Reporter: Mark Miller
>Assignee: Grant Ingersoll
>  Labels: newdev
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-1730.patch, SOLR-1730.patch
>
>
> QueryElevationComponent tries to do preload some data if its config file does 
> not exist:
> {code}
> if (!exists){
>   // preload the first data
>   RefCounted searchHolder = null;
>   try {
> searchHolder = core.getNewestSearcher(false);
> IndexReader reader = searchHolder.get().getReader();
> getElevationMap( reader, core );
>   } finally {
> if (searchHolder != null) searchHolder.decref();
>   }
> }
> {code}
> This does not work though, as asking for the newest searcher causes a request 
> to be submitted to Solr before its ready to handle it:
> {code}
>  [java] SEVERE: java.lang.NullPointerException
>  [java]   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
>  [java]   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>  [java]   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1317)
>  [java]   at 
> org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:52)
>  [java]   at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1147)
>  [java]   at 
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>  [java]   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> {code}
> The SearchHandler has not yet been core informed (as the 
> QueryElevationComponent causes this as its getting core informed right before 
> the SearchHandler) and so its components arraylist is still null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (LUCENE-3606) Make IndexReader really read-only in Lucene 4.0

2011-12-07 Thread Uwe Schindler (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164833#comment-13164833
 ] 

Uwe Schindler edited comment on LUCENE-3606 at 12/7/11 11:28 PM:
-

I tried to fix the remaining tests today, but this seems impossible without 
IndexReader.deleteDocument(int docId). Some of those tests with commented out 
by nocommits are so old that its impossible to understand what they are really 
testing (especially TestAddIndexes and TestIndexWriterMerging). I would simply 
delete them, because all this stuff is heavyily random tested otherwise (those 
"old tests" have no randomization at all).

The remaining nocommits are:

{noformat}
./src/java/org/apache/lucene/index/codecs/lucene40/Lucene40NormsReader.java:
  // nocommit: change to a real check? see LUCENE-3619
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: move the 
whole modification stuff to IW
./src/java/org/apache/lucene/index/SegmentReader.java:  // end nocommit
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: is this 
needed anymore by IndexWriter?
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/test/org/apache/lucene/index/TestAddIndexes.java:  /* nocommit: 
reactivate these tests
./src/test/org/apache/lucene/index/TestDeletionPolicy.java:  /* nocommit: fix 
this test, I don't understand it!
./src/test/org/apache/lucene/index/TestIndexWriterMerging.java:  /* nocommit: 
Fix tests to use an id and delete by term
./src/test/org/apache/lucene/index/TestParallelReaderEmptyIndex.java:  /* 
nocommit: Fix tests to use an id and delete by term
./src/test/org/apache/lucene/index/TestSizeBoundedForceMerge.java:  /* 
nocommit: Fix tests to use an id and delete by term
./src/test/org/apache/lucene/index/TestSizeBoundedForceMerge.java:  /* 
nocommit: Fix tests to use an id and delete by term
{noformat}

The parts in SegmentReader should be made TODO and a new issue should be 
opened, which removed RW from SegmentReader (Mike?). The tests should be 
deleted as described above. Otherwise the branch seems finalized otherwise so I 
would like to merge back to trunk asap.

  was (Author: thetaphi):
I tried to fix the remaining tests today, but this seems impossible without 
IndexReader.deleteDocument(int docId). Some of those tests with commented out 
by nocommits are so old that its impossible to understand what they are really 
testing (especially TestAddIndexes and TestIndexWriterMerging). I would simply 
delete them, because all this stuff is heavyily random tested otherwise (those 
"old tests" have no randomization at all).

The remaining nocommits are:

{noformat}
./src/java/org/apache/lucene/index/codecs/lucene40/Lucene40NormsReader.java:
  // nocommit: change to a real check? see LUC
19
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: move the 
whole modification stuff to IW
./src/java/org/apache/lucene/index/SegmentReader.java:  // end nocommit
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: is this 
needed anymore by IndexWriter?
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/test/org/apache/lucene/index/TestAddIndexes.java:  /* nocommit: 
reactivate these tests
./src/test/org/apache/lucene/index/TestDeletionPolicy.java:  /* nocommit: fix 
this test, I don't understand it!
./src/test/org/apache/lucene/index/TestIndexWriterMerging.java:  /* nocommit: 
Fix tests to use an id and delete by te

[jira] [Commented] (LUCENE-3606) Make IndexReader really read-only in Lucene 4.0

2011-12-07 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164833#comment-13164833
 ] 

Uwe Schindler commented on LUCENE-3606:
---

I tried to fix the remaining tests today, but this seems impossible without 
IndexReader.deleteDocument(int docId). Some of those tests with commented out 
by nocommits are so old that its impossible to understand what they are really 
testing (especially TestAddIndexes and TestIndexWriterMerging). I would simply 
delete them, because all this stuff is heavyily random tested otherwise (those 
"old tests" have no randomization at all).

The remaining nocommits are:

{noformat}
./src/java/org/apache/lucene/index/codecs/lucene40/Lucene40NormsReader.java:
  // nocommit: change to a real check? see LUC
19
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: move the 
whole modification stuff to IW
./src/java/org/apache/lucene/index/SegmentReader.java:  // end nocommit
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: is this 
needed anymore by IndexWriter?
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/java/org/apache/lucene/index/SegmentReader.java:  // nocommit: remove 
deletions from SR
./src/test/org/apache/lucene/index/TestAddIndexes.java:  /* nocommit: 
reactivate these tests
./src/test/org/apache/lucene/index/TestDeletionPolicy.java:  /* nocommit: fix 
this test, I don't understand it!
./src/test/org/apache/lucene/index/TestIndexWriterMerging.java:  /* nocommit: 
Fix tests to use an id and delete by term
./src/test/org/apache/lucene/index/TestParallelReaderEmptyIndex.java:  /* 
nocommit: Fix tests to use an id and delete by term
./src/test/org/apache/lucene/index/TestSizeBoundedForceMerge.java:  /* 
nocommit: Fix tests to use an id and delete by term
./src/test/org/apache/lucene/index/TestSizeBoundedForceMerge.java:  /* 
nocommit: Fix tests to use an id and delete by term
{noformat}

The parts in SegmentReader should be made TODO and a new issue should be 
opened, which removed RW from SegmentReader (Mike?). The tests should be 
deleted as described above. Otherwise the branch seems finalized otherwise so I 
would like to merge back to trunk asap.

> Make IndexReader really read-only in Lucene 4.0
> ---
>
> Key: LUCENE-3606
> URL: https://issues.apache.org/jira/browse/LUCENE-3606
> Project: Lucene - Java
>  Issue Type: Task
>  Components: core/index
>Affects Versions: 4.0
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>
> As we change API completely in Lucene 4.0 we are also free to remove 
> read-write access and commits from IndexReader. This code is so hairy and 
> buggy (as investigated by Robert and Mike today) when you work on 
> SegmentReader level but forget to flush in the DirectoryReader, so its better 
> to really make IndexReaders readonly.
> Currently with IndexReader you can do things like:
> - delete/undelete Documents -> Can be done by with IndexWriter, too (using 
> deleteByQuery)
> - change norms -> this is a bad idea in general, but when we remove norms at 
> all and replace by DocValues this is obsolete already. Changing DocValues 
> should also be done using IndexWriter in trunk (once it is ready)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3627) CorruptIndexException on indexing after a failure occurs after segments file creation but before any bytes are written

2011-12-07 Thread Ken McCracken (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164816#comment-13164816
 ] 

Ken McCracken commented on LUCENE-3627:
---

I have been reviewing https://issues.apache.org/jira/browse/LUCENE-3255 which 
seems related in that the error is encountered on the same section of 
SegmentInfos.java.
One way to fix this might be to change SegmentInfos.java as follows where int 
format = input.readInt(); to

int format;
try {
format = input.readInt();
} catch (IOException ioe) {
if (input.length() == 0) {
try {
input.close();
} finally {
directory.deleteFile(segmentFileName);
}
return;
}
throw ioe;
}

however, there are unit tests that seem to verify that no file deletions are 
happening at this low a level.  


> CorruptIndexException on indexing after a failure occurs after segments file 
> creation but before any bytes are written
> --
>
> Key: LUCENE-3627
> URL: https://issues.apache.org/jira/browse/LUCENE-3627
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 3.5
> Environment: lucene-3.5.0, src download from GA release 
> lucene.apache.org.
> Mac OS X 10.6.5, running tests in Eclipse Build id: 20100218-1602, 
> java version "1.6.0_24"
> Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326)
> Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode)
>Reporter: Ken McCracken
>Priority: Critical
> Attachments: TestCrashCausesCorruptIndex.java
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> FSDirectory.createOutput(..) uses a RandomAccessFile to do its work.  On my 
> system the default FSDirectory.open(..) creates an NIOFSDirectory.  If 
> createOutput is called on a segments_* file and a crash occurs between 
> RandomAccessFile creation (file system shows a segments_* file exists but has 
> zero bytes) but before any bytes are written to the file, subsequent 
> IndexWriters cannot proceed.  The difficulty is that it does not know how to 
> clear the empty segments_* file.  None of the file deletions will happen on 
> such a segment file because the opening bytes cannot not be read to determine 
> format and version.
> I will attempt to attach a Test file demonstrates the issue; place it in your 
> src/test/org/apache/lucene/store/
> directory and run the unit tests with JUnit4.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1520) QueryElevationComponent does not work when unique key is of type 'sint'

2011-12-07 Thread Grant Ingersoll (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-1520:
--

Attachment: SOLR-1520.patch

Adds in tests for this against schema11.xml.  Should be close to ready to go.  
For now, it requires a non-tokenized field for the id.

> QueryElevationComponent does not work when unique key is of type 'sint'
> ---
>
> Key: SOLR-1520
> URL: https://issues.apache.org/jira/browse/SOLR-1520
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.3
> Environment: Gentoo Linux, Apache Tomcat 6.0.20, Java 1.6.0_15-b03
>Reporter: Simon Lachinger
>Assignee: Grant Ingersoll
> Attachments: SOLR-1520.patch, SOLR-1520.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> QueryElevationComponent does not work when unique key of the documents is of 
> type 'sint'. I did not try any other combination, but when I looking at the 
> source in QueryElevationComponent.java I doubt it will work with any other 
> type than string.
> I propose to either make it work with other data types or at least to post a 
> warning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1730) Solr fails to start if QueryElevationComponent config is missing

2011-12-07 Thread Grant Ingersoll (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164779#comment-13164779
 ] 

Grant Ingersoll commented on SOLR-1730:
---

I'm pretty comfortable with this solution and would like to commit in the 
coming day or two, if others want to review it.

> Solr fails to start if QueryElevationComponent config is missing
> 
>
> Key: SOLR-1730
> URL: https://issues.apache.org/jira/browse/SOLR-1730
> Project: Solr
>  Issue Type: Bug
>  Components: SearchComponents - other
>Affects Versions: 1.4
>Reporter: Mark Miller
>Assignee: Grant Ingersoll
>  Labels: newdev
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-1730.patch, SOLR-1730.patch
>
>
> QueryElevationComponent tries to do preload some data if its config file does 
> not exist:
> {code}
> if (!exists){
>   // preload the first data
>   RefCounted searchHolder = null;
>   try {
> searchHolder = core.getNewestSearcher(false);
> IndexReader reader = searchHolder.get().getReader();
> getElevationMap( reader, core );
>   } finally {
> if (searchHolder != null) searchHolder.decref();
>   }
> }
> {code}
> This does not work though, as asking for the newest searcher causes a request 
> to be submitted to Solr before its ready to handle it:
> {code}
>  [java] SEVERE: java.lang.NullPointerException
>  [java]   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
>  [java]   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>  [java]   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1317)
>  [java]   at 
> org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:52)
>  [java]   at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1147)
>  [java]   at 
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>  [java]   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> {code}
> The SearchHandler has not yet been core informed (as the 
> QueryElevationComponent causes this as its getting core informed right before 
> the SearchHandler) and so its components arraylist is still null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2953) Introducing hit Count as an alternative to score

2011-12-07 Thread Commented

[ 
https://issues.apache.org/jira/browse/SOLR-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164764#comment-13164764
 ] 

Jan Høydahl commented on SOLR-2953:
---

Can't you do this simply by plugging in your own Similarity class in Schema?

> Introducing hit Count as an alternative to score 
> -
>
> Key: SOLR-2953
> URL: https://issues.apache.org/jira/browse/SOLR-2953
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.0
>Reporter: Kaleem Ahmed
>  Labels: features
> Fix For: 4.0
>
>   Original Estimate: 1,008h
>  Remaining Estimate: 1,008h
>
> As of now we have score as relevancy factor for a query against a document, 
> and this score is relative to the number of documents in the index. In the 
> same way why not have some other relevancy feature say "hitCounts" which is 
> absolute for a given doc and a given query, It shouldn't depend on the number 
> of documents in the index. This will help a lot for the frequently changing 
> indexes , where the search rules are predefined along the relevancy factor 
> for a document to be qualified for that query(search rule). 
> Ex: consider a use case where a list of queries are formed with a threshold 
> number for each query and these are searched on a frequently updated index to 
> get the documents that score above the threshold i.e. when a document's 
> relevancy factor crosses the threshold for a query the document is said to be 
> qualified for that query. 
> For the above use case to satisfy the score shouldn't change every time the 
> index gets updated with new documents. So we introduce new feature called 
> "hitCount"  which represents the relevancy of a document against a query and 
> it is absolute(won't change with index size). 
> This hitCount is a positive integer and is calculated as follows 
> Ex: Document with text "the quick fox jumped over the lazy dog, while the 
> lazy dog was too lazy to care" 
> 1. for the query "lazy AND dog" the hitCount will be == (no of occurrences of 
> "lazy" in the document) +  (no of occurrences of "dog" in the document)  =>  
> 3+2 => 5  
> 2. for the phrase query  \"lazy dog\"  the hitCount will be == (no of 
> occurrences of exact phrase "lazy dog" in the document) => 2
> This will be very useful  as an alternative scoring mechanism.
> I already implemented this whole thing in the Solr source code(that I 
> downloaded) and we are using it. So far it's going good. 
> It would be really great if this feature is added to trunk (original  Solr) 
> so that we don't have to implement the changes every time  a new version is 
> released and also others could be benefited with this. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2718) Lazy load response writers

2011-12-07 Thread Erik Hatcher (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-2718:
---

Fix Version/s: 3.6

Fixed already on 4.0, but needs to be ported to 3.x too.

> Lazy load response writers
> --
>
> Key: SOLR-2718
> URL: https://issues.apache.org/jira/browse/SOLR-2718
> Project: Solr
>  Issue Type: Improvement
>  Components: Response Writers
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-2718-velocity-test-cleanup.patch, SOLR-2718.patch
>
>
> This stems from issues with SOLR-2588, moving the Velocity response writer 
> back to contrib.  We still want the example app to use the 
> VelocityResponseWriter for the /browse interface.  Many of Solr's core tests 
> use the example Solr configuration.  There are other contribs that are 
> brought into the example app (extract, clustering, DIH, for examples) but 
> these are request handlers that lazy load.  Response writers don't currently 
> lazy load, thus causing core tests that use the example config to fail unless 
> "ant dist" is run.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (SOLR-2718) Lazy load response writers

2011-12-07 Thread Erik Hatcher (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher reopened SOLR-2718:



I've seen two reports from Solr 3.5 users today (via #solr in IRC) where 
they've deployed Solr into Tomcat and copied the example configuration.  The 
lazy loading was not backported to 3.x and thus the registration of VRW causes 
a startup error when the libs aren't found.

This needs to be ported/adapted to 3.x to ensure 3.6 doesn't have this issue.

> Lazy load response writers
> --
>
> Key: SOLR-2718
> URL: https://issues.apache.org/jira/browse/SOLR-2718
> Project: Solr
>  Issue Type: Improvement
>  Components: Response Writers
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
> Fix For: 4.0
>
> Attachments: SOLR-2718-velocity-test-cleanup.patch, SOLR-2718.patch
>
>
> This stems from issues with SOLR-2588, moving the Velocity response writer 
> back to contrib.  We still want the example app to use the 
> VelocityResponseWriter for the /browse interface.  Many of Solr's core tests 
> use the example Solr configuration.  There are other contribs that are 
> brought into the example app (extract, clustering, DIH, for examples) but 
> these are request handlers that lazy load.  Response writers don't currently 
> lazy load, thus causing core tests that use the example config to fail unless 
> "ant dist" is run.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3626) Make PKIndexSplitter and MultiPassIndexSplitter work per segment

2011-12-07 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164698#comment-13164698
 ] 

Uwe Schindler commented on LUCENE-3626:
---

Change the relations if you like and find better ones, I am out of this issue 
now - unassigned myself and I will unassign from other Lucene issues now 
because of this stupidness. Goodbye.

> Make PKIndexSplitter and MultiPassIndexSplitter work per segment
> 
>
> Key: LUCENE-3626
> URL: https://issues.apache.org/jira/browse/LUCENE-3626
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/other
>Affects Versions: 4.0
>Reporter: Uwe Schindler
> Fix For: 4.0
>
>
> Spinoff from LUCENE-3624: DocValuesw merger throws exception on 
> IW.addIndexes(SlowMultiReaderWrapper) as string-index like docvalues cannot 
> provide asSortedSource.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3626) Make PKIndexSplitter and MultiPassIndexSplitter work per segment

2011-12-07 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164696#comment-13164696
 ] 

Robert Muir commented on LUCENE-3626:
-

I'm not complaining, but you marked my issue as breaking this functionality? 
thats insanity!

Lets revisit the situation:
* As of yesterday, if you used one of these tools, *all* docvalues fields were 
*silently dropped* and merge succeeded (data loss)
* I fixed this in LUCENE-3623, but in testing exposed the fact that with the 
data loss fixed, if you had a sortedsource, you would get a non-obvious 
NullPointerException deep in the docvalues codec stack.
* Because of this LUCENE-3624 changes the NullPointerException to 
UnsupportedOperationException so that its clear that this isn't working, and 
why.


> Make PKIndexSplitter and MultiPassIndexSplitter work per segment
> 
>
> Key: LUCENE-3626
> URL: https://issues.apache.org/jira/browse/LUCENE-3626
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/other
>Affects Versions: 4.0
>Reporter: Uwe Schindler
> Fix For: 4.0
>
>
> Spinoff from LUCENE-3624: DocValuesw merger throws exception on 
> IW.addIndexes(SlowMultiReaderWrapper) as string-index like docvalues cannot 
> provide asSortedSource.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-3626) Make PKIndexSplitter and MultiPassIndexSplitter work per segment

2011-12-07 Thread Uwe Schindler (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler reassigned LUCENE-3626:
-

Assignee: (was: Uwe Schindler)

> Make PKIndexSplitter and MultiPassIndexSplitter work per segment
> 
>
> Key: LUCENE-3626
> URL: https://issues.apache.org/jira/browse/LUCENE-3626
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/other
>Affects Versions: 4.0
>Reporter: Uwe Schindler
> Fix For: 4.0
>
>
> Spinoff from LUCENE-3624: DocValuesw merger throws exception on 
> IW.addIndexes(SlowMultiReaderWrapper) as string-index like docvalues cannot 
> provide asSortedSource.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3626) Make PKIndexSplitter and MultiPassIndexSplitter work per segment

2011-12-07 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164689#comment-13164689
 ] 

Uwe Schindler commented on LUCENE-3626:
---

Instead of complaining help to fix it. There was no better relation for it.

> Make PKIndexSplitter and MultiPassIndexSplitter work per segment
> 
>
> Key: LUCENE-3626
> URL: https://issues.apache.org/jira/browse/LUCENE-3626
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/other
>Affects Versions: 4.0
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 4.0
>
>
> Spinoff from LUCENE-3624: DocValuesw merger throws exception on 
> IW.addIndexes(SlowMultiReaderWrapper) as string-index like docvalues cannot 
> provide asSortedSource.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3626) Make PKIndexSplitter and MultiPassIndexSplitter work per segment

2011-12-07 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164687#comment-13164687
 ] 

Robert Muir commented on LUCENE-3626:
-

this issue is *not* broken by LUCENE-3624.

Prior to LUCENE-3624, you just got a NullPointerException instead of a 
UnsupportedOperationException


> Make PKIndexSplitter and MultiPassIndexSplitter work per segment
> 
>
> Key: LUCENE-3626
> URL: https://issues.apache.org/jira/browse/LUCENE-3626
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/other
>Affects Versions: 4.0
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 4.0
>
>
> Spinoff from LUCENE-3624: DocValuesw merger throws exception on 
> IW.addIndexes(SlowMultiReaderWrapper) as string-index like docvalues cannot 
> provide asSortedSource.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3627) CorruptIndexException on indexing after a failure occurs after segments file creation but before any bytes are written

2011-12-07 Thread Ken McCracken (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ken McCracken updated LUCENE-3627:
--

  Environment: 
lucene-3.5.0, src download from GA release lucene.apache.org.
Mac OS X 10.6.5, running tests in Eclipse Build id: 20100218-1602, 
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode)


  was:
Mac OS X 10.6.5, running tests in Eclipse Build id: 20100218-1602, 
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode)


Affects Version/s: 3.5

> CorruptIndexException on indexing after a failure occurs after segments file 
> creation but before any bytes are written
> --
>
> Key: LUCENE-3627
> URL: https://issues.apache.org/jira/browse/LUCENE-3627
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 3.5
> Environment: lucene-3.5.0, src download from GA release 
> lucene.apache.org.
> Mac OS X 10.6.5, running tests in Eclipse Build id: 20100218-1602, 
> java version "1.6.0_24"
> Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326)
> Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode)
>Reporter: Ken McCracken
>Priority: Critical
> Attachments: TestCrashCausesCorruptIndex.java
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> FSDirectory.createOutput(..) uses a RandomAccessFile to do its work.  On my 
> system the default FSDirectory.open(..) creates an NIOFSDirectory.  If 
> createOutput is called on a segments_* file and a crash occurs between 
> RandomAccessFile creation (file system shows a segments_* file exists but has 
> zero bytes) but before any bytes are written to the file, subsequent 
> IndexWriters cannot proceed.  The difficulty is that it does not know how to 
> clear the empty segments_* file.  None of the file deletions will happen on 
> such a segment file because the opening bytes cannot not be read to determine 
> format and version.
> I will attempt to attach a Test file demonstrates the issue; place it in your 
> src/test/org/apache/lucene/store/
> directory and run the unit tests with JUnit4.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3627) CorruptIndexException on indexing after a failure occurs after segments file creation but before any bytes are written

2011-12-07 Thread Ken McCracken (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ken McCracken updated LUCENE-3627:
--

Attachment: TestCrashCausesCorruptIndex.java

Drop this file in src/test/org/apache/lucene/store for the lucene 3.5.0 source 
release, and run your unit tests with junit4.

> CorruptIndexException on indexing after a failure occurs after segments file 
> creation but before any bytes are written
> --
>
> Key: LUCENE-3627
> URL: https://issues.apache.org/jira/browse/LUCENE-3627
> Project: Lucene - Java
>  Issue Type: Bug
> Environment: Mac OS X 10.6.5, running tests in Eclipse Build id: 
> 20100218-1602, 
> java version "1.6.0_24"
> Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326)
> Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode)
>Reporter: Ken McCracken
>Priority: Critical
> Attachments: TestCrashCausesCorruptIndex.java
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> FSDirectory.createOutput(..) uses a RandomAccessFile to do its work.  On my 
> system the default FSDirectory.open(..) creates an NIOFSDirectory.  If 
> createOutput is called on a segments_* file and a crash occurs between 
> RandomAccessFile creation (file system shows a segments_* file exists but has 
> zero bytes) but before any bytes are written to the file, subsequent 
> IndexWriters cannot proceed.  The difficulty is that it does not know how to 
> clear the empty segments_* file.  None of the file deletions will happen on 
> such a segment file because the opening bytes cannot not be read to determine 
> format and version.
> I will attempt to attach a Test file demonstrates the issue; place it in your 
> src/test/org/apache/lucene/store/
> directory and run the unit tests with JUnit4.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3298) FST has hard limit max size of 2.1 GB

2011-12-07 Thread Dawid Weiss (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164671#comment-13164671
 ] 

Dawid Weiss commented on LUCENE-3298:
-

Sudarshan,

If you take a look at the trunk version of FSTLookup it uses FSTCompletion 
underneath and that class in turn stores arbitrary byte sequences (text is 
converted to UTF8). Not byte outputs, but you could create your "suggestions" 
by concatenating input with output, divided with a marker or something. This 
will bloat the automaton, but if your data is relatively small, it's not a 
problem and you can still extract your "outputs" after suggestions are 
retrieved from the FST. Take a look at FSTCompletion and FSTCompletionBuilder 
(and tests), they'll be helpful.

> FST has hard limit max size of 2.1 GB
> -
>
> Key: LUCENE-3298
> URL: https://issues.apache.org/jira/browse/LUCENE-3298
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/FSTs
>Reporter: Michael McCandless
>Priority: Minor
> Attachments: LUCENE-3298.patch
>
>
> The FST uses a single contiguous byte[] under the hood, which in java is 
> indexed by int so we cannot grow this over Integer.MAX_VALUE.  It also 
> internally encodes references to this array as vInt.
> We could switch this to a paged byte[] and make the far larger.
> But I think this is low priority... I'm not going to work on it any time soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3627) CorruptIndexException on indexing after a failure occurs after segments file creation but before any bytes are written

2011-12-07 Thread Ken McCracken (Created) (JIRA)
CorruptIndexException on indexing after a failure occurs after segments file 
creation but before any bytes are written
--

 Key: LUCENE-3627
 URL: https://issues.apache.org/jira/browse/LUCENE-3627
 Project: Lucene - Java
  Issue Type: Bug
 Environment: Mac OS X 10.6.5, running tests in Eclipse Build id: 
20100218-1602, 
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode)

Reporter: Ken McCracken
Priority: Critical


FSDirectory.createOutput(..) uses a RandomAccessFile to do its work.  On my 
system the default FSDirectory.open(..) creates an NIOFSDirectory.  If 
createOutput is called on a segments_* file and a crash occurs between 
RandomAccessFile creation (file system shows a segments_* file exists but has 
zero bytes) but before any bytes are written to the file, subsequent 
IndexWriters cannot proceed.  The difficulty is that it does not know how to 
clear the empty segments_* file.  None of the file deletions will happen on 
such a segment file because the opening bytes cannot not be read to determine 
format and version.

I will attempt to attach a Test file demonstrates the issue; place it in your 
src/test/org/apache/lucene/store/
directory and run the unit tests with JUnit4.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2947) DIH caching bug - EntityRunner destroys child entity processor

2011-12-07 Thread Mikhail Khludnev (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Khludnev updated SOLR-2947:
---

Attachment: SOLR-2947.patch

Clean up comment. Rename patch to conform to HowToContribute

> DIH caching bug - EntityRunner destroys child entity processor
> --
>
> Key: SOLR-2947
> URL: https://issues.apache.org/jira/browse/SOLR-2947
> Project: Solr
>  Issue Type: Sub-task
>  Components: contrib - DataImportHandler
>Affects Versions: 4.0
>Reporter: Mikhail Khludnev
>  Labels: noob
> Fix For: 4.0
>
> Attachments: SOLR-2947.patch, dih-cache-destroy-on-threads-fix.patch, 
> dih-cache-threads-enabling-bug.patch
>
>
> My intention is fix multithread import with SQL cache. Here is the 2nd stage. 
> If I enable DocBuilder.EntityRunner flow even for single thread, it breaks 
> the pretty basic functionality: parent-child join.
> the reason is [line 473 
> entityProcessor.destroy();|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DocBuilder.java?revision=1201659&view=markup]
>  breaks children entityProcessor.
> see attachement comments for more details. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3298) FST has hard limit max size of 2.1 GB

2011-12-07 Thread Commented

[ 
https://issues.apache.org/jira/browse/LUCENE-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164632#comment-13164632
 ] 

Carlos González-Cadenas commented on LUCENE-3298:
-

Hi Sudarshan,

I don't believe that my implementation is gonna be of much practical value for 
the general public. Note that, as described above, in my implementation I store 
custom data that is useful for my application, but it almost certainly won't 
make any sense for the rest of applications.

I'm happy to tell you how to modify the code to store your own outputs, it's 
quite easy: 
1) First you have to enable it at the code level, you just need to change 
NoOutputs by ByteSequenceOutputs and then in all the references of Arc 
or FST you need to change them by Arc and FST. 
2) At build time, you need to store something in the output. You can do it by 
creating the appropriate BytesRef and including it in the builder.add() call 
instead of the placeholder value that is present now.
3) At query time, you need to collect the output while traversing the FST (note 
that the output may be scattered through the whole arc chain) and then you can 
process it in the way specific to your app. Probably you want to do it in the 
collect() method (when the LookupResults are created).

I believe that's all. If you have any questions, let me know.

Thanks
Carlos

> FST has hard limit max size of 2.1 GB
> -
>
> Key: LUCENE-3298
> URL: https://issues.apache.org/jira/browse/LUCENE-3298
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/FSTs
>Reporter: Michael McCandless
>Priority: Minor
> Attachments: LUCENE-3298.patch
>
>
> The FST uses a single contiguous byte[] under the hood, which in java is 
> indexed by int so we cannot grow this over Integer.MAX_VALUE.  It also 
> internally encodes references to this array as vInt.
> We could switch this to a paged byte[] and make the far larger.
> But I think this is low priority... I'm not going to work on it any time soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3298) FST has hard limit max size of 2.1 GB

2011-12-07 Thread Sudarshan Gaikaiwari (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164621#comment-13164621
 ] 

Sudarshan Gaikaiwari commented on LUCENE-3298:
--

Hi Carlos

I am interested in your implementation of FSTLookup where you are using a FST 
with ByteSequenceOutputs. Would it be possible for you to share your 
implementation.

Thanks
Sudarshan

> FST has hard limit max size of 2.1 GB
> -
>
> Key: LUCENE-3298
> URL: https://issues.apache.org/jira/browse/LUCENE-3298
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/FSTs
>Reporter: Michael McCandless
>Priority: Minor
> Attachments: LUCENE-3298.patch
>
>
> The FST uses a single contiguous byte[] under the hood, which in java is 
> indexed by int so we cannot grow this over Integer.MAX_VALUE.  It also 
> internally encodes references to this array as vInt.
> We could switch this to a paged byte[] and make the far larger.
> But I think this is low priority... I'm not going to work on it any time soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3606) Make IndexReader really read-only in Lucene 4.0

2011-12-07 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164615#comment-13164615
 ] 

Uwe Schindler commented on LUCENE-3606:
---

Robert and I committed more fixes to the remaining tests to the branch.
I also added the test of LUCENE-3620 to this branch and fixed FilterIndexReader.

> Make IndexReader really read-only in Lucene 4.0
> ---
>
> Key: LUCENE-3606
> URL: https://issues.apache.org/jira/browse/LUCENE-3606
> Project: Lucene - Java
>  Issue Type: Task
>  Components: core/index
>Affects Versions: 4.0
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
>
> As we change API completely in Lucene 4.0 we are also free to remove 
> read-write access and commits from IndexReader. This code is so hairy and 
> buggy (as investigated by Robert and Mike today) when you work on 
> SegmentReader level but forget to flush in the DirectoryReader, so its better 
> to really make IndexReaders readonly.
> Currently with IndexReader you can do things like:
> - delete/undelete Documents -> Can be done by with IndexWriter, too (using 
> deleteByQuery)
> - change norms -> this is a bad idea in general, but when we remove norms at 
> all and replace by DocValues this is obsolete already. Changing DocValues 
> should also be done using IndexWriter in trunk (once it is ready)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3620) FilterIndexReader does not override all of IndexReader methods

2011-12-07 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164611#comment-13164611
 ] 

Uwe Schindler commented on LUCENE-3620:
---

I will do it once the branch is merged back.

> FilterIndexReader does not override all of IndexReader methods
> --
>
> Key: LUCENE-3620
> URL: https://issues.apache.org/jira/browse/LUCENE-3620
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/search
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3620-trunk.patch, LUCENE-3620.patch, 
> LUCENE-3620.patch, LUCENE-3620.patch
>
>
> FilterIndexReader does not override all of IndexReader methods. We've hit an 
> error in LUCENE-3573 (and fixed it). So I thought to write a simple test 
> which asserts that FIR overrides all methods of IR (and we can filter our 
> methods that we don't think that it should override). The test is very simple 
> (attached), and it currently fails over these methods:
> {code}
> getRefCount
> incRef
> tryIncRef
> decRef
> reopen
> reopen
> reopen
> reopen
> clone
> numDeletedDocs
> document
> setNorm
> setNorm
> termPositions
> deleteDocument
> deleteDocuments
> undeleteAll
> getIndexCommit
> getUniqueTermCount
> getTermInfosIndexDivisor
> {code}
> I didn't yet fix anything in FIR -- if you spot a method that you think we 
> should not override and delegate, please comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3620) FilterIndexReader does not override all of IndexReader methods

2011-12-07 Thread Shai Erera (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164609#comment-13164609
 ] 

Shai Erera commented on LUCENE-3620:


bq. I added the patch to the LUCENE-3606 branch and fixed FilterIndexReader 
there

That's great, thanks ! So can I resolve this issue?

> FilterIndexReader does not override all of IndexReader methods
> --
>
> Key: LUCENE-3620
> URL: https://issues.apache.org/jira/browse/LUCENE-3620
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/search
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3620-trunk.patch, LUCENE-3620.patch, 
> LUCENE-3620.patch, LUCENE-3620.patch
>
>
> FilterIndexReader does not override all of IndexReader methods. We've hit an 
> error in LUCENE-3573 (and fixed it). So I thought to write a simple test 
> which asserts that FIR overrides all methods of IR (and we can filter our 
> methods that we don't think that it should override). The test is very simple 
> (attached), and it currently fails over these methods:
> {code}
> getRefCount
> incRef
> tryIncRef
> decRef
> reopen
> reopen
> reopen
> reopen
> clone
> numDeletedDocs
> document
> setNorm
> setNorm
> termPositions
> deleteDocument
> deleteDocuments
> undeleteAll
> getIndexCommit
> getUniqueTermCount
> getTermInfosIndexDivisor
> {code}
> I didn't yet fix anything in FIR -- if you spot a method that you think we 
> should not override and delegate, please comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1730) Solr fails to start if QueryElevationComponent config is missing

2011-12-07 Thread Grant Ingersoll (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164594#comment-13164594
 ] 

Grant Ingersoll commented on SOLR-1730:
---

All tests pass for me locally.

> Solr fails to start if QueryElevationComponent config is missing
> 
>
> Key: SOLR-1730
> URL: https://issues.apache.org/jira/browse/SOLR-1730
> Project: Solr
>  Issue Type: Bug
>  Components: SearchComponents - other
>Affects Versions: 1.4
>Reporter: Mark Miller
>Assignee: Grant Ingersoll
>  Labels: newdev
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-1730.patch, SOLR-1730.patch
>
>
> QueryElevationComponent tries to do preload some data if its config file does 
> not exist:
> {code}
> if (!exists){
>   // preload the first data
>   RefCounted searchHolder = null;
>   try {
> searchHolder = core.getNewestSearcher(false);
> IndexReader reader = searchHolder.get().getReader();
> getElevationMap( reader, core );
>   } finally {
> if (searchHolder != null) searchHolder.decref();
>   }
> }
> {code}
> This does not work though, as asking for the newest searcher causes a request 
> to be submitted to Solr before its ready to handle it:
> {code}
>  [java] SEVERE: java.lang.NullPointerException
>  [java]   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
>  [java]   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>  [java]   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1317)
>  [java]   at 
> org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:52)
>  [java]   at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1147)
>  [java]   at 
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>  [java]   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> {code}
> The SearchHandler has not yet been core informed (as the 
> QueryElevationComponent causes this as its getting core informed right before 
> the SearchHandler) and so its components arraylist is still null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1730) Solr fails to start if QueryElevationComponent config is missing

2011-12-07 Thread Grant Ingersoll (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-1730:
--

Attachment: SOLR-1730.patch

This patch should fix the problem.  I did a couple of things:

1. Addressed the solrconfig issue Yonik raised (i.e. use a sys property)
2. I logged that the core can't be created.
3. If there is only 1 core being created, then this throws an exception up and 
out of Solr to the container.  Based on the docs, it seems different containers 
will deal with this as they see fit.  Jetty simply displays an error message.
4. I marked the exception coming out of QEC as not logged yet.
5. In the SolrCore initialization code, I changed from only catching 
IOException to catching Throwable.  I also then release the latch and close 
down any resources the core has allocated so far.  I had to release the latch 
in the catch block there otherwise the ExecutorService can't shutdown because 
it is blocked on the latch.

I'm running full tests now.

> Solr fails to start if QueryElevationComponent config is missing
> 
>
> Key: SOLR-1730
> URL: https://issues.apache.org/jira/browse/SOLR-1730
> Project: Solr
>  Issue Type: Bug
>  Components: SearchComponents - other
>Affects Versions: 1.4
>Reporter: Mark Miller
>Assignee: Grant Ingersoll
>  Labels: newdev
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-1730.patch, SOLR-1730.patch
>
>
> QueryElevationComponent tries to do preload some data if its config file does 
> not exist:
> {code}
> if (!exists){
>   // preload the first data
>   RefCounted searchHolder = null;
>   try {
> searchHolder = core.getNewestSearcher(false);
> IndexReader reader = searchHolder.get().getReader();
> getElevationMap( reader, core );
>   } finally {
> if (searchHolder != null) searchHolder.decref();
>   }
> }
> {code}
> This does not work though, as asking for the newest searcher causes a request 
> to be submitted to Solr before its ready to handle it:
> {code}
>  [java] SEVERE: java.lang.NullPointerException
>  [java]   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
>  [java]   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>  [java]   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1317)
>  [java]   at 
> org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:52)
>  [java]   at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1147)
>  [java]   at 
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>  [java]   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> {code}
> The SearchHandler has not yet been core informed (as the 
> QueryElevationComponent causes this as its getting core informed right before 
> the SearchHandler) and so its components arraylist is still null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1730) Solr fails to start if QueryElevationComponent config is missing

2011-12-07 Thread Grant Ingersoll (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164575#comment-13164575
 ] 

Grant Ingersoll commented on SOLR-1730:
---

The NPE is due to the fact that we initListeners() then we call getSearcher() 
which creates the new searcher and registers a Future/Callable on those 
listeners passing in "this" (i.e. the partially constructed core that is about 
to fail), then later, when the core fails, there is still an 
thread/future/callable waiting to fire off the newSearcher event which it does 
as soon as the CountDownLatch is released.  Little does it know, the core is 
actually dead.  I don't particularly think we need to fix this other than to 
perhaps document it here, as I think, since things are undefined at this point 
b/c the core is dead, that we shouldn't care too much about these side 
consequences.

> Solr fails to start if QueryElevationComponent config is missing
> 
>
> Key: SOLR-1730
> URL: https://issues.apache.org/jira/browse/SOLR-1730
> Project: Solr
>  Issue Type: Bug
>  Components: SearchComponents - other
>Affects Versions: 1.4
>Reporter: Mark Miller
>Assignee: Grant Ingersoll
>  Labels: newdev
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-1730.patch
>
>
> QueryElevationComponent tries to do preload some data if its config file does 
> not exist:
> {code}
> if (!exists){
>   // preload the first data
>   RefCounted searchHolder = null;
>   try {
> searchHolder = core.getNewestSearcher(false);
> IndexReader reader = searchHolder.get().getReader();
> getElevationMap( reader, core );
>   } finally {
> if (searchHolder != null) searchHolder.decref();
>   }
> }
> {code}
> This does not work though, as asking for the newest searcher causes a request 
> to be submitted to Solr before its ready to handle it:
> {code}
>  [java] SEVERE: java.lang.NullPointerException
>  [java]   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
>  [java]   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>  [java]   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1317)
>  [java]   at 
> org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:52)
>  [java]   at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1147)
>  [java]   at 
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>  [java]   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> {code}
> The SearchHandler has not yet been core informed (as the 
> QueryElevationComponent causes this as its getting core informed right before 
> the SearchHandler) and so its components arraylist is still null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1730) Solr fails to start if QueryElevationComponent config is missing

2011-12-07 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164565#comment-13164565
 ] 

Yonik Seeley commented on SOLR-1730:


Is there an easy way we could avoid yet another solrconfig.xml file?  Perhaps 
making the elevate file a system property in the existing 
solrconfig-elevate.xml and just change it for the "bad" test?

> Solr fails to start if QueryElevationComponent config is missing
> 
>
> Key: SOLR-1730
> URL: https://issues.apache.org/jira/browse/SOLR-1730
> Project: Solr
>  Issue Type: Bug
>  Components: SearchComponents - other
>Affects Versions: 1.4
>Reporter: Mark Miller
>Assignee: Grant Ingersoll
>  Labels: newdev
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-1730.patch
>
>
> QueryElevationComponent tries to do preload some data if its config file does 
> not exist:
> {code}
> if (!exists){
>   // preload the first data
>   RefCounted searchHolder = null;
>   try {
> searchHolder = core.getNewestSearcher(false);
> IndexReader reader = searchHolder.get().getReader();
> getElevationMap( reader, core );
>   } finally {
> if (searchHolder != null) searchHolder.decref();
>   }
> }
> {code}
> This does not work though, as asking for the newest searcher causes a request 
> to be submitted to Solr before its ready to handle it:
> {code}
>  [java] SEVERE: java.lang.NullPointerException
>  [java]   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
>  [java]   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>  [java]   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1317)
>  [java]   at 
> org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:52)
>  [java]   at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1147)
>  [java]   at 
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>  [java]   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> {code}
> The SearchHandler has not yet been core informed (as the 
> QueryElevationComponent causes this as its getting core informed right before 
> the SearchHandler) and so its components arraylist is still null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3620) FilterIndexReader does not override all of IndexReader methods

2011-12-07 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164560#comment-13164560
 ] 

Uwe Schindler commented on LUCENE-3620:
---

In general like reopen, getSequentialSubReaders should default to return null 
in FilterIndexReader. If it delegates, the filter has no chance to filter the 
segments - leading to bugs...

> FilterIndexReader does not override all of IndexReader methods
> --
>
> Key: LUCENE-3620
> URL: https://issues.apache.org/jira/browse/LUCENE-3620
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/search
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3620-trunk.patch, LUCENE-3620.patch, 
> LUCENE-3620.patch, LUCENE-3620.patch
>
>
> FilterIndexReader does not override all of IndexReader methods. We've hit an 
> error in LUCENE-3573 (and fixed it). So I thought to write a simple test 
> which asserts that FIR overrides all methods of IR (and we can filter our 
> methods that we don't think that it should override). The test is very simple 
> (attached), and it currently fails over these methods:
> {code}
> getRefCount
> incRef
> tryIncRef
> decRef
> reopen
> reopen
> reopen
> reopen
> clone
> numDeletedDocs
> document
> setNorm
> setNorm
> termPositions
> deleteDocument
> deleteDocuments
> undeleteAll
> getIndexCommit
> getUniqueTermCount
> getTermInfosIndexDivisor
> {code}
> I didn't yet fix anything in FIR -- if you spot a method that you think we 
> should not override and delegate, please comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3620) FilterIndexReader does not override all of IndexReader methods

2011-12-07 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164558#comment-13164558
 ] 

Uwe Schindler commented on LUCENE-3620:
---

I added the patch to the LUCENE-3606 branch and fixed FilterIndexReader there 
(was missing 2 methods: getIndexCommit and getTermInfosIndexDivisor). The 
CHANGES.txt was merged to trunk.

> FilterIndexReader does not override all of IndexReader methods
> --
>
> Key: LUCENE-3620
> URL: https://issues.apache.org/jira/browse/LUCENE-3620
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/search
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3620-trunk.patch, LUCENE-3620.patch, 
> LUCENE-3620.patch, LUCENE-3620.patch
>
>
> FilterIndexReader does not override all of IndexReader methods. We've hit an 
> error in LUCENE-3573 (and fixed it). So I thought to write a simple test 
> which asserts that FIR overrides all methods of IR (and we can filter our 
> methods that we don't think that it should override). The test is very simple 
> (attached), and it currently fails over these methods:
> {code}
> getRefCount
> incRef
> tryIncRef
> decRef
> reopen
> reopen
> reopen
> reopen
> clone
> numDeletedDocs
> document
> setNorm
> setNorm
> termPositions
> deleteDocument
> deleteDocuments
> undeleteAll
> getIndexCommit
> getUniqueTermCount
> getTermInfosIndexDivisor
> {code}
> I didn't yet fix anything in FIR -- if you spot a method that you think we 
> should not override and delegate, please comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3298) FST has hard limit max size of 2.1 GB

2011-12-07 Thread Commented

[ 
https://issues.apache.org/jira/browse/LUCENE-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164546#comment-13164546
 ] 

Carlos González-Cadenas commented on LUCENE-3298:
-

Hello James,

Now we're using it and for the moment we haven't noticed any problems (although 
I must say that we haven't done extensive testing). I'll let you know if we 
find any.

I haven't updated the patch to sync with the current trunk, I've just reverted 
to the appropriate version of Lucene identified in the patch and then I've 
applied it there. If you have some time, it would be great if you can sync the 
patch with the current trunk. 

As you suggest, I'll also take a look at the sections you mention to see if we 
can make it more efficient.

Thanks
Carlos



> FST has hard limit max size of 2.1 GB
> -
>
> Key: LUCENE-3298
> URL: https://issues.apache.org/jira/browse/LUCENE-3298
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/FSTs
>Reporter: Michael McCandless
>Priority: Minor
> Attachments: LUCENE-3298.patch
>
>
> The FST uses a single contiguous byte[] under the hood, which in java is 
> indexed by int so we cannot grow this over Integer.MAX_VALUE.  It also 
> internally encodes references to this array as vInt.
> We could switch this to a paged byte[] and make the far larger.
> But I think this is low priority... I'm not going to work on it any time soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3626) Make PKIndexSplitter and MultiPassIndexSplitter work per segment

2011-12-07 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164539#comment-13164539
 ] 

Uwe Schindler commented on LUCENE-3626:
---

Correction: It could work, but it does not rectify the complexity.

> Make PKIndexSplitter and MultiPassIndexSplitter work per segment
> 
>
> Key: LUCENE-3626
> URL: https://issues.apache.org/jira/browse/LUCENE-3626
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/other
>Affects Versions: 4.0
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 4.0
>
>
> Spinoff from LUCENE-3624: DocValuesw merger throws exception on 
> IW.addIndexes(SlowMultiReaderWrapper) as string-index like docvalues cannot 
> provide asSortedSource.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3626) Make PKIndexSplitter and MultiPassIndexSplitter work per segment

2011-12-07 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164538#comment-13164538
 ] 

Uwe Schindler commented on LUCENE-3626:
---

This only affects Lucene 4.0, as 3.x has no DocValues. The PKIndexSplitter 
per-segment variant could be backported to 3.x, but the MultiPassIndexSplitter 
working on absolute doc-Ids cannot handle that as AtomicReaderContext 
containing the docBase is not available in 3.x.

> Make PKIndexSplitter and MultiPassIndexSplitter work per segment
> 
>
> Key: LUCENE-3626
> URL: https://issues.apache.org/jira/browse/LUCENE-3626
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/other
>Affects Versions: 4.0
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 4.0
>
>
> Spinoff from LUCENE-3624: DocValuesw merger throws exception on 
> IW.addIndexes(SlowMultiReaderWrapper) as string-index like docvalues cannot 
> provide asSortedSource.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1730) Solr fails to start if QueryElevationComponent config is missing

2011-12-07 Thread Grant Ingersoll (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164537#comment-13164537
 ] 

Grant Ingersoll commented on SOLR-1730:
---

OK, so per IRC discussion w/ Mark and looking at the code, this exception 
actually causes the core to fail to construct and be registered.

It seems to me, then, that the way forward is that if this is the only core (or 
would be the only core) then Solr should fail and exit.  If there are other 
cores, it should log that the core for  cannot be created and then proceed. 
 One core failure should not cause the others to be out of service.

> Solr fails to start if QueryElevationComponent config is missing
> 
>
> Key: SOLR-1730
> URL: https://issues.apache.org/jira/browse/SOLR-1730
> Project: Solr
>  Issue Type: Bug
>  Components: SearchComponents - other
>Affects Versions: 1.4
>Reporter: Mark Miller
>Assignee: Grant Ingersoll
>  Labels: newdev
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-1730.patch
>
>
> QueryElevationComponent tries to do preload some data if its config file does 
> not exist:
> {code}
> if (!exists){
>   // preload the first data
>   RefCounted searchHolder = null;
>   try {
> searchHolder = core.getNewestSearcher(false);
> IndexReader reader = searchHolder.get().getReader();
> getElevationMap( reader, core );
>   } finally {
> if (searchHolder != null) searchHolder.decref();
>   }
> }
> {code}
> This does not work though, as asking for the newest searcher causes a request 
> to be submitted to Solr before its ready to handle it:
> {code}
>  [java] SEVERE: java.lang.NullPointerException
>  [java]   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
>  [java]   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>  [java]   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1317)
>  [java]   at 
> org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:52)
>  [java]   at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1147)
>  [java]   at 
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>  [java]   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> {code}
> The SearchHandler has not yet been core informed (as the 
> QueryElevationComponent causes this as its getting core informed right before 
> the SearchHandler) and so its components arraylist is still null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3624) Throw exception for "Multi-SortedSource" instead of returning null

2011-12-07 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164534#comment-13164534
 ] 

Uwe Schindler commented on LUCENE-3624:
---

I will take care of PKIndexSplitter (easy) and MultiPassIndexSplitter (messy 
because splits by absolute docId) to work per segment. I openend LUCENE-3626.

> Throw exception for "Multi-SortedSource" instead of returning null
> --
>
> Key: LUCENE-3624
> URL: https://issues.apache.org/jira/browse/LUCENE-3624
> Project: Lucene - Java
>  Issue Type: Task
>Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-3624.patch
>
>
> Spinoff of LUCENE-3623: currently if you addIndexes(FIR) or similar, you get 
> a NPE deep within codecs during merge.
> I think the NPE is confusing, it looks like a bug but a clearer exception 
> would be an improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3626) Make PKIndexSplitter and MultiPassIndexSplitter work per segment

2011-12-07 Thread Uwe Schindler (Created) (JIRA)
Make PKIndexSplitter and MultiPassIndexSplitter work per segment


 Key: LUCENE-3626
 URL: https://issues.apache.org/jira/browse/LUCENE-3626
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 4.0
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 4.0


Spinoff from LUCENE-3624: DocValuesw merger throws exception on 
IW.addIndexes(SlowMultiReaderWrapper) as string-index like docvalues cannot 
provide asSortedSource.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3298) FST has hard limit max size of 2.1 GB

2011-12-07 Thread James Dyer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164531#comment-13164531
 ] 

James Dyer commented on LUCENE-3298:


Carlos,

I'm not sure how much help this is, but you might be able to eke a little bit 
of performance if you can tighten RewritablePagedBytes.copyBytes().  You'll 
note it currently moves the From-Bytes into a temp array then writes that back 
to the fst an the To-Bytes location.  Note also, the one place this gets 
called, it used to be a simple "System.ArrayCopy".  So if you can make it copy 
in-place that might claw back the performance loss a little.  Beyond this, a 
different pair of eyes might find more ways to optimize.  In the end though you 
will likely never make it perform quite as well as the simple array.

Also, it sounds as if you've maybe done work to sync this with the current 
trunk.  If so, would you mind uploading the updated patch?

Also if you end up using this, be sure to test thoroughly.  I implemented this 
one just to gain a little familiarity with the code and I do not claim any sort 
of expertise in this area, so beware!  But all of the regular unit tests did 
pass for me.  I was meaning to try to run test2bpostings against this but 
wasn't able to get it set up.  If I remember this issue came up originally 
because someone wanted to run test2bpostings with memorycodec and it was going 
passed the limit.

> FST has hard limit max size of 2.1 GB
> -
>
> Key: LUCENE-3298
> URL: https://issues.apache.org/jira/browse/LUCENE-3298
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/FSTs
>Reporter: Michael McCandless
>Priority: Minor
> Attachments: LUCENE-3298.patch
>
>
> The FST uses a single contiguous byte[] under the hood, which in java is 
> indexed by int so we cannot grow this over Integer.MAX_VALUE.  It also 
> internally encodes references to this array as vInt.
> We could switch this to a paged byte[] and make the far larger.
> But I think this is low priority... I'm not going to work on it any time soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3624) Throw exception for "Multi-SortedSource" instead of returning null

2011-12-07 Thread Robert Muir (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-3624.
-

   Resolution: Fixed
Fix Version/s: 4.0

> Throw exception for "Multi-SortedSource" instead of returning null
> --
>
> Key: LUCENE-3624
> URL: https://issues.apache.org/jira/browse/LUCENE-3624
> Project: Lucene - Java
>  Issue Type: Task
>Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-3624.patch
>
>
> Spinoff of LUCENE-3623: currently if you addIndexes(FIR) or similar, you get 
> a NPE deep within codecs during merge.
> I think the NPE is confusing, it looks like a bug but a clearer exception 
> would be an improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3624) Throw exception for "Multi-SortedSource" instead of returning null

2011-12-07 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164526#comment-13164526
 ] 

Michael McCandless commented on LUCENE-3624:


+1

> Throw exception for "Multi-SortedSource" instead of returning null
> --
>
> Key: LUCENE-3624
> URL: https://issues.apache.org/jira/browse/LUCENE-3624
> Project: Lucene - Java
>  Issue Type: Task
>Reporter: Robert Muir
> Attachments: LUCENE-3624.patch
>
>
> Spinoff of LUCENE-3623: currently if you addIndexes(FIR) or similar, you get 
> a NPE deep within codecs during merge.
> I think the NPE is confusing, it looks like a bug but a clearer exception 
> would be an improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1730) Solr fails to start if QueryElevationComponent config is missing

2011-12-07 Thread Grant Ingersoll (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164523#comment-13164523
 ] 

Grant Ingersoll commented on SOLR-1730:
---

{quote}
The SearchHandler has not yet been core informed (as the 
QueryElevationComponent causes this as its getting core informed right before 
the SearchHandler) and so its components arraylist is still null.
{quote}

I believe this is no longer the case, at least in 4.  I think this all works 
correctly other than what to do if an inform() actually fails.  For QEC, it's 
probably enough to log and silently not elevate anything, but I'm not sure if 
that makes sense with other components

> Solr fails to start if QueryElevationComponent config is missing
> 
>
> Key: SOLR-1730
> URL: https://issues.apache.org/jira/browse/SOLR-1730
> Project: Solr
>  Issue Type: Bug
>  Components: SearchComponents - other
>Affects Versions: 1.4
>Reporter: Mark Miller
>Assignee: Grant Ingersoll
>  Labels: newdev
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-1730.patch
>
>
> QueryElevationComponent tries to do preload some data if its config file does 
> not exist:
> {code}
> if (!exists){
>   // preload the first data
>   RefCounted searchHolder = null;
>   try {
> searchHolder = core.getNewestSearcher(false);
> IndexReader reader = searchHolder.get().getReader();
> getElevationMap( reader, core );
>   } finally {
> if (searchHolder != null) searchHolder.decref();
>   }
> }
> {code}
> This does not work though, as asking for the newest searcher causes a request 
> to be submitted to Solr before its ready to handle it:
> {code}
>  [java] SEVERE: java.lang.NullPointerException
>  [java]   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
>  [java]   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>  [java]   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1317)
>  [java]   at 
> org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:52)
>  [java]   at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1147)
>  [java]   at 
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>  [java]   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> {code}
> The SearchHandler has not yet been core informed (as the 
> QueryElevationComponent causes this as its getting core informed right before 
> the SearchHandler) and so its components arraylist is still null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3624) Throw exception for "Multi-SortedSource" instead of returning null

2011-12-07 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164507#comment-13164507
 ] 

Simon Willnauer commented on LUCENE-3624:
-

+1 to commit

> Throw exception for "Multi-SortedSource" instead of returning null
> --
>
> Key: LUCENE-3624
> URL: https://issues.apache.org/jira/browse/LUCENE-3624
> Project: Lucene - Java
>  Issue Type: Task
>Reporter: Robert Muir
> Attachments: LUCENE-3624.patch
>
>
> Spinoff of LUCENE-3623: currently if you addIndexes(FIR) or similar, you get 
> a NPE deep within codecs during merge.
> I think the NPE is confusing, it looks like a bug but a clearer exception 
> would be an improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1730) Solr fails to start if QueryElevationComponent config is missing

2011-12-07 Thread Grant Ingersoll (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-1730:
--

Attachment: SOLR-1730.patch

A little bit of progress, namely in setting up some tests for this as well as 
fixing the logging of the main exception.

The BadComponentTest shows the error (as well as some issue with either the 
harness or core itself when it comes to bad components).  The QEC is just the 
symptom of what's wrong here, as all Components produce similar errors if the 
inform() fails.  The real question is, what should we do about it, since inform 
is called on reloads, not just at startup it gets a bit trickier with the fail 
early approach that one often wants.

> Solr fails to start if QueryElevationComponent config is missing
> 
>
> Key: SOLR-1730
> URL: https://issues.apache.org/jira/browse/SOLR-1730
> Project: Solr
>  Issue Type: Bug
>  Components: SearchComponents - other
>Affects Versions: 1.4
>Reporter: Mark Miller
>Assignee: Grant Ingersoll
>  Labels: newdev
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-1730.patch
>
>
> QueryElevationComponent tries to do preload some data if its config file does 
> not exist:
> {code}
> if (!exists){
>   // preload the first data
>   RefCounted searchHolder = null;
>   try {
> searchHolder = core.getNewestSearcher(false);
> IndexReader reader = searchHolder.get().getReader();
> getElevationMap( reader, core );
>   } finally {
> if (searchHolder != null) searchHolder.decref();
>   }
> }
> {code}
> This does not work though, as asking for the newest searcher causes a request 
> to be submitted to Solr before its ready to handle it:
> {code}
>  [java] SEVERE: java.lang.NullPointerException
>  [java]   at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
>  [java]   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>  [java]   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1317)
>  [java]   at 
> org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:52)
>  [java]   at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1147)
>  [java]   at 
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>  [java]   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> {code}
> The SearchHandler has not yet been core informed (as the 
> QueryElevationComponent causes this as its getting core informed right before 
> the SearchHandler) and so its components arraylist is still null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3097) Post grouping faceting

2011-12-07 Thread Martijn van Groningen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164496#comment-13164496
 ] 

Martijn van Groningen commented on LUCENE-3097:
---

Yes, if you're using Solr. You can try to apply the patch it should work for 
field facets. 

> Post grouping faceting
> --
>
> Key: LUCENE-3097
> URL: https://issues.apache.org/jira/browse/LUCENE-3097
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: modules/grouping
>Reporter: Martijn van Groningen
>Assignee: Martijn van Groningen
>Priority: Minor
> Fix For: 3.4, 4.0
>
> Attachments: LUCENE-3097.patch, LUCENE-3097.patch, LUCENE-3097.patch, 
> LUCENE-3097.patch, LUCENE-3097.patch, LUCENE-30971.patch
>
>
> This issues focuses on implementing post grouping faceting.
> * How to handle multivalued fields. What field value to show with the facet.
> * Where the facet counts should be based on
> ** Facet counts can be based on the normal documents. Ungrouped counts. 
> ** Facet counts can be based on the groups. Grouped counts.
> ** Facet counts can be based on the combination of group value and facet 
> value. Matrix counts.   
> And properly more implementation options.
> The first two methods are implemented in the SOLR-236 patch. For the first 
> option it calculates a DocSet based on the individual documents from the 
> query result. For the second option it calculates a DocSet for all the most 
> relevant documents of a group. Once the DocSet is computed the FacetComponent 
> and StatsComponent use one the DocSet to create facets and statistics.  
> This last one is a bit more complex. I think it is best explained with an 
> example. Lets say we search on travel offers:
> |||hotel||departure_airport||duration||
> |Hotel a|AMS|5
> |Hotel a|DUS|10
> |Hotel b|AMS|5
> |Hotel b|AMS|10
> If we group by hotel and have a facet for airport. Most end users expect 
> (according to my experience off course) the following airport facet:
> AMS: 2
> DUS: 1
> The above result can't be achieved by the first two methods. You either get 
> counts AMS:3 and DUS:1 or 1 for both airports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3298) FST has hard limit max size of 2.1 GB

2011-12-07 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164495#comment-13164495
 ] 

Yonik Seeley commented on LUCENE-3298:
--

Perhaps we should just have two implementations (a 32 bit one, and a 64 bit 
one)?

> FST has hard limit max size of 2.1 GB
> -
>
> Key: LUCENE-3298
> URL: https://issues.apache.org/jira/browse/LUCENE-3298
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/FSTs
>Reporter: Michael McCandless
>Priority: Minor
> Attachments: LUCENE-3298.patch
>
>
> The FST uses a single contiguous byte[] under the hood, which in java is 
> indexed by int so we cannot grow this over Integer.MAX_VALUE.  It also 
> internally encodes references to this array as vInt.
> We could switch this to a paged byte[] and make the far larger.
> But I think this is low priority... I'm not going to work on it any time soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Release PyLucene 3.5.0

2011-12-07 Thread Andi Vajda


On Wed, 7 Dec 2011, Bill Janssen wrote:


Here's an issue with the new IndexPolicy class:

compile:
   [mkdir] Created dir: /private/tmp/pylucene-3.5.0-1/build/classes
   [javac] /private/tmp/pylucene-3.5.0-1/extensions.xml:19: warning: 
'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to 
false for repeatable builds
   [javac] Compiling 30 source files to 
/private/tmp/pylucene-3.5.0-1/build/classes
   [javac] 
/private/tmp/pylucene-3.5.0-1/java/org/apache/pylucene/index/PythonIndexDeletionPolicy.java:48:
 method does not override a method from its superclass
   [javac] @Override
   [javac]  ^
   [javac] 
/private/tmp/pylucene-3.5.0-1/java/org/apache/pylucene/index/PythonIndexDeletionPolicy.java:52:
 method does not override a method from its superclass
   [javac] @Override
   [javac]  ^
   [javac] Note: Some input files use or override a deprecated API.
   [javac] Note: Recompile with -Xlint:deprecation for details.
   [javac] 2 errors

BUILD FAILED
/private/tmp/pylucene-3.5.0-1/extensions.xml:19: Compile failed; see the 
compiler error output for details.

Total time: 0 seconds
make: *** [build/jar/extensions.jar] Error 1
/tmp/pylucene-3.5.0-1 941 % java -version
java -version
java version "1.5.0_30"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_30-b03-389-9M3425)
Java HotSpot(TM) Client VM (build 1.5.0_30-161, mixed mode, sharing)
/tmp/pylucene-3.5.0-1 942 %

This is OS X 10.5 with the 32-bit Python 2.5 and corresponding 32-bit Java 1.5.


Yes, that's the problem. Java 1.5 doesn't allow @override on interface 
methods.


Thank you for reporting this !

Andi..


Re: [VOTE] Release PyLucene 3.5.0

2011-12-07 Thread Andi Vajda


On Wed, 7 Dec 2011, Bill Janssen wrote:


Andi Vajda  wrote:


A list of Lucene Java changes can be seen at:
http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_5/lucene/CHANGES.txt


``The requested URL
/repos/asf/lucene/dev/tags/lucene_solr_3_5/lucene/CHANGES.txt was not
found on this server.''


Ouch, I even used the wrong thing for the release.
I should have used the 3_5_0 lucene release tag, not the branch.

Rolling a new RC.

Andi..



Re: [VOTE] Release PyLucene 3.5.0

2011-12-07 Thread Andi Vajda


On Wed, 7 Dec 2011, Christian Heimes wrote:


Am 07.12.2011 03:39, schrieb Andi Vajda:


The PyLucene 3.5.0-1 release closely tracking the recent release of
Apache Lucene 3.5.0 is ready.

A release candidate is available from:
http://people.apache.org/~vajda/staging_area/

A list of changes in this release can be seen at:
http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_3_5/CHANGES

PyLucene 3.5.0 is built with JCC 2.12 included in these release artifacts.


Hello Andi,

The CHANGES file doesn't mention JCC 2.12.


Fixed.

Thanks !

Andi..



[jira] [Commented] (LUCENE-3097) Post grouping faceting

2011-12-07 Thread Ian Grainger (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164482#comment-13164482
 ] 

Ian Grainger commented on LUCENE-3097:
--

Oh, sorry- I just read the previous comment _properly_ - So the case I need 
fixing is [SOLR-2898|https://issues.apache.org/jira/browse/SOLR-2898]?

> Post grouping faceting
> --
>
> Key: LUCENE-3097
> URL: https://issues.apache.org/jira/browse/LUCENE-3097
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: modules/grouping
>Reporter: Martijn van Groningen
>Assignee: Martijn van Groningen
>Priority: Minor
> Fix For: 3.4, 4.0
>
> Attachments: LUCENE-3097.patch, LUCENE-3097.patch, LUCENE-3097.patch, 
> LUCENE-3097.patch, LUCENE-3097.patch, LUCENE-30971.patch
>
>
> This issues focuses on implementing post grouping faceting.
> * How to handle multivalued fields. What field value to show with the facet.
> * Where the facet counts should be based on
> ** Facet counts can be based on the normal documents. Ungrouped counts. 
> ** Facet counts can be based on the groups. Grouped counts.
> ** Facet counts can be based on the combination of group value and facet 
> value. Matrix counts.   
> And properly more implementation options.
> The first two methods are implemented in the SOLR-236 patch. For the first 
> option it calculates a DocSet based on the individual documents from the 
> query result. For the second option it calculates a DocSet for all the most 
> relevant documents of a group. Once the DocSet is computed the FacetComponent 
> and StatsComponent use one the DocSet to create facets and statistics.  
> This last one is a bit more complex. I think it is best explained with an 
> example. Lets say we search on travel offers:
> |||hotel||departure_airport||duration||
> |Hotel a|AMS|5
> |Hotel a|DUS|10
> |Hotel b|AMS|5
> |Hotel b|AMS|10
> If we group by hotel and have a facet for airport. Most end users expect 
> (according to my experience off course) the following airport facet:
> AMS: 2
> DUS: 1
> The above result can't be achieved by the first two methods. You either get 
> counts AMS:3 and DUS:1 or 1 for both airports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3298) FST has hard limit max size of 2.1 GB

2011-12-07 Thread Commented

[ 
https://issues.apache.org/jira/browse/LUCENE-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164479#comment-13164479
 ] 

Carlos González-Cadenas commented on LUCENE-3298:
-

Thanks for the presentation. It's very interesting. 

Now that we've invested very significant time with this approach, we'd like to 
stick a little bit more with it and see where we can get to. The FST approach, 
given that is way more low level, will give us more control of the 
functionality down the road, which definitely will prove benefitial mid-term. 
If needed due to space requirements, we can think of replacing FST by LZTrie if 
we need more infix compression for the permutations.

Re: next steps, you commented above that you may consider including this patch 
into the codebase when you have people that have the need. We obviously would 
be very interested in this patch getting into trunk. 

In terms of performance, James is speaking about a 20% performance loss in a 
32-bit machine, we're seeing less performance degradation in a 64-bit machine, 
something around 10-15% depending on the specific FST and query. If you or 
James envision any way to optimize it, let me know, we can give a hand here if 
you tell us the potential paths to make it more efficient.  
 



> FST has hard limit max size of 2.1 GB
> -
>
> Key: LUCENE-3298
> URL: https://issues.apache.org/jira/browse/LUCENE-3298
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/FSTs
>Reporter: Michael McCandless
>Priority: Minor
> Attachments: LUCENE-3298.patch
>
>
> The FST uses a single contiguous byte[] under the hood, which in java is 
> indexed by int so we cannot grow this over Integer.MAX_VALUE.  It also 
> internally encodes references to this array as vInt.
> We could switch this to a paged byte[] and make the far larger.
> But I think this is low priority... I'm not going to work on it any time soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2208) Token div exceeds length of provided text sized 4114

2011-12-07 Thread Vadim Kisselmann (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164478#comment-13164478
 ] 

Vadim Kisselmann commented on LUCENE-2208:
--

Added patch doesn't work, unfortunately. Removing HTMLStripCharFilterFactory 
solves the problem but is not a solution. I need this Filter.

> Token div exceeds length of provided text sized 4114
> 
>
> Key: LUCENE-2208
> URL: https://issues.apache.org/jira/browse/LUCENE-2208
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: modules/highlighter
>Affects Versions: 3.0
> Environment:  diagnostics = {os.version=5.1, os=Windows XP, 
> lucene.version=3.0.0 883080 - 2009-11-22 15:43:58, source=flush, os.arch=x86, 
> java.version=1.6.0_12, java.vendor=Sun Microsystems Inc.}
>
>Reporter: Ramazan VARLIKLI
> Attachments: LUCENE-2208.patch, LUCENE-2208_test.patch
>
>
> I have a doc which contains html codes. I want to strip html tags and make 
> the test clear after then apply highlighter on the clear text . But 
> highlighter throws an exceptions if I strip out the html characters  , if i 
> don't strip out , it works fine. It just confuses me at the moment 
> I copy paste 3 thing here from the console as it may contain special 
> characters which might cause the problem.
> 1 -) Here is the html text 
>   Starter
>   
> 
> 
>  Learning path: History
>   Key question
>   Did transport fuel the industrial revolution?
>   Learning Objective
> 
>   To categorise points as for or against an argument
>   
> 
>   What to do?
>   
> Watch the clip: Transport fuelled the industrial 
> revolution.
>   
>   The clips claims that transport fuelled the industrial 
> revolution. Some historians argue that the industrial revolution only 
> happened because of developments in transport.
> 
>   Read the statements below and decide which 
> points are for and which points are against the argument 
> that industry expanded in the 18th and 19th centuries because of developments 
> in transport.
>   
>   
>   
>   Industry expanded because of inventions and 
> the discovery of steam power.
>   Improvements in transport allowed goods to 
> be sold all over the country and all over the world so there were more 
> customers to develop industry for.
>   Developments in transport allowed 
> resources, such as coal from mines and cotton from America to come together 
> to manufacture products.
>   Transport only developed because industry 
> needed it. It was slow to develop as money was spent on improving roads, then 
> building canals and the replacing them with railways in order to keep up with 
> industry.
>   
>   
>   Now try to think of 2 more statements of your 
> own.
>   
> 
> 
>   
>   Main activity
>   
> 
> Learning path: 
> History
>   Learning Objective
>   
> To select evidence to support points
>   
>   What to do?
>   
>   Choose the 4 points that you think are most important - 
> try to be balanced by having two for and two 
> against.
> Write one in each of the point boxes of the 
> paragraphs on the sheet  class="link-internal">Constructing a balanced argument. You 
> might like to re write the points in your own words and use connectives to 
> link the paragraphs.
>   
> In history and in any argument, you need evidence 
> to support your points.
> Find evidence from these sources and from 
> your own knowledge to support each of your points:
> 
>  href="../servlet/link?template=vid¯o=setResource&resourceID=2044" 
> class="link-internal">At a toll gate
>  href="../servlet/link?macro=setResource&template=vid&resourceID=2046" 
> class="link-internal">Canals
>  href="../servlet/link?macro=setResource&template=vid&resourceID=2043" 
> class="link-internal">Growing cities: traffic
>href="../servlet/link?macro=setResource&template=vid&resourceID=2047" 
> class="link-internal">Impact of the railway 
>href="../servlet/link?macro=setRes

[jira] [Updated] (LUCENE-3624) Throw exception for "Multi-SortedSource" instead of returning null

2011-12-07 Thread Robert Muir (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3624:


Attachment: LUCENE-3624.patch

Here's a patch. I did the same for getArray() <-- don't call this if hasArray() 
returns false or you get UOE, consistent with ByteBuffer.

Now if you try to merge a SlowMultiReaderWrapper, the error is more obvious:
{noformat}
[junit] java.lang.UnsupportedOperationException: asSortedSource is not 
supported
[junit] at 
org.apache.lucene.index.values.IndexDocValues$Source.asSortedSource(IndexDocValues.java:224)
[junit] at 
org.apache.lucene.index.values.SortedBytesMergeUtils.buildSlices(SortedBytesMergeUtils.java:89)
[junit] at 
org.apache.lucene.index.values.VarSortedBytesImpl$Writer.merge(VarSortedBytesImpl.java:68)
[junit] at 
org.apache.lucene.index.codecs.PerDocConsumer.merge(PerDocConsumer.java:84)
[junit] at 
org.apache.lucene.index.SegmentMerger.mergePerDoc(SegmentMerger.java:321)
[junit] at 
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:124)
[junit] at 
org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:2429)
{noformat}

> Throw exception for "Multi-SortedSource" instead of returning null
> --
>
> Key: LUCENE-3624
> URL: https://issues.apache.org/jira/browse/LUCENE-3624
> Project: Lucene - Java
>  Issue Type: Task
>Reporter: Robert Muir
> Attachments: LUCENE-3624.patch
>
>
> Spinoff of LUCENE-3623: currently if you addIndexes(FIR) or similar, you get 
> a NPE deep within codecs during merge.
> I think the NPE is confusing, it looks like a bug but a clearer exception 
> would be an improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Release PyLucene 3.5.0

2011-12-07 Thread Bill Janssen
Here's an issue with the new IndexPolicy class:

compile:
[mkdir] Created dir: /private/tmp/pylucene-3.5.0-1/build/classes
[javac] /private/tmp/pylucene-3.5.0-1/extensions.xml:19: warning: 
'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to 
false for repeatable builds
[javac] Compiling 30 source files to 
/private/tmp/pylucene-3.5.0-1/build/classes
[javac] 
/private/tmp/pylucene-3.5.0-1/java/org/apache/pylucene/index/PythonIndexDeletionPolicy.java:48:
 method does not override a method from its superclass
[javac] @Override
[javac]  ^
[javac] 
/private/tmp/pylucene-3.5.0-1/java/org/apache/pylucene/index/PythonIndexDeletionPolicy.java:52:
 method does not override a method from its superclass
[javac] @Override
[javac]  ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] 2 errors

BUILD FAILED
/private/tmp/pylucene-3.5.0-1/extensions.xml:19: Compile failed; see the 
compiler error output for details.

Total time: 0 seconds
make: *** [build/jar/extensions.jar] Error 1
/tmp/pylucene-3.5.0-1 941 % java -version
java -version
java version "1.5.0_30"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_30-b03-389-9M3425)
Java HotSpot(TM) Client VM (build 1.5.0_30-161, mixed mode, sharing)
/tmp/pylucene-3.5.0-1 942 % 

This is OS X 10.5 with the 32-bit Python 2.5 and corresponding 32-bit Java 1.5.

Bill


[jira] [Commented] (SOLR-2880) Investigate adding an overseer that can assign shards, later do re-balancing, etc

2011-12-07 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164457#comment-13164457
 ] 

Yonik Seeley commented on SOLR-2880:


Some random comments so far... Naming:
 - numShards vs num_shards... we should try to make system properties 
consistent with the names that actually appear in ZK
 - _core, _collection? why the underscores? 

I'm not sure num_shards belongs as a configuration item anywhere (in solr.xml 
or as a collection property in ZK). The number of shards a collection has is 
always just the number you see in ZK under the collection. This will make it 
easier for people with custom sharding to just add another shard. Whoever is 
creating the initial layout should thus create all of the shards at once.

> Investigate adding an overseer that can assign shards, later do re-balancing, 
> etc
> -
>
> Key: SOLR-2880
> URL: https://issues.apache.org/jira/browse/SOLR-2880
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Mark Miller
>Assignee: Mark Miller
> Fix For: 4.0
>
> Attachments: SOLR-2880-merge-elections.patch, SOLR-2880.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Release PyLucene 3.5.0

2011-12-07 Thread Bill Janssen
Andi Vajda  wrote:

> A list of Lucene Java changes can be seen at:
> http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_5/lucene/CHANGES.txt

``The requested URL
/repos/asf/lucene/dev/tags/lucene_solr_3_5/lucene/CHANGES.txt was not
found on this server.''



Re: [VOTE] Release PyLucene 3.5.0

2011-12-07 Thread Christian Heimes
Am 07.12.2011 03:39, schrieb Andi Vajda:
> 
> The PyLucene 3.5.0-1 release closely tracking the recent release of
> Apache Lucene 3.5.0 is ready.
> 
> A release candidate is available from:
> http://people.apache.org/~vajda/staging_area/
> 
> A list of changes in this release can be seen at:
> http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_3_5/CHANGES
> 
> PyLucene 3.5.0 is built with JCC 2.12 included in these release artifacts.

Hello Andi,

The CHANGES file doesn't mention JCC 2.12.

Christian


[jira] [Updated] (LUCENE-3586) Choose a specific Directory implementation running the CheckIndex main

2011-12-07 Thread Luca Cavanna (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Cavanna updated LUCENE-3586:
-

Attachment: LUCENE-3586.patch

New patch against trunk according to Michael's hints.
It's now possible to use external FSDirectory implementations. The package 
oal.store is used if no package is specified. This isn't good if someone has 
the FSDirectory implementation in the default package, but I'm not sure if this 
case is worth a fall back. Please, let me know what you think.

> Choose a specific Directory implementation running the CheckIndex main
> --
>
> Key: LUCENE-3586
> URL: https://issues.apache.org/jira/browse/LUCENE-3586
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Luca Cavanna
>Assignee: Luca Cavanna
>Priority: Minor
> Attachments: LUCENE-3586.patch, LUCENE-3586.patch
>
>
> It should be possible to choose a specific Directory implementation to use 
> during the CheckIndex process when we run it from its main.
> What about an additional main parameter?
> In fact, I'm experiencing some problems with MMapDirectory working with a big 
> segment, and after some failed attempts playing with maxChunkSize, I decided 
> to switch to another FSDirectory implementation but I needed to do that on my 
> own main.
> Should we also consider to use a FileSwitchDirectory?
> I'm willing to contribute, could you please let me know your thoughts about 
> it?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2509) spellcheck: StringIndexOutOfBoundsException: String index out of range: -1

2011-12-07 Thread Erick Erickson (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-2509.
--

   Resolution: Fixed
Fix Version/s: 4.0
   3.6


Trunk r: 1211456
3x r:1211457

> spellcheck: StringIndexOutOfBoundsException: String index out of range: -1
> --
>
> Key: SOLR-2509
> URL: https://issues.apache.org/jira/browse/SOLR-2509
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 3.1
> Environment: Debian Lenny
> JAVA Version "1.6.0_20"
>Reporter: Thomas Gambier
>Assignee: Erick Erickson
>Priority: Blocker
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-2509.patch, SOLR-2509.patch, SOLR-2509.patch, 
> document.xml, schema.xml, solrconfig.xml
>
>
> Hi,
> I'm a french user of SOLR and i've encountered a problem since i've installed 
> SOLR 3.1.
> I've got an error with this query : 
> cle_frbr:"LYSROUGE1149-73190"
> *SEE COMMENTS BELOW*
> I've tested to escape the minus char and the query worked :
> cle_frbr:"LYSROUGE1149(BACKSLASH)-73190"
> But, strange fact, if i change one letter in my query it works :
> cle_frbr:"LASROUGE1149-73190"
> I've tested the same query on SOLR 1.4 and it works !
> Can someone test the query on next line on a 3.1 SOLR version and tell me if 
> he have the same problem ? 
> yourfield:"LYSROUGE1149-73190"
> Where do the problem come from ?
> Thank you by advance for your help.
> Tom

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3620) FilterIndexReader does not override all of IndexReader methods

2011-12-07 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3620:
---

Attachment: LUCENE-3620-trunk.patch

Patch adds the test to TestFilterIndexReader. Uwe asked that I do not commit 
these changes (test + FIR/IR fixes) until he merges in the branch on 
IR-read-only. We decided that Uwe will apply that patch to the branch, fix 
FIR/IR there and merge the branch afterwards.

> FilterIndexReader does not override all of IndexReader methods
> --
>
> Key: LUCENE-3620
> URL: https://issues.apache.org/jira/browse/LUCENE-3620
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/search
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3620-trunk.patch, LUCENE-3620.patch, 
> LUCENE-3620.patch, LUCENE-3620.patch
>
>
> FilterIndexReader does not override all of IndexReader methods. We've hit an 
> error in LUCENE-3573 (and fixed it). So I thought to write a simple test 
> which asserts that FIR overrides all methods of IR (and we can filter our 
> methods that we don't think that it should override). The test is very simple 
> (attached), and it currently fails over these methods:
> {code}
> getRefCount
> incRef
> tryIncRef
> decRef
> reopen
> reopen
> reopen
> reopen
> clone
> numDeletedDocs
> document
> setNorm
> setNorm
> termPositions
> deleteDocument
> deleteDocuments
> undeleteAll
> getIndexCommit
> getUniqueTermCount
> getTermInfosIndexDivisor
> {code}
> I didn't yet fix anything in FIR -- if you spot a method that you think we 
> should not override and delegate, please comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3623) SegmentReader.getFieldNames ignores FieldOption.DOC_VALUES

2011-12-07 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164426#comment-13164426
 ] 

Robert Muir commented on LUCENE-3623:
-

{quote}
This means you cannot use PKIndexSplitter and MultiPassIndexSplitter with 
docValues?
{quote}

Not totally, since this commit most docvalues will work for those cases, just 
not the SORTED_BYTES variants.


> SegmentReader.getFieldNames ignores FieldOption.DOC_VALUES
> --
>
> Key: LUCENE-3623
> URL: https://issues.apache.org/jira/browse/LUCENE-3623
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-3623.patch, LUCENE-3623.patch, LUCENE-3623.patch, 
> LUCENE-3623_test.patch
>
>
> we use this getFieldNames api in segmentmerger if we merge something that 
> isn't a SegmentReader (e.g. FilterIndexReader)
> it looks to me that if you use a FilterIndexReader, call 
> addIndexes(Reader...) the docvalues will be simply dropped.
> I dont think its enough to just note that the field has docvalues either 
> right? We need to also set the type 
> correctly in the merged field infos? This would imply that instead of 
> FieldOption.DOCVALUES, we need to have a 
> FieldOption for each ValueType so that we correctly update the type.
> But looking at FI.update/setDocValues, it doesn't look like we 'type-promote' 
> here anyway?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3623) SegmentReader.getFieldNames ignores FieldOption.DOC_VALUES

2011-12-07 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164421#comment-13164421
 ] 

Uwe Schindler commented on LUCENE-3623:
---

bq. MultiSource should just override asSortedSource() and throw an 
unsupportedOperationException.

This means you cannot use PKIndexSplitter and MultiPassIndexSplitter with 
docValues? We should open another issue to make it work per-segment (by 
implementing a per-segment FilteredReader), which is possible. Currently it 
wraps the source index by a SlowMultiReaderWrapper.

> SegmentReader.getFieldNames ignores FieldOption.DOC_VALUES
> --
>
> Key: LUCENE-3623
> URL: https://issues.apache.org/jira/browse/LUCENE-3623
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-3623.patch, LUCENE-3623.patch, LUCENE-3623.patch, 
> LUCENE-3623_test.patch
>
>
> we use this getFieldNames api in segmentmerger if we merge something that 
> isn't a SegmentReader (e.g. FilterIndexReader)
> it looks to me that if you use a FilterIndexReader, call 
> addIndexes(Reader...) the docvalues will be simply dropped.
> I dont think its enough to just note that the field has docvalues either 
> right? We need to also set the type 
> correctly in the merged field infos? This would imply that instead of 
> FieldOption.DOCVALUES, we need to have a 
> FieldOption for each ValueType so that we correctly update the type.
> But looking at FI.update/setDocValues, it doesn't look like we 'type-promote' 
> here anyway?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3625) FieldValueFitler should expose the field it uses

2011-12-07 Thread Simon Willnauer (Created) (JIRA)
FieldValueFitler should expose the field it uses


 Key: LUCENE-3625
 URL: https://issues.apache.org/jira/browse/LUCENE-3625
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 3.6, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Priority: Trivial
 Fix For: 3.6, 4.0
 Attachments: LUCENE-3625.patch

FieldValueFitler should expose the field it uses. It currently hides this 
entirely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3625) FieldValueFitler should expose the field it uses

2011-12-07 Thread Simon Willnauer (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3625:


Attachment: LUCENE-3625.patch

> FieldValueFitler should expose the field it uses
> 
>
> Key: LUCENE-3625
> URL: https://issues.apache.org/jira/browse/LUCENE-3625
> Project: Lucene - Java
>  Issue Type: Task
>Affects Versions: 3.6, 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Trivial
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3625.patch
>
>
> FieldValueFitler should expose the field it uses. It currently hides this 
> entirely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2072) Search Grouping: expand group sort options

2011-12-07 Thread George P. Stathis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164399#comment-13164399
 ] 

George P. Stathis commented on SOLR-2072:
-

Thanks Erick. That's a good enough start for me. I'll also look at the patches 
attached to SOLR-1297 since it's referenced in this ticket.

> Search Grouping: expand group sort options
> --
>
> Key: SOLR-2072
> URL: https://issues.apache.org/jira/browse/SOLR-2072
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Yonik Seeley
>
> Ability to specify functions over group documents when sorting groups.  
> max(score) or avg(popularity), etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2580) Create Components to Support Using Business Rules in Solr

2011-12-07 Thread Grant Ingersoll (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164393#comment-13164393
 ] 

Grant Ingersoll commented on SOLR-2580:
---

Wiki page is up: http://wiki.apache.org/solr/Business%20Rules

> Create Components to Support Using Business Rules in Solr
> -
>
> Key: SOLR-2580
> URL: https://issues.apache.org/jira/browse/SOLR-2580
> Project: Solr
>  Issue Type: New Feature
>Reporter: Tomás Fernández Löbbe
>Assignee: Grant Ingersoll
> Fix For: 4.0
>
>
> The goal is to be able to adjust the relevance of documents based on user 
> defined business rules.
> For example, in a e-commerce site, when the user chooses the "shoes" 
> category, we may be interested in boosting products from a certain brand. 
> This can be expressed as a rule in the following way:
> rule "Boost Adidas products when searching shoes"
> when
> $qt : QueryTool()
> TermQuery(term.field=="category", term.text=="shoes")
> then
> $qt.boost("{!lucene}brand:adidas");
> end
> The QueryTool object should be used to alter the main query in a easy way. 
> Even more human-like rules can be written:
> rule "Boost Adidas products when searching shoes"
>  when
> Query has term "shoes" in field "product"
>  then
> Add boost query "{!lucene}brand:adidas"
> end
> These rules are written in a text file in the config directory and can be 
> modified at runtime. Rules will be managed using JBoss Drools: 
> http://www.jboss.org/drools/drools-expert.html
> On a first stage, it will allow to add boost queries or change sorting fields 
> based on the user query, but it could be extended to allow more options.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2953) Introducing hit Count as an alternative to score

2011-12-07 Thread Erick Erickson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164386#comment-13164386
 ] 

Erick Erickson commented on SOLR-2953:
--

Can you make a patch and upload it? See: 
http://wiki.apache.org/solr/HowToContribute#Generating_a_patch

Then people can take a look and see how you implemented it and discuss.

> Introducing hit Count as an alternative to score 
> -
>
> Key: SOLR-2953
> URL: https://issues.apache.org/jira/browse/SOLR-2953
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 4.0
>Reporter: Kaleem Ahmed
>  Labels: features
> Fix For: 4.0
>
>   Original Estimate: 1,008h
>  Remaining Estimate: 1,008h
>
> As of now we have score as relevancy factor for a query against a document, 
> and this score is relative to the number of documents in the index. In the 
> same way why not have some other relevancy feature say "hitCounts" which is 
> absolute for a given doc and a given query, It shouldn't depend on the number 
> of documents in the index. This will help a lot for the frequently changing 
> indexes , where the search rules are predefined along the relevancy factor 
> for a document to be qualified for that query(search rule). 
> Ex: consider a use case where a list of queries are formed with a threshold 
> number for each query and these are searched on a frequently updated index to 
> get the documents that score above the threshold i.e. when a document's 
> relevancy factor crosses the threshold for a query the document is said to be 
> qualified for that query. 
> For the above use case to satisfy the score shouldn't change every time the 
> index gets updated with new documents. So we introduce new feature called 
> "hitCount"  which represents the relevancy of a document against a query and 
> it is absolute(won't change with index size). 
> This hitCount is a positive integer and is calculated as follows 
> Ex: Document with text "the quick fox jumped over the lazy dog, while the 
> lazy dog was too lazy to care" 
> 1. for the query "lazy AND dog" the hitCount will be == (no of occurrences of 
> "lazy" in the document) +  (no of occurrences of "dog" in the document)  =>  
> 3+2 => 5  
> 2. for the phrase query  \"lazy dog\"  the hitCount will be == (no of 
> occurrences of exact phrase "lazy dog" in the document) => 2
> This will be very useful  as an alternative scoring mechanism.
> I already implemented this whole thing in the Solr source code(that I 
> downloaded) and we are using it. So far it's going good. 
> It would be really great if this feature is added to trunk (original  Solr) 
> so that we don't have to implement the changes every time  a new version is 
> released and also others could be benefited with this. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2072) Search Grouping: expand group sort options

2011-12-07 Thread Erick Erickson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164384#comment-13164384
 ] 

Erick Erickson commented on SOLR-2072:
--

George:

This isn't going to be all that much help, but you might try stepping through 
the test cases to get a feel for how grouping works, see: TestGroupingSearch 
for instance. And the files referenced in some of the patches might also be 
useful, e.g. SOLR-2072.

Warning: I know very little to nothing about the code in question, but that's 
how I'd start to get a feel for it, hopefully enough to propose an approach...

> Search Grouping: expand group sort options
> --
>
> Key: SOLR-2072
> URL: https://issues.apache.org/jira/browse/SOLR-2072
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Yonik Seeley
>
> Ability to specify functions over group documents when sorting groups.  
> max(score) or avg(popularity), etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2952) InterruptedException during SorlCore instatiation

2011-12-07 Thread Erik Hatcher (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164376#comment-13164376
 ] 

Erik Hatcher commented on SOLR-2952:


Custom classes are shown in that stack trace that don't come with Solr.  First 
thing is to investigate that custom code, as that is likely where the issue is.

> InterruptedException during SorlCore instatiation
> -
>
> Key: SOLR-2952
> URL: https://issues.apache.org/jira/browse/SOLR-2952
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 3.4
>Reporter: Krzysztof Kwiatosz
>
> We have the following exception during SolrCore initialization:
> {code}
> org.apache.lucene.util.ThreadInterruptedException: 
> java.lang.InterruptedException: sleep interrupted
>  [java] java.lang.RuntimeException: 
> org.apache.lucene.util.ThreadInterruptedException: 
> java.lang.InterruptedException: sleep interrupted
>  [java] at 
> org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1099)
>  [java] at org.apache.solr.core.SolrCore.(SolrCore.java:585)
>  [java] at 
> de.hybris.platform.solrfacetsearch.solr.impl.SolrCoreRegistry.registerServer(SolrCoreRegistry.java:263)
>  [java] at 
> de.hybris.platform.solrfacetsearch.solr.impl.SolrCoreRegistry.getEmbeddedSolrServer(SolrCoreRegistry.java:203)
>  [java] at 
> de.hybris.platform.solrfacetsearch.solr.impl.SolrCoreRegistry.getEmbeddedSolrServer(SolrCoreRegistry.java:217)
>  [java] at 
> de.hybris.platform.solrfacetsearch.jalo.SolrfacetsearchManager$1.afterTenantStartUp(SolrfacetsearchManager.java:104)
>  [java] at 
> de.hybris.platform.core.AbstractTenant.executeStartupNotifyIfNecessary(AbstractTenant.java:601)
>  [java] at 
> de.hybris.platform.core.AbstractTenant.executeInitsIfNecessary(AbstractTenant.java:1012)
>  [java] at 
> de.hybris.platform.core.Registry.assureTenantStarted(Registry.java:478)
>  [java] at 
> de.hybris.platform.core.Registry.activateMasterTenant(Registry.java:423)
>  [java] at 
> de.hybris.platform.core.Registry.activateMasterTenantAndFailIfAlreadySet(Registry.java:388)
>  [java] at 
> de.hybris.platform.core.Registry.setCurrentTenantByID(Registry.java:497)
>  [java] at 
> de.hybris.platform.task.impl.DefaultTaskService$Poll.activateTenant(DefaultTaskService.java:1072)
>  [java] at 
> de.hybris.platform.task.impl.DefaultTaskService$Poll.run(DefaultTaskService.java:947)
>  [java] Caused by: org.apache.lucene.util.ThreadInterruptedException: 
> java.lang.InterruptedException: sleep interrupted
>  [java] at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:696)
>  [java] at 
> org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:75)
>  [java] at 
> org.apache.lucene.index.IndexReader.open(IndexReader.java:421)
>  [java] at 
> org.apache.lucene.index.IndexReader.open(IndexReader.java:364)
>  [java] at 
> org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:38)
>  [java] at 
> org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1088)
>  [java] ... 13 more
>  [java] Caused by: java.lang.InterruptedException: sleep interrupted
>  [java] at java.lang.Thread.sleep(Native Method)
>  [java] at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:694)
>  [java] ... 18 more
> {code}
> What could be the reason?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3620) FilterIndexReader does not override all of IndexReader methods

2011-12-07 Thread Shai Erera (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164355#comment-13164355
 ] 

Shai Erera commented on LUCENE-3620:


Committed rev 1211413 to 3x. Porting to trunk

> FilterIndexReader does not override all of IndexReader methods
> --
>
> Key: LUCENE-3620
> URL: https://issues.apache.org/jira/browse/LUCENE-3620
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/search
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3620.patch, LUCENE-3620.patch, LUCENE-3620.patch
>
>
> FilterIndexReader does not override all of IndexReader methods. We've hit an 
> error in LUCENE-3573 (and fixed it). So I thought to write a simple test 
> which asserts that FIR overrides all methods of IR (and we can filter our 
> methods that we don't think that it should override). The test is very simple 
> (attached), and it currently fails over these methods:
> {code}
> getRefCount
> incRef
> tryIncRef
> decRef
> reopen
> reopen
> reopen
> reopen
> clone
> numDeletedDocs
> document
> setNorm
> setNorm
> termPositions
> deleteDocument
> deleteDocuments
> undeleteAll
> getIndexCommit
> getUniqueTermCount
> getTermInfosIndexDivisor
> {code}
> I didn't yet fix anything in FIR -- if you spot a method that you think we 
> should not override and delegate, please comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3623) SegmentReader.getFieldNames ignores FieldOption.DOC_VALUES

2011-12-07 Thread Robert Muir (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-3623.
-

   Resolution: Fixed
Fix Version/s: 4.0

I committed this to trunk, and opened an issue for the Multi SortedSource stuff.

> SegmentReader.getFieldNames ignores FieldOption.DOC_VALUES
> --
>
> Key: LUCENE-3623
> URL: https://issues.apache.org/jira/browse/LUCENE-3623
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Fix For: 4.0
>
> Attachments: LUCENE-3623.patch, LUCENE-3623.patch, LUCENE-3623.patch, 
> LUCENE-3623_test.patch
>
>
> we use this getFieldNames api in segmentmerger if we merge something that 
> isn't a SegmentReader (e.g. FilterIndexReader)
> it looks to me that if you use a FilterIndexReader, call 
> addIndexes(Reader...) the docvalues will be simply dropped.
> I dont think its enough to just note that the field has docvalues either 
> right? We need to also set the type 
> correctly in the merged field infos? This would imply that instead of 
> FieldOption.DOCVALUES, we need to have a 
> FieldOption for each ValueType so that we correctly update the type.
> But looking at FI.update/setDocValues, it doesn't look like we 'type-promote' 
> here anyway?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3624) Throw exception for "Multi-SortedSource" instead of returning null

2011-12-07 Thread Robert Muir (Created) (JIRA)
Throw exception for "Multi-SortedSource" instead of returning null
--

 Key: LUCENE-3624
 URL: https://issues.apache.org/jira/browse/LUCENE-3624
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir


Spinoff of LUCENE-3623: currently if you addIndexes(FIR) or similar, you get a 
NPE deep within codecs during merge.

I think the NPE is confusing, it looks like a bug but a clearer exception would 
be an improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3623) SegmentReader.getFieldNames ignores FieldOption.DOC_VALUES

2011-12-07 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164347#comment-13164347
 ] 

Simon Willnauer commented on LUCENE-3623:
-

MultiSource should just override asSortedSource() and throw an 
unsupportedOperationException.

> SegmentReader.getFieldNames ignores FieldOption.DOC_VALUES
> --
>
> Key: LUCENE-3623
> URL: https://issues.apache.org/jira/browse/LUCENE-3623
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: LUCENE-3623.patch, LUCENE-3623.patch, LUCENE-3623.patch, 
> LUCENE-3623_test.patch
>
>
> we use this getFieldNames api in segmentmerger if we merge something that 
> isn't a SegmentReader (e.g. FilterIndexReader)
> it looks to me that if you use a FilterIndexReader, call 
> addIndexes(Reader...) the docvalues will be simply dropped.
> I dont think its enough to just note that the field has docvalues either 
> right? We need to also set the type 
> correctly in the merged field infos? This would imply that instead of 
> FieldOption.DOCVALUES, we need to have a 
> FieldOption for each ValueType so that we correctly update the type.
> But looking at FI.update/setDocValues, it doesn't look like we 'type-promote' 
> here anyway?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-2954) org.apache.lucene.util.bytesref exception in analysis.jsp

2011-12-07 Thread Okke Klein (Closed) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Okke Klein closed SOLR-2954.



Prolly Tomcat giving me hard time.

Replacing with old file and reload resolved issue.

> org.apache.lucene.util.bytesref exception in analysis.jsp
> -
>
> Key: SOLR-2954
> URL: https://issues.apache.org/jira/browse/SOLR-2954
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Okke Klein
>Priority: Minor
>
> After reverting to previous revision, analysts.jsp worked again. Can someone 
> fix this?
> SEVERE: Servlet.service() for servlet jsp threw exception
> java.lang.NoSuchMethodError: 
> org.apache.lucene.util.BytesRef.(Lorg/apache/lucene/util/BytesRef;)V
>   at org.apache.jsp.admin.analysis_jsp._jspService(analysis_jsp.java:702)
>   at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:68)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
>   at 
> org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:416)
>   at 
> org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:389)
>   at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:332)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:306)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:671)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:462)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:401)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher.java:329)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:271)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:244)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>   at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:240)
>   at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:161)
>   at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:164)
>   at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:108)
>   at 
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:558)
>   at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
>   at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:379)
>   at 
> org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:282)
>   at 
> org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:357)
>   at 
> org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1687)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> 7-dec-2011 11:47:16 org.apache.solr.common.SolrException log
> SEVERE: org.apache.jasper.JasperException: javax.servlet.ServletException: 
> java.lang.NoSuchMethodError: 
> org.apache.lucene.util.BytesRef.(Lorg/apache/lucene/util/BytesRef;)V
>   at 
> org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:531)
>   at 
> org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:439)
>   at 
> org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:389)
>   at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:332)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:306)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:671)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:462)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:401)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher.java:329)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrD

[jira] [Commented] (LUCENE-3623) SegmentReader.getFieldNames ignores FieldOption.DOC_VALUES

2011-12-07 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164335#comment-13164335
 ] 

Robert Muir commented on LUCENE-3623:
-

+1, at a glance it wasn't obvious to me what caused the NPE.

So a clear exception would be a big improvement.

> SegmentReader.getFieldNames ignores FieldOption.DOC_VALUES
> --
>
> Key: LUCENE-3623
> URL: https://issues.apache.org/jira/browse/LUCENE-3623
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: LUCENE-3623.patch, LUCENE-3623.patch, LUCENE-3623.patch, 
> LUCENE-3623_test.patch
>
>
> we use this getFieldNames api in segmentmerger if we merge something that 
> isn't a SegmentReader (e.g. FilterIndexReader)
> it looks to me that if you use a FilterIndexReader, call 
> addIndexes(Reader...) the docvalues will be simply dropped.
> I dont think its enough to just note that the field has docvalues either 
> right? We need to also set the type 
> correctly in the merged field infos? This would imply that instead of 
> FieldOption.DOCVALUES, we need to have a 
> FieldOption for each ValueType so that we correctly update the type.
> But looking at FI.update/setDocValues, it doesn't look like we 'type-promote' 
> here anyway?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3623) SegmentReader.getFieldNames ignores FieldOption.DOC_VALUES

2011-12-07 Thread Simon Willnauer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164334#comment-13164334
 ] 

Simon Willnauer commented on LUCENE-3623:
-

bq. swapping in SlowMultiReaderWrapper to TestTypePromotion found another bug:
that is because SMRW returns MultiSource which doesn't support asSortedSource() 
maybe we should throw an exception here?


> SegmentReader.getFieldNames ignores FieldOption.DOC_VALUES
> --
>
> Key: LUCENE-3623
> URL: https://issues.apache.org/jira/browse/LUCENE-3623
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Robert Muir
> Attachments: LUCENE-3623.patch, LUCENE-3623.patch, LUCENE-3623.patch, 
> LUCENE-3623_test.patch
>
>
> we use this getFieldNames api in segmentmerger if we merge something that 
> isn't a SegmentReader (e.g. FilterIndexReader)
> it looks to me that if you use a FilterIndexReader, call 
> addIndexes(Reader...) the docvalues will be simply dropped.
> I dont think its enough to just note that the field has docvalues either 
> right? We need to also set the type 
> correctly in the merged field infos? This would imply that instead of 
> FieldOption.DOCVALUES, we need to have a 
> FieldOption for each ValueType so that we correctly update the type.
> But looking at FI.update/setDocValues, it doesn't look like we 'type-promote' 
> here anyway?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3097) Post grouping faceting

2011-12-07 Thread Ian Grainger (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164331#comment-13164331
 ] 

Ian Grainger commented on LUCENE-3097:
--

Hi - is the matrix count feature available in Solr 3.5? Seeing as this is 
marked as closed I assume it is? If so do I need to do anything to use this 
feature?

> Post grouping faceting
> --
>
> Key: LUCENE-3097
> URL: https://issues.apache.org/jira/browse/LUCENE-3097
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: modules/grouping
>Reporter: Martijn van Groningen
>Assignee: Martijn van Groningen
>Priority: Minor
> Fix For: 3.4, 4.0
>
> Attachments: LUCENE-3097.patch, LUCENE-3097.patch, LUCENE-3097.patch, 
> LUCENE-3097.patch, LUCENE-3097.patch, LUCENE-30971.patch
>
>
> This issues focuses on implementing post grouping faceting.
> * How to handle multivalued fields. What field value to show with the facet.
> * Where the facet counts should be based on
> ** Facet counts can be based on the normal documents. Ungrouped counts. 
> ** Facet counts can be based on the groups. Grouped counts.
> ** Facet counts can be based on the combination of group value and facet 
> value. Matrix counts.   
> And properly more implementation options.
> The first two methods are implemented in the SOLR-236 patch. For the first 
> option it calculates a DocSet based on the individual documents from the 
> query result. For the second option it calculates a DocSet for all the most 
> relevant documents of a group. Once the DocSet is computed the FacetComponent 
> and StatsComponent use one the DocSet to create facets and statistics.  
> This last one is a bit more complex. I think it is best explained with an 
> example. Lets say we search on travel offers:
> |||hotel||departure_airport||duration||
> |Hotel a|AMS|5
> |Hotel a|DUS|10
> |Hotel b|AMS|5
> |Hotel b|AMS|10
> If we group by hotel and have a facet for airport. Most end users expect 
> (according to my experience off course) the following airport facet:
> AMS: 2
> DUS: 1
> The above result can't be achieved by the first two methods. You either get 
> counts AMS:3 and DUS:1 or 1 for both airports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2954) org.apache.lucene.util.bytesref exception in analysis.jsp

2011-12-07 Thread Robert Muir (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved SOLR-2954.
---

Resolution: Not A Problem

NoSuchMethodError

> org.apache.lucene.util.bytesref exception in analysis.jsp
> -
>
> Key: SOLR-2954
> URL: https://issues.apache.org/jira/browse/SOLR-2954
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.0
>Reporter: Okke Klein
>Priority: Minor
>
> After reverting to previous revision, analysts.jsp worked again. Can someone 
> fix this?
> SEVERE: Servlet.service() for servlet jsp threw exception
> java.lang.NoSuchMethodError: 
> org.apache.lucene.util.BytesRef.(Lorg/apache/lucene/util/BytesRef;)V
>   at org.apache.jsp.admin.analysis_jsp._jspService(analysis_jsp.java:702)
>   at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:68)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
>   at 
> org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:416)
>   at 
> org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:389)
>   at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:332)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:306)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:671)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:462)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:401)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher.java:329)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:271)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:244)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>   at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:240)
>   at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:161)
>   at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:164)
>   at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:108)
>   at 
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:558)
>   at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
>   at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:379)
>   at 
> org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:282)
>   at 
> org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:357)
>   at 
> org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1687)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> 7-dec-2011 11:47:16 org.apache.solr.common.SolrException log
> SEVERE: org.apache.jasper.JasperException: javax.servlet.ServletException: 
> java.lang.NoSuchMethodError: 
> org.apache.lucene.util.BytesRef.(Lorg/apache/lucene/util/BytesRef;)V
>   at 
> org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:531)
>   at 
> org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:439)
>   at 
> org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:389)
>   at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:332)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:306)
>   at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:671)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:462)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:401)
>   at 
> org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher.java:329)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:271)
>   a

[jira] [Created] (SOLR-2954) org.apache.lucene.util.bytesref exception in analysis.jsp

2011-12-07 Thread Okke Klein (Created) (JIRA)
org.apache.lucene.util.bytesref exception in analysis.jsp
-

 Key: SOLR-2954
 URL: https://issues.apache.org/jira/browse/SOLR-2954
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Okke Klein
Priority: Minor


After reverting to previous revision, analysts.jsp worked again. Can someone 
fix this?


SEVERE: Servlet.service() for servlet jsp threw exception
java.lang.NoSuchMethodError: 
org.apache.lucene.util.BytesRef.(Lorg/apache/lucene/util/BytesRef;)V
at org.apache.jsp.admin.analysis_jsp._jspService(analysis_jsp.java:702)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:68)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
at 
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:416)
at 
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:389)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:332)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:306)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:671)
at 
org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:462)
at 
org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:401)
at 
org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher.java:329)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:271)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:244)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:240)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:161)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:164)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:108)
at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:558)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:379)
at 
org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:282)
at 
org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:357)
at 
org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1687)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
7-dec-2011 11:47:16 org.apache.solr.common.SolrException log
SEVERE: org.apache.jasper.JasperException: javax.servlet.ServletException: 
java.lang.NoSuchMethodError: 
org.apache.lucene.util.BytesRef.(Lorg/apache/lucene/util/BytesRef;)V
at 
org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:531)
at 
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:439)
at 
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:389)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:332)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:306)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:671)
at 
org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:462)
at 
org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:401)
at 
org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher.java:329)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:271)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:244)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:240)
at 
org.apache.catalina.core.StandardContextV

[jira] [Created] (SOLR-2953) Introducing hit Count as an alternative to score

2011-12-07 Thread Kaleem Ahmed (Created) (JIRA)
Introducing hit Count as an alternative to score 
-

 Key: SOLR-2953
 URL: https://issues.apache.org/jira/browse/SOLR-2953
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.0
Reporter: Kaleem Ahmed
 Fix For: 4.0


As of now we have score as relevancy factor for a query against a document, and 
this score is relative to the number of documents in the index. In the same way 
why not have some other relevancy feature say "hitCounts" which is absolute for 
a given doc and a given query, It shouldn't depend on the number of documents 
in the index. This will help a lot for the frequently changing indexes , where 
the search rules are predefined along the relevancy factor for a document to be 
qualified for that query(search rule). 

Ex: consider a use case where a list of queries are formed with a threshold 
number for each query and these are searched on a frequently updated index to 
get the documents that score above the threshold i.e. when a document's 
relevancy factor crosses the threshold for a query the document is said to be 
qualified for that query. 
For the above use case to satisfy the score shouldn't change every time the 
index gets updated with new documents. So we introduce new feature called 
"hitCount"  which represents the relevancy of a document against a query and it 
is absolute(won't change with index size). 

This hitCount is a positive integer and is calculated as follows 
Ex: Document with text "the quick fox jumped over the lazy dog, while the lazy 
dog was too lazy to care" 
1. for the query "lazy AND dog" the hitCount will be == (no of occurrences of 
"lazy" in the document) +  (no of occurrences of "dog" in the document)  =>  
3+2 => 5  


2. for the phrase query  \"lazy dog\"  the hitCount will be == (no of 
occurrences of exact phrase "lazy dog" in the document) => 2

This will be very useful  as an alternative scoring mechanism.

I already implemented this whole thing in the Solr source code(that I 
downloaded) and we are using it. So far it's going good. 
It would be really great if this feature is added to trunk (original  Solr) so 
that we don't have to implement the changes every time  a new version is 
released and also others could be benefited with this. 







--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-919) Cache and reuse the SolrConfig

2011-12-07 Thread Noble Paul (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164247#comment-13164247
 ] 

Noble Paul commented on SOLR-919:
-

@Drew , The ;ogic is mostly same, but there are some more changes required in 
the core to make it work because a few references are kept in SolrConfig which 
should have been keptin the core

> Cache and reuse the SolrConfig
> --
>
> Key: SOLR-919
> URL: https://issues.apache.org/jira/browse/SOLR-919
> Project: Solr
>  Issue Type: Improvement
>  Components: multicore
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-919.patch
>
>
> If there are 1000's of cores the no:of times we need to load and parse the 
> solrconfig.xml is going to be very expensive. It is desirable to just create 
> one instance of SolrConfig object and re-use it across cores

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-2952) InterruptedException during SorlCore instatiation

2011-12-07 Thread Krzysztof Kwiatosz (Created) (JIRA)
InterruptedException during SorlCore instatiation
-

 Key: SOLR-2952
 URL: https://issues.apache.org/jira/browse/SOLR-2952
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 3.4
Reporter: Krzysztof Kwiatosz


We have the following exception during SolrCore initialization:
{code}
org.apache.lucene.util.ThreadInterruptedException: 
java.lang.InterruptedException: sleep interrupted
 [java] java.lang.RuntimeException: 
org.apache.lucene.util.ThreadInterruptedException: 
java.lang.InterruptedException: sleep interrupted
 [java] at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1099)
 [java] at org.apache.solr.core.SolrCore.(SolrCore.java:585)
 [java] at 
de.hybris.platform.solrfacetsearch.solr.impl.SolrCoreRegistry.registerServer(SolrCoreRegistry.java:263)
 [java] at 
de.hybris.platform.solrfacetsearch.solr.impl.SolrCoreRegistry.getEmbeddedSolrServer(SolrCoreRegistry.java:203)
 [java] at 
de.hybris.platform.solrfacetsearch.solr.impl.SolrCoreRegistry.getEmbeddedSolrServer(SolrCoreRegistry.java:217)
 [java] at 
de.hybris.platform.solrfacetsearch.jalo.SolrfacetsearchManager$1.afterTenantStartUp(SolrfacetsearchManager.java:104)
 [java] at 
de.hybris.platform.core.AbstractTenant.executeStartupNotifyIfNecessary(AbstractTenant.java:601)
 [java] at 
de.hybris.platform.core.AbstractTenant.executeInitsIfNecessary(AbstractTenant.java:1012)
 [java] at 
de.hybris.platform.core.Registry.assureTenantStarted(Registry.java:478)
 [java] at 
de.hybris.platform.core.Registry.activateMasterTenant(Registry.java:423)
 [java] at 
de.hybris.platform.core.Registry.activateMasterTenantAndFailIfAlreadySet(Registry.java:388)
 [java] at 
de.hybris.platform.core.Registry.setCurrentTenantByID(Registry.java:497)
 [java] at 
de.hybris.platform.task.impl.DefaultTaskService$Poll.activateTenant(DefaultTaskService.java:1072)
 [java] at 
de.hybris.platform.task.impl.DefaultTaskService$Poll.run(DefaultTaskService.java:947)
 [java] Caused by: org.apache.lucene.util.ThreadInterruptedException: 
java.lang.InterruptedException: sleep interrupted
 [java] at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:696)
 [java] at 
org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:75)
 [java] at 
org.apache.lucene.index.IndexReader.open(IndexReader.java:421)
 [java] at 
org.apache.lucene.index.IndexReader.open(IndexReader.java:364)
 [java] at 
org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:38)
 [java] at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1088)
 [java] ... 13 more
 [java] Caused by: java.lang.InterruptedException: sleep interrupted
 [java] at java.lang.Thread.sleep(Native Method)
 [java] at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:694)
 [java] ... 18 more
{code}

What could be the reason?


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3620) FilterIndexReader does not override all of IndexReader methods

2011-12-07 Thread Shai Erera (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-3620:
---

Attachment: LUCENE-3620.patch

Patch makes the mentioned methods final, modifies SolrIndexReader and other IR 
extensions (ParallelReader, Instantiated, MemoryIndex) to not override them.

Also added a CHANGES entry under backwards compatibility.

If there are no objections, I will commit it later today.

> FilterIndexReader does not override all of IndexReader methods
> --
>
> Key: LUCENE-3620
> URL: https://issues.apache.org/jira/browse/LUCENE-3620
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/search
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3620.patch, LUCENE-3620.patch, LUCENE-3620.patch
>
>
> FilterIndexReader does not override all of IndexReader methods. We've hit an 
> error in LUCENE-3573 (and fixed it). So I thought to write a simple test 
> which asserts that FIR overrides all methods of IR (and we can filter our 
> methods that we don't think that it should override). The test is very simple 
> (attached), and it currently fails over these methods:
> {code}
> getRefCount
> incRef
> tryIncRef
> decRef
> reopen
> reopen
> reopen
> reopen
> clone
> numDeletedDocs
> document
> setNorm
> setNorm
> termPositions
> deleteDocument
> deleteDocuments
> undeleteAll
> getIndexCommit
> getUniqueTermCount
> getTermInfosIndexDivisor
> {code}
> I didn't yet fix anything in FIR -- if you spot a method that you think we 
> should not override and delegate, please comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org