[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-06-13 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12604818#action_12604818
 ] 

Grant Ingersoll commented on SOLR-572:
--

{quote}
the spell checker component handling build/reload seems highly awkward to me. 
suggestion component really should just do that... and wrap the other 
operations as a /spellchecker/rebuild kinda thing and not even necessarily 
componentize those operations since they don't really necessarily need to be 
hooked together with other operations as a single request.
{quote}

I've thought about a bit, too, as it bothers me, too, but I think the 
initialization, etc. gets a bit tricky, like all Solr initialization.  Not sure 
what to do.

 Spell Checker as a Search Component
 ---

 Key: SOLR-572
 URL: https://issues.apache.org/jira/browse/SOLR-572
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.3

 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch


 Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
 following features:
 * Allow creating a spell index on a given field and make it possible to have 
 multiple spell indices -- one for each field
 * Give suggestions on a per-field basis
 * Given a multi-word query, give only one consistent suggestion
 * Process the query with the same analyzer specified for the source field and 
 process each token separately
 * Allow the user to specify minimum length for a token (optional)
 Consistency criteria for a multi-word query can consist of the following:
 * Preserve the correct words in the original query as it is
 * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-06-13 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12604830#action_12604830
 ] 

Grant Ingersoll commented on SOLR-572:
--

Sean,

I see the issue and am working on it.  Good catch.  I'll have a patch shortly.

 Spell Checker as a Search Component
 ---

 Key: SOLR-572
 URL: https://issues.apache.org/jira/browse/SOLR-572
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.3

 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch


 Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
 following features:
 * Allow creating a spell index on a given field and make it possible to have 
 multiple spell indices -- one for each field
 * Give suggestions on a per-field basis
 * Given a multi-word query, give only one consistent suggestion
 * Process the query with the same analyzer specified for the source field and 
 process each token separately
 * Allow the user to specify minimum length for a token (optional)
 Consistency criteria for a multi-word query can consist of the following:
 * Preserve the correct words in the original query as it is
 * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-572) Spell Checker as a Search Component

2008-06-13 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-572:
-

Attachment: SOLR-572.patch

Fixes Sean's issue w/ extended results.  

Also, slightly modified the extended results results.  See the 
{code}
testExtendedResultsCount()
{code}

in SpellCheckComponentTest for the new format.  Basically, though it tries to 
normalize the map entries so that one can ask for specific things by name,.

 Spell Checker as a Search Component
 ---

 Key: SOLR-572
 URL: https://issues.apache.org/jira/browse/SOLR-572
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.3

 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
 SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch


 Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
 following features:
 * Allow creating a spell index on a given field and make it possible to have 
 multiple spell indices -- one for each field
 * Give suggestions on a per-field basis
 * Given a multi-word query, give only one consistent suggestion
 * Process the query with the same analyzer specified for the source field and 
 process each token separately
 * Allow the user to specify minimum length for a token (optional)
 Consistency criteria for a multi-word query can consist of the following:
 * Preserve the correct words in the original query as it is
 * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-594) StopFilterFactory, SynonymFilterFactory and EnglishProterFilterFactory not backwards compatible because of inform

2008-06-13 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-594:
--

Fix Version/s: 1.3
  Summary: StopFilterFactory, SynonymFilterFactory and 
EnglishProterFilterFactory not backwards compatible because of inform  (was: 
StopWordFilter isn't backwards compatible)

(Updating title to reflect full scope of issue)

Original solr-user thread pointing out problem...
http://www.nabble.com/NullPointerException-at-lucene.analysis.StopFilter-with-1.3-to17564627.html#a17564627

Discussion on solr-dev about how to deal with this...
http://www.nabble.com/3-TokenFilter-factories-not-compatible-with-1.2-to17658628.html#a17658628

...current consensus seems to be that the best approach is...

{quote}
 3) Documentation and Education
 Since this wasn't exactly a use case we ever advertised, we could punt on
 the problem by putting a disclaimer in the CAHNGES.txt that ayone directly
 constructing those 3 classes should explicitly call inform() on the
 instances after calling init.


 #3 is obviously the simplest approach as developers, and to be quite honest:
 probably impacts the fewest total number of people (since there are probably
 very few people constructing Factory instances themselves)
{quote}

but first we're going to try and get some feedback from solr-user to verify 
that this really will only impact a small population of users.

 StopFilterFactory, SynonymFilterFactory and EnglishProterFilterFactory not 
 backwards compatible because of inform
 -

 Key: SOLR-594
 URL: https://issues.apache.org/jira/browse/SOLR-594
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Ronald Braun
Priority: Minor
 Fix For: 1.3


 Direct construction of StopWordFilter is not backwards compatible between 1.2 
 and 1.3.  Here is some test code that throws a null pointer exception in 1.3 
 but that functions correctly in 1.2.
  TokenizerFactory tokenizer = new WhitespaceTokenizerFactory();
   MapString, String args = new HashMapString, String();
   args.put(ignoreCase, true);
   args.put(words, stopwords.txt);
   StopFilterFactory stopFilter = new StopFilterFactory();
   stopFilter.init(args);
  args = new HashMapString, String();
  args.put(generateWordParts, 1);
  args.put(generateNumberParts, 1);
  args.put(catenateWords, 0);
  args.put(catenateNumbers, 0);
  args.put(catenateAll, 0);
  WordDelimiterFilterFactory wordFilter = new WordDelimiterFilterFactory();
  wordFilter.init(args);
  TokenFilterFactory[] filters = new TokenFilterFactory[] {stopFilter, 
 wordFilter };
  TokenizerChain pipeline =TokenizerChain(tokenizer, filters);
  /*** throws a null pointer exception in 1.3: ***/
 boolean onlyStopWords = pipeline.tokenStream(null, new 
 StringReader(query)).next() == null;
 Hoss commented thusly in the solr forums (including a workaround):
 The short answer is: right after you call stopFilter.init(args) call
 stopFilter.inform(solrCore.getSolrConfig().getResourceLoader());
 This is an interesting use case that wasn't really considered when we
 switched away from using hte SolrCore singlton and the the
 ResourceLoaderAware interface was added.  we made sure things would still
 work for people who had their own custom Analysis Factories, but some of
 the functionality in *existing* Factories was moved from the init() method
 to inform() ... which means the classes aren't technically backwards
 compatibly for people doing what you're doing: constructing them directly.
 When I have some more time, i'll spin up a thread on solr-dev to discuss
 what we should do about this -- n the mean time feel free to file a bug
 that StopFilter isn't backwards compatible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-584) XSL for stats.jsp ignores core

2008-06-13 Thread Yousef Ourabi (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yousef Ourabi updated SOLR-584:
---

Attachment: stats.xsl.patch

tentative patch for stats.xsl, creates new Multicore Mode table and displays 
either Single Core Mode or multicore mode, with core name. We might want to 
think about just making this  a row in the Core table, but this will at least 
star that discussion.

 XSL for stats.jsp ignores core
 

 Key: SOLR-584
 URL: https://issues.apache.org/jira/browse/SOLR-584
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Hoss Man
Assignee: Hoss Man
Priority: Trivial
 Fix For: 1.3

 Attachments: stats.xsl.patch


 stats.xsl doesn't do anything with the core info from the XML, so it gets 
 dumped unceremoniously into the middle of the page.
 this is particulrly disconcerting in single core mode when the value is 
 null  (which should probably be changed in stats.jsp to something that 
 seems less like an error)
 http://www.nabble.com/%22null%22-in-admin-page-to17486312.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-424) The ruby output type produces incorrect output for numeric types without a value

2008-06-13 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12604988#action_12604988
 ] 

Yonik Seeley commented on SOLR-424:
---

The latest patch would have a non-negligible performance impact (since it calls 
toString() and trim() on every object)
and incorrect output (since it would effectively disable zero length strings 
for any type, including strings).

 The ruby output type produces incorrect output for numeric types without a 
 value
 

 Key: SOLR-424
 URL: https://issues.apache.org/jira/browse/SOLR-424
 Project: Solr
  Issue Type: Bug
  Components: clients - ruby - flare
Affects Versions: 1.1.0, 1.2, 1.3
Reporter: Kurt Schrader
Assignee: Erik Hatcher
Priority: Critical
 Fix For: 1.3

 Attachments: fix_ruby_output.patch, 
 TextResponseWriter-424.java.patch, TextResponseWriter-SOLR-424.patch, 
 zero_length_int.patch


 When parsing the Ruby output returned from Solr, if a numerical value has no 
 value in the index, it causes an invalid Ruby hash to be returned.  For 
 instance:
 {code:xml} 
  'response'={'numFound'=1,'start'=0,'maxScore'=4.951244,'docs'=[
   {
'subclass_t'='Protocol',
'pk_i'=1,
'id'='Protocol:1',
'name_t'='Falcipain IC50',
'group_id_i'=,
'score'=4.951244}]
  }}
 {code}
 is not a valid hash because 'group_id_i' does not resolve to anything.  It 
 should resolve to nil:
 {code:xml} 
  'response'={'numFound'=1,'start'=0,'maxScore'=4.951244,'docs'=[
   {
'subclass_t'='Protocol',
'pk_i'=1,
'id'='Protocol:1',
'name_t'='Falcipain IC50',
'group_id_i'=nil,
'score'=4.951244}]
  }}
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-424) The ruby output type produces incorrect output for numeric types without a value

2008-06-13 Thread Yousef Ourabi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12604992#action_12604992
 ] 

Yousef Ourabi commented on SOLR-424:


The other place I was thinking would be the WriteInt implementation in 
JsonResponseWriter -- do the check there.  Would that make more sense to you? 

 The ruby output type produces incorrect output for numeric types without a 
 value
 

 Key: SOLR-424
 URL: https://issues.apache.org/jira/browse/SOLR-424
 Project: Solr
  Issue Type: Bug
  Components: clients - ruby - flare
Affects Versions: 1.1.0, 1.2, 1.3
Reporter: Kurt Schrader
Assignee: Erik Hatcher
Priority: Critical
 Fix For: 1.3

 Attachments: fix_ruby_output.patch, 
 TextResponseWriter-424.java.patch, TextResponseWriter-SOLR-424.patch, 
 zero_length_int.patch


 When parsing the Ruby output returned from Solr, if a numerical value has no 
 value in the index, it causes an invalid Ruby hash to be returned.  For 
 instance:
 {code:xml} 
  'response'={'numFound'=1,'start'=0,'maxScore'=4.951244,'docs'=[
   {
'subclass_t'='Protocol',
'pk_i'=1,
'id'='Protocol:1',
'name_t'='Falcipain IC50',
'group_id_i'=,
'score'=4.951244}]
  }}
 {code}
 is not a valid hash because 'group_id_i' does not resolve to anything.  It 
 should resolve to nil:
 {code:xml} 
  'response'={'numFound'=1,'start'=0,'maxScore'=4.951244,'docs'=[
   {
'subclass_t'='Protocol',
'pk_i'=1,
'id'='Protocol:1',
'name_t'='Falcipain IC50',
'group_id_i'=nil,
'score'=4.951244}]
  }}
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.