[jira] Commented: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12604818#action_12604818 ] Grant Ingersoll commented on SOLR-572: -- {quote} the spell checker component handling build/reload seems highly awkward to me. suggestion component really should just do that... and wrap the other operations as a /spellchecker/rebuild kinda thing and not even necessarily componentize those operations since they don't really necessarily need to be hooked together with other operations as a single request. {quote} I've thought about a bit, too, as it bothers me, too, but I think the initialization, etc. gets a bit tricky, like all Solr initialization. Not sure what to do. Spell Checker as a Search Component --- Key: SOLR-572 URL: https://issues.apache.org/jira/browse/SOLR-572 Project: Solr Issue Type: New Feature Components: spellchecker Affects Versions: 1.3 Reporter: Shalin Shekhar Mangar Assignee: Grant Ingersoll Priority: Minor Fix For: 1.3 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch Expose the Lucene contrib SpellChecker as a Search Component. Provide the following features: * Allow creating a spell index on a given field and make it possible to have multiple spell indices -- one for each field * Give suggestions on a per-field basis * Given a multi-word query, give only one consistent suggestion * Process the query with the same analyzer specified for the source field and process each token separately * Allow the user to specify minimum length for a token (optional) Consistency criteria for a multi-word query can consist of the following: * Preserve the correct words in the original query as it is * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12604830#action_12604830 ] Grant Ingersoll commented on SOLR-572: -- Sean, I see the issue and am working on it. Good catch. I'll have a patch shortly. Spell Checker as a Search Component --- Key: SOLR-572 URL: https://issues.apache.org/jira/browse/SOLR-572 Project: Solr Issue Type: New Feature Components: spellchecker Affects Versions: 1.3 Reporter: Shalin Shekhar Mangar Assignee: Grant Ingersoll Priority: Minor Fix For: 1.3 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch Expose the Lucene contrib SpellChecker as a Search Component. Provide the following features: * Allow creating a spell index on a given field and make it possible to have multiple spell indices -- one for each field * Give suggestions on a per-field basis * Given a multi-word query, give only one consistent suggestion * Process the query with the same analyzer specified for the source field and process each token separately * Allow the user to specify minimum length for a token (optional) Consistency criteria for a multi-word query can consist of the following: * Preserve the correct words in the original query as it is * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated SOLR-572: - Attachment: SOLR-572.patch Fixes Sean's issue w/ extended results. Also, slightly modified the extended results results. See the {code} testExtendedResultsCount() {code} in SpellCheckComponentTest for the new format. Basically, though it tries to normalize the map entries so that one can ask for specific things by name,. Spell Checker as a Search Component --- Key: SOLR-572 URL: https://issues.apache.org/jira/browse/SOLR-572 Project: Solr Issue Type: New Feature Components: spellchecker Affects Versions: 1.3 Reporter: Shalin Shekhar Mangar Assignee: Grant Ingersoll Priority: Minor Fix For: 1.3 Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch Expose the Lucene contrib SpellChecker as a Search Component. Provide the following features: * Allow creating a spell index on a given field and make it possible to have multiple spell indices -- one for each field * Give suggestions on a per-field basis * Given a multi-word query, give only one consistent suggestion * Process the query with the same analyzer specified for the source field and process each token separately * Allow the user to specify minimum length for a token (optional) Consistency criteria for a multi-word query can consist of the following: * Preserve the correct words in the original query as it is * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-594) StopFilterFactory, SynonymFilterFactory and EnglishProterFilterFactory not backwards compatible because of inform
[ https://issues.apache.org/jira/browse/SOLR-594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-594: -- Fix Version/s: 1.3 Summary: StopFilterFactory, SynonymFilterFactory and EnglishProterFilterFactory not backwards compatible because of inform (was: StopWordFilter isn't backwards compatible) (Updating title to reflect full scope of issue) Original solr-user thread pointing out problem... http://www.nabble.com/NullPointerException-at-lucene.analysis.StopFilter-with-1.3-to17564627.html#a17564627 Discussion on solr-dev about how to deal with this... http://www.nabble.com/3-TokenFilter-factories-not-compatible-with-1.2-to17658628.html#a17658628 ...current consensus seems to be that the best approach is... {quote} 3) Documentation and Education Since this wasn't exactly a use case we ever advertised, we could punt on the problem by putting a disclaimer in the CAHNGES.txt that ayone directly constructing those 3 classes should explicitly call inform() on the instances after calling init. #3 is obviously the simplest approach as developers, and to be quite honest: probably impacts the fewest total number of people (since there are probably very few people constructing Factory instances themselves) {quote} but first we're going to try and get some feedback from solr-user to verify that this really will only impact a small population of users. StopFilterFactory, SynonymFilterFactory and EnglishProterFilterFactory not backwards compatible because of inform - Key: SOLR-594 URL: https://issues.apache.org/jira/browse/SOLR-594 Project: Solr Issue Type: Bug Affects Versions: 1.3 Reporter: Ronald Braun Priority: Minor Fix For: 1.3 Direct construction of StopWordFilter is not backwards compatible between 1.2 and 1.3. Here is some test code that throws a null pointer exception in 1.3 but that functions correctly in 1.2. TokenizerFactory tokenizer = new WhitespaceTokenizerFactory(); MapString, String args = new HashMapString, String(); args.put(ignoreCase, true); args.put(words, stopwords.txt); StopFilterFactory stopFilter = new StopFilterFactory(); stopFilter.init(args); args = new HashMapString, String(); args.put(generateWordParts, 1); args.put(generateNumberParts, 1); args.put(catenateWords, 0); args.put(catenateNumbers, 0); args.put(catenateAll, 0); WordDelimiterFilterFactory wordFilter = new WordDelimiterFilterFactory(); wordFilter.init(args); TokenFilterFactory[] filters = new TokenFilterFactory[] {stopFilter, wordFilter }; TokenizerChain pipeline =TokenizerChain(tokenizer, filters); /*** throws a null pointer exception in 1.3: ***/ boolean onlyStopWords = pipeline.tokenStream(null, new StringReader(query)).next() == null; Hoss commented thusly in the solr forums (including a workaround): The short answer is: right after you call stopFilter.init(args) call stopFilter.inform(solrCore.getSolrConfig().getResourceLoader()); This is an interesting use case that wasn't really considered when we switched away from using hte SolrCore singlton and the the ResourceLoaderAware interface was added. we made sure things would still work for people who had their own custom Analysis Factories, but some of the functionality in *existing* Factories was moved from the init() method to inform() ... which means the classes aren't technically backwards compatibly for people doing what you're doing: constructing them directly. When I have some more time, i'll spin up a thread on solr-dev to discuss what we should do about this -- n the mean time feel free to file a bug that StopFilter isn't backwards compatible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-584) XSL for stats.jsp ignores core
[ https://issues.apache.org/jira/browse/SOLR-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yousef Ourabi updated SOLR-584: --- Attachment: stats.xsl.patch tentative patch for stats.xsl, creates new Multicore Mode table and displays either Single Core Mode or multicore mode, with core name. We might want to think about just making this a row in the Core table, but this will at least star that discussion. XSL for stats.jsp ignores core Key: SOLR-584 URL: https://issues.apache.org/jira/browse/SOLR-584 Project: Solr Issue Type: Bug Affects Versions: 1.3 Reporter: Hoss Man Assignee: Hoss Man Priority: Trivial Fix For: 1.3 Attachments: stats.xsl.patch stats.xsl doesn't do anything with the core info from the XML, so it gets dumped unceremoniously into the middle of the page. this is particulrly disconcerting in single core mode when the value is null (which should probably be changed in stats.jsp to something that seems less like an error) http://www.nabble.com/%22null%22-in-admin-page-to17486312.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-424) The ruby output type produces incorrect output for numeric types without a value
[ https://issues.apache.org/jira/browse/SOLR-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12604988#action_12604988 ] Yonik Seeley commented on SOLR-424: --- The latest patch would have a non-negligible performance impact (since it calls toString() and trim() on every object) and incorrect output (since it would effectively disable zero length strings for any type, including strings). The ruby output type produces incorrect output for numeric types without a value Key: SOLR-424 URL: https://issues.apache.org/jira/browse/SOLR-424 Project: Solr Issue Type: Bug Components: clients - ruby - flare Affects Versions: 1.1.0, 1.2, 1.3 Reporter: Kurt Schrader Assignee: Erik Hatcher Priority: Critical Fix For: 1.3 Attachments: fix_ruby_output.patch, TextResponseWriter-424.java.patch, TextResponseWriter-SOLR-424.patch, zero_length_int.patch When parsing the Ruby output returned from Solr, if a numerical value has no value in the index, it causes an invalid Ruby hash to be returned. For instance: {code:xml} 'response'={'numFound'=1,'start'=0,'maxScore'=4.951244,'docs'=[ { 'subclass_t'='Protocol', 'pk_i'=1, 'id'='Protocol:1', 'name_t'='Falcipain IC50', 'group_id_i'=, 'score'=4.951244}] }} {code} is not a valid hash because 'group_id_i' does not resolve to anything. It should resolve to nil: {code:xml} 'response'={'numFound'=1,'start'=0,'maxScore'=4.951244,'docs'=[ { 'subclass_t'='Protocol', 'pk_i'=1, 'id'='Protocol:1', 'name_t'='Falcipain IC50', 'group_id_i'=nil, 'score'=4.951244}] }} {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-424) The ruby output type produces incorrect output for numeric types without a value
[ https://issues.apache.org/jira/browse/SOLR-424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12604992#action_12604992 ] Yousef Ourabi commented on SOLR-424: The other place I was thinking would be the WriteInt implementation in JsonResponseWriter -- do the check there. Would that make more sense to you? The ruby output type produces incorrect output for numeric types without a value Key: SOLR-424 URL: https://issues.apache.org/jira/browse/SOLR-424 Project: Solr Issue Type: Bug Components: clients - ruby - flare Affects Versions: 1.1.0, 1.2, 1.3 Reporter: Kurt Schrader Assignee: Erik Hatcher Priority: Critical Fix For: 1.3 Attachments: fix_ruby_output.patch, TextResponseWriter-424.java.patch, TextResponseWriter-SOLR-424.patch, zero_length_int.patch When parsing the Ruby output returned from Solr, if a numerical value has no value in the index, it causes an invalid Ruby hash to be returned. For instance: {code:xml} 'response'={'numFound'=1,'start'=0,'maxScore'=4.951244,'docs'=[ { 'subclass_t'='Protocol', 'pk_i'=1, 'id'='Protocol:1', 'name_t'='Falcipain IC50', 'group_id_i'=, 'score'=4.951244}] }} {code} is not a valid hash because 'group_id_i' does not resolve to anything. It should resolve to nil: {code:xml} 'response'={'numFound'=1,'start'=0,'maxScore'=4.951244,'docs'=[ { 'subclass_t'='Protocol', 'pk_i'=1, 'id'='Protocol:1', 'name_t'='Falcipain IC50', 'group_id_i'=nil, 'score'=4.951244}] }} {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.