[jira] [Commented] (SOLR-2934) Problem with Solr Hunspell with French Dictionary

2012-07-16 Thread Stephan Meisinger (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414942#comment-13414942
 ] 

Stephan Meisinger commented on SOLR-2934:
-

Please consider look at this again:
I can reproduce the original StringOutOfBoundException in 
DoubleASCIIFlagParsingStrategy

I think this is caused by 

 for (int i = 0; i  rawFlags.length(); i+=2) {
char cookedFlag = (char) ((int) rawFlags.charAt(i) + (int) rawFlags.charAt(i + 
1)); //  i +1 here!!!

we have used the dictonary with solr 3.3 (+patched with files from 
LUCENE-3414/SOLR-2769) changed this given line to 

 for (int i = 0; i  rawFlags.length()-1; i+=2) { //  reduce size by 1 
because of .charAt(i+1)
char cookedFlag = (char) ((int) rawFlags.charAt(i) + (int) rawFlags.charAt(i + 
1));

this worked flawless for us. 

 Problem with Solr Hunspell with French Dictionary
 -

 Key: SOLR-2934
 URL: https://issues.apache.org/jira/browse/SOLR-2934
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 3.5
 Environment: Windows 7
Reporter: Nathan Castelein
Assignee: Chris Male
 Fix For: 4.0

 Attachments: en_GB.aff, en_GB.dic


 I'm trying to add the HunspellStemFilterFactory to my Solr project. 
 I'm trying this on a fresh new download of Solr 3.5.
 I downloaded french dictionary here (found it from here): 
 http://www.dicollecte.org/download/fr/hunspell-fr-moderne-v4.3.zip
 But when I start Solr and go to the Solr Analysis, an error occurs in Solr.
 Is there the trace : 
 java.lang.RuntimeException: Unable to load hunspell data! 
 [dictionary=en_GB.dic,affix=fr-moderne.aff]
   at 
 org.apache.solr.analysis.HunspellStemFilterFactory.inform(HunspellStemFilterFactory.java:82)
   at 
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:546)
   at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:126)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:461)
   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
   at 
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94)
   at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)
   at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
   at 
 org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713)
   at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
   at 
 org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282)
   at 
 org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518)
   at 
 org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499)
   at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
   at 
 org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
   at 
 org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
   at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
   at 
 org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
   at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
   at 
 org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
   at org.mortbay.jetty.Server.doStart(Server.java:224)
   at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
   at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
   at java.lang.reflect.Method.invoke(Unknown Source)
   at org.mortbay.start.Main.invokeMain(Main.java:194)
   at org.mortbay.start.Main.start(Main.java:534)
   at org.mortbay.start.Main.start(Main.java:441)
   at org.mortbay.start.Main.main(Main.java:119)
 Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
 range: 3
   at java.lang.String.charAt(Unknown Source)
   at 
 org.apache.lucene.analysis.hunspell.HunspellDictionary$DoubleASCIIFlagParsingStrategy.parseFlags(HunspellDictionary.java:382)
   at 
 org.apache.lucene.analysis.hunspell.HunspellDictionary.parseAffix(HunspellDictionary.java:165)
   at 
 org.apache.lucene.analysis.hunspell.HunspellDictionary.readAffixFile(HunspellDictionary.java:121)
   at 
 

[jira] [Created] (SOLR-2751) TermsComponent terms.regex and terms.upper does not always work

2011-09-09 Thread Stephan Meisinger (JIRA)
TermsComponent terms.regex and terms.upper does not always work
---

 Key: SOLR-2751
 URL: https://issues.apache.org/jira/browse/SOLR-2751
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Affects Versions: 3.3
 Environment: Solr 3.3
Reporter: Stephan Meisinger


TermComponent with a regex does check upper bound only on regexp success.

example:

terms.regex.flag=case_insensitive
terms.fl=suggest_fr
terms.limit=10
terms.regex=a.*
terms.lower=A
terms.upper=b

will also check terms starting with 'b' up to 'z'. But this wouldn't be needed. 
For this example upper is ignored. Currently checks are done:

[lower] - start loop at
[regexp] - miss: continue
[upper] - miss: break
[freq] - miss: continue

should be done:

[lower] - start loop at
[upper] - miss: break
[freq] - miss: continue (I think double compare is much faster then a std 
regexp)
[regexp] - miss: continue





--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org