JIRA's In Progress status

2008-05-15 Thread Otis Gospodnetic
Hi,

I was hunting for a way to set a JIRA issue status to "In Progress" (see 
http://confluence.atlassian.com/display/JIRA/Issue+status+and+workflow ), but 
couldn't find it.  It looks like that comes with "Workflow Actions" and Solr's 
has only Resolve Issue and Close Issue.  I see this on JIRA's admin page for 
Solr:

Workflow Scheme: None 

But I can't see a place to change that.  Can somebody add the "In Progress" 
action/status to Solr's JIRA? (and Lucene's?)


Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-05-15 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597358#action_12597358
 ] 

Otis Gospodnetic commented on SOLR-572:
---

I see (indexDir comment).  Might be better to make it more obvious then - 
"sourceIndex" for the Lucene index that serves as the source of data) vs. 
"targetIndex" (or "spellcheckerIndex") for the resulting spellchecker index.

For Lucene indices to be used as sources of data type="index", 
field="fieldName", location="path/to/lucene/index/directory" makes sense.

Ignore my comment about the schema, I'm just complicating things with that.  
Yes, one word per line for plain-text file data sources - that can easily be 
digested with PlainTextDictionary class (part of Lucene SC).


> Spell Checker as a Search Component
> ---
>
> Key: SOLR-572
> URL: https://issues.apache.org/jira/browse/SOLR-572
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.3
>
> Attachments: SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-05-15 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597354#action_12597354
 ] 

Shalin Shekhar Mangar commented on SOLR-572:


Otis, I agree that we should call "index' instead of "solr" for the type and 
"path" can be renamed to "location". But indexDir refers to the target for the 
spell check index whereas "path" currently refers to the source of the 
dictionary, so IMHO we should keep "indexDir" as it is (It can also be a 
relative path).

For supporting arbitrary lucene indices, user must specify type="index", 
field="fieldName", location="path/to/lucene/index/directory" which should be 
enough (TODO). In that case the analyzer can be fixed as something (say 
WhitespaceAnalyzer or StandardAnalyzer).

I'm not sure I understand your comment on the schema. If this is for text files 
then I was thinking more about having a text file which would have one word per 
line and all those words would go into the same dictionary.

> Spell Checker as a Search Component
> ---
>
> Key: SOLR-572
> URL: https://issues.apache.org/jira/browse/SOLR-572
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.3
>
> Attachments: SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-05-15 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597351#action_12597351
 ] 

Otis Gospodnetic commented on SOLR-572:
---

I had a quick look and it all looks nice and clean.
I like the config, though I think "solr" is too specific - the source field 
could be in a vanilla Lucene indexthat lives somewhere on disk, or example.  
Thus, I'd change "solr" to "index".  Oh, I see, you are reading field values 
from the index of the current core.  I think that is fine, but wouldn't it also 
be good to be able to read field values from a vanilla Lucene index? (but you 
wouldn't know the field type and thus would not be able to get the Analyzer for 
the field)

Also, and regardless of the above, instead of having "indexDir" and "path", why 
not call them both "location" and maybe even let them include the file: schema 
for consistency, if it works with the code that uses those locations?

Also on TODO:
* Read dictionary from plain-text files.

> Spell Checker as a Search Component
> ---
>
> Key: SOLR-572
> URL: https://issues.apache.org/jira/browse/SOLR-572
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.3
>
> Attachments: SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-05-15 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597345#action_12597345
 ] 

Noble Paul commented on SOLR-572:
-

 * the spellcheck.dictionary=default must be optional in query. The user must 
be able to name a dictionary as 'default' and that can be used as the default 
if no value is passed.
 



> Spell Checker as a Search Component
> ---
>
> Key: SOLR-572
> URL: https://issues.apache.org/jira/browse/SOLR-572
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.3
>
> Attachments: SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (SOLR-319) changes SynonymFilterFactoryto "Analyze" synonyms file

2008-05-15 Thread Koji Sekiguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi reassigned SOLR-319:
---

Assignee: Koji Sekiguchi

> changes SynonymFilterFactoryto "Analyze" synonyms file
> --
>
> Key: SOLR-319
> URL: https://issues.apache.org/jira/browse/SOLR-319
> Project: Solr
>  Issue Type: Improvement
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Attachments: SOLR-319.patch, SOLR-319.patch, SOLR-319.patch
>
>
> WHAT:
> Currently, SynonymFilterFactory works very well with N-gram tokenizer 
> (CJKTokenizer, for example).
> But we have to take care of the statement in synonyms.txt.
> For example, if I use CJKTokenizer (work as bi-gram for CJK chars) and want 
> C1C2C3 maps to C4C5C6,
> I have to write the rule as follows:
> C1C2 C2C3 => C4C5 C5C6
> But I want to write it "C1C2C3=>C4C5C6". This patch allows it. It is also 
> helpful for sharing synonyms.txt.
> HOW:
> tokenFactory attribute is added to  class="solr.SynonymFilterFactory"/>.
> If the attribute is specified, SynonymFilterFactory uses the TokenizerFactory 
> to create Tokenizer.
> Then SynonymFilterFactory uses the Tokenizer to get tokens from the rules in 
> synonyms.txt file.
> sample-1: CJKTokenizer
>  positionIncrementGap="100">
>   
> 
>  synonyms="ngram_synonym_test_ja.txt"
>   ignoreCase="true" expand="true" 
> tokenFactory="solr.CJKTokenizerFactory"/>
> 
>   
>   
> 
> 
>   
> 
> sample-2: NGramTokenizer
>  positionIncrementGap="100">
>   
>  maxGramSize="2"/>
> 
>   
>   
>  maxGramSize="2"/>
>  synonyms="ngram_synonym_test_ngram.txt"
>   ignoreCase="true" expand="true"
>   tokenFactory="solr.NGramTokenizerFactory" 
> minGramSize="2" maxGramSize="2"/>
> 
>   
> 
> backward compatibility:
> Yes. If you omit tokenFactory attribute from  class="solr.SynonymFilterFactory"/> tag, it works as usual.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-576) Make DocSetHitCollector public

2008-05-15 Thread Mike Klaas (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597288#action_12597288
 ] 

Mike Klaas commented on SOLR-576:
-

This is reasonable; I have a whole org/apache/solr/search directory structure 
in my project just to get access to this class.

ISTM that it should be documented before being made public, though.

> Make DocSetHitCollector public
> --
>
> Key: SOLR-576
> URL: https://issues.apache.org/jira/browse/SOLR-576
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.3
>Reporter: Jason Rutherglen
>Priority: Minor
>
> Make org.apache.solr.search.DocSetHitCollector public for use by other code

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: svn commit: r656826 - in /lucene/solr/trunk/src: java/org/apache/solr/update/DirectUpdateHandler2.java test/org/apache/solr/update/AutoCommitTest.java

2008-05-15 Thread Yonik Seeley
On Thu, May 15, 2008 at 4:39 PM,  <[EMAIL PROTECTED]> wrote:
> remove last vestiges of maxPendingDeletes from DUH2

Oops, thanks - I guess I missed that (I previously did a quick grep
and didn't see anything).

-Yonik


[jira] Created: (SOLR-576) Make DocSetHitCollector public

2008-05-15 Thread Jason Rutherglen (JIRA)
Make DocSetHitCollector public
--

 Key: SOLR-576
 URL: https://issues.apache.org/jira/browse/SOLR-576
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.3
Reporter: Jason Rutherglen
Priority: Minor


Make org.apache.solr.search.DocSetHitCollector public for use by other code

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (SOLR-556) Highlighting of multi-valued fields returns snippets which span multiple different values

2008-05-15 Thread Mike Klaas (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Klaas reassigned SOLR-556:
---

Assignee: Mike Klaas

> Highlighting of multi-valued fields returns snippets which span multiple 
> different values
> -
>
> Key: SOLR-556
> URL: https://issues.apache.org/jira/browse/SOLR-556
> Project: Solr
>  Issue Type: Bug
>  Components: highlighter
>Affects Versions: 1.3
> Environment: Tomcat 5.5
>Reporter: Lars Kotthoff
>Assignee: Mike Klaas
>Priority: Minor
> Attachments: solr-highlight-multivalued-example.xml, 
> solr-highlight-multivalued.patch
>
>
> When highlighting multi-valued fields, the highlighter sometimes returns 
> snippets which span multiple values, e.g. with values "foo" and "bar" and 
> search term "ba" the highlighter will create the snippet "foobar". 
> Furthermore it sometimes returns smaller snippets than it should, e.g. with 
> value "foobar" and search term "oo" it will create the snippet "oo" 
> regardless of hl.fragsize.
> I have been unable to determine the real cause for this, or indeed what 
> actually goes on at all. To reproduce the problem, I've used the following 
> steps:
> * create an index with multi-valued fields, one document should have at least 
> 3 values for these fields (in my case strings of length between 5 and 15 
> Japanese characters -- as far as I can tell plain old ASCII should produce 
> the same effect though)
> * search for part of a value in such a field with highlighting enabled, the 
> additional parameters I use are hl.fragsize=70, hl.requireFieldMatch=true, 
> hl.mergeContiguous=true (changing the parameters does not seem to have any 
> effect on the result though)
> * highlighted snippets should show effects described above

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-556) Highlighting of multi-valued fields returns snippets which span multiple different values

2008-05-15 Thread Mike Klaas (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597226#action_12597226
 ] 

Mike Klaas commented on SOLR-556:
-

Thanks for the report, Lars.  I'll take a look at this shortly.

> Highlighting of multi-valued fields returns snippets which span multiple 
> different values
> -
>
> Key: SOLR-556
> URL: https://issues.apache.org/jira/browse/SOLR-556
> Project: Solr
>  Issue Type: Bug
>  Components: highlighter
>Affects Versions: 1.3
> Environment: Tomcat 5.5
>Reporter: Lars Kotthoff
>Priority: Minor
> Attachments: solr-highlight-multivalued-example.xml, 
> solr-highlight-multivalued.patch
>
>
> When highlighting multi-valued fields, the highlighter sometimes returns 
> snippets which span multiple values, e.g. with values "foo" and "bar" and 
> search term "ba" the highlighter will create the snippet "foobar". 
> Furthermore it sometimes returns smaller snippets than it should, e.g. with 
> value "foobar" and search term "oo" it will create the snippet "oo" 
> regardless of hl.fragsize.
> I have been unable to determine the real cause for this, or indeed what 
> actually goes on at all. To reproduce the problem, I've used the following 
> steps:
> * create an index with multi-valued fields, one document should have at least 
> 3 values for these fields (in my case strings of length between 5 and 15 
> Japanese characters -- as far as I can tell plain old ASCII should produce 
> the same effect though)
> * search for part of a value in such a field with highlighting enabled, the 
> additional parameters I use are hl.fragsize=70, hl.requireFieldMatch=true, 
> hl.mergeContiguous=true (changing the parameters does not seem to have any 
> effect on the result though)
> * highlighted snippets should show effects described above

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Adding Koji to JIRA

2008-05-15 Thread Chris Hostetter

: I *think* Yonik or Hoss have to manually give you JIRA privileges...

...or Erik.

I've added Koji to the Jira committers list.  While i was in there, I gave 
all the other PMC members who are Solr commiters the "Admin" permisions 
for Solr in Jira (so now any current PMC member can do stuff like this in 
the future)


-Hoss



Accessing IndexReader during core initialization hangs init

2008-05-15 Thread Shalin Shekhar Mangar
Hi,

While working on SOLR-572, I found that if I try to access the
IndexReader using SolrCore.getSearcher().get().getReader() within the
SolrCoreAware.inform method, the initialization process hangs.
Basically, the SolrCore.getSearcher halts at the searcherLock.wait()
call in the snippet below:

// check to see if we can wait for someone else's searcher to be set
  if (onDeckSearchers>0 && !forceNew && _searcher==null) {
try {
  searcherLock.wait();
} catch (InterruptedException e) {
  log.info(SolrException.toStr(e));
}
  }

Is this by design? Are SearchComponents not supposed to access the
IndexReader in this way? I needed access to the IndexReader so that I
can create the spell check index during core initialization. For now,
I've moved the index creation to the first query coming into
SpellCheckComponent (note to myself: review thread-safety in the init
code).

--
Regards,
Shalin Shekhar Mangar.


[jira] Updated: (SOLR-572) Spell Checker as a Search Component

2008-05-15 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-572:
---

Attachment: SOLR-572.patch

A first cut for this issue. Please consider this as work in progress. I've 
posted this to get feedback on the approach and syntax.

The contains the following:
* SpellCheckComponent is an implementation of SearchComponent
* The configuration is specified in solrconfig.xml with multiple "dictionary" 
nodes. Each dictionary must have a name and a type. The name must be specified 
during query time. The type is needed to allow for more than one way of loading 
data into the spell index (solr field or file). For example:
{code:xml}


default
solr
word
c:/temp/spellindex


external
file
spellings.txt


{code}
* If indexDir is not present in the dictionary's configuration then a 
RAMDirectory is used, otherwise a FSDirectory is used.
* This patch supports dictionaries loaded from Solr fields.
* A separate Lucene SpellChecker is created for each configured dictionary
* Sample query syntax is as follows:
** 
{{/select/?q=aura&version=2.2&start=0&rows=10&indent=on&spellcheck=true&spellcheck.dictionary=default&spellcheck.count=10}}
** 
{{/select/?q=toyata&version=2.2&start=0&rows=10&indent=on&spellcheck=true&spellcheck.dictionary=default}}
* The value for "q" is analyzed with the Solr field's query analyzer. 
Suggestions for each token are fetched separately.
* Only one suggestion for a query is given by default. This should be used for 
multi-token queries.
* If spellcheck.count is specified then the response has a number of 
suggestions <= spellcheck.count for each token separately.
* Only unique words are returned in the suggestions.

Things to be done:
* Add JUnit tests
* Reloading dictionaries. Currently the dictionary is loaded only once during 
the first request.
* Make things more configurable like SpellCheckerRequestHandler
* Add support for onlyMorePopular flag as in SpellCheckerRequestHandler

> Spell Checker as a Search Component
> ---
>
> Key: SOLR-572
> URL: https://issues.apache.org/jira/browse/SOLR-572
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.3
>
> Attachments: SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-379) KStem Token Filter

2008-05-15 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597185#action_12597185
 ] 

Otis Gospodnetic commented on SOLR-379:
---

It would be great to have this available in Solr.  Because of Kstem's 
incompatible library, I don't know how we can handle this.  Incompatible 
license really just means we cannot distribute the KStem code (and cannot have 
it in the Lucene/Solr svn repository).  Usually when incompatible licensing is 
a problem we say "modify the build script to download the needed library on 
demand if it's not present locally".  This is what some of the Lucene contrib 
components do, for example.

However, looking at your ZIP file I see:

  -rw-r--r--  2836  15-Oct-2007  17:16:46  
src/java/org/apache/solr/analysis/KStemFilterFactory.java
  -rw-r--r-- 4  15-Oct-2007  16:28:08  
src/java/org/apache/lucene/analysis/KStemmer.java
  -rw-r--r--  4501  15-Oct-2007  17:08:38  
src/java/org/apache/lucene/analysis/KStemFilter.java
  -rw-r--r-- 34259  15-Oct-2007  16:28:24  
src/java/org/apache/lucene/analysis/KStemData8.java
  -rw-r--r-- 39918  15-Oct-2007  16:28:28  
src/java/org/apache/lucene/analysis/KStemData7.java
  -rw-r--r-- 41412  15-Oct-2007  16:28:34  
src/java/org/apache/lucene/analysis/KStemData6.java
  -rw-r--r-- 40457  15-Oct-2007  16:28:40  
src/java/org/apache/lucene/analysis/KStemData5.java
  -rw-r--r-- 40823  15-Oct-2007  16:28:44  
src/java/org/apache/lucene/analysis/KStemData4.java
  -rw-r--r-- 39808  15-Oct-2007  16:28:50  
src/java/org/apache/lucene/analysis/KStemData3.java
  -rw-r--r-- 42696  15-Oct-2007  16:29:00  
src/java/org/apache/lucene/analysis/KStemData2.java
  -rw-r--r-- 40020  15-Oct-2007  16:29:14  
src/java/org/apache/lucene/analysis/KStemData1.java

But this is really just a duplicate of what's in 
http://ciir.cs.umass.edu/downloads/files/KStem.jar, plus the Solr-specific 
KStemFilterFactory.java.

So, could we simply download KStem.jar on demand?  And is 
KStemFilterFactory.java really copyright CIIR?  If we can change that to ASL 
then we can include it in the repo and with the modified build that downloads 
KStem.jar before compiling this class would compile.


> KStem Token Filter
> --
>
> Key: SOLR-379
> URL: https://issues.apache.org/jira/browse/SOLR-379
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Pieter Berkel
>Priority: Minor
> Attachments: KStemSolr.zip
>
>
> A Lucene / Solr implementation of the KStem stemmer.  Full credit goes to 
> Harry Wagner for adapting the Lucene version found here:
> http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi
> Background discussion to this stemmer (including licensing issues) can be 
> found in this thread:
> http://www.nabble.com/Embedded-about-50--faster-for-indexing-tf4325720.html#a12376295
> I've made some minor changes to KStemFilterFactory so that it compiles 
> cleanly against trunk:
> 1) removed some unnecessary imports
> 2) changed the init() method parameters introduced by SOLR-215
> 3) moved KStemFilterFactory into package org.apache.solr.analysis
> Once compiled and included in your Solr war (or as a jar in your lib 
> directory, the KStem filter can be used in your schema very easily:
>   
> 
>  words="stopwords.txt"/>
> 
> 
> 
> 
>   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-553) Highlighter does not match phrase queries correctly

2008-05-15 Thread Bojan Smid (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bojan Smid updated SOLR-553:


Attachment: Solr-553.patch

Added unit test for this fix to the patch.

> Highlighter does not match phrase queries correctly
> ---
>
> Key: SOLR-553
> URL: https://issues.apache.org/jira/browse/SOLR-553
> Project: Solr
>  Issue Type: New Feature
>  Components: highlighter
>Affects Versions: 1.2
> Environment: all
>Reporter: Brian Whitman
>Assignee: Otis Gospodnetic
> Attachments: highlighttest.xml, Solr-553.patch, Solr-553.patch
>
>
> http://www.nabble.com/highlighting-pt2%3A-returning-tokens-out-of-order-from-PhraseQuery-to16156718.html
> Say we search for the band "I Love You But I've Chosen Darkness"
> .../selectrows=100&q=%22I%20Love%20You%20But%20I\'ve%20Chosen%20Darkness%22&fq=type:html&hl=true&hl.fl=content&hl.fragsize=500&hl.snippets=5&hl.simple.pre=%3Cspan%3E&hl.simple.post=%3C/span%3E
> The highlight returns a snippet that does have the name altogether:
> Lights (Live) : I Love You But 
> I've Chosen Darkness :
> But also returns unrelated snips from the same page:
> Black Francis Shop "I Think I Love 
> You"
> A correct highlighter should not return snippets that do not match the phrase 
> exactly.
> LUCENE-794 (not yet committed, but seems to be ready) fixes up the problem 
> from the Lucene end. Solr should get it too.
> Related: SOLR-575 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-572) Spell Checker as a Search Component

2008-05-15 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-572:
---

Summary: Spell Checker as a Search Component  (was: Spell Checker as a 
Search Handler)

> Spell Checker as a Search Component
> ---
>
> Key: SOLR-572
> URL: https://issues.apache.org/jira/browse/SOLR-572
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.3
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.