from:"Tommaso Teofili \(Updated\) \(JIRA\)"

[jira] [Updated] (SOLR-2983) Unable to load custom MergePolicy

2012-03-29 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated SOLR-2983:
--

Fix Version/s: 3.6

> Unable to load custom MergePolicy
> -
>
> Key: SOLR-2983
> URL: https://issues.apache.org/jira/browse/SOLR-2983
> Project: Solr
>  Issue Type: Bug
>Reporter: Mathias Herberts
>Assignee: Tommaso Teofili
>Priority: Minor
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-2983.patch, SOLR-2983_2.patch
>
>
> As part of a recent upgrade to Solr 3.5.0 we encountered an error related to 
> our use of LinkedIn's ZoieMergePolicy.
> It seems the code that loads a custom MergePolicy was at some point moved 
> into SolrIndexConfig.java from SolrIndexWriter.java, but as this code was 
> copied verbatim it now contains a bug:
> try {
>   policy = (MergePolicy) 
> schema.getResourceLoader().newInstance(mpClassName, null, new 
> Class[]{IndexWriter.class}, new Object[]{this});
> } catch (Exception e) {
>   policy = (MergePolicy) 
> schema.getResourceLoader().newInstance(mpClassName);
> }
> 'this' is no longer an IndexWriter but a SolrIndexConfig, therefore the call 
> to newInstance will always throw an exception and the catch clause will be 
> executed. If the custom MergePolicy does not have a default constructor 
> (which is the case of ZoieMergePolicy), the second attempt to create the 
> MergePolicy will also fail and Solr won't start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3671) Add a TypeTokenFilter

2012-01-19 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-3671:


Attachment: LUCENE-3671.patch

> Add a TypeTokenFilter
> -
>
> Key: LUCENE-3671
> URL: https://issues.apache.org/jira/browse/LUCENE-3671
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: core/queryparser
>Reporter: Santiago M. Mola
> Attachments: LUCENE-3671.patch
>
>
> It would be convenient to have a TypeTokenFilter that filters tokens by its 
> type, either with an exclude or include list. This might be a stupid thing to 
> provide for people who use Lucene directly, but it would be very useful to 
> later expose it to Solr and other Lucene-backed search solutions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3671) Add a TypeTokenFilter

2012-01-19 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-3671:


Attachment: LUCENE-3671_2.patch

Thanks Robert, you're right.
Updated patch attached.

> Add a TypeTokenFilter
> -
>
> Key: LUCENE-3671
> URL: https://issues.apache.org/jira/browse/LUCENE-3671
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: core/queryparser
>Reporter: Santiago M. Mola
> Attachments: LUCENE-3671.patch, LUCENE-3671_2.patch
>
>
> It would be convenient to have a TypeTokenFilter that filters tokens by its 
> type, either with an exclude or include list. This might be a stupid thing to 
> provide for people who use Lucene directly, but it would be very useful to 
> later expose it to Solr and other Lucene-backed search solutions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3671) Add a TypeTokenFilter

2012-01-20 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-3671:


Attachment: LUCENE-3671_3.patch

updated patch with unit test

> Add a TypeTokenFilter
> -
>
> Key: LUCENE-3671
> URL: https://issues.apache.org/jira/browse/LUCENE-3671
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: core/queryparser
>Reporter: Santiago M. Mola
> Attachments: LUCENE-3671.patch, LUCENE-3671_2.patch, 
> LUCENE-3671_3.patch
>
>
> It would be convenient to have a TypeTokenFilter that filters tokens by its 
> type, either with an exclude or include list. This might be a stupid thing to 
> provide for people who use Lucene directly, but it would be very useful to 
> later expose it to Solr and other Lucene-backed search solutions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3054) Add a TypeTokenFilterFactory

2012-01-22 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated SOLR-3054:
--

Attachment: SOLR-3054.patch

attached factory with unit test patch

> Add a TypeTokenFilterFactory
> 
>
> Key: SOLR-3054
> URL: https://issues.apache.org/jira/browse/SOLR-3054
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Reporter: Tommaso Teofili
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3054.patch
>
>
> Create a TypeTokenFilterFactory to make the TypeTokenFilter (filtering tokens 
> depending on token types, see LUCENE-3671) available in Solr too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3054) Add a TypeTokenFilterFactory

2012-01-22 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated SOLR-3054:
--

Attachment: SOLR-3054_2.patch

Thanks Uwe, I've fixed the patch as per your comments.
Also I've added that TypeTokenFF implements the ResourceLoaderAware interface.

> Add a TypeTokenFilterFactory
> 
>
> Key: SOLR-3054
> URL: https://issues.apache.org/jira/browse/SOLR-3054
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Reporter: Tommaso Teofili
>Assignee: Uwe Schindler
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3054.patch, SOLR-3054_2.patch
>
>
> Create a TypeTokenFilterFactory to make the TypeTokenFilter (filtering tokens 
> depending on token types, see LUCENE-3671) available in Solr too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3054) Add a TypeTokenFilterFactory

2012-01-22 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated SOLR-3054:
--

Attachment: SOLR-3054_3.patch

updated patch:
 - enablePosIncr set to false
 - fixed unit tests for throwing Exceptions rather than try/catching
 - 'types' TypeTokenFF parameter is now mandatory (SolrException raised if not 
supplied)

> Add a TypeTokenFilterFactory
> 
>
> Key: SOLR-3054
> URL: https://issues.apache.org/jira/browse/SOLR-3054
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Reporter: Tommaso Teofili
>Assignee: Uwe Schindler
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3054.patch, SOLR-3054_2.patch, SOLR-3054_3.patch
>
>
> Create a TypeTokenFilterFactory to make the TypeTokenFilter (filtering tokens 
> depending on token types, see LUCENE-3671) available in Solr too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3013) Add UIMA based tokenizers / filters that can be used in the schema.xml

2012-01-23 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated SOLR-3013:
--

Attachment: SOLR-3013.patch

patch overview:
 - moved the 'ae' package out of 'processor' package since it's to be used by 
tokenizers too
 - created an 'analysis' package which contains 
tokenizers/analyzers/tokenizerfactories
 - updated the 'Introduction' section inside CHANGES.txt 
 

The UIMAAnnotationsTokenizer creates tokens using annotations created over the 
input Reader.
The UIMATypeAwareAnnotationsTokenizer creates tokens using annotations created 
over the input Reader adding also the TypeAttribute according to the specified 
UIMA FeaturePath.

> Add UIMA based tokenizers / filters that can be used in the schema.xml
> --
>
> Key: SOLR-3013
> URL: https://issues.apache.org/jira/browse/SOLR-3013
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Affects Versions: 3.5
>Reporter: Tommaso Teofili
>Priority: Minor
>  Labels: uima, update_request_handler
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3013.patch
>
>
> Add UIMA based tokenizers / filters that can be declared and used directly 
> inside the schema.xml.
> Thus instead of using the UIMA UpdateRequestProcessor one could directly 
> define per-field NLP capable tokenizers / filters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3744) Add support for type whitelist in TypeTokenFilter

2012-02-03 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-3744:


Attachment: LUCENE-3744_2.patch

Thanks Santiago, I updated the patch to split the Lucene changes from the Solr 
changes (will open a new Jira for the Solr factories changes).


> Add support for type whitelist in TypeTokenFilter
> -
>
> Key: LUCENE-3744
> URL: https://issues.apache.org/jira/browse/LUCENE-3744
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Santiago M. Mola
>Priority: Trivial
> Attachments: LUCENE-3744_2.patch, TypeTokenFilter-whitelist.patch, 
> TypeTokenFilter_whitelst_lucene_and_solr.patch
>
>
> A usual use case for TypeTokenFilter is allowing only a set of token types. 
> That is, listing allowed types, instead of filtered ones. I'm attaching a 
> patch to add a useWhitelist option for that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

2012-02-03 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-3731:


Attachment: LUCENE-3731.patch

this patch adds a modules/analysis/uima module

> Create a analysis/uima module for UIMA based tokenizers/analyzers
> -
>
> Key: LUCENE-3731
> URL: https://issues.apache.org/jira/browse/LUCENE-3731
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3731.patch
>
>
> As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored 
> out in a separate module (modules/analysis/uima) as they can be used in plain 
> Lucene. Then the solr/contrib/uima will contain only the related factories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

2012-02-09 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-3731:


Attachment: LUCENE-3731_2.patch

Updated patch which incorporates Robert's suggestions. 
The random strings testing highlights some corner cases where the endOffset is 
not set correctly, probably due to Redear to String explicit conversion in 
BaseUIMATokenizer which needs to get rid of line.separator property.

New patch to fix the above will follow.

> Create a analysis/uima module for UIMA based tokenizers/analyzers
> -
>
> Key: LUCENE-3731
> URL: https://issues.apache.org/jira/browse/LUCENE-3731
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch
>
>
> As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored 
> out in a separate module (modules/analysis/uima) as they can be used in plain 
> Lucene. Then the solr/contrib/uima will contain only the related factories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

2012-02-13 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-3731:


Attachment: LUCENE-3731_3.patch

Updated patch which fixes corner cases with wrong endOffsets.


> Create a analysis/uima module for UIMA based tokenizers/analyzers
> -
>
> Key: LUCENE-3731
> URL: https://issues.apache.org/jira/browse/LUCENE-3731
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, 
> LUCENE-3731_3.patch
>
>
> As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored 
> out in a separate module (modules/analysis/uima) as they can be used in plain 
> Lucene. Then the solr/contrib/uima will contain only the related factories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

2012-02-14 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-3731:


Attachment: LUCENE-3731_4.patch

patch with finalOffset setting in the end() method.

> Create a analysis/uima module for UIMA based tokenizers/analyzers
> -
>
> Key: LUCENE-3731
> URL: https://issues.apache.org/jira/browse/LUCENE-3731
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, 
> LUCENE-3731_3.patch, LUCENE-3731_4.patch
>
>
> As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored 
> out in a separate module (modules/analysis/uima) as they can be used in plain 
> Lucene. Then the solr/contrib/uima will contain only the related factories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

2012-02-16 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-3731:


Attachment: LUCENE-3731_rsrel.patch

bq. Because Tokenizer.close() is misleading/confusing, the instance is still 
reused after 
this for subsequent documents.

When I call close() it looks the correct way one could reuse that Tokenizer 
instance is by calling reset(someOtherInput) before doing anything else, so, 
after adding 

{code}
assert reader != null : "input has been closed, please reset it";
{code}

as first line inside the toString(Reader reader) method in BaseUIMATokenizer, I 
tried this test:
{code}

  @Test
  public void testSetReaderAndClose() throws Exception {
StringReader input = new StringReader("the big brown fox jumped on the 
wood");
Tokenizer t = new UIMAAnnotationsTokenizer("/uima/AggregateSentenceAE.xml", 
"org.apache.uima.TokenAnnotation", input);
assertTokenStreamContents(t, new String[]{"the", "big", "brown", "fox", 
"jumped", "on", "the", "wood"});
t.close();
try {
  t.incrementToken();
  fail("should've been failed as reader is not set");
} catch (AssertionError error) {
  // ok
}
input = new StringReader("hi oh my");
t = new UIMAAnnotationsTokenizer("/uima/TestAggregateSentenceAE.xml", 
"org.apache.lucene.uima.ts.TokenAnnotation", input);
assertTrue("should've been incremented ", t.incrementToken());
t.close();
try {
  t.incrementToken();
  fail("should've been failed as reader is not set");
} catch (AssertionError error) {
  // ok
}
t.reset(new StringReader("hey what do you say"));
assertTrue("should've been incremented ", t.incrementToken());
  }

{code}

and it looks to me it's behaving correctly.
Still working on improving it and trying to catch possible corner cases.


> Create a analysis/uima module for UIMA based tokenizers/analyzers
> -
>
> Key: LUCENE-3731
> URL: https://issues.apache.org/jira/browse/LUCENE-3731
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, 
> LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_rsrel.patch, 
> LUCENE-3731_speed.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch
>
>
> As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored 
> out in a separate module (modules/analysis/uima) as they can be used in plain 
> Lucene. Then the solr/contrib/uima will contain only the related factories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

2012-02-29 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated LUCENE-3731:


Fix Version/s: (was: 3.6)

> Create a analysis/uima module for UIMA based tokenizers/analyzers
> -
>
> Key: LUCENE-3731
> URL: https://issues.apache.org/jira/browse/LUCENE-3731
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 4.0
>
> Attachments: LUCENE-3731.patch, LUCENE-3731_2.patch, 
> LUCENE-3731_3.patch, LUCENE-3731_4.patch, LUCENE-3731_rsrel.patch, 
> LUCENE-3731_speed.patch, LUCENE-3731_speed.patch, LUCENE-3731_speed.patch
>
>
> As discussed in SOLR-3013 the UIMA Tokenizers/Analyzer should be refactored 
> out in a separate module (modules/analysis/uima) as they can be used in plain 
> Lucene. Then the solr/contrib/uima will contain only the related factories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2983) Unable to load custom MergePolicy

2012-03-22 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated SOLR-2983:
--

Attachment: SOLR-2983_2.patch

new patch which adds the changes.txt entry and adds the toIndexWriterConfig() 
method (the one which caused the failures) testing

> Unable to load custom MergePolicy
> -
>
> Key: SOLR-2983
> URL: https://issues.apache.org/jira/browse/SOLR-2983
> Project: Solr
>  Issue Type: Bug
>Reporter: Mathias Herberts
>Assignee: Tommaso Teofili
>Priority: Minor
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-2983.patch, SOLR-2983_2.patch
>
>
> As part of a recent upgrade to Solr 3.5.0 we encountered an error related to 
> our use of LinkedIn's ZoieMergePolicy.
> It seems the code that loads a custom MergePolicy was at some point moved 
> into SolrIndexConfig.java from SolrIndexWriter.java, but as this code was 
> copied verbatim it now contains a bug:
> try {
>   policy = (MergePolicy) 
> schema.getResourceLoader().newInstance(mpClassName, null, new 
> Class[]{IndexWriter.class}, new Object[]{this});
> } catch (Exception e) {
>   policy = (MergePolicy) 
> schema.getResourceLoader().newInstance(mpClassName);
> }
> 'this' is no longer an IndexWriter but a SolrIndexConfig, therefore the call 
> to newInstance will always throw an exception and the catch clause will be 
> executed. If the custom MergePolicy does not have a default constructor 
> (which is the case of ZoieMergePolicy), the second attempt to create the 
> MergePolicy will also fail and Solr won't start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3021) Minor code performance improvements in SolrJ

2012-01-10 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated SOLR-3021:
--

Attachment: SOLR-3021.patch

attached patch

> Minor code performance improvements in SolrJ
> 
>
> Key: SOLR-3021
> URL: https://issues.apache.org/jira/browse/SOLR-3021
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 3.5
>Reporter: Tommaso Teofili
>Priority: Minor
>  Labels: solrj
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-3021.patch
>
>
> Remove string concatenations and use Collections.addAll instead of looping to 
> add elements to Iterables.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2983) Unable to load custom MergePolicy

2012-01-17 Thread Tommaso Teofili (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated SOLR-2983:
--

Attachment: SOLR-2983.patch

simple patch which just removes the (always failing) try clause, adding unit 
tests for a bad merge policy sample

> Unable to load custom MergePolicy
> -
>
> Key: SOLR-2983
> URL: https://issues.apache.org/jira/browse/SOLR-2983
> Project: Solr
>  Issue Type: Bug
>Reporter: Mathias Herberts
>Priority: Critical
> Fix For: 3.6, 4.0
>
> Attachments: SOLR-2983.patch
>
>
> As part of a recent upgrade to Solr 3.5.0 we encountered an error related to 
> our use of LinkedIn's ZoieMergePolicy.
> It seems the code that loads a custom MergePolicy was at some point moved 
> into SolrIndexConfig.java from SolrIndexWriter.java, but as this code was 
> copied verbatim it now contains a bug:
> try {
>   policy = (MergePolicy) 
> schema.getResourceLoader().newInstance(mpClassName, null, new 
> Class[]{IndexWriter.class}, new Object[]{this});
> } catch (Exception e) {
>   policy = (MergePolicy) 
> schema.getResourceLoader().newInstance(mpClassName);
> }
> 'this' is no longer an IndexWriter but a SolrIndexConfig, therefore the call 
> to newInstance will always throw an exception and the catch clause will be 
> executed. If the custom MergePolicy does not have a default constructor 
> (which is the case of ZoieMergePolicy), the second attempt to create the 
> MergePolicy will also fail and Solr won't start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2983) Unable to load custom MergePolicy

[jira] [Updated] (LUCENE-3671) Add a TypeTokenFilter

[jira] [Updated] (LUCENE-3671) Add a TypeTokenFilter

[jira] [Updated] (LUCENE-3671) Add a TypeTokenFilter

[jira] [Updated] (SOLR-3054) Add a TypeTokenFilterFactory

[jira] [Updated] (SOLR-3054) Add a TypeTokenFilterFactory

[jira] [Updated] (SOLR-3054) Add a TypeTokenFilterFactory

[jira] [Updated] (SOLR-3013) Add UIMA based tokenizers / filters that can be used in the schema.xml

[jira] [Updated] (LUCENE-3744) Add support for type whitelist in TypeTokenFilter

[jira] [Updated] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

[jira] [Updated] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

[jira] [Updated] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

[jira] [Updated] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

[jira] [Updated] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

[jira] [Updated] (LUCENE-3731) Create a analysis/uima module for UIMA based tokenizers/analyzers

[jira] [Updated] (SOLR-2983) Unable to load custom MergePolicy

[jira] [Updated] (SOLR-3021) Minor code performance improvements in SolrJ

[jira] [Updated] (SOLR-2983) Unable to load custom MergePolicy

18 matches

Site Navigation

Mail list logo

Footer information