Solr Analyzers and Filters

2009-05-29 Thread Chris Male
Hello,

I have noticed that in Lucene 2.9 Token is now a deprecated class and many
of the Lucene analyzers have been moved over to the new API.  Is there the
intention to do the same to the to the Solr analyzers and filters? Is it
still worth while creating a new Filter that uses the Token based API?

Thank you,
Chris


[jira] Created: (SOLR-1194) Query Analyzer not Invoking for Custom FiledType - When we use Custom QParser Plugin

2009-05-29 Thread Nagarajan.shanmugam (JIRA)
Query Analyzer not Invoking for Custom FiledType - When we use Custom QParser 
Plugin


 Key: SOLR-1194
 URL: https://issues.apache.org/jira/browse/SOLR-1194
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.3
 Environment: Windows, Java 1.6. Solr 1.3
Reporter: Nagarajan.shanmugam


Hi I  Created Custom Solr Field kwd_names in
schema.xml
fieldType name=kwd_names class=solr.TextField positionIncrementGap=100
analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory 
/
filter class=solr.TrimFilterFactory /
filter class=solr.LowerCaseFilterFactory /
filter class=solr.PhoneticFilterFactory 
encoder=Metaphone inject=true/  
/analyzer
analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory 
/
filter class=solr.TrimFilterFactory /   

filter class=solr.LowerCaseFilterFactory /
filter class=solr.PhoneticFilterFactory 
encoder=Metaphone inject=true/  
/analyzer 
/fieldType

I configured requestHandler in solrConfig.xml with Custom QparserPlugin
requestHandler name=fperson class=solr.SearchHandler
!-- default values for query parameters --
 lst name=defaults
   str name=echoParamsexplicit/str
   str name=defTypefpersonQueryParser/str
 /lst
 /requestHandler

queryParser name=fpersonQueryParser 

class=com.thinkronize.edudym.search.analysis.FPersonQParserPlugin /

  SolrQuery q = new SolrQuery();
  q.setParam(q, George);
  q.setParam(gender, M);
  q.setQueryType(FPersonSearcher.QUERY_TYPE);
  server.query(q);

When I fire Query it wont invoke the QueryAnlayzer it Doesnt give any result. 
But if i remove q.setQueryType its invoking the query analyzer and its giving 
results 

That mean QueryAnalyzer for that field not invoked when i use CustomQParser 
Plugin.




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-236) Field collapsing

2009-05-29 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated SOLR-236:
---

Attachment: field-collapse-solr-236.patch

Hi,

I have modified the latest patch of Thomas and made two performance 
improvements: 
1) Improved normal field collapsing. I tested it with an index 1.1 million 
documents. When collapsing on all documents and with no sorting specified (so 
sorting on score) the query time is around 130ms compared with the previous 
patch which is around 1.5 s. When I then add sorting on string field the query 
time is around 220 ms compared with the previous patch which is around 5.2 s. 

The reason why it is faster is because the latest patch queries for a doclist 
instead of a docset. In the normal collapse method it keeps track of the most 
relevant documents, so the end result is the same, also creating a docList of 
1.1 million documents (and ordering it) is very expensive.

Note: I did not improved adjacent collapsing, because the adjacent method needs 
(as far as I understand it) a completely sorted list of documents (docList).

2) Sightly improved facetation in combination with field collapsing, by reusing 
the uncollapsed docset that is created during the collapsing process (the 
previous patch made invoked a second search).

I also have added documentation, added a few unit tests for the collapsing 
process itself and made the debug information easier readable.

I'm very interested in other people's experiences with this patch and feedback 
on the patch itself. 

Cheers,

Martijn 


 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch, 
 SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-236) Field collapsing

2009-05-29 Thread Martijn van Groningen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714442#action_12714442
 ] 

Martijn van Groningen edited comment on SOLR-236 at 5/29/09 6:02 AM:
-

Hi,

I have modified the latest patch of Thomas and made two performance 
improvements: 
1) Improved normal field collapsing. I tested it with an index 1.1 million 
documents. When collapsing on all documents and with no sorting specified (so 
sorting on score) the query time is around 130ms compared with the previous 
patch which is around 1.5 s. When I then add sorting on string field the query 
time is around 220 ms compared with the previous patch which is around 5.2 s. 

The reason why it is faster is because the latest patch queries for a doclist 
instead of a docset. In the normal collapse method it keeps track of the most 
relevant documents, so the end result is the same, also creating a docList of 
1.1 million documents (and ordering it) is very expensive.

Note: I did not improved adjacent collapsing, because the adjacent method needs 
(as far as I understand it) a completely sorted list of documents (docList).

2) Slightly improved facetation in combination with field collapsing, by 
reusing the uncollapsed docset that is created during the collapsing process 
(the previous patch made invoked a second search).

I also have added documentation, added a few unit tests for the collapsing 
process itself and made the debug information more readable.

I'm very interested in other people's experiences with this patch and feedback 
on the patch itself. 

Cheers,

Martijn 


  was (Author: martijn):
Hi,

I have modified the latest patch of Thomas and made two performance 
improvements: 
1) Improved normal field collapsing. I tested it with an index 1.1 million 
documents. When collapsing on all documents and with no sorting specified (so 
sorting on score) the query time is around 130ms compared with the previous 
patch which is around 1.5 s. When I then add sorting on string field the query 
time is around 220 ms compared with the previous patch which is around 5.2 s. 

The reason why it is faster is because the latest patch queries for a doclist 
instead of a docset. In the normal collapse method it keeps track of the most 
relevant documents, so the end result is the same, also creating a docList of 
1.1 million documents (and ordering it) is very expensive.

Note: I did not improved adjacent collapsing, because the adjacent method needs 
(as far as I understand it) a completely sorted list of documents (docList).

2) Sightly improved facetation in combination with field collapsing, by reusing 
the uncollapsed docset that is created during the collapsing process (the 
previous patch made invoked a second search).

I also have added documentation, added a few unit tests for the collapsing 
process itself and made the debug information easier readable.

I'm very interested in other people's experiences with this patch and feedback 
on the patch itself. 

Cheers,

Martijn 

  
 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch, 
 SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: 

[jira] Issue Comment Edited: (SOLR-236) Field collapsing

2009-05-29 Thread Martijn van Groningen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12714442#action_12714442
 ] 

Martijn van Groningen edited comment on SOLR-236 at 5/29/09 11:38 AM:
--

Hi,

I have modified the latest patch of Thomas and made two performance 
improvements: 
1) Improved normal field collapsing. I tested it with an index 1.1 million 
documents. When collapsing on all documents and with no sorting specified (so 
sorting on score) the query time is around 130ms compared with the previous 
patch which is around 1.5 s. When I then add sorting on string field the query 
time is around 220 ms compared with the previous patch which is around 5.2 s. 

The reason why it is faster is because the latest patch queries for a doclist 
instead of a docset. In the normal collapse method it keeps track of the most 
relevant documents, so the end result is the same, also creating a docList of 
1.1 million documents (and ordering it) is very expensive.

Note: I did not improved adjacent collapsing, because the adjacent method needs 
(as far as I understand it) a completely sorted list of documents (docList).

2) Slightly improved facetation in combination with field collapsing, by 
reusing the uncollapsed docset that is created during the collapsing process 
(the previous patch made invoked a second search).

I also have added documentation, added a few unit tests for the collapsing 
process itself and made the debug information more readable.
This patch works from revision 779335 (last Wednesday) and up. This patch 
depends on some changes in Solr and a change inside Lucene.

I'm very interested in other people's experiences with this patch and feedback 
on the patch itself. 

Cheers,

Martijn 


  was (Author: martijn):
Hi,

I have modified the latest patch of Thomas and made two performance 
improvements: 
1) Improved normal field collapsing. I tested it with an index 1.1 million 
documents. When collapsing on all documents and with no sorting specified (so 
sorting on score) the query time is around 130ms compared with the previous 
patch which is around 1.5 s. When I then add sorting on string field the query 
time is around 220 ms compared with the previous patch which is around 5.2 s. 

The reason why it is faster is because the latest patch queries for a doclist 
instead of a docset. In the normal collapse method it keeps track of the most 
relevant documents, so the end result is the same, also creating a docList of 
1.1 million documents (and ordering it) is very expensive.

Note: I did not improved adjacent collapsing, because the adjacent method needs 
(as far as I understand it) a completely sorted list of documents (docList).

2) Slightly improved facetation in combination with field collapsing, by 
reusing the uncollapsed docset that is created during the collapsing process 
(the previous patch made invoked a second search).

I also have added documentation, added a few unit tests for the collapsing 
process itself and made the debug information more readable.

I'm very interested in other people's experiences with this patch and feedback 
on the patch itself. 

Cheers,

Martijn 

  
 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch, 
 SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - 

Re: [jira] Updated: (SOLR-1155) Change DirectUpdateHandler2 to allow concurrent adds during an autocommit

2009-05-29 Thread Mike Klaas
I'd like to take a look at this but JIRA seems to be down. Is anyone else
experiencing this?

-Mike


On Wed, May 13, 2009 at 7:41 AM, Jayson Minard (JIRA) j...@apache.orgwrote:


 [
 https://issues.apache.org/jira/browse/SOLR-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]

 Jayson Minard updated SOLR-1155:
 

 Attachment: Solr-1155.patch

 Resolve TODO for commitWithin, and updated AutoCommitTrackerTest to
 validate the fix.

  Change DirectUpdateHandler2 to allow concurrent adds during an autocommit
  -
 
  Key: SOLR-1155
  URL: https://issues.apache.org/jira/browse/SOLR-1155
  Project: Solr
   Issue Type: Improvement
   Components: search
 Affects Versions: 1.3
 Reporter: Jayson Minard
  Attachments: Solr-1155.patch, Solr-1155.patch
 
 
  Currently DirectUpdateHandler2 will block adds during a commit, and it
 seems to be possible with recent changes to Lucene to allow them to run
 concurrently.
  See:
 http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--td23435224.html

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.




snapshot config value for Solr 1.4 Replication

2009-05-29 Thread mlathe

Hi All,
I'm doing some proof of concept work with Solr Replication
http://wiki.apache.org/solr/SolrReplication

If you dig through the ReplicationHandler code you will see that the master
node's config can include replicateAfter and snapshot, like this:
lst name=master
str name=replicateAfterstartup,commit/str
str name=snapshotstartup,commit/str
str name=confFilesschema.xml,stopwords.txt,synonyms.txt/str
/lst

Does anyone understand what the snapshot values do? it's not defined in the
wiki documentation.

Thanks
--Matthias

-- 
View this message in context: 
http://www.nabble.com/snapshot-config-value-for-Solr-1.4-Replication-tp23788960p23788960.html
Sent from the Solr - Dev mailing list archive at Nabble.com.