[jira] Updated: (SOLR-236) Field collapsing

2007-05-19 Thread Emmanuel Keller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Keller updated SOLR-236:
-

Attachment: field_collapsing.patch

The last version of the patch.

- Results are now cached using CollapseCache (a new instance of SolrCache 
added on solrconfig.xml)
- The parameter collapse has been removed.

This version has been fully tested.

Feedbacks are welcome.

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.2
Reporter: Emmanuel Keller
 Attachments: collapse_field.patch, collapse_field.patch, 
 field_collapsing.patch, field_collapsing.patch, field_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 4 new query parameters (SolrParams):
 collapse set to true to enable collapsing.
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-236) Field collapsing

2007-05-19 Thread Emmanuel Keller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Keller updated SOLR-236:
-

Attachment: field_collapsing_1.1.0.patch

I still maintain a version for the release 1.1.0 (The version we used on our 
production environment).

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.2
Reporter: Emmanuel Keller
 Attachments: collapse_field.patch, collapse_field.patch, 
 field_collapsing.patch, field_collapsing.patch, field_collapsing.patch, 
 field_collapsing_1.1.0.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 4 new query parameters (SolrParams):
 collapse set to true to enable collapsing.
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Code style

2007-05-19 Thread Otis Gospodnetic
+1 to sharing styles.  Next week, I'll see if I can get my Eclipse 3.2 style 
that I customized to fit Lucene style.

Otis
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share

- Original Message 
From: Chris Hostetter [EMAIL PROTECTED]
To: solr-dev@lucene.apache.org
Sent: Saturday, May 19, 2007 2:05:33 AM
Subject: Re: Code style


: thinking might be useful (in Lucene as well) is to have downloadable
: codestyle templates for IntelliJ and Eclipse defined and linked to
: from the developer section of the website (or it could even be

I certainly have no objection to hosting any style files for any editors
people like on the wiki, or even in SVN, as long as it's clear these are
just aids for people who want help being more consistent, and not
indications that the use of certain editors/IDEs to ensure consistent code
are mandatory for submitting patches or commiting changes.


-Hoss






[jira] Commented: (SOLR-69) PATCH:MoreLikeThis support

2007-05-19 Thread Brian Whitman (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12497178
 ] 

Brian Whitman commented on SOLR-69:
---

R, one useful feature would be mlt.fq=query, where query is a filter query, 
like type:book. Or since we're moving to a solo handler for mlt, just 
supporting fq would be good.

like

/mlt?q=id:BOOK01mlt.fl=contentsfq=type:BOOK

(Because in a single solr instance you've got information about books  
authors, and you only want the mlt results to be books.)


 PATCH:MoreLikeThis support
 --

 Key: SOLR-69
 URL: https://issues.apache.org/jira/browse/SOLR-69
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Bertrand Delacretaz
Priority: Minor
 Attachments: lucene-queries-2.0.0.jar, lucene-queries-2.1.1-dev.jar, 
 SOLR-69-MoreLikeThisRequestHandler.patch, 
 SOLR-69-MoreLikeThisRequestHandler.patch, SOLR-69.patch, SOLR-69.patch, 
 SOLR-69.patch, SOLR-69.patch


 Here's a patch that implements simple support of Lucene's MoreLikeThis class.
 The MoreLikeThisHelper code is heavily based on (hmm...lifted from might be 
 more appropriate ;-) Erik Hatcher's example mentioned in 
 http://www.mail-archive.com/[EMAIL PROTECTED]/msg00878.html
 To use it, add at least the following parameters to a standard or dismax 
 query:
   mlt=true
   mlt.fl=list,of,fields,which,define,similarity
 See the MoreLikeThisHelper source code for more parameters.
 Here are two URLs that work with the example config, after loading all 
 documents found in exampledocs in the index (just to show that it seems to 
 work - of course you need a larger corpus to make it interesting):
 http://localhost:8983/solr/select/?stylesheet=q=apacheqt=standardmlt=truemlt.fl=manu,catmlt.mindf=1mlt.mindf=1fl=id,score
 http://localhost:8983/solr/select/?stylesheet=q=apacheqt=dismaxmlt=truemlt.fl=manu,catmlt.mindf=1mlt.mindf=1fl=id,score
 Results are added to the output like this:
 response
   ...
   lst name=moreLikeThis
 result name=UTF8TEST numFound=1 start=0 maxScore=1.5293242
   doc
 float name=score1.5293242/float
 str name=idSOLR1000/str
   /doc
 /result
 result name=SOLR1000 numFound=1 start=0 maxScore=1.5293242
   doc
 float name=score1.5293242/float
 str name=idUTF8TEST/str
   /doc
 /result
   /lst
 I haven't tested this extensively yet, will do in the next few days. But 
 comments are welcome of course.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-242) tr parameter implies XSL, no wt=xslt necessary

2007-05-19 Thread Brian Whitman (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12497179
 ] 

Brian Whitman commented on SOLR-242:


OK, I guess I am over-complicating by trying to simplify. XSLT to me seems like 
something that happens at the very end of everything, and the response writers 
seem to be a different class of things.

But if tr is only used by xslt, shouldn't it be xslt.tr (to follow the 
standard, like json.nl?) 

Also, if wt=xslt is set but no tr=, what happens? 




 tr parameter implies XSL, no wt=xslt necessary
 --

 Key: SOLR-242
 URL: https://issues.apache.org/jira/browse/SOLR-242
 Project: Solr
  Issue Type: Improvement
Reporter: Brian Whitman
Priority: Trivial

 Perhaps the most trivial issue ever, but tr=file.xsl should imply that the 
 XML from whichever response writer is being used gets parsed by the given 
 transform. The wt=xslt is somewhat redundant. And maybe change the tr 
 parameter to xslt. 
 Imagine in the future there's a response writer that outputs a different kind 
 of XML. That shouldn't preclude the use of a transform on top of that 
 response. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-221) faceting memory and performance improvement

2007-05-19 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-221.
---

Resolution: Fixed

 faceting memory and performance improvement
 ---

 Key: SOLR-221
 URL: https://issues.apache.org/jira/browse/SOLR-221
 Project: Solr
  Issue Type: Improvement
Reporter: Yonik Seeley
 Assigned To: Yonik Seeley
 Attachments: facet.patch, facet.patch


 1) compare minimum count currently needed to the term df and avoid 
 unnecessary intersection count
 2) set a minimum term df in order to use the filterCache, otherwise iterate 
 over TermDocs

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.