[jira] Commented: (SOLR-195) Wildcard/prefix queries not highlighted

2007-06-05 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501790
 ] 

Hoss Man commented on SOLR-195:
---

a follow up note: as mentioned in the email thread linked to in the issue 
report, one work arround people may want to consider if highlighting is 
important (at the expense of the PrefixFilter optimization) is to force the use 
of a WildCardQuery in what would otherwise be interpreted as a PrefixQuery by 
putting a "?" before the "*"

ie: auto?* instead of auto*

(yes, this does require that at least one character follow the prefix)

> Wildcard/prefix queries not highlighted
> ---
>
> Key: SOLR-195
> URL: https://issues.apache.org/jira/browse/SOLR-195
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.1.0, 1.2
>Reporter: Mike Klaas
>Priority: Minor
> Fix For: 1.2
>
>
> Possible bug in query rewrite()ing:
> http://www.nabble.com/return-matched-terms---fuzzy-or-wildcard-searches-tf3452757.html#a9640214

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-143) Support for PMD and Clover

2007-06-05 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501725
 ] 

Ryan McKinley commented on SOLR-143:


> 
> Ryan, i haven't looked at your updated patch, but i don't understand your 
> last comment...
> 

There is nothing interisting in the changes, it just does not conflict with the 
recently added init-forrest-entities

The patch also removes your attempts to fix the PMD serious errors (4 months 
later, most of the .java files have conflict)  We should probalby fix them when 
this does get added.


>> Is there a reason to have the -Drun.clover configuration rather then the 
>> target specifying if clover is used or not?
> 
> how would that work exactly?  the run.clover property is what's used to 
> ensure that code is compiled with clover hooks (in the existing "compile" 
> target) and that the clover db is initialized prior to running the unit tests 
> (in the existing "test" target).
> 

magic!

I just looked into other projects I thought did this - they have different 
'compile' tasks for test and dist and require everyone to have clover.  This is 
not appropriate for solr, so the -Drun.clover option is better.


> Support for PMD and Clover
> --
>
> Key: SOLR-143
> URL: https://issues.apache.org/jira/browse/SOLR-143
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
>Priority: Minor
> Attachments: pmd-and-clover.diff, SOLR-143-CloverAndPMD.patch
>
>
> had some time on a plane this weekend, so I adapted some of the clover hooks 
> from Java-Lucene to Solr's build.xml and also put in hooks for running PMD (a 
> bug pattern finding tool).
> the PMD hook actually teste the PMD ruleset twice, once warning about any 
> violations, and once failing the build if any serious violations were found 
> ... the goal would be to hook this into the "ant test" target so you can't 
> successfully build if you have any serious rule violations.
> i strarted with a custom ruleset based on some of the bigger rules from PMD 
> ... the theory being that as well clean up the code base we can add more 
> nit-picky rules if we want to :)
> User is required to provide their own copy of PMD and/or clover on in an 
> ANT_LIB. Clover requires (ASF committer) license, PMD is freely available...
> http://pmd.sourceforge.net/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-143) Support for PMD and Clover

2007-06-05 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501708
 ] 

Hoss Man commented on SOLR-143:
---

Ryan, i haven't looked at your updated patch, but i don't understand your last 
comment...

> Is there a reason to have the -Drun.clover configuration rather then the 
> target specifying if clover is used or not?

how would that work exactly?  the run.clover property is what's used to ensure 
that code is compiled with clover hooks (in the existing "compile" target) and 
that the clover db is initialized prior to running the unit tests (in the 
existing "test" target).

> Support for PMD and Clover
> --
>
> Key: SOLR-143
> URL: https://issues.apache.org/jira/browse/SOLR-143
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
>Priority: Minor
> Attachments: pmd-and-clover.diff, SOLR-143-CloverAndPMD.patch
>
>
> had some time on a plane this weekend, so I adapted some of the clover hooks 
> from Java-Lucene to Solr's build.xml and also put in hooks for running PMD (a 
> bug pattern finding tool).
> the PMD hook actually teste the PMD ruleset twice, once warning about any 
> violations, and once failing the build if any serious violations were found 
> ... the goal would be to hook this into the "ant test" target so you can't 
> successfully build if you have any serious rule violations.
> i strarted with a custom ruleset based on some of the bigger rules from PMD 
> ... the theory being that as well clean up the code base we can add more 
> nit-picky rules if we want to :)
> User is required to provide their own copy of PMD and/or clover on in an 
> ANT_LIB. Clover requires (ASF committer) license, PMD is freely available...
> http://pmd.sourceforge.net/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-191) Integrate code coverage. Clover / cobertura

2007-06-05 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-191.


Resolution: Duplicate

duplicate of SOLR-143

> Integrate code coverage.  Clover / cobertura
> 
>
> Key: SOLR-191
> URL: https://issues.apache.org/jira/browse/SOLR-191
> Project: Solr
>  Issue Type: Test
>Reporter: Ryan McKinley
> Attachments: lib-cobertura.zip, SOLR-191-cobertura-build.patch
>
>
> Solr needs some sort of code coverage tool for testing.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-143) Support for PMD and Clover

2007-06-05 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley updated SOLR-143:
---

Attachment: SOLR-143-CloverAndPMD.patch

Updated to apply with trunk -- unlike the original patch, this does not try to 
fix the serious errors (we can do that later)

For anyone trying to run - this does not require that you have PMD or clover, 
it just generates reports if you ask for them (and have it configured)

For anyone trying to run, these are the command lines:
 ant clean
 ant test -Drun.clover=true
 ant clover-reports -Drun.clover=true
 ant pmd-reports 

Is there a reason to have the -Drun.clover configuration rather then the target 
specifying if clover is used or not?

> Support for PMD and Clover
> --
>
> Key: SOLR-143
> URL: https://issues.apache.org/jira/browse/SOLR-143
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
>Priority: Minor
> Attachments: pmd-and-clover.diff, SOLR-143-CloverAndPMD.patch
>
>
> had some time on a plane this weekend, so I adapted some of the clover hooks 
> from Java-Lucene to Solr's build.xml and also put in hooks for running PMD (a 
> bug pattern finding tool).
> the PMD hook actually teste the PMD ruleset twice, once warning about any 
> violations, and once failing the build if any serious violations were found 
> ... the goal would be to hook this into the "ant test" target so you can't 
> successfully build if you have any serious rule violations.
> i strarted with a custom ruleset based on some of the bigger rules from PMD 
> ... the theory being that as well clean up the code base we can add more 
> nit-picky rules if we want to :)
> User is required to provide their own copy of PMD and/or clover on in an 
> ANT_LIB. Clover requires (ASF committer) license, PMD is freely available...
> http://pmd.sourceforge.net/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] release rc4 as Solr 1.2

2007-06-05 Thread Doug Cutting

+1 This looks good to me.

Doug

Yonik Seeley wrote:

OK, this one is it!
Please vote to release the artifacts at
http://people.apache.org/~yonik/staging_area/solr/1.2rc4/
as Apache Solr 1.2

+1

-Yonik


broken link on homepage

2007-06-05 Thread Ryan McKinley
I know this was mentioned recently... but it seems important to fix pre 
1.2 official release.


The *"Faceted Searching With Apache Solr"* is a big bold link on the 
front page that links to 404.


If we are uncomfortable updating the link to an existing file (so as not 
to "edit history") we should at least remove the link.  Linking to a 404 
on the front page is a bad sign to anyone starting to check out solr.


ryan




Re: [VOTE] release rc4 as Solr 1.2

2007-06-05 Thread Sami Siren
Yonik Seeley wrote:
> OK, this one is it!
> Please vote to release the artifacts at
> http://people.apache.org/~yonik/staging_area/solr/1.2rc4/
> as Apache Solr 1.2

+1

Checked out the .tgz package, followed basic tutorial steps successfully
on fc5.

-- 
 Sami Siren


[jira] Commented: (SOLR-236) Field collapsing

2007-06-05 Thread Emmanuel Keller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501583
 ] 

Emmanuel Keller commented on SOLR-236:
--

Correct, except that collapse result is only used as filter to the final result 
to hide collapsed documents.

P.S.: Sorry, if my answers are a little short, I am not perfectly fluent in 
english.

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.2
>Reporter: Emmanuel Keller
> Attachments: field_collapsing_1.1.0.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version (1.2)
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2007-06-05 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501582
 ] 

Yonik Seeley commented on SOLR-236:
---

Oh I see... the modified sort is *just* to build the filter.

The building-the-filter part is a problem though... asking for *all* matching 
docs in sorted order isn't that scalable.
If we get the interface right though, more efficient implementations can follow.
For that reason, it might be good for implementatin details like 
"collapseCache" to be private.

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.2
>Reporter: Emmanuel Keller
> Attachments: field_collapsing_1.1.0.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version (1.2)
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2007-06-05 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501579
 ] 

Yonik Seeley commented on SOLR-236:
---

> The sorting is not modified. I copy the current sort to do a new search. 

Perhaps if you outlined the algorithm you use, it would clear up some things.

It looks like you make a copy of the Sort and insert a primary sort on the 
field to be collapsed, and then process the same way as you would for the 
"ADJACENT" option.  If the original sort was by relevance, this doesn't give 
you the groups sorted by relevance, right?


> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.2
>Reporter: Emmanuel Keller
> Attachments: field_collapsing_1.1.0.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version (1.2)
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2007-06-05 Thread Emmanuel Keller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501578
 ] 

Emmanuel Keller commented on SOLR-236:
--

Adjacent collapsing is useful because it preserves the pertinence of the sort.
The sorting is not modified. I copy the current sort to do a new search.

I am currently working on taking care of type field (int).

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.2
>Reporter: Emmanuel Keller
> Attachments: field_collapsing_1.1.0.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version (1.2)
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2007-06-05 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501557
 ] 

Yonik Seeley commented on SOLR-236:
---

Will Johnson brings up other use-cases:
[...] 
> it's also heavily used in
> ecommerce settings.  Check out BestBuy.com/circuitcity/etc and do a
> search for some really generic word like 'cable' and notice all the
> groups of items; BB shows 3 per group, CC shows 1 per group.  In each
> case it's not clear that the number of docs is really limited at all, ie
> it's more important to get back all the categories with n docs per
> category and the counts per category than it is to get back a fixed
> number of results or even categories for that matter.  Also notice that
> neither of these sites allow you to page through the categorized
> results.

Some of this seems very closely related to faceted search, and much of it could 
be implemented that way now on the client side, but it would take multiple 
queries to do so.

One could also think about supporting multi-valued fields in the same manner 
that faceting does.


> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.2
>Reporter: Emmanuel Keller
> Attachments: field_collapsing_1.1.0.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version (1.2)
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Commented: Field collapsing

2007-06-05 Thread Yonik Seeley

On 6/5/07, Will Johnson <[EMAIL PROTECTED]> wrote:

I haven't looked at any of the patches but I can comment some other uses
for the feature that are in production today with major vendors.  While
it's used for site collapsing ala google it's also heavily used in
ecommerce settings.  Check out BestBuy.com/circuitcity/etc and do a
search for some really generic word like 'cable' and notice all the
groups of items; BB shows 3 per group, CC shows 1 per group.  In each
case it's not clear that the number of docs is really limited at all, ie
it's more important to get back all the categories with n docs per
category and the counts per category than it is to get back a fixed
number of results or even categories for that matter.  Also notice that
neither of these sites allow you to page through the categorized
results.


Thans for the use cases... I've added some of this info to the bug to
easier keep track of it.


I'd also point out that many vendors require the collapsing field to be
an int instead of a string and then force the front end to do the
mapping.


Now that just seems silly... must be lucene is just a superior
technology to build a search platform from ;-)

-Yonk


RE: [jira] Commented: (SOLR-236) Field collapsing

2007-06-05 Thread Will Johnson
I haven't looked at any of the patches but I can comment some other uses
for the feature that are in production today with major vendors.  While
it's used for site collapsing ala google it's also heavily used in
ecommerce settings.  Check out BestBuy.com/circuitcity/etc and do a
search for some really generic word like 'cable' and notice all the
groups of items; BB shows 3 per group, CC shows 1 per group.  In each
case it's not clear that the number of docs is really limited at all, ie
it's more important to get back all the categories with n docs per
category and the counts per category than it is to get back a fixed
number of results or even categories for that matter.  Also notice that
neither of these sites allow you to page through the categorized
results.

I'd also point out that many vendors require the collapsing field to be
an int instead of a string and then force the front end to do the
mapping.  just one more thing to consider

- will

 

-Original Message-
From: Yonik Seeley (JIRA) [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 05, 2007 9:01 AM
To: solr-dev@lucene.apache.org
Subject: [jira] Commented: (SOLR-236) Field collapsing


[
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.p
lugin.system.issuetabpanels:comment-tabpanel#action_12501550 ] 

Yonik Seeley commented on SOLR-236:
---

I guess adjacent collapsing can make sense when one is sorting by the
field that is being collapsed.

For the normal collapsing though, this patch appears to implement it by
changing the sort order to the collapsing field (normally not desired).
For example, if sorting by relevance and collapsing on a field, one
would normally want the groups sorted by relevance (with the group
relevance defined as the max score of it's members).

As far as how to do paging, it makes sense to rigidly define it in terms
of number of documents, regardless of how many documents are in each
group.  Going back to google, it always displays the first 10 documents,
but a variable number of groups.   That does mean that a group could be
split across pages.  It would actually be much simpler (IMO) to always
return a fixed number of groups rather than a fixed number of documents,
but I don't think this would be less useful to people.  Thoughts?

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.2
>Reporter: Emmanuel Keller
> Attachments: field_collapsing_1.1.0.patch,
SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a
given field to a single entry in the result set. Site collapsing is a
special case of this, where all results for a given web site is
collapsed into one or two entries in the result set, typically with an
associated "more documents from this site" link. See also Duplicate
detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed
before collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version (1.2)
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2007-06-05 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501550
 ] 

Yonik Seeley commented on SOLR-236:
---

I guess adjacent collapsing can make sense when one is sorting by the field 
that is being collapsed.

For the normal collapsing though, this patch appears to implement it by 
changing the sort order to the collapsing field (normally not desired).  For 
example, if sorting by relevance and collapsing on a field, one would normally 
want the groups sorted by relevance (with the group relevance defined as the 
max score of it's members).

As far as how to do paging, it makes sense to rigidly define it in terms of 
number of documents, regardless of how many documents are in each group.  Going 
back to google, it always displays the first 10 documents, but a variable 
number of groups.   That does mean that a group could be split across pages.  
It would actually be much simpler (IMO) to always return a fixed number of 
groups rather than a fixed number of documents, but I don't think this would be 
less useful to people.  Thoughts?

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.2
>Reporter: Emmanuel Keller
> Attachments: field_collapsing_1.1.0.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version (1.2)
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-236) Field collapsing

2007-06-05 Thread Emmanuel Keller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Keller updated SOLR-236:
-

Attachment: (was: SOLR-236-FieldCollapsing.patch)

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.2
>Reporter: Emmanuel Keller
> Attachments: field_collapsing_1.1.0.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version (1.2)
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-236) Field collapsing

2007-06-05 Thread Emmanuel Keller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Keller updated SOLR-236:
-

Attachment: (was: field_collapsing.patch)

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.2
>Reporter: Emmanuel Keller
> Attachments: field_collapsing_1.1.0.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version (1.2)
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-236) Field collapsing

2007-06-05 Thread Emmanuel Keller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Keller updated SOLR-236:
-

Attachment: (was: field_collapsing.patch)

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.2
>Reporter: Emmanuel Keller
> Attachments: field_collapsing_1.1.0.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version (1.2)
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-236) Field collapsing

2007-06-05 Thread Emmanuel Keller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Keller updated SOLR-236:
-

Attachment: (was: collapse_field.patch)

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.2
>Reporter: Emmanuel Keller
> Attachments: field_collapsing_1.1.0.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version (1.2)
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-236) Field collapsing

2007-06-05 Thread Emmanuel Keller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Keller updated SOLR-236:
-

Attachment: (was: field_collapsing.patch)

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.2
>Reporter: Emmanuel Keller
> Attachments: field_collapsing_1.1.0.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version (1.2)
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-236) Field collapsing

2007-06-05 Thread Emmanuel Keller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Keller updated SOLR-236:
-

Attachment: (was: collapse_field.patch)

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.2
>Reporter: Emmanuel Keller
> Attachments: field_collapsing_1.1.0.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version (1.2)
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-236) Field collapsing

2007-06-05 Thread Emmanuel Keller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Keller updated SOLR-236:
-

Attachment: SOLR-236-FieldCollapsing.patch

Sorry, my last post was buggy. Here is the correct one. There is no more 
exception now.
About tokens, if any token matches within the field it will collapse.
When I start implementing collapsing, my need was to to group documents having 
exact identical field.

I believe that faceting has identical behavior. Lookt at "Graphic card" as 
example:
http://localhost:8983/solr/select/?q=cat:graphic%20card&version=2.2&start=0&rows=10&indent=on&facet=true&facet.field=cat

I will try to maintain the wiki page.

> Field collapsing
> 
>
> Key: SOLR-236
> URL: https://issues.apache.org/jira/browse/SOLR-236
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.2
>Reporter: Emmanuel Keller
> Attachments: collapse_field.patch, collapse_field.patch, 
> field_collapsing.patch, field_collapsing.patch, field_collapsing.patch, 
> field_collapsing_1.1.0.patch, SOLR-236-FieldCollapsing.patch, 
> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given 
> field to a single entry in the result set. Site collapsing is a special case 
> of this, where all results for a given web site is collapsed into one or two 
> entries in the result set, typically with an associated "more documents from 
> this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before 
> collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version (1.2)
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.