[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495334 ] Ryan McKinley commented on SOLR-236: This looks good. Someone with better lucene chops should look at the IndexSearcher getDocListAndSet part... A few comments/questions about the interface: If you apply all the example docs and hit: http://localhost:8983/solr/select/?q=*:*&collapse=true you get 500. We should use: params.required().get( "collapse.field" ) to have a nicer error: With: http://localhost:8983/solr/select/?q=*:*&collapse=true&collapse.field=manu&collapse.max=1 the collapse info at the bottom says: 3 5 9 what does that mean? How would you use it? How does it relate to the Field collapsing > > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.2 >Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before > collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-237) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495332 ] Ryan McKinley commented on SOLR-237: Looks like *I* missed something... yes, SOLR-236 applies to trunk fine. I didn't notice it because I was looking at this issue. Since any further development/integration should happen on SOLR-236, I think we should close this issue and mark it as a duplicate. I'll put my substantive comments on SOLR-236... > Field collapsing > > > Key: SOLR-237 > URL: https://issues.apache.org/jira/browse/SOLR-237 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.1.0 >Reporter: Emmanuel Keller > Attachments: field_collapsing-1.1.patch, field_collapsing_1.1.0.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before > collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (SOLR-234) TrimFilter should update the start and end offsets
: > Incidently, PatternTokenizerFactory seems to have the anoying limitation : > of assuming there is a token prior to each match -- even if the match : > explicitly matches on the start of the string (so it creates a 0 width : > token) ... that seems like a bug right? : how would you change it? I don't know regex well enough to see the : limitation. My only criteria was that the output is the same as if you : send it to string.split( pattern ); Ahhh yes i see ... if you are trying to mimic String.split (or Pattern.split) then the current behavior is correct. my thinking was that if you were trying to use this to tokenize on whitespace (or something like that) and your input as " aaa bbb ccc " ... this would create 4 tokens: an zero width token, followed by tokens for aaa, bbb, and ccc ... but that first token seeemed like a mistake to me (or if it's not a mistake, then it seemed like there should also be a zero width width token at the end after the last space too ... but that's the say string splitting works too. -Hoss
[jira] Commented: (SOLR-237) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495292 ] Emmanuel Keller commented on SOLR-237: -- I provided two patch. The first was made off current trunk (see solr-236). And this one (solr-237) for release 1.1. Is that correct ? Or did I miss something ? > Field collapsing > > > Key: SOLR-237 > URL: https://issues.apache.org/jira/browse/SOLR-237 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.1.0 >Reporter: Emmanuel Keller > Attachments: field_collapsing-1.1.patch, field_collapsing_1.1.0.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before > collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (SOLR-234) TrimFilter should update the start and end offsets
Chris Hostetter wrote: : After 1/2 hour of regex hacking... I think I'll stick with a two step : process: split then trim ;) But regex hacking is FUN!! I'm 99% certain this does waht you want... yup! that does it. thanks ..if it doesn't send me an example string that it fails on and tell me what hte desired output is. Incidently, PatternTokenizerFactory seems to have the anoying limitation of assuming there is a token prior to each match -- even if the match explicitly matches on the start of the string (so it creates a 0 width token) ... that seems like a bug right? how would you change it? I don't know regex well enough to see the limitation. My only criteria was that the output is the same as if you send it to string.split( pattern ); thanks again ryan
[jira] Commented: (SOLR-237) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495285 ] Ryan McKinley commented on SOLR-237: can you make a patch off: http://svn.apache.org/repos/asf/lucene/solr/trunk/ thanks > Field collapsing > > > Key: SOLR-237 > URL: https://issues.apache.org/jira/browse/SOLR-237 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.1.0 >Reporter: Emmanuel Keller > Attachments: field_collapsing-1.1.patch, field_collapsing_1.1.0.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before > collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-237) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Keller updated SOLR-237: - Attachment: field_collapsing-1.1.patch Patch from http://svn.apache.org/repos/asf/lucene/solr/branches/branch-1.1 > Field collapsing > > > Key: SOLR-237 > URL: https://issues.apache.org/jira/browse/SOLR-237 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.1.0 >Reporter: Emmanuel Keller > Attachments: field_collapsing-1.1.patch, field_collapsing_1.1.0.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before > collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-237) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495244 ] Emmanuel Keller commented on SOLR-237: -- Ryan, I used the following svn path:http://svn.apache.org/repos/asf/lucene/solr/tags/release-1.1.0 Last changed revision: 489774 Shoud I use this one ? http://svn.apache.org/repos/asf/lucene/solr/branches/branch-1.1 Last changed revision: 488066 Thanks for you reply Emmanuel. > Field collapsing > > > Key: SOLR-237 > URL: https://issues.apache.org/jira/browse/SOLR-237 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.1.0 >Reporter: Emmanuel Keller > Attachments: field_collapsing_1.1.0.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before > collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (SOLR-234) TrimFilter should update the start and end offsets
: After 1/2 hour of regex hacking... I think I'll stick with a two step : process: split then trim ;) But regex hacking is FUN!! I'm 99% certain this does waht you want...
[jira] Commented: (SOLR-237) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495234 ] Ryan McKinley commented on SOLR-237: Thanks for posting this Emmanuel! I'm having trouble applying the patch... I get an error that says something like "this patch seems outdated!" Did you build it with a recent svn checkout? thanks ryan > Field collapsing > > > Key: SOLR-237 > URL: https://issues.apache.org/jira/browse/SOLR-237 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.1.0 >Reporter: Emmanuel Keller > Attachments: field_collapsing_1.1.0.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before > collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.