[jira] [Commented] (SOLR-10132) Support facet.matches to cull facets returned with a regex
[ https://issues.apache.org/jira/browse/SOLR-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226817#comment-16226817 ] Gus Heck commented on SOLR-10132: - I had been meaning to get back to this, but it got lost in the shuffle, thanks for getting it in. > Support facet.matches to cull facets returned with a regex > -- > > Key: SOLR-10132 > URL: https://issues.apache.org/jira/browse/SOLR-10132 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 6.4.1 >Reporter: Gus Heck >Assignee: Christine Poerschke > Attachments: SOLR-10132.patch, SOLR-10132.patch, SOLR-10132.patch, > SOLR-10132.patch, SOLR-10132.patch > > > I recently ran into a case where I really wanted to only return the next > level of a hierarchical facet, and while I was able to do that with a > coordinated set of dynamic fields, it occurred to me that this would have > been much much easier if I could have simply used PathHierarchyTokenizer and > written > ="/my/current/prefix/[^/]+$" > thereby limiting the returned facets to the next level down and not return > the additional N levels I didn't (yet) want to display (numbering in the > thousands near the top of the tree). I suspect there are other good use > cases, and the patch seemed relatively tractable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10132) Support facet.matches to cull facets returned with a regex
[ https://issues.apache.org/jira/browse/SOLR-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226767#comment-16226767 ] ASF subversion and git services commented on SOLR-10132: Commit e6ec82249f909b25f33c2fd8bff75326b87bf115 in lucene-solr's branch refs/heads/branch_7x from [~cpoerschke] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e6ec822 ] SOLR-10132: A new optional facet.matches parameter to return facet buckets only for terms that match a regular expression. (Gus Heck, Christine Poerschke) > Support facet.matches to cull facets returned with a regex > -- > > Key: SOLR-10132 > URL: https://issues.apache.org/jira/browse/SOLR-10132 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 6.4.1 >Reporter: Gus Heck >Assignee: Christine Poerschke > Attachments: SOLR-10132.patch, SOLR-10132.patch, SOLR-10132.patch, > SOLR-10132.patch, SOLR-10132.patch > > > I recently ran into a case where I really wanted to only return the next > level of a hierarchical facet, and while I was able to do that with a > coordinated set of dynamic fields, it occurred to me that this would have > been much much easier if I could have simply used PathHierarchyTokenizer and > written > ="/my/current/prefix/[^/]+$" > thereby limiting the returned facets to the next level down and not return > the additional N levels I didn't (yet) want to display (numbering in the > thousands near the top of the tree). I suspect there are other good use > cases, and the patch seemed relatively tractable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10132) Support facet.matches to cull facets returned with a regex
[ https://issues.apache.org/jira/browse/SOLR-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225481#comment-16225481 ] ASF subversion and git services commented on SOLR-10132: Commit b8bcaf92465eed8477baf9932bea624b6b7830f8 in lucene-solr's branch refs/heads/master from [~cpoerschke] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=b8bcaf9 ] SOLR-10132: A new optional facet.matches parameter to return facet buckets only for terms that match a regular expression. (Gus Heck, Christine Poerschke) > Support facet.matches to cull facets returned with a regex > -- > > Key: SOLR-10132 > URL: https://issues.apache.org/jira/browse/SOLR-10132 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 6.4.1 >Reporter: Gus Heck >Assignee: Christine Poerschke > Attachments: SOLR-10132.patch, SOLR-10132.patch, SOLR-10132.patch, > SOLR-10132.patch, SOLR-10132.patch > > > I recently ran into a case where I really wanted to only return the next > level of a hierarchical facet, and while I was able to do that with a > coordinated set of dynamic fields, it occurred to me that this would have > been much much easier if I could have simply used PathHierarchyTokenizer and > written > ="/my/current/prefix/[^/]+$" > thereby limiting the returned facets to the next level down and not return > the additional N levels I didn't (yet) want to display (numbering in the > thousands near the top of the tree). I suspect there are other good use > cases, and the patch seemed relatively tractable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10132) Support facet.matches to cull facets returned with a regex
[ https://issues.apache.org/jira/browse/SOLR-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192888#comment-16192888 ] Gus Heck commented on SOLR-10132: - The only remaining question is whether or not newExcludeBytesRefFilter was intentionally protected access, or if that's just a result of the mood that the ide was in that particular day... If it's to be left in as a protected method the current patch is good to go (perhaps delete my comment). If there's any reasonable scenario where it might have already been relied upon of course it stays regardless (for back compatibility), but it's pretty new and pretty deep, so maybe not?. > Support facet.matches to cull facets returned with a regex > -- > > Key: SOLR-10132 > URL: https://issues.apache.org/jira/browse/SOLR-10132 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 6.4.1 >Reporter: Gus Heck >Assignee: Christine Poerschke > Attachments: SOLR-10132.patch, SOLR-10132.patch, SOLR-10132.patch, > SOLR-10132.patch > > > I recently ran into a case where I really wanted to only return the next > level of a hierarchical facet, and while I was able to do that with a > coordinated set of dynamic fields, it occurred to me that this would have > been much much easier if I could have simply used PathHierarchyTokenizer and > written > ="/my/current/prefix/[^/]+$" > thereby limiting the returned facets to the next level down and not return > the additional N levels I didn't (yet) want to display (numbering in the > thousands near the top of the tree). I suspect there are other good use > cases, and the patch seemed relatively tractable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10132) Support facet.matches to cull facets returned with a regex
[ https://issues.apache.org/jira/browse/SOLR-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16185771#comment-16185771 ] Gus Heck commented on SOLR-10132: - Any thoughts on this? > Support facet.matches to cull facets returned with a regex > -- > > Key: SOLR-10132 > URL: https://issues.apache.org/jira/browse/SOLR-10132 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 6.4.1 >Reporter: Gus Heck >Assignee: Christine Poerschke > Attachments: SOLR-10132.patch, SOLR-10132.patch, SOLR-10132.patch, > SOLR-10132.patch > > > I recently ran into a case where I really wanted to only return the next > level of a hierarchical facet, and while I was able to do that with a > coordinated set of dynamic fields, it occurred to me that this would have > been much much easier if I could have simply used PathHierarchyTokenizer and > written > ="/my/current/prefix/[^/]+$" > thereby limiting the returned facets to the next level down and not return > the additional N levels I didn't (yet) want to display (numbering in the > thousands near the top of the tree). I suspect there are other good use > cases, and the patch seemed relatively tractable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10132) Support facet.matches to cull facets returned with a regex
[ https://issues.apache.org/jira/browse/SOLR-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169878#comment-16169878 ] Christine Poerschke commented on SOLR-10132: Hi Gus, thanks for returning to this! bq. ... the MATCH_ALL_TERMS idea has missed the boat. ... so I think now returning null as before is the only path forward. ... I agree. bq. ... Also the new ascii doc stuff has come in since the last patch here so I probably should add some documentation for this feature too, now that that is something I can do myself :-). ... Yes please. bq. ... Should I do the patch vs trunk since it seems I just barely missed the boat for 7? Yes please. Almost always patches would be against trunk/master and then from there any back porting would be done via cherry-pick to the branches, branch_7x at present. > Support facet.matches to cull facets returned with a regex > -- > > Key: SOLR-10132 > URL: https://issues.apache.org/jira/browse/SOLR-10132 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 6.4.1 >Reporter: Gus Heck >Assignee: Christine Poerschke > Attachments: SOLR-10132.patch, SOLR-10132.patch, SOLR-10132.patch > > > I recently ran into a case where I really wanted to only return the next > level of a hierarchical facet, and while I was able to do that with a > coordinated set of dynamic fields, it occurred to me that this would have > been much much easier if I could have simply used PathHierarchyTokenizer and > written > ="/my/current/prefix/[^/]+$" > thereby limiting the returned facets to the next level down and not return > the additional N levels I didn't (yet) want to display (numbering in the > thousands near the top of the tree). I suspect there are other good use > cases, and the patch seemed relatively tractable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10132) Support facet.matches to cull facets returned with a regex
[ https://issues.apache.org/jira/browse/SOLR-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168923#comment-16168923 ] Gus Heck commented on SOLR-10132: - Finally getting back to this. From the looks of things, the MATCH_ALL_TERMS idea has missed the boat. There are several tests now that break when I try to implement that and the above statement about "yet part of an official release" has changed too, so I think now returning null as before is the only path forward. Working on a new patch (that returns null here). Tests are working including broken out test, also added javadoc. Also the new ascii doc stuff has come in since the last patch here so I probably should add some documentation for this feature too, now that that is something I can do myself :). Should I do the patch vs trunk since it seems I just barely missed the boat for 7? > Support facet.matches to cull facets returned with a regex > -- > > Key: SOLR-10132 > URL: https://issues.apache.org/jira/browse/SOLR-10132 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 6.4.1 >Reporter: Gus Heck >Assignee: Christine Poerschke > Attachments: SOLR-10132.patch, SOLR-10132.patch, SOLR-10132.patch > > > I recently ran into a case where I really wanted to only return the next > level of a hierarchical facet, and while I was able to do that with a > coordinated set of dynamic fields, it occurred to me that this would have > been much much easier if I could have simply used PathHierarchyTokenizer and > written > ="/my/current/prefix/[^/]+$" > thereby limiting the returned facets to the next level down and not return > the additional N levels I didn't (yet) want to display (numbering in the > thousands near the top of the tree). I suspect there are other good use > cases, and the patch seemed relatively tractable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10132) Support facet.matches to cull facets returned with a regex
[ https://issues.apache.org/jira/browse/SOLR-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886319#comment-15886319 ] Christine Poerschke commented on SOLR-10132: re: {{null}} vs. {{SimpleFacets.MATCH_ALL_TERMS}} - SOLR-9914 refactored the "contains" check which was a null or non-null String into a null or non-null SubstringBytesRefFilter object _and_ that change isn't yet part of an official release. On that basis, yes, I think MATCH_ALL_TERMS instead of null would make sense. And for clarity, could I suggest making that change separately from the {{facet.matches}} feature addition here - what do you think? > Support facet.matches to cull facets returned with a regex > -- > > Key: SOLR-10132 > URL: https://issues.apache.org/jira/browse/SOLR-10132 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 6.4.1 >Reporter: Gus Heck >Assignee: Christine Poerschke > Attachments: SOLR-10132.patch, SOLR-10132.patch, SOLR-10132.patch > > > I recently ran into a case where I really wanted to only return the next > level of a hierarchical facet, and while I was able to do that with a > coordinated set of dynamic fields, it occurred to me that this would have > been much much easier if I could have simply used PathHierarchyTokenizer and > written > ="/my/current/prefix/[^/]+$" > thereby limiting the returned facets to the next level down and not return > the additional N levels I didn't (yet) want to display (numbering in the > thousands near the top of the tree). I suspect there are other good use > cases, and the patch seemed relatively tractable. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10132) Support facet.matches to cull facets returned with a regex
[ https://issues.apache.org/jira/browse/SOLR-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878097#comment-15878097 ] Gus Heck commented on SOLR-10132: - Hmm, would it be possible to test like this: {code} if (termFilter != SimpleFacets.MATCH_ALL) { final BytesRef term = si.lookupOrd(startTermIndex+i); if (!termFilter.test(term)) { continue; } } {code} I'm generally not fond of null except when representing an unknown primitive value (boxed of course), which is why I tried to eliminate it. It's not very self documenting, and retaining it the list + switch you have added or a string of if/else checking as in the original code, plus and a lot of other reasons. http://www.yegor256.com/2014/05/13/why-null-is-bad.html is more eloquent than I on this... Perhaps it should be named {{MATCH_ALL_TERMS}} however, to avoid sounding like it has something to do with {{MatchAllDocsQuery()}} > Support facet.matches to cull facets returned with a regex > -- > > Key: SOLR-10132 > URL: https://issues.apache.org/jira/browse/SOLR-10132 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 6.4.1 >Reporter: Gus Heck >Assignee: Christine Poerschke > Attachments: SOLR-10132.patch, SOLR-10132.patch, SOLR-10132.patch > > > I recently ran into a case where I really wanted to only return the next > level of a hierarchical facet, and while I was able to do that with a > coordinated set of dynamic fields, it occurred to me that this would have > been much much easier if I could have simply used PathHierarchyTokenizer and > written > ="/my/current/prefix/[^/]+$" > thereby limiting the returned facets to the next level down and not return > the additional N levels I didn't (yet) want to display (numbering in the > thousands near the top of the tree). I suspect there are other good use > cases, and the patch seemed relatively tractable. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10132) Support facet.matches to cull facets returned with a regex
[ https://issues.apache.org/jira/browse/SOLR-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872078#comment-15872078 ] Gus Heck commented on SOLR-10132: - Ok I'll use the suggested, somewhat hackish workaround... opened https://issues.apache.org/jira/browse/SOLR-10155 for the review of this check > Support facet.matches to cull facets returned with a regex > -- > > Key: SOLR-10132 > URL: https://issues.apache.org/jira/browse/SOLR-10132 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 6.4.1 >Reporter: Gus Heck > Attachments: SOLR-10132.patch > > > I recently ran into a case where I really wanted to only return the next > level of a hierarchical facet, and while I was able to do that with a > coordinated set of dynamic fields, it occurred to me that this would have > been much much easier if I could have simply used PathHierarchyTokenizer and > written > ="/my/current/prefix/[^/]+$" > thereby limiting the returned facets to the next level down and not return > the additional N levels I didn't (yet) want to display (numbering in the > thousands near the top of the tree). I suspect there are other good use > cases, and the patch seemed relatively tractable. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10132) Support facet.matches to cull facets returned with a regex
[ https://issues.apache.org/jira/browse/SOLR-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870204#comment-15870204 ] Christine Poerschke commented on SOLR-10132: bq. ... still need to figure out what to do with the check for numeric facets ... Something like this might be hacky but should work: {code} public class RegexBytesRefFilter extends SubstringBytesRefFilter { final private String regex; final private Pattern compiled; public RegexBytesRefFilter(String regex) { super(regex, false); this.regex = regex; this.compiled = Pattern.compile(regex); } @Override protected boolean includeString(String term) { Matcher m = compiled.matcher(term); return m.matches(); } } {code} In SOLR-9914 we were puzzled by the check also, perhaps its removal could be considered (outside the scope of this ticket) i.e. just disallow/throw if any _contains_ (or _matches_ or other string-matching) is used with numeric facets? > Support facet.matches to cull facets returned with a regex > -- > > Key: SOLR-10132 > URL: https://issues.apache.org/jira/browse/SOLR-10132 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 6.4.1 >Reporter: Gus Heck > Attachments: SOLR-10132.patch > > > I recently ran into a case where I really wanted to only return the next > level of a hierarchical facet, and while I was able to do that with a > coordinated set of dynamic fields, it occurred to me that this would have > been much much easier if I could have simply used PathHierarchyTokenizer and > written > ="/my/current/prefix/[^/]+$" > thereby limiting the returned facets to the next level down and not return > the additional N levels I didn't (yet) want to display (numbering in the > thousands near the top of the tree). I suspect there are other good use > cases, and the patch seemed relatively tractable. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org