[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993233#comment-13993233 ]
Hoss Man commented on SOLR-2894: -------------------------------- I still haven't had a chance to really dig into the implementation details of the patch, but i wanted to spend some time testing things out from a user perspective... ---- One of the first things i noticed, is that the refinement requests seem to be extra verbose. For example, given this user request (using the example data, with a 2 shard cloud setup): http://localhost:8983/solr/select?q=*:*&sort=id+desc&rows=2&facet=true&facet.pivot=cat,inStock&facet.limit=3&facet.pivot=manu_id_s,inStock This is what the refinement requests in the logs of each shard looked like... {noformat} 3434041 [qtp1282186295-19] INFO org.apache.solr.core.SolrCore – [collection1] webapp=/solr path=/select params={manu_id_s,inStock_8__terms=samsung&facet=true&sort=id+desc&facet.limit=3&manu_id_s,inStock_9__terms=viewsonic&distrib=false&cat,inStock_1__terms=search&wt=javabin&version=2&rows=0&manu_id_s,inStock_6__terms=maxtor&manu_id_s,inStock_7__terms=nor&NOW=1399584452682&shard.url=http://127.0.1.1:8983/solr/collection1/&df=text&cat,inStock_2__terms=software&q=*:*&manu_id_s,inStock_3__terms=canon&manu_id_s,inStock_4__terms=ati&facet.pivot.mincount=-1&isShard=true&cat,inStock_0__terms=hard+drive&facet.pivot={!terms%3D$cat,inStock_0__terms}cat,inStock&facet.pivot={!terms%3D$cat,inStock_1__terms}cat,inStock&facet.pivot={!terms%3D$cat,inStock_2__terms}cat,inStock&facet.pivot={!terms%3D$manu_id_s,inStock_3__terms}manu_id_s,inStock&facet.pivot={!terms%3D$manu_id_s,inStock_4__terms}manu_id_s,inStock&facet.pivot={!terms%3D$manu_id_s,inStock_5__terms}manu_id_s,inStock&facet.pivot={!terms%3D$manu_id_s,inStock_6__terms}manu_id_s,inStock&facet.pivot={!terms%3D$manu_id_s,inStock_7__terms}manu_id_s,inStock&facet.pivot={!terms%3D$manu_id_s,inStock_8__terms}manu_id_s,inStock&facet.pivot={!terms%3D$manu_id_s,inStock_9__terms}manu_id_s,inStock&manu_id_s,inStock_5__terms=eu} hits=14 status=0 QTime=3 3424918 [qtp1282186295-16] INFO org.apache.solr.core.SolrCore – [collection1] webapp=/solr path=/select params={cat,inStock_10__terms=memory&manu_id_s,inStock_15__terms=dell&facet=true&manu_id_s,inStock_12__terms=apple&sort=id+desc&facet.limit=3&manu_id_s,inStock_13__terms=asus&manu_id_s,inStock_16__terms=uk&distrib=false&wt=javabin&manu_id_s,inStock_14__terms=boa&version=2&rows=0&NOW=1399584452682&shard.url=http://127.0.1.1:7574/solr/collection1/&df=text&manu_id_s,inStock_11__terms=corsair&q=*:*&facet.pivot.mincount=-1&isShard=true&facet.pivot={!terms%3D$cat,inStock_10__terms}cat,inStock&facet.pivot={!terms%3D$manu_id_s,inStock_11__terms}manu_id_s,inStock&facet.pivot={!terms%3D$manu_id_s,inStock_12__terms}manu_id_s,inStock&facet.pivot={!terms%3D$manu_id_s,inStock_13__terms}manu_id_s,inStock&facet.pivot={!terms%3D$manu_id_s,inStock_14__terms}manu_id_s,inStock&facet.pivot={!terms%3D$manu_id_s,inStock_15__terms}manu_id_s,inStock&facet.pivot={!terms%3D$manu_id_s,inStock_16__terms}manu_id_s,inStock} hits=18 status=0 QTime=2 {noformat} Or if we prune that down to just the interesting params (as far as pivot faceting goes)... {noformat:title=shard1} facet.pivot.mincount=-1 cat,inStock_0__terms=hard+drive cat,inStock_1__terms=search cat,inStock_2__terms=software manu_id_s,inStock_3__terms=canon manu_id_s,inStock_4__terms=ati manu_id_s,inStock_5__terms=eu manu_id_s,inStock_6__terms=maxtor manu_id_s,inStock_7__terms=nor manu_id_s,inStock_8__terms=samsung manu_id_s,inStock_9__terms=viewsonic facet.pivot={!terms=$cat,inStock_0__terms}cat,inStock facet.pivot={!terms=$cat,inStock_1__terms}cat,inStock facet.pivot={!terms=$cat,inStock_2__terms}cat,inStock facet.pivot={!terms=$manu_id_s,inStock_3__terms}manu_id_s,inStock facet.pivot={!terms=$manu_id_s,inStock_4__terms}manu_id_s,inStock facet.pivot={!terms=$manu_id_s,inStock_5__terms}manu_id_s,inStock facet.pivot={!terms=$manu_id_s,inStock_6__terms}manu_id_s,inStock facet.pivot={!terms=$manu_id_s,inStock_7__terms}manu_id_s,inStock facet.pivot={!terms=$manu_id_s,inStock_8__terms}manu_id_s,inStock facet.pivot={!terms=$manu_id_s,inStock_9__terms}manu_id_s,inStock {noformat} {noformat:title=shard2} facet.pivot.mincount=-1 cat,inStock_10__terms=memory manu_id_s,inStock_11__terms=corsair manu_id_s,inStock_12__terms=apple manu_id_s,inStock_13__terms=asus manu_id_s,inStock_14__terms=boa manu_id_s,inStock_15__terms=dell manu_id_s,inStock_16__terms=uk facet.pivot={!terms=$cat,inStock_10__terms}cat,inStock facet.pivot={!terms=$manu_id_s,inStock_11__terms}manu_id_s,inStock facet.pivot={!terms=$manu_id_s,inStock_12__terms}manu_id_s,inStock facet.pivot={!terms=$manu_id_s,inStock_13__terms}manu_id_s,inStock facet.pivot={!terms=$manu_id_s,inStock_14__terms}manu_id_s,inStock facet.pivot={!terms=$manu_id_s,inStock_15__terms}manu_id_s,inStock facet.pivot={!terms=$manu_id_s,inStock_16__terms}manu_id_s,inStock {noformat} I believe that what's going on here is basically: * top level params are being used for the individual terms that need refined (which is smart, helps eliminate risk of terms needing special escaping with local params) * the top level param names for these terms that need refined use a per-(user)request "global" counter to ensure that they are unique (+1) * the top level term param names also include the facet.pivot spec they are needed for -- this seems redundant since the counter is clearly global (even across multiple "facet.pivot" specs) * these top leve term param names are then added only to the shard requests where refinement is actually needed for those terms (+1) and are referenced as variables in facet.pivot commands using the "terms" local param (which the shards evidently look for to know when this is a refinement request) * because many terms may need refinement, that means each user specified {{facet.pivot=X,Y}} param results in *many* shard params of {{facet.pivot=\{!terms=$N\}X,Y}} I realize that local params don't play nice with multi-valued params at all, let alone make it easy to use a single variable to refer to a multi-valued param -- But wouldn't it be simpler (and less verbose over the wire) to just ignore Solr's built in param variable derefrencing and instead generate 1 unique param name to use for *all* the terms we care about (for each unique pivot spec), and then refer to that name once in a local param for a _single_ facet.pivot param (which the pivot facet could would then go and explicitly fetch from the top level SolrParams as a multi-value) The result being, that instead of the refinement requests shows above, the refinement requests for each shard could be something much simpler like... {noformat:title=shard1_proposed} facet.pivot.mincount=-1 _fpt_1=hard+drive _fpt_1=search _fpt_1=software _fpt_2=canon _fpt_2=ati _fpt_2=eu _fpt_2=maxtor _fpt_2=nor _fpt_2=samsung _fpt_2=viewsonic facet.pivot={!fpt=1}cat,inStock facet.pivot={!fpt=2}manu_id_s,inStock {noformat} {noformat:title=shard2_proposed} facet.pivot.mincount=-1 _fpt_1=memory _fpt_2=corsair _fpt_2=apple _fpt_2=asus _fpt_2=boa _fpt_2=dell _fpt_2=uk facet.pivot={!fpt=1}cat,inStock facet.pivot={!fpt=2}manu_id_s,inStock {noformat} (where {{\_fpt\_}} is just a short prefix for "facet pivot terms" that i pulled out of my ass) ---- Another thing I noticed is that with my 2 shard exampledocs setup, the following URL seems to send the pivot faceting into an infinite loop of refinement requests (note the typo: there's a space embeded in a field name {{manu_id_s != manu_+id_s}} )... http://localhost:8983/solr/select?q=*:*&sort=id+desc&rows=2&facet=true&facet.pivot=cat,manu_+id_s,inStock&facet.limit=3 ...not clear what's going on there, but definitely something that needs fixed before committing ("garbage in -> garbage out" is one thing, "garbage in -> crash your cluster" is another) ---- the multi-level refinement is sooooooo sweet. > Implement distributed pivot faceting > ------------------------------------ > > Key: SOLR-2894 > URL: https://issues.apache.org/jira/browse/SOLR-2894 > Project: Solr > Issue Type: Improvement > Reporter: Erik Hatcher > Fix For: 4.9, 5.0 > > Attachments: SOLR-2894-reworked.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > dateToObject.patch > > > Following up on SOLR-792, pivot faceting currently only supports > undistributed mode. Distributed pivot faceting needs to be implemented. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org