gsmiller commented on a change in pull request #509:
URL: https://github.com/apache/lucene/pull/509#discussion_r774242416



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/FacetsConfig.java
##########
@@ -471,19 +471,24 @@ private void indexDrillDownTerms(
 
   private void processSSDVFacetFields(
       Map<String, List<SortedSetDocValuesFacetField>> byField, Document doc) {
+    Set<String> addedPaths = new HashSet<>();
+
     for (Map.Entry<String, List<SortedSetDocValuesFacetField>> ent : 
byField.entrySet()) {
 
       String indexFieldName = ent.getKey();
 
       for (SortedSetDocValuesFacetField facetField : ent.getValue()) {
-        FacetLabel facetLabel = new FacetLabel(facetField.dim, 
facetField.label);
-        String fullPath = pathToString(facetLabel.components, 
facetLabel.length);
-
-        // For facet counts:
-        doc.add(new SortedSetDocValuesField(indexFieldName, new 
BytesRef(fullPath)));
-
-        // For drill-down:
-        indexDrillDownTerms(doc, indexFieldName, getDimConfig(facetField.dim), 
facetLabel);
+        FacetLabel facetLabel = new FacetLabel(facetField.dim, 
facetField.path);
+        for (int i = 0; i < facetLabel.length; i++) {

Review comment:
       Ooops, I actually meant to go back and remove this comment before 
publishing my review. So let me elaborate.
   
   Seems like the only hard requirement for this indexing is to handle 
multi-valued counting properly, but I can also see that it makes building up 
the child/sibling state much simpler to know that every unique path will have a 
corresponding ordinal in the SSDV field. So I'm supportive of always doing this 
for now (could maybe optimize later).
   
   It does seem that this will "regress" the non-hierarchical and single-valued 
cases though in that it will start indexing all the dims even though it's not 
strictly necessary. It would be nice to not do this indexing where it's not 
needed. At the same time, we don't actually know if current users expect multi- 
vs. single-valued since dim config isn't made available, so I could be 
convinced that we should just do this and assume everything could be 
multi-valued. But maybe there's a clean way to make this configurable like taxo 
facets? Let's think about it a bit more.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to