Thanks Michael On Tue, Nov 28, 2023 at 11:45 PM Michael Froh <msf...@gmail.com> wrote:
> Oh -- of course if you're using IntPoint / LongPoint for your numeric > fields, they won't be indexed as terms, so loading terms for them won't > work. > > It's not the prettiest solution, but I think the following should let you > collect the set of distinct point values for an IntPoint field: > > > final Set<Integer> collectedValues = new TreeSet<>(); > for (LeafReaderContext lrc : reader.leaves()) { > LeafReader lr = lrc.reader(); > PointValues.IntersectVisitor collectingVisitor = new > PointValues.IntersectVisitor() { > @Override > public void visit(int docID) throws IOException { > > } > > @Override > public void visit(int docID, byte[] packedValue) { > > collectedValues.add(IntPoint.decodeDimension(packedValue, 0)); > } > > @Override > public PointValues.Relation compare(byte[] > minPackedValue, byte[] maxPackedValue) { > return PointValues.Relation.CELL_CROSSES_QUERY; > } > }; > > lr.getPointValues(fieldname).intersect(collectingVisitor); > } > > > > On Tue, Nov 28, 2023 at 1:42 PM Michael Froh <msf...@gmail.com> wrote: > > > Hello! > > > > Instead of MultiFields.getFields(), you can use > > MultiTerms.getTerms(reader, fieldname) to get the Terms instance. > > > > To decode your long / int values, you should be able to use > > LongPoint/IntPoint.unpack to write the values into an array: > > > > long[] val = new long[1]; // Assuming 1-D values > > LongPoint.unpack(value, 0, val); > > values.add(val[0]); > > > > Hope that helps, > > Froh > > > > > > On Wed, Nov 22, 2023 at 11:09 AM <vvse...@gmail.com> wrote: > > > >> Hello, > >> > >> In Lucene 6 I was doing this to get all values for a given field > >> knowing its type: > >> > >> public List<Object> getDistinctValues(IndexReader reader, String > >> fieldname, > >> Class<? extends Object> type) throws IOException { > >> > >> List<Object> values = new ArrayList<Object>(); > >> Fields fields = MultiFields.getFields(reader); > >> if (fields == null) return values; > >> > >> Terms terms = fields.terms(fieldname); > >> if (terms == null) return values; > >> > >> TermsEnum iterator = terms.iterator(); > >> > >> BytesRef value = iterator.next(); > >> > >> while (value != null) { > >> if (type == Long.class) { > >> values.add(LegacyNumericUtils.prefixCodedToLong(value)); > >> } else if (type == Integer.class) { > >> values.add(LegacyNumericUtils.prefixCodedToInt(value)); > >> } else if (type == Boolean.class) { > >> values.add(LegacyNumericUtils.prefixCodedToInt(value) == 1 ? > >> TRUE : FALSE); > >> } else if (type == Date.class) { > >> values.add(new > >> Date(LegacyNumericUtils.prefixCodedToLong(value))); > >> } else if (type == String.class) { > >> values.add(value.utf8ToString()); > >> } else { > >> // ... > >> } > >> > >> value = iterator.next(); > >> } > >> > >> return values; > >> } > >> > >> I am trying to upgrade to lucene 9. > >> there were 2 changes over time: > >> - LegacyNumericUtils has been removed in favor of PointBase > >> - MultiFields.getFields() has been dropped, and I read we were > encouraged > >> to avoid fields in general > >> > >> what is proper way to implement getting distinct values for a specific > >> field in a reader? > >> > >> thanks for your help, > >> > >> vs > >> > > >