Depending on what you use the field for, you can use BinaryDocValuesField which encodes a byte[] and lets you store the data however you want. But how are you using these fields later at search time?
On Tue, Jun 28, 2022 at 3:46 PM linfeng lu <linfeng...@hotmail.com> wrote: > Hi~ > > We are trying to build an OLAP database based on lucene, and we heavily > use lucene's *DocValues* (as our column store). > > *We try to use DocValues to store the array type field. *For example, if > we want to store the *field1* and *feild2* in this json document into > *DocValues* respectively, SORTED_NUMERIC and SORTED_SET seem to be our > only option. > > *{* > * "field1": [ 3, 1, 1, 2 ], * > * "field2": [ "c", "a", "a", "b" ] * > *}* > > > When we store *field1* in SORTED_NUMERIC and *field2* in SORTED_SET, we > will get this result: > > *[image: Community Verified icon]* > > field1: > > - origin: [3, 1, 1, 2] > - in SORTED_NUMERIC: [1, 1, 2, 3] > > field2: > > - origin: [”c”, “a”, “a”, “b” ] > - in SORTED_SET: ords [0, 1, 2] terms [”a”, “b”, “c”] > > > The original ordering relationship of the elements in the array is lost. > > We're guessing that lucene's DocValues are designed primarily for sorting > and aggregation, so the original order of elements may not matter. > > But in our usage scene, it is important to keep the original order of the > elements in the array (we allow user to access the elements in the array > using the subscript operator). > > We wonder if lucene has plans to add new types of DocValues that can store > arrays and keep the original order of elements in the array? > > Thanks! >