Hello,
I have tried to retrieve values stored via StoredField type inside a Collector
when its method setNextReader(AtomicReaderContext) was called.
I used the following method from FieldCache, but do not get back any values:
FieldCache.DEFAULT.getTerms(indexReader, field, false);
Retrieving the values from the document itself during call to
Collector.collect(int) works fine.
But this is much much slower than getting all terms at once as by the above
method.
My question:
Is there a way to get binary content with similar performance as by the above
described concept, i.e. retrieving the field terms when setting the reader in a
Collector?
Besides, the concept works fine for any stored field that is indexed, e.g. like
in the following code snippet:
final FieldType fieldType = new FieldType();
{
fieldType.setStored(true);
fieldType.setIndexed(true); // need to index, otherwise no fast
retrieval of terms in collector is possible
fieldType.setIndexOptions(IndexOptions.DOCS_ONLY);
fieldType.setTokenized(false);
fieldType.setOmitNorms(true);
fieldType.freeze();
}
Field field = new Field(fieldName, fieldValue, fieldType); //
fieldValue is of type String
But this does not allow me to store binary content (i.e. values in byte[]
arrays) as is available for StoredField.
The constructor expects field content of type String.
I have tried to convert the content into base64 encoded strings, but the
conversion from base64 encoded strings to byte arrays is quite expensive for
large indexes.
Thanks for your advice.
Best regards,
Josef