[
https://issues.apache.org/jira/browse/LUCENE-8558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kranthi updated LUCENE-8558:
----------------------------
Description:
The indexing time for my ~2M documents has gone up significantly when I started
adding fields of type NumericDocValuesField.
Upon debugging found the bottleneck to be in the
PerFieldMergeState#FilterFieldInfos constructor. The contains check in the
below code snippet was the culprit.
{code:java}
this.filteredNames = new HashSet<>(filterFields);
this.filtered = new ArrayList<>(filterFields.size());
for (FieldInfo fi : src) {
if (filterFields.contains(fi.name)) {
{code}
A simple change as below seems to have fixed my issue
{code:java}
this.filteredNames = new HashSet<>(filterFields);
this.filtered = new ArrayList<>(filterFields.size());
for (FieldInfo fi : src) {
if (this.filteredNames.contains(fi.name)) {
{code}
was:
The indexing time for my ~2M documents has gone up significantly when I started
adding fields of type NumericDocValuesField.
Upon debugging found the bottleneck to be in the
PerFieldMergeState#FilterFieldInfos constructor. The contains check in the
below code snippet was the culprit.
{code:java}
this.filteredNames = new HashSet<>(filterFields);
this.filtered = new ArrayList<>(filterFields.size());
for (FieldInfo fi : src) {
if (filterFields.contains(fi.name)) {
{code}
A simple change to the following seems to have fixed my issue
{code:java}
this.filteredNames = new HashSet<>(filterFields);
this.filtered = new ArrayList<>(filterFields.size());
for (FieldInfo fi : src) {
if (this.filteredNames.contains(fi.name)) {
{code}
> Adding NumericDocValuesFields is slowing down the indexing process
> significantly
> --------------------------------------------------------------------------------
>
> Key: LUCENE-8558
> URL: https://issues.apache.org/jira/browse/LUCENE-8558
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/index
> Affects Versions: 7.4, 7.5
> Reporter: Kranthi
> Priority: Major
> Labels: patch, performance
> Fix For: 7.4, 7.5
>
>
> The indexing time for my ~2M documents has gone up significantly when I
> started adding fields of type NumericDocValuesField.
>
> Upon debugging found the bottleneck to be in the
> PerFieldMergeState#FilterFieldInfos constructor. The contains check in the
> below code snippet was the culprit.
> {code:java}
> this.filteredNames = new HashSet<>(filterFields);
> this.filtered = new ArrayList<>(filterFields.size());
> for (FieldInfo fi : src) {
> if (filterFields.contains(fi.name)) {
> {code}
> A simple change as below seems to have fixed my issue
> {code:java}
> this.filteredNames = new HashSet<>(filterFields);
> this.filtered = new ArrayList<>(filterFields.size());
> for (FieldInfo fi : src) {
> if (this.filteredNames.contains(fi.name)) {
> {code}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]