Hi folks,

I'm fixing code that I noticed to have a defect.  My expectation was that
once I make the fix, the index size will be smaller but instead I see it
growing.

Here is the stripped down version of the code to show the issue:

Buggy code #1:

  for (String field : fieldsList)
  {
    doc.addField(SolrField_ID_LIST, "1"); // <== Notice how I'm adding the
same value over and over
    doc.addField(SolrField_ALL_FIELDS_DATA, stringData);
  }

  docsToAdd.add(doc);

Fixed code #2:

  for (String field : fieldsList)
  {
    doc.addField(SolrField_ALL_FIELDS_DATA, stringData);
  }

  doc.addField(SolrField_ID_LIST, "1"); // <== Notice how I'm now adding
this value only once

  docsToAdd.add(doc);

I index the exact same data in both cases; all that changed is the logic of
the code per the above.

On my test index of 1000 records, when I look at Solr's admin page (same is
true looking at the physical disk in the "index" folder) the index size for
#1 is 834.77 KB, but for #2 it is 1.56 MB.

As a side test, I changed the code to the following:

Test code #3:

  for (String field : fieldsList)
  {
    doc.addField(SolrField_ALL_FIELDS_DATA, stringData);
  }

  // doc.addField(SolrField_ID_LIST, "1"); // <== I no longer include this
field

  docsToAdd.add(doc);

And now the index size is 2.27 MB !!!

Yes, each time I run the test, i start with a fresh empty index (num docs:
0, index size: 0).

Here are my field definitions:

  <field name="ALL_FIELDS_DATA" type="text" multiValued="true"
indexed="true" required="false" stored="false"/>
  <field name="ID_LIST" type="string" multiValued="true" indexed="true"
required="false" stored="false"/>

My question is, why my index size is going up in size?  I was expecting it
to go down because I'm now indexing less data into each Solr document.

Thanks

Steve

Reply via email to