Hi folks,
I'm fixing code that I noticed to have a defect. My expectation was that
once I make the fix, the index size will be smaller but instead I see it
growing.
Here is the stripped down version of the code to show the issue:
Buggy code #1:
for (String field : fieldsList)
{
doc.addField(SolrField_ID_LIST, "1"); // <== Notice how I'm adding the
same value over and over
doc.addField(SolrField_ALL_FIELDS_DATA, stringData);
}
docsToAdd.add(doc);
Fixed code #2:
for (String field : fieldsList)
{
doc.addField(SolrField_ALL_FIELDS_DATA, stringData);
}
doc.addField(SolrField_ID_LIST, "1"); // <== Notice how I'm now adding
this value only once
docsToAdd.add(doc);
I index the exact same data in both cases; all that changed is the logic of
the code per the above.
On my test index of 1000 records, when I look at Solr's admin page (same is
true looking at the physical disk in the "index" folder) the index size for
#1 is 834.77 KB, but for #2 it is 1.56 MB.
As a side test, I changed the code to the following:
Test code #3:
for (String field : fieldsList)
{
doc.addField(SolrField_ALL_FIELDS_DATA, stringData);
}
// doc.addField(SolrField_ID_LIST, "1"); // <== I no longer include this
field
docsToAdd.add(doc);
And now the index size is 2.27 MB !!!
Yes, each time I run the test, i start with a fresh empty index (num docs:
0, index size: 0).
Here are my field definitions:
<field name="ALL_FIELDS_DATA" type="text" multiValued="true"
indexed="true" required="false" stored="false"/>
<field name="ID_LIST" type="string" multiValued="true" indexed="true"
required="false" stored="false"/>
My question is, why my index size is going up in size? I was expecting it
to go down because I'm now indexing less data into each Solr document.
Thanks
Steve