Re: parent/child rows in solr

Shawn Heisey Tue, 11 Sep 2018 18:33:01 -0700

On 9/11/2018 7:07 PM, John Smith wrote:

header:      223,580


child1:      124,978
child2:      254,045
child3:      127,917
child4:    1,009,030
child5:      225,311
child6:      381,561
child7:      438,315
child8:       18,850


Trying to index that into solr with a flatfile schema, blows up into
5,475,316,072 rows. Yes, 5.5 billion rows. I calculated that by running a

I think you're not getting what I'm suggesting. Or maybe there's anaspect of your data that I'm not understanding.

If we add up all those numbers for the child docs, there are 2.5 millionof them. So you would have 2.5 million docs in Solr. I have createdSolr indexes far larger than this, and I do not consider my work to be"big data". Solr can handle 2.5 million docs easily, as long as thehardware resources are sufficient.

Where the data duplication will come in is in additional fields in those2.5 million docs. Each one will contain some (or maybe all) of the datathat WOULD have been in the parent document. The amount of databalloons, but the number of documents (rows) doesn't.

That kind of arrangement is usually enough to accomplish whatever isneeded. I cannot assume that it will work for your use case, but itdoes work for most.


Thanks,
Shawn

Re: parent/child rows in solr

Reply via email to