This behaviour is for delta import only. One document get field values of all documents. These fields are child entities which maps column to multi valued fields.
<entity name="user_building" query="IMPORT_QUERY" deltaQuery="DELTA_QUERY" pk="buildingUserId" deletedPkQuery="DELETE_QUERY" onError="continue"> <entity name="buildingUsagePointsAndAccountNumber" query="SELECT_QUERY" transformer="RegexTransformer" cacheImpl="SortedMapBackedCache" cacheKey="bldId" cacheLookup="user_building.plainBuildingId" onError="continue"> <field name="txt_usage_points" column="usage_points" splitBy="," /> <field name="txt_account_numbers" column="account_numbers" splitBy="," /> <field name="sdp_service_to_date" column="sdp_service_to_date" dateTimeFormat="yyyy-MM-dd" /> </entity> </entity> On Thu, Mar 16, 2017 at 6:35 PM, Alexandre Rafalovitch <arafa...@gmail.com> wrote: > Could you give a bit more details. Do you mean one document gets the > content of multiple documents? And only on delta? > > Regards, > Alex > > On 16 Mar 2017 8:53 AM, "Sujay Bawaskar" <sujay.bawas...@firstfuel.com> > wrote: > > Hi, > > We are using DIH with cache(SortedMapBackedCache) with solr 5.3.1. We have > around 2.8 million documents in solr and total index size is 4 GB. DIH > delta import is dumping all values of mapped columns to their respective > multi valued fields. This is causing size of one solr document upto 2 GB. > Is this a known issue with solr 5.3.1? > > Thanks, > Sujay >