Thanks Alex. I will test it with 5.4 and 6.4 and let you know. On Thu, Mar 16, 2017 at 7:40 PM, Alexandre Rafalovitch <arafa...@gmail.com> wrote:
> You have nested entities and accumulate the content of the inner > entities in the outer one with caching on an inner one. Your > description sounds like the inner cache is not reset on the next > iteration of the outer loop. > > This may be connected to > https://issues.apache.org/jira/browse/SOLR-7843 (Fixed in 5.4) > > Or it may be a different bug. I would make a simplest test case (based > on DIH-db example) and then try it on 5.3.1 and 5.4. And then 6.4 if > the problem is still there. If it is still there in 6.4, then we may > have a new bug. > > Regards, > Alex. > ---- > http://www.solr-start.com/ - Resources for Solr users, new and experienced > > > On 16 March 2017 at 09:17, Sujay Bawaskar <sujay.bawas...@firstfuel.com> > wrote: > > This behaviour is for delta import only. One document get field values of > > all documents. These fields are child entities which maps column to multi > > valued fields. > > > > <entity name="user_building" > > query="IMPORT_QUERY" > > deltaQuery="DELTA_QUERY" > > pk="buildingUserId" > > deletedPkQuery="DELETE_QUERY" > > onError="continue"> > > > > <entity name="buildingUsagePointsAndAccountNumber" > > query="SELECT_QUERY" > > transformer="RegexTransformer" cacheImpl="SortedMapBackedCache" > > cacheKey="bldId" cacheLookup="user_building.plainBuildingId" > > onError="continue"> > > <field name="txt_usage_points" column="usage_points" splitBy="," /> > > <field name="txt_account_numbers" column="account_numbers" > > splitBy="," /> > > <field name="sdp_service_to_date" column="sdp_service_to_date" > > dateTimeFormat="yyyy-MM-dd" /> > > </entity> > > </entity> > > > > On Thu, Mar 16, 2017 at 6:35 PM, Alexandre Rafalovitch < > arafa...@gmail.com> > > wrote: > > > >> Could you give a bit more details. Do you mean one document gets the > >> content of multiple documents? And only on delta? > >> > >> Regards, > >> Alex > >> > >> On 16 Mar 2017 8:53 AM, "Sujay Bawaskar" <sujay.bawas...@firstfuel.com> > >> wrote: > >> > >> Hi, > >> > >> We are using DIH with cache(SortedMapBackedCache) with solr 5.3.1. We > have > >> around 2.8 million documents in solr and total index size is 4 GB. DIH > >> delta import is dumping all values of mapped columns to their respective > >> multi valued fields. This is causing size of one solr document upto 2 > GB. > >> Is this a known issue with solr 5.3.1? > >> > >> Thanks, > >> Sujay > >> >