Thanks Alex. I will test it with 5.4 and 6.4 and let you know.

On Thu, Mar 16, 2017 at 7:40 PM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> You have nested entities and accumulate the content of the inner
> entities in the outer one with caching on an inner one. Your
> description sounds like the inner cache is not reset on the next
> iteration of the outer loop.
>
> This may be connected to
> https://issues.apache.org/jira/browse/SOLR-7843 (Fixed in 5.4)
>
> Or it may be a different bug. I would make a simplest test case (based
> on DIH-db example) and then try it on 5.3.1 and 5.4. And then 6.4 if
> the problem is still there. If it is still there in 6.4, then we may
> have a new bug.
>
> Regards,
>    Alex.
> ----
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 16 March 2017 at 09:17, Sujay Bawaskar <sujay.bawas...@firstfuel.com>
> wrote:
> > This behaviour is for delta import only. One document get field values of
> > all documents. These fields are child entities which maps column to multi
> > valued fields.
> >
> > <entity name="user_building"
> > query="IMPORT_QUERY"
> > deltaQuery="DELTA_QUERY"
> > pk="buildingUserId"
> > deletedPkQuery="DELETE_QUERY"
> > onError="continue">
> >
> >                      <entity name="buildingUsagePointsAndAccountNumber"
> > query="SELECT_QUERY"
> > transformer="RegexTransformer" cacheImpl="SortedMapBackedCache"
> > cacheKey="bldId" cacheLookup="user_building.plainBuildingId"
> > onError="continue">
> > <field name="txt_usage_points" column="usage_points" splitBy="," />
> > <field name="txt_account_numbers" column="account_numbers"
> > splitBy="," />
> > <field name="sdp_service_to_date" column="sdp_service_to_date"
> > dateTimeFormat="yyyy-MM-dd" />
> > </entity>
> > </entity>
> >
> > On Thu, Mar 16, 2017 at 6:35 PM, Alexandre Rafalovitch <
> arafa...@gmail.com>
> > wrote:
> >
> >> Could you give a bit more details. Do you mean one document gets the
> >> content of multiple documents? And only on delta?
> >>
> >> Regards,
> >>     Alex
> >>
> >> On 16 Mar 2017 8:53 AM, "Sujay Bawaskar" <sujay.bawas...@firstfuel.com>
> >> wrote:
> >>
> >> Hi,
> >>
> >> We are using DIH with cache(SortedMapBackedCache) with solr 5.3.1. We
> have
> >> around 2.8 million documents in solr and total index size is 4 GB. DIH
> >> delta import is dumping all values of mapped columns to their respective
> >> multi valued fields. This is causing size of one solr document upto 2
> GB.
> >> Is this a known issue with solr 5.3.1?
> >>
> >> Thanks,
> >> Sujay
> >>
>

Reply via email to