You have nested entities and accumulate the content of the inner
entities in the outer one with caching on an inner one. Your
description sounds like the inner cache is not reset on the next
iteration of the outer loop.

This may be connected to
https://issues.apache.org/jira/browse/SOLR-7843 (Fixed in 5.4)

Or it may be a different bug. I would make a simplest test case (based
on DIH-db example) and then try it on 5.3.1 and 5.4. And then 6.4 if
the problem is still there. If it is still there in 6.4, then we may
have a new bug.

Regards,
   Alex.
----
http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 16 March 2017 at 09:17, Sujay Bawaskar <sujay.bawas...@firstfuel.com> wrote:
> This behaviour is for delta import only. One document get field values of
> all documents. These fields are child entities which maps column to multi
> valued fields.
>
> <entity name="user_building"
> query="IMPORT_QUERY"
> deltaQuery="DELTA_QUERY"
> pk="buildingUserId"
> deletedPkQuery="DELETE_QUERY"
> onError="continue">
>
>                      <entity name="buildingUsagePointsAndAccountNumber"
> query="SELECT_QUERY"
> transformer="RegexTransformer" cacheImpl="SortedMapBackedCache"
> cacheKey="bldId" cacheLookup="user_building.plainBuildingId"
> onError="continue">
> <field name="txt_usage_points" column="usage_points" splitBy="," />
> <field name="txt_account_numbers" column="account_numbers"
> splitBy="," />
> <field name="sdp_service_to_date" column="sdp_service_to_date"
> dateTimeFormat="yyyy-MM-dd" />
> </entity>
> </entity>
>
> On Thu, Mar 16, 2017 at 6:35 PM, Alexandre Rafalovitch <arafa...@gmail.com>
> wrote:
>
>> Could you give a bit more details. Do you mean one document gets the
>> content of multiple documents? And only on delta?
>>
>> Regards,
>>     Alex
>>
>> On 16 Mar 2017 8:53 AM, "Sujay Bawaskar" <sujay.bawas...@firstfuel.com>
>> wrote:
>>
>> Hi,
>>
>> We are using DIH with cache(SortedMapBackedCache) with solr 5.3.1. We have
>> around 2.8 million documents in solr and total index size is 4 GB. DIH
>> delta import is dumping all values of mapped columns to their respective
>> multi valued fields. This is causing size of one solr document upto 2 GB.
>> Is this a known issue with solr 5.3.1?
>>
>> Thanks,
>> Sujay
>>

Reply via email to