Hi,

Did not encounter this issue with solr 6.x. But delta import with cache
executes nested query for every element encountered in parent query. Since
this select does not have where clause because we are using cache, it takes
long time. So delta import witch cache is very slow. My observation is
 behaviour of delta import with caching is not similar to that of full
import with caching. If delta query selects 10 elements then its like
executing select all query on database for all ten records.
Any comment on this behaviour of delta import?

On Thu, Mar 16, 2017 at 7:47 PM, Sujay Bawaskar <
sujay.bawas...@firstfuel.com> wrote:

> Thanks Alex. I will test it with 5.4 and 6.4 and let you know.
>
> On Thu, Mar 16, 2017 at 7:40 PM, Alexandre Rafalovitch <arafa...@gmail.com
> > wrote:
>
>> You have nested entities and accumulate the content of the inner
>> entities in the outer one with caching on an inner one. Your
>> description sounds like the inner cache is not reset on the next
>> iteration of the outer loop.
>>
>> This may be connected to
>> https://issues.apache.org/jira/browse/SOLR-7843 (Fixed in 5.4)
>>
>> Or it may be a different bug. I would make a simplest test case (based
>> on DIH-db example) and then try it on 5.3.1 and 5.4. And then 6.4 if
>> the problem is still there. If it is still there in 6.4, then we may
>> have a new bug.
>>
>> Regards,
>>    Alex.
>> ----
>> http://www.solr-start.com/ - Resources for Solr users, new and
>> experienced
>>
>>
>> On 16 March 2017 at 09:17, Sujay Bawaskar <sujay.bawas...@firstfuel.com>
>> wrote:
>> > This behaviour is for delta import only. One document get field values
>> of
>> > all documents. These fields are child entities which maps column to
>> multi
>> > valued fields.
>> >
>> > <entity name="user_building"
>> > query="IMPORT_QUERY"
>> > deltaQuery="DELTA_QUERY"
>> > pk="buildingUserId"
>> > deletedPkQuery="DELETE_QUERY"
>> > onError="continue">
>> >
>> >                      <entity name="buildingUsagePointsAndAccountNumber"
>> > query="SELECT_QUERY"
>> > transformer="RegexTransformer" cacheImpl="SortedMapBackedCache"
>> > cacheKey="bldId" cacheLookup="user_building.plainBuildingId"
>> > onError="continue">
>> > <field name="txt_usage_points" column="usage_points" splitBy="," />
>> > <field name="txt_account_numbers" column="account_numbers"
>> > splitBy="," />
>> > <field name="sdp_service_to_date" column="sdp_service_to_date"
>> > dateTimeFormat="yyyy-MM-dd" />
>> > </entity>
>> > </entity>
>> >
>> > On Thu, Mar 16, 2017 at 6:35 PM, Alexandre Rafalovitch <
>> arafa...@gmail.com>
>> > wrote:
>> >
>> >> Could you give a bit more details. Do you mean one document gets the
>> >> content of multiple documents? And only on delta?
>> >>
>> >> Regards,
>> >>     Alex
>> >>
>> >> On 16 Mar 2017 8:53 AM, "Sujay Bawaskar" <sujay.bawas...@firstfuel.com
>> >
>> >> wrote:
>> >>
>> >> Hi,
>> >>
>> >> We are using DIH with cache(SortedMapBackedCache) with solr 5.3.1. We
>> have
>> >> around 2.8 million documents in solr and total index size is 4 GB. DIH
>> >> delta import is dumping all values of mapped columns to their
>> respective
>> >> multi valued fields. This is causing size of one solr document upto 2
>> GB.
>> >> Is this a known issue with solr 5.3.1?
>> >>
>> >> Thanks,
>> >> Sujay
>> >>
>>
>
>

Reply via email to