i've benchmarked the import already with 500k records, one time without the 
artists subquery, and one time without the join in the main query:


Without subquery: 500k in 3 min 30 sec

Without join and without subquery: 500k in 2 min 30.

With subquery and with left join:   320k in 6 Min 30


so the joins / subqueries are definitely a bottleneck. 

How exactly did you implement the custom data import? 

In our case, we need to de-normalize the relations of the sql data for the 
index, 
so i fear i can't really get rid of the join / subquery.


-robert





On Dec 15, 2010, at 15:43 , Tim Heckman wrote:

> 2010/12/15 Robert Gründler <rob...@dubture.com>:
>> The data-config.xml looks like this (only 1 entity):
>> 
>>      <entity name="track" query="select t.id as id, t.title as title, 
>> l.title as label from track t left join label l on (l.id = t.label_id) where 
>> t.deleted = 0" transformer="TemplateTransformer">
>>        <field column="title" name="title_t" />
>>        <field column="label" name="label_t" />
>>        <field column="id" name="sf_meta_id" />
>>        <field column="metaclass" template="Track" name="sf_meta_class"/>
>>        <field column="metaid" template="${track.id}" name="sf_meta_id"/>
>>        <field column="uniqueid" template="Track_${track.id}" 
>> name="sf_unique_id"/>
>> 
>>        <entity name="artists" query="select a.name as artist from artist a 
>> left join track_artist ta on (ta.artist_id = a.id) where 
>> ta.track_id=${track.id}">
>>          <field column="artist" name="artists_t" />
>>        </entity>
>> 
>>      </entity>
> 
> So there's one track entity with an artist sub-entity. My (admittedly
> rather limited) experience has been that sub-entities, where you have
> to run a separate query for every row in the parent entity, really
> slow down data import. For my own purposes, I wrote a custom data
> import using SolrJ to improve the performance (from 3 hours to 10
> minutes).
> 
> Just as a test, how long does it take if you comment out the artists entity?

Reply via email to