Just to get the paranoid option out of the way, is 'id' actually the column that has unique ids in your database? If you do "select distinct id from imdb.director" - how many items do you get?
Regards, Alex. ---- Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 7 November 2015 at 18:21, Yangrui Guo <guoyang...@gmail.com> wrote: > Hello > > I'm being troubled by solr's data import handler. My solr version is 5.3.1 > and mysql is 5.5. I tried to index imdb data but found solr only partially > indexed. I ran "SELECT DISTINCT COUNT(*) FROM imdb.director" and the query > result was 1636549. However DIH only fetched and indexed 287041 rows. I > didn't see any error in the log. Why was this happening? > > Here's my data-config.xml > > <dataConfig> > <dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" > url="jdbc:mysql://localhost:3306/imdb" user="root" password="password" /> > <document> > <entity name="director" transformer="RegexTransformer" query="SELECT > DISTINCT * FROM imdb.director"> > <field name="id" column="id" /> > <field name="content_type" column="content_type" /> > </entity> > </document> > </dataConfig> > > Yangrui Guo