On 4/25/2013 8:17 AM, Gustav wrote:
> Are these two methods functionally different? Is there a performance
> difference?
> 
> Another though would be that, if using join tables in MySQL, using the SQL
> query method with multiple joins could cause multiple documents to be
> indexed instead of one.

They may be equivalent in terms of results, but they work differently
and probably will NOT have the same performance.

When using nested entities in DIH, the main entity results in one SQL
query, but the inner entities will result in a separate SQL query for
every single item returned by the main query.  If you have exactly 1
million rows in your main table and you're using a nested config with
two entities, you will be executing 1000001 queries.  DIH will be
spending a fair amount of time doing nothing but waiting for the latency
on a million individual queries via JDBC.  It probably also results in
extra work for the database server.

With a server-side join, you're down to one query via JDBC, and the
database server is doing the work of combining your tables, normally
something it can do very efficiently.

Thanks,
Shawn

Reply via email to