Re: [nhibernate-development] A new scheme for eager loading

Fabio Maulo Thu, 17 Feb 2011 04:32:51 -0800

The matter is not how do the query but how hydrate the model.
Some years ago, I and Oren have talked about the introduction of a new fetch
mode for relations and collections: fetch="batchquery"...
btw, perhaps, we can put some more effort in "subselect" fetching at least
for those RDBMS supporting a decent paging syntax.


On Thu, Feb 17, 2011 at 8:51 AM, Niels Kühnel <[email protected]>wrote:

> All right. I underestimated the penalty from the increased size of the
> result set.
> When not paging subselects are faster than my idea unless you have some
> really complicated where clauses
> When paging batching is faster unless you have big pages.
> This means that it is not worth the effort.
>
> Thanks for your input.
>
> Now, subselects would be faster in all cases if the where clause could be
> reused.
>
> DECLARE @IDs TABLE (ID bigint primary key);
> INSERT @IDs select TOP 100 ID FROM [Person] WHERE {tricky stuff}
>
> SELECT {all fields} FROM [Person] WHERE ID IN (SELECT ID FROM @IDs)
> SELECT {all fields} FROM [Tag] tags0_ WHERE tags0_.Person_id in (select ID
> FROM @IDs)
> SELECT {all fields} FROM [Phonenumber] phonenumbe0_ WHERE
> phonenumbe0_.Person_id in (select ID FROM @IDs)
>
> What do you think is the best way to make queries like this?
>
> Thanks,
> Niels
>
>
>
> On 16 February 2011 21:01, Niels Kühnel <[email protected]> wrote:
>
>> I'm not sure why i used INNER JOIN .. 1=1. Cross join is more clear,
>> thanks.
>>
>> Anyway, I like the power in making general expressions as:
>> "I want to show these companies in my list, and for all of them I need
>> these relations for what I'm going to show. Please give me them as fast as
>> possible, but I don't really care how"
>>
>> _session.QueryOver<Company>().Skip(20).Take(20).OrderBy(x=>x.Name).Asc)
>>                 .EagerFetchMany(x => x.PushReportInfos)
>>                 .EagerFetch(x => x.AgreementStatus)
>>                 .EagerFetchMany(x => x.SecurityTokens)
>>                 .EagerFetchMany(x => x.NaceEntries)
>>                 .EagerFetchMany(x => x.NaceEntries.First().NaceEntry)
>>                 .Where "some really complicated stuff with EXISTS and
>> stuff"
>>
>> You can get that with subselects too, but in that case "some really
>> complicated stuff" will be repeated for each sub select.
>> In my approach filtering is only done once, and the database engine simply
>> joins the tables on the resulting root entities
>>
>> If you follow an approach whith a query for only the root entities and
>> then get the rest by using some kind of " WHERE IN (root ids)" approach
>> you're essentially doing the same as me, except that your need more code
>> and you are sending a lot of ids to and fro.
>> You are especially likely to get into trouble if you need nested relations
>> or if you for some reason need to consider thousands of entities.
>>
>> I still need to do some performance comparisons though, as everything else
>> is just hand waving :)
>>
>> On 16 February 2011 20:36, nadav s <[email protected]> wrote:
>>
>>> fetch=subselect does a subselect for the original query that isn't really
>>> neccessary because you have the ids in the session, and that subselect might
>>> be a big one sometimes.
>>> I think batch size is the best approach.
>>> The thing is that currently you have to set the value in the mappings...
>>> there is a patch i've uploaded to jira, i know fabio doesn't like it,
>>> still don't really understand why exactly, but it enables you to override
>>> the batch size that was set in the mappings per session, and i think it
>>> could be useful in plenty of use cases.
>>>
>>>
>>> On Wed, Feb 16, 2011 at 7:41 PM, Fabio Maulo <[email protected]>wrote:
>>>
>>>> You can try even fetch="subselect" to upload each collection in just one
>>>> roundtrip
>>>> For instance given a Company with a collection of Contacts when you get
>>>> the first page of Companies NH will fetch all contacts of the first page in
>>>> one roundtrip.
>>>> fetch="subselect" may fail in some circumstances under MSSQL thank to
>>>> its nice way to get pagination (perhaps in 2012 the will realize that
>>>> result-sets are paged in web and that developers are humans).
>>>>
>>>>
>>>> On Wed, Feb 16, 2011 at 2:21 PM, Diego Mijelshon <
>>>> [email protected]> wrote:
>>>>
>>>>> Niels,
>>>>>
>>>>> Your concepts are interesting.
>>>>> A simple question: why are you using INNER JOIN (...) ON 1=1 instead of
>>>>> CROSS JOIN?
>>>>>
>>>>> Anyway: while the whole idea might work, it would be interesting to
>>>>> measure it against existing approaches using real use cases. I suspect
>>>>> (read: guess) the gains do not offset the loses.
>>>>> Also, my general approach is to NEVER fetch collections. I prefer to
>>>>> use load batching, or a couple roundtrips.
>>>>> As applied to paging, this will get even easier when we introduce HQL
>>>>> syntax for paging.
>>>>>
>>>>> But as said: all research is useful, even if we don't end up using the
>>>>> exact original idea.
>>>>>
>>>>>     Diego
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 16, 2011 at 10:15, Niels Kühnel <[email protected]>wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm new on this list, so I'm very sorry if I break all your rules...
>>>>>>
>>>>>> I have devised a new way for eager loading collections, collections'
>>>>>> collections etc. at once without making the Cartesian product. I've
>>>>>> done a proof of concept implementation but I think it's solid enough
>>>>>> for further investigation
>>>>>>
>>>>>> The basic idea is to add a "fork" to the query to add the missing
>>>>>> structure when two unrelated tables are joined as:
>>>>>>
>>>>>> SELECT A.ID AS A, Forks.Fork, B.ID AS B, C.ID AS C FROM A
>>>>>>    INNER JOIN (SELECT 1 AS Fork UNION ALL SELECT 2) AS Forks ON 1=1
>>>>>>    LEFT JOIN B ON B.A = A.ID AND Forks.Fork = 1
>>>>>>    LEFT JOIN C ON C.A = A.ID AND Forks.Fork = 2
>>>>>>
>>>>>> While spending some time with eager loading I've also improved paging
>>>>>> in Sql server 2005+/CE 4.0 so that you can get root entities, say, 10
>>>>>> to 20 even though the query has eagerly fetched collections. I haven't
>>>>>> yet looked into if the same method can be used with other databases
>>>>>>
>>>>>> If you have the time, you can read more about my ideas at
>>>>>>
>>>>>> http://kuhnel.wordpress.com/2010/10/15/the-eager-fork-%E2%80%93-taming-cartesian-explosions-in-databases/
>>>>>> and
>>>>>>
>>>>>> http://kuhnel.wordpress.com/2011/02/13/paging-eager-fetched-nhibernate-collections-efficiently/
>>>>>>
>>>>>> and some proof of concept code at
>>>>>> https://bitbucket.org/nielskuhnel/eager-nh-extensions
>>>>>>
>>>>>> I still have to do a proper performance comparison against lazy
>>>>>> loading and futures but my ad hoc results are encouraging. I've also
>>>>>> studied execution plans in SQL Server and the "fork" tables doesn't
>>>>>> seem to upset it in any way.
>>>>>> It's implemented as a custom dialect so no changes to the core are
>>>>>> needed per se.
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>> Niels Kühnel
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Fabio Maulo
>>>>
>>>>
>>>
>>
>


-- 
Fabio Maulo

Re: [nhibernate-development] A new scheme for eager loading

Reply via email to