implementing subselect as a fetch mode is something i wanted to do a long time ago. instead i was advised to make batch size enabled to be overridden in a session context. If it helps you, vote for it:
http://216.121.112.228/browse/NH-2316 <http://216.121.112.228/browse/NH-2316> On Sat, Feb 19, 2011 at 3:39 AM, Niels Kühnel <[email protected]>wrote: > That could be really cool. Although not beeing decent before the "Denali" > release, SQL server supports paging by means of sub queries. > > In general eager loading performance in NHibernate could be improved if, > offhand, something like this could be implemented in dialects: > > bool SupportsReusableSubSelectFilter > SqlString GetResuableSubSelectFilter(idFields, query, alias) > SqlString GetSubSelectFilterReference(alias) > > and the FetchMode enum had: > SubSelect > SubSelectPreFilter (expert option to use when pages are small or where > conditions complex) > > Of course, this would require quite some effort but just including > SubSelect in FetchMode and implementing paging using the existing > GetLimitString from dialects would be an awesome start :) > > Cheers, > Niels > > > On 17 February 2011 13:32, Fabio Maulo <[email protected]> wrote: > >> The matter is not how do the query but how hydrate the model. >> Some years ago, I and Oren have talked about the introduction of a new >> fetch mode for relations and collections: fetch="batchquery"... >> btw, perhaps, we can put some more effort in "subselect" fetching at least >> for those RDBMS supporting a decent paging syntax. >> >> On Thu, Feb 17, 2011 at 8:51 AM, Niels Kühnel <[email protected]>wrote: >> >>> All right. I underestimated the penalty from the increased size of the >>> result set. >>> When not paging subselects are faster than my idea unless you have some >>> really complicated where clauses >>> When paging batching is faster unless you have big pages. >>> This means that it is not worth the effort. >>> >>> Thanks for your input. >>> >>> Now, subselects would be faster in all cases if the where clause could be >>> reused. >>> >>> DECLARE @IDs TABLE (ID bigint primary key); >>> INSERT @IDs select TOP 100 ID FROM [Person] WHERE {tricky stuff} >>> >>> SELECT {all fields} FROM [Person] WHERE ID IN (SELECT ID FROM @IDs) >>> SELECT {all fields} FROM [Tag] tags0_ WHERE tags0_.Person_id in (select >>> ID FROM @IDs) >>> SELECT {all fields} FROM [Phonenumber] phonenumbe0_ WHERE >>> phonenumbe0_.Person_id in (select ID FROM @IDs) >>> >>> What do you think is the best way to make queries like this? >>> >>> Thanks, >>> Niels >>> >>> >>> >>> On 16 February 2011 21:01, Niels Kühnel <[email protected]> wrote: >>> >>>> I'm not sure why i used INNER JOIN .. 1=1. Cross join is more clear, >>>> thanks. >>>> >>>> Anyway, I like the power in making general expressions as: >>>> "I want to show these companies in my list, and for all of them I need >>>> these relations for what I'm going to show. Please give me them as fast as >>>> possible, but I don't really care how" >>>> >>>> _session.QueryOver<Company>().Skip(20).Take(20).OrderBy(x=>x.Name).Asc) >>>> .EagerFetchMany(x => x.PushReportInfos) >>>> .EagerFetch(x => x.AgreementStatus) >>>> .EagerFetchMany(x => x.SecurityTokens) >>>> .EagerFetchMany(x => x.NaceEntries) >>>> .EagerFetchMany(x => x.NaceEntries.First().NaceEntry) >>>> .Where "some really complicated stuff with EXISTS and >>>> stuff" >>>> >>>> You can get that with subselects too, but in that case "some really >>>> complicated stuff" will be repeated for each sub select. >>>> In my approach filtering is only done once, and the database engine >>>> simply joins the tables on the resulting root entities >>>> >>>> If you follow an approach whith a query for only the root entities and >>>> then get the rest by using some kind of " WHERE IN (root ids)" approach >>>> you're essentially doing the same as me, except that your need more code >>>> and you are sending a lot of ids to and fro. >>>> You are especially likely to get into trouble if you need nested >>>> relations or if you for some reason need to consider thousands of entities. >>>> >>>> I still need to do some performance comparisons though, as everything >>>> else is just hand waving :) >>>> >>>> On 16 February 2011 20:36, nadav s <[email protected]> wrote: >>>> >>>>> fetch=subselect does a subselect for the original query that isn't >>>>> really neccessary because you have the ids in the session, and that >>>>> subselect might be a big one sometimes. >>>>> I think batch size is the best approach. >>>>> The thing is that currently you have to set the value in the >>>>> mappings... >>>>> there is a patch i've uploaded to jira, i know fabio doesn't like it, >>>>> still don't really understand why exactly, but it enables you to override >>>>> the batch size that was set in the mappings per session, and i think it >>>>> could be useful in plenty of use cases. >>>>> >>>>> >>>>> On Wed, Feb 16, 2011 at 7:41 PM, Fabio Maulo <[email protected]>wrote: >>>>> >>>>>> You can try even fetch="subselect" to upload each collection in just >>>>>> one roundtrip >>>>>> For instance given a Company with a collection of Contacts when you >>>>>> get the first page of Companies NH will fetch all contacts of the first >>>>>> page >>>>>> in one roundtrip. >>>>>> fetch="subselect" may fail in some circumstances under MSSQL thank to >>>>>> its nice way to get pagination (perhaps in 2012 the will realize that >>>>>> result-sets are paged in web and that developers are humans). >>>>>> >>>>>> >>>>>> On Wed, Feb 16, 2011 at 2:21 PM, Diego Mijelshon < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Niels, >>>>>>> >>>>>>> Your concepts are interesting. >>>>>>> A simple question: why are you using INNER JOIN (...) ON 1=1 instead >>>>>>> of CROSS JOIN? >>>>>>> >>>>>>> Anyway: while the whole idea might work, it would be interesting to >>>>>>> measure it against existing approaches using real use cases. I suspect >>>>>>> (read: guess) the gains do not offset the loses. >>>>>>> Also, my general approach is to NEVER fetch collections. I prefer to >>>>>>> use load batching, or a couple roundtrips. >>>>>>> As applied to paging, this will get even easier when we introduce HQL >>>>>>> syntax for paging. >>>>>>> >>>>>>> But as said: all research is useful, even if we don't end up using >>>>>>> the exact original idea. >>>>>>> >>>>>>> Diego >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Feb 16, 2011 at 10:15, Niels Kühnel >>>>>>> <[email protected]>wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I'm new on this list, so I'm very sorry if I break all your rules... >>>>>>>> >>>>>>>> I have devised a new way for eager loading collections, collections' >>>>>>>> collections etc. at once without making the Cartesian product. I've >>>>>>>> done a proof of concept implementation but I think it's solid enough >>>>>>>> for further investigation >>>>>>>> >>>>>>>> The basic idea is to add a "fork" to the query to add the missing >>>>>>>> structure when two unrelated tables are joined as: >>>>>>>> >>>>>>>> SELECT A.ID AS A, Forks.Fork, B.ID AS B, C.ID AS C FROM A >>>>>>>> INNER JOIN (SELECT 1 AS Fork UNION ALL SELECT 2) AS Forks ON 1=1 >>>>>>>> LEFT JOIN B ON B.A = A.ID AND Forks.Fork = 1 >>>>>>>> LEFT JOIN C ON C.A = A.ID AND Forks.Fork = 2 >>>>>>>> >>>>>>>> While spending some time with eager loading I've also improved >>>>>>>> paging >>>>>>>> in Sql server 2005+/CE 4.0 so that you can get root entities, say, >>>>>>>> 10 >>>>>>>> to 20 even though the query has eagerly fetched collections. I >>>>>>>> haven't >>>>>>>> yet looked into if the same method can be used with other databases >>>>>>>> >>>>>>>> If you have the time, you can read more about my ideas at >>>>>>>> >>>>>>>> http://kuhnel.wordpress.com/2010/10/15/the-eager-fork-%E2%80%93-taming-cartesian-explosions-in-databases/ >>>>>>>> and >>>>>>>> >>>>>>>> http://kuhnel.wordpress.com/2011/02/13/paging-eager-fetched-nhibernate-collections-efficiently/ >>>>>>>> >>>>>>>> and some proof of concept code at >>>>>>>> https://bitbucket.org/nielskuhnel/eager-nh-extensions >>>>>>>> >>>>>>>> I still have to do a proper performance comparison against lazy >>>>>>>> loading and futures but my ad hoc results are encouraging. I've also >>>>>>>> studied execution plans in SQL Server and the "fork" tables doesn't >>>>>>>> seem to upset it in any way. >>>>>>>> It's implemented as a custom dialect so no changes to the core are >>>>>>>> needed per se. >>>>>>>> >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Niels Kühnel >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Fabio Maulo >>>>>> >>>>>> >>>>> >>>> >>> >> >> >> -- >> Fabio Maulo >> >> >
