Re: [nhibernate-development] implementing FetchMode.SubSelect per query, and improving it

nadav s Wed, 01 Sep 2010 11:22:28 -0700

about the subselect breaking changes: yeah i got that, and its un
resolveable. i think its better this way because i just think it had alot of
performance issues previously, but with all do respect to my self, i can't
be the one to make this decision :)


about the per session batching, seems much more reasonable and much easier
than per query

On Wed, Sep 1, 2010 at 9:19 PM, Diego Mijelshon <[email protected]>wrote:

> If you do implement a customizable batch-size, it should be per-session...
> Something like the existing SetBatchSize and EnableFilter...
>
> A possible API:
>
> session.EnableBatching(typeof(Foo), 100); //on Foo entity
> session.EnableBatching(typeof(Foo), "Bars", 50); //on the Foo.Bars
> collection
> session.EnableBatching("Foo.Bars", 50); //alternative
> session.DisableBatching(...); //with the same options
>
>     Diego
>
>
>
> On Wed, Sep 1, 2010 at 14:38, nadav s <[email protected]> wrote:
>
>> of course i'll be doing the same work as batch size, i'm not set to
>> implement batch size all over again, but trying to allow it to be query
>> specific, meaning, being able to look for owners of a specific query, and
>> not all owners that are in the session (owners from different query might be
>> there), and allowing it to be overrideable
>>
>>
>> On Wed, Sep 1, 2010 at 8:09 PM, Fabio Maulo <[email protected]> wrote:
>>
>>> ah...
>>> take care with "using IDs is more efficient" because "subselect" does not
>>> suffer the problem of max-parameter (IIRC 2100 in msSQL)
>>>
>>>
>>> On Wed, Sep 1, 2010 at 2:03 PM, nadav s <[email protected]> wrote:
>>>
>>>> great. thanks
>>>>
>>>>
>>>> On Wed, Sep 1, 2010 at 7:56 PM, nadav s <[email protected]> wrote:
>>>>
>>>>> you know the internal of nhibernate much much much better than me, and
>>>>> i won't get into an implementation argue with you, but it is possible to
>>>>> implement.
>>>>>
>>>>> with subselect (again i'm talking about subselect because i didn't do
>>>>> any research on the batch size, but i guess the idea is similar because it
>>>>> works the same, only batch size issues a good query and subselect issues 
>>>>> an
>>>>> evil one), as i've noticed, there is a special one-to-many collection
>>>>> persister, that knows once the collection is accessed, use a sub select
>>>>> batcher that loads the collections of all the owners that were returned by
>>>>> the initial query.
>>>>>
>>>>> if the persister could have been set, or modified, for a specific
>>>>> instance of a collection, it would have been possible - you could have set
>>>>> the batch size\subselect for a specific query, which in turn would have 
>>>>> set
>>>>> a different persister for the collections that their persisters needs
>>>>> modification, and then when a collection would have been accessed, the
>>>>> persister would have done its thing.
>>>>>
>>>>> of course, i'm not sure thats the proper way of implementing it, but as
>>>>> an idea - tell the specific collections that are created for the entities 
>>>>> of
>>>>> a specific query to do something else than the default, it is possible
>>>>>
>>>>>
>>>>> On Wed, Sep 1, 2010 at 7:49 PM, John Davidson <[email protected]>wrote:
>>>>>
>>>>>> I think nadav is saying that subselect from NHibernate is an issue,
>>>>>> but the implementation he is proposing will fix that problem
>>>>>>
>>>>>> John Davidson
>>>>>>
>>>>>>
>>>>>> On Wed, Sep 1, 2010 at 12:46 PM, Fabio Maulo <[email protected]>wrote:
>>>>>>
>>>>>>> LOL!!
>>>>>>> Your first assertion : "btw, i don't really get what is the problem
>>>>>>> with subselect"
>>>>>>> Your second assertion : "the sub select is always inefficient"
>>>>>>>
>>>>>>> On Wed, Sep 1, 2010 at 1:42 PM, nadav s <[email protected]> wrote:
>>>>>>>
>>>>>>>> the sub select is always inefficient, especially when there is an
>>>>>>>> initial complex query (with sub queries in it), and its a killer when 
>>>>>>>> its a
>>>>>>>> two level tree (when fetching the grandchildren). fixing it was really
>>>>>>>> really easy, and i can't see any downside to it.
>>>>>>>>
>>>>>>>> different use cases in a web app:
>>>>>>>>
>>>>>>>> use case 1: sub select\batch size is NOT desired
>>>>>>>>
>>>>>>>>    the user searches for car companies by some criteria. the user
>>>>>>>> will then choose (double click on a grid's row or something) one of the
>>>>>>>>    companies to see it in full details. each company has one-to-many
>>>>>>>> car types (mazda -> mazda 3, mazda 5, mazda 6...) and each
>>>>>>>>    car type will be displayed in its own tab, when at first, the
>>>>>>>> newest car type or the most expensive one, doesn't matter is selected.
>>>>>>>>    each car type has its models, mazda3 2008 isn't the same as 2010
>>>>>>>> (i don't that much about cars and not sure the years are correct,
>>>>>>>>    but there are differences between the models).
>>>>>>>>
>>>>>>>>    the result: if carType.Models is mapped with some batch size, say
>>>>>>>> 10, the models of 10 of the car types are now fetched, although
>>>>>>>>    the user only watches the models of one of the car types, if
>>>>>>>> there could be lots of models for each car type, it slowed the first 
>>>>>>>> tab,
>>>>>>>>    and made the other tabs faster, because their car types are now
>>>>>>>> loaded, but its not what is desired, because the user is expected to
>>>>>>>>    click on only one of other tabs or something.
>>>>>>>>
>>>>>>>>  use case 2: desired:
>>>>>>>>
>>>>>>>>     the user wanna see some custom developed report (ui that can be
>>>>>>>> implemented with MRS/Cognus or any other reporting framework,
>>>>>>>>     and we have all kinds of reports that live up to this
>>>>>>>> definition, and for some good reasons also). for the report the user
>>>>>>>> searches for
>>>>>>>>     car companies by some criteria (some search form) and then
>>>>>>>> expects to see the returned companies, paged of course, but with all
>>>>>>>>     of their car types, and for each of the car type - all of its
>>>>>>>> models. here, a sub select or batch fetching is a must or else we'll 
>>>>>>>> get a
>>>>>>>> CP
>>>>>>>>     with join fetching, or N^2 + 1 if we do regular lazy loading
>>>>>>>> (like we wanted to do in the first situation).
>>>>>>>>
>>>>>>>> of course we can work around that, and thats exactly what we do,
>>>>>>>> using a generic mechanizm that for reports, eager fetches with sub 
>>>>>>>> selects
>>>>>>>> and not joins, the association it was asked to fetch. for the regular
>>>>>>>> queries, it just use the default which is regular lazy.
>>>>>>>>
>>>>>>>> it would have been really really nice, if i could have set, for the
>>>>>>>> report query, query.SetFetchMode("CarTypes", FetchMode.SubSelect)
>>>>>>>> or if you will, query.SetBatchSize("CarTypes", 20)
>>>>>>>> and same for models
>>>>>>>> query.SetFetchMode("CarTypes.Models", FetchMode.SubSelect) or
>>>>>>>> query.SetBatchSize("CarTypes.Models", int.MaxValue).
>>>>>>>>
>>>>>>>> it must be max value because i want all the models, and can't
>>>>>>>> possibly know how many car types are going to be there. of course it 
>>>>>>>> won't
>>>>>>>> be alot, because the "query" is going to use paging, but i don't 
>>>>>>>> really know
>>>>>>>> if its 20, 40, or something else.
>>>>>>>>
>>>>>>>> batch size, currently makes me choose between the use cases, slowing
>>>>>>>> down one of them, or makes me query and connect the associations my 
>>>>>>>> self.
>>>>>>>> same goes for sub select, which also issues an inefficient query for
>>>>>>>> CarTypes and a killer query for the Models
>>>>>>>> before my fix it would have been:
>>>>>>>> select ...
>>>>>>>> from Models m
>>>>>>>> where m.CarTypeId in
>>>>>>>>    (select c.Id
>>>>>>>>     from CarTypes c
>>>>>>>>     where c.CompanyId in
>>>>>>>>             (select company.Id
>>>>>>>>              from Companies company
>>>>>>>>              where <could be some crazy crteria - this is the same
>>>>>>>> where clause of the very original query>))
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> (i was able to make itthe inefficiency of the query
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Sep 1, 2010 at 6:58 PM, Fabio Maulo 
>>>>>>>> <[email protected]>wrote:
>>>>>>>>
>>>>>>>>> I don't know which is the problem... you said that there is a
>>>>>>>>> problem and you want change it using the same tech used by batch-size 
>>>>>>>>> (using
>>>>>>>>> uploaded ids) because subselect seems inefficient in some cases.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Sep 1, 2010 at 12:48 PM, nadav s <[email protected]>wrote:
>>>>>>>>>
>>>>>>>>>> btw, i don't really get what is the problem with subselect, as it
>>>>>>>>>> lets you efficiently fetch a whole object graph for the N fathers 
>>>>>>>>>> that were
>>>>>>>>>> fetched in some query, in the most efficient way possible
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Sep 1, 2010 at 6:46 PM, nadav s <[email protected]>wrote:
>>>>>>>>>>
>>>>>>>>>>> i don't think its thats low priority, because it is actually a
>>>>>>>>>>> thing people expect to happen when they set a fetch mode to Eager, 
>>>>>>>>>>> at least
>>>>>>>>>>> i've seen alot of situations when people really thought that thats 
>>>>>>>>>>> whats
>>>>>>>>>>> going to happen  (later finding out it killed their query with CP)
>>>>>>>>>>>
>>>>>>>>>>> about when it is helpful - exactly in the situations diego
>>>>>>>>>>> described. two use cases,
>>>>>>>>>>> in one of them you query the fathers and gonna need only one of
>>>>>>>>>>> the father's collection, and for the other
>>>>>>>>>>> you're gonna need all of their collections.
>>>>>>>>>>> it gets more complicated when there are grandchildren involved,
>>>>>>>>>>> and in one of the situations you want the grand children of one of 
>>>>>>>>>>> the
>>>>>>>>>>> childs, and in the other situation, because you load an object 
>>>>>>>>>>> graph, you're
>>>>>>>>>>> gonna need all of them.
>>>>>>>>>>>
>>>>>>>>>>> now, either you implement (similar to what diego said) the
>>>>>>>>>>> loading of the collections yourself, or you gonna have to live with 
>>>>>>>>>>> the
>>>>>>>>>>> batch size slowing down the first situation, where you would have 
>>>>>>>>>>> prefered
>>>>>>>>>>> lazy loading without batching
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Sep 1, 2010 at 5:22 PM, Diego Mijelshon <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I have entities where batch loading helps in some use cases but
>>>>>>>>>>>> it loads lots of unneeded entities/collections in other complex 
>>>>>>>>>>>> use cases,
>>>>>>>>>>>> where I have many proxies but only use a few.
>>>>>>>>>>>> My current workaround is doing "manual batch loading" (i.e.
>>>>>>>>>>>> dummy query) in the cases where I need it.
>>>>>>>>>>>>
>>>>>>>>>>>> It would be definitely a low-priority but nice-to-have feature.
>>>>>>>>>>>>
>>>>>>>>>>>>     Diego
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Sep 1, 2010 at 10:12, Fabio Maulo <[email protected]
>>>>>>>>>>>> > wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> It is possible for batcher (INSERT, UPDATE,DELETE).
>>>>>>>>>>>>> I don't understand where it is useful for collection/relations
>>>>>>>>>>>>> batch-size.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Sep 1, 2010 at 9:37 AM, Diego Mijelshon <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Being able to override batch-size would be useful.
>>>>>>>>>>>>>> Implementing it requires messing with more than one part of the
>>>>>>>>>>>>>> infrastructure, though.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     Diego
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Fabio Maulo
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Fabio Maulo
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Fabio Maulo
>>>
>>>
>>
>

Re: [nhibernate-development] implementing FetchMode.SubSelect per query, and improving it

Reply via email to