> Yes you can, but I think you are introducing a breaking change if you are
> going for:
> where c.FooId in (p1, p2, p3, p4..... pn) Subselect does not suffer the
2100
> parameters limit, with your patch it will have this new limitation.
Yep, paging is a problem. We have the same feature, called
'parameterized prefetch paths', which works roughly like this: you have a
threshold T which is say by default 100. If there are < T parentID's, it
switches to a where x.FkField in (p1, p...) query for children, and if >= T,
it will use a subquery.
Paging however doesn't work with a subquery, you have to use the
parameterized variant with eager loading. This isn't that hard to achieve
however: just keep the page size < T.
In NH land, the 'T' is the batchsize, but it's not configurable at
runtime, correct? So you can't set it per query. It depends whether a
subquery is faster than a parameterized variant. Speed tests on sqlserver
suggest that when you hit 100 or more parameters, a subquery is quicker, but
complex subqueries could make this number become different, so it is highly
useful to have a configurable batch size.
FB
>
>
> On Wed, Sep 1, 2010 at 2:56 PM, nadav s <[email protected]> wrote:
>
>
> before starting to work on the batch size thing, i want to apply a
> patch for making subselect more sain because that work is already done,
the
> problem is, in the old way, if the first query used paging, the sub
select
> ignored the paging, fetching all the children of the fathers of the first
> query, if it hadn't used paging.
>
> there is a test called SubselectFetchWithLimit which creates 3
> parents, fetches only 2 parents, initiliaze their collections, then
fetches
> the third parent and expects its collection to already by initialized
> (although it wasn't returned by the paged query).
>
> my improvement breaks this test, because paging is now taking into
> consideration when fetching by ids, which i think is the much more correct
> way to go.
>
> so the assert with the comment
> // The test for True is the test of H3.2
> now breaks. can i change the test intentially so that subselect will
> consider paging?
>
>
> On Wed, Sep 1, 2010 at 8:38 PM, nadav s <[email protected]> wrote:
>
>
> of course i'll be doing the same work as batch size, i'm not
> set to implement batch size all over again, but trying to allow it to be
> query specific, meaning, being able to look for owners of a specific
query,
> and not all owners that are in the session (owners from different query
> might be there), and allowing it to be overrideable
>
>
> On Wed, Sep 1, 2010 at 8:09 PM, Fabio Maulo
> <[email protected]> wrote:
>
>
> ah...
> take care with "using IDs is more efficient" because
> "subselect" does not suffer the problem of max-parameter (IIRC 2100 in
> msSQL)
>
>
> On Wed, Sep 1, 2010 at 2:03 PM, nadav s
> <[email protected]> wrote:
>
>
> great. thanks
>
>
> On Wed, Sep 1, 2010 at 7:56 PM, nadav s
> <[email protected]> wrote:
>
>
> you know the internal of nhibernate
much
> much much better than me, and i won't get into an implementation argue
with
> you, but it is possible to implement.
>
> with subselect (again i'm talking
about
> subselect because i didn't do any research on the batch size, but i guess
> the idea is similar because it works the same, only batch size issues a
good
> query and subselect issues an evil one), as i've noticed, there is a
special
> one-to-many collection persister, that knows once the collection is
> accessed, use a sub select batcher that loads the collections of all the
> owners that were returned by the initial query.
>
> if the persister could have been
set, or
> modified, for a specific instance of a collection, it would have been
> possible - you could have set the batch size\subselect for a specific
query,
> which in turn would have set a different persister for the collections
that
> their persisters needs modification, and then when a collection would have
> been accessed, the persister would have done its thing.
>
> of course, i'm not sure thats the
proper
> way of implementing it, but as an idea - tell the specific collections
that
> are created for the entities of a specific query to do something else than
> the default, it is possible
>
>
> On Wed, Sep 1, 2010 at 7:49 PM, John
> Davidson <[email protected]> wrote:
>
>
> I think nadav is saying that
> subselect from NHibernate is an issue, but the implementation he is
> proposing will fix that problem
>
> John Davidson
>
>
> On Wed, Sep 1, 2010 at 12:46
PM,
> Fabio Maulo <[email protected]> wrote:
>
>
> LOL!!
> Your first assertion
: "btw,
> i don't really get what is the problem with subselect"
> Your second
assertion : "the
> sub select is always inefficient"
>
> On Wed, Sep 1, 2010
at 1:42
> PM, nadav s <[email protected]> wrote:
>
>
> the sub
select is
> always inefficient, especially when there is an initial complex query
(with
> sub queries in it), and its a killer when its a two level tree (when
> fetching the grandchildren). fixing it was really really easy, and i can't
> see any downside to it.
>
> different
use cases
> in a web app:
>
> use case 1:
sub
> select\batch size is NOT desired
>
> the user
searches
> for car companies by some criteria. the user will then choose (double
click
> on a grid's row or something) one of the
> companies
to see
> it in full details. each company has one-to-many car types (mazda -> mazda
> 3, mazda 5, mazda 6...) and each
> car type
will be
> displayed in its own tab, when at first, the newest car type or the most
> expensive one, doesn't matter is selected.
> each car
type has
> its models, mazda3 2008 isn't the same as 2010 (i don't that much about
cars
> and not sure the years are correct,
> but there
are
> differences between the models).
>
> the
result: if
> carType.Models is mapped with some batch size, say 10, the models of 10 of
> the car types are now fetched, although
> the user
only
> watches the models of one of the car types, if there could be lots of
models
> for each car type, it slowed the first tab,
> and made
the other
> tabs faster, because their car types are now loaded, but its not what is
> desired, because the user is expected to
> click on
only one
> of other tabs or something.
>
> use case 2:
desired:
>
> the user
wanna
> see some custom developed report (ui that can be implemented with
MRS/Cognus
> or any other reporting framework,
> and we
have all
> kinds of reports that live up to this definition, and for some good
reasons
> also). for the report the user searches for
> car
companies by
> some criteria (some search form) and then expects to see the returned
> companies, paged of course, but with all
> of their
car
> types, and for each of the car type - all of its models. here, a sub
select
> or batch fetching is a must or else we'll get a CP
> with
join
> fetching, or N^2 + 1 if we do regular lazy loading (like we wanted to do
in
> the first situation).
>
> of course we
can work
> around that, and thats exactly what we do, using a generic mechanizm that
> for reports, eager fetches with sub selects and not joins, the association
> it was asked to fetch. for the regular queries, it just use the default
> which is regular lazy.
>
> it would
have been
> really really nice, if i could have set, for the report query,
> query.SetFetchMode("CarTypes", FetchMode.SubSelect)
> or if you
will,
> query.SetBatchSize("CarTypes", 20)
> and same for
models
>
> query.SetFetchMode("CarTypes.Models", FetchMode.SubSelect) or
>
> query.SetBatchSize("CarTypes.Models", int.MaxValue).
>
> it must be
max value
> because i want all the models, and can't possibly know how many car types
> are going to be there. of course it won't be alot, because the "query" is
> going to use paging, but i don't really know if its 20, 40, or something
> else.
>
>
> batch size,
currently
> makes me choose between the use cases, slowing down one of them, or makes
me
> query and connect the associations my self. same goes for sub select,
which
> also issues an inefficient query for CarTypes and a killer query for the
> Models
> before my
fix it
> would have been:
> select ...
> from Models
m
> where
m.CarTypeId in
> (select
c.Id
> from
CarTypes c
> where
c.CompanyId
> in
>
(select
> company.Id
>
from
> Companies company
>
where
> <could be some crazy crteria - this is the same where clause of the very
> original query>))
>
>
>
> (i was able
to make
> itthe inefficiency of the query
>
>
>
>
> On Wed, Sep
1, 2010
> at 6:58 PM, Fabio Maulo <[email protected]> wrote:
>
>
> I
don't know
> which is the problem... you said that there is a problem and you want
change
> it using the same tech used by batch-size (using uploaded ids) because
> subselect seems inefficient in some cases.
>
>
> On
Wed, Sep 1,
> 2010 at 12:48 PM, nadav s <[email protected]> wrote:
>
>
>
btw, i
> don't really get what is the problem with subselect, as it lets you
> efficiently fetch a whole object graph for the N fathers that were fetched
> in some query, in the most efficient way possible
>
>
>
On Wed,
> Sep 1, 2010 at 6:46 PM, nadav s <[email protected]> wrote:
>
>
>
> i don't think its thats low priority, because it is actually a thing
> people expect to happen when they set a fetch mode to Eager, at least i've
> seen alot of situations when people really thought that thats whats going
to
> happen (later finding out it killed their query with CP)
>
>
> about when it is helpful - exactly in the situations diego
described.
> two use cases,
>
> in one of them you query the fathers and gonna need only one of the
> father's collection, and for the other
>
> you're gonna need all of their collections.
>
> it gets more complicated when there are grandchildren involved, and
> in one of the situations you want the grand children of one of the childs,
> and in the other situation, because you load an object graph, you're gonna
> need all of them.
>
>
> now, either you implement (similar to what diego said) the loading
of
> the collections yourself, or you gonna have to live with the batch size
> slowing down the first situation, where you would have prefered lazy
loading
> without batching
>
>
>
> On Wed, Sep 1, 2010 at 5:22 PM, Diego Mijelshon
> <[email protected]> wrote:
>
>
>
> I have entities where batch loading helps in some use cases but it
> loads lots of unneeded entities/collections in other complex use cases,
> where I have many proxies but only use a few.
>
> My current workaround is doing "manual batch loading" (i.e. dummy
> query) in the cases where I need it.
>
>
> It would be definitely a low-priority but nice-to-have feature.
>
>
>
> Diego
>
>
>
>
> On Wed, Sep 1, 2010 at 10:12, Fabio Maulo <[email protected]>
> wrote:
>
>
>
>
> It is possible for batcher (INSERT, UPDATE,DELETE).
>
> I don't understand where it is useful for
collection/relations
> batch-size.
>
>
>
> On Wed, Sep 1, 2010 at 9:37 AM, Diego Mijelshon
> <[email protected]> wrote:
>
>
>
>
> Being able to override batch-size would be useful.
> Implementing it requires messing with more than one part of the
> infrastructure, though.
>
>
>
> Diego
>
>
>
>
>
>
>
> --
>
Fabio Maulo
>
>
>
>
>
>
>
> --
> Fabio Maulo
>
>
>
>
>
>
>
>
>
> --
> Fabio Maulo
>
>
>
>
>
>
>
>
> --
> Fabio Maulo
>