Re: How should we handle the JPQL DISTINCT keyword

Michael Dick Thu, 29 Oct 2009 10:39:23 -0700

Hi Pinaki,

Thanks for reading through my verbose problem description and for your
comments!


This particular issue turned up when working with a WebSphere customer and
they have been reluctant to use vendor specific options like FetchPlans in
the past. I can propose FetchPlans as a solution again, but I want to have a
vendor neutral answer ready if they push back.

As a developer I appreciate the flexibility of FetchPlans, but I've found
that using them in this manner results in a additional SQL statements. With
the simple example I posted above FetchPlans generate 4 SQL statements and
take slightly longer than the JOIN FETCH's 2 statements. One developer's
laptop does not make a valid benchmark though :-)

-mike

On Wed, Oct 28, 2009 at 9:11 PM, Pinaki Poddar <[email protected]> wrote:

>
> Hi Mike,
>  Very good description of the problem.
>  JOIN FETCH is one place where the assumption that a object-oriented query
> is same as a row-based query shows its strain. DISTINCT in JPQL referred to
> b, while DISTINCT in SQL referred to the row it selects -- and b is not a
> row but the root of an object graph!
>
>  SetBackedList is indeed a good idea.
>
>  Other option is to drop the JOIN FETCH clauses altogether from part of the
> query. But to add them to the FetchPlan. Then query
> JPQL: "SELECT b from Book b JOIN FETCH b.subjects WHERE b.title = 'Gone
> Sailing'"
> is effectively
>  "SELECT b from Book b WHERE b.title = 'Gone Sailing'"
>  fetch.add(Book.class, "subjects");
>
>  Do not how much trouble to tweak OpenJPA that way though!
>
>
>
>
> Michael Dick wrote:
> >
> > Hi all,
> >
> > Currently we handle the JPQL DISTINCT keyword as a 1:1 mapping to the SQL
> > DISTINCT keyword. So the following
> > JPQL:  "SELECT DISTINCT b FROM Book b WHERE b.title = 'Gone Sailing'"
> > may result in the following SQL
> > SQL : "SELECT DISTINCT t0.id, t0.dueDate, t0.title, . . . FROM Book t0
> > WHERE
> > t0.title = 'Gone Sailing'"
> >
> > This works fine for most queries, but when I was looking into OPENJPA-894
> > I
> > noticed a problem with some relationships and the JOIN FETCH clause. A
> > JOIN
> > FETCH looking like this :
> > JPQL: "SELECT b from Book b JOIN FETCH b.subjects WHERE b.title = 'Gone
> > Sailing'"
> > Has two effects :
> >    1. It eagerly loads b.subjects
> >    2. It returns (potentially) multiple references to the same book. One
> > reference to Book(id=1) for every subject associated with Book(id=1).
> >
> > The resulting SQL may look like this (in this case Book is MxM with
> > Subject)
> > :
> > SQL "SELECT t0.oid, t0.dueDate, t0.title, t1.BOOKS_OID, t2.oid, t2.name
> > FROM
> > LIBBOOK t0 INNER JOIN LIBSUBJECT_LIBBOOK t1 ON t0.oid = t1.BOOKS_OID
> INNER
> > JOIN LIBSUBJECT t2 ON t1.SUBJECTS_OID = t2.oid WHERE t0.title = 'Gone
> > Fishing'"
> >
> > Lets say that the Book with title = 'Gone Sailing' has two subjects :
> > Outdoors and Sportsman. In that case the SQL will return two rows that
> > look
> > something like this :
> >
> > OID    DUEDATE    TITLE          BOOKS_OID  OID  NAME
> > ----   ---------- ---------      ---------- ---- -----------
> > 3      2009-11-11 Gone Fishing   3          12   Outdoors
> > 3      2009-11-11 Gone Fishing   3          13   Sportsman
> >
> > The fix for OPENJPA-894 generates a result list with two references to
> the
> > same Book(id=3).
> >
> > If you only wanted eager fetching of b.subjects, and didn't want
> > duplicates
> > a good first guess would be to add the DISTINCT keyword (I'm finally
> > getting
> > back to the subject)
> >
> > JPQL : "SELECT DISTINCT b from Book b JOIN FETCH b.subjects WHERE b.title
> > =
> > 'Gone Sailing'"
> > SQL :  "SELECT DISTINCT t0.oid, t0.dueDate, t0.title, t1.BOOKS_OID,
> > t2.oid,
> > t2.name FROM LIBBOOK t0 INNER JOIN LIBSUBJECT_LIBBOOK t1 ON t0.oid =
> > t1.BOOKS_OID INNER JOIN LIBSUBJECT t2 ON t1.SUBJECTS_OID = t2.oid WHERE
> > t0.title = 'Gone Fishing'"
> >
> > We'll still get the same two rows because the SQL DISTINCT keyword is
> > applied to all permutations of the columns - not just the oid (and if it
> > did
> > only apply to the oid we wouldn't be eagerly loading b.subjects anyway).
> > As
> > a result I think we'll have to use a SetBackedList (or otherwise enforce
> > distinct results after getting rows from SQL) as our ResultList.
> >
> > I have a patch that does this, and resolves the use case I described
> above
> > (it's in the TestLibService unit test), but I'm open to any other ideas
> > for
> > ways to resolve the problem.
> >
> > Thanks,
> > -mike
> >
> >
>
>
> -----
> Pinaki
> --
> View this message in context:
> http://n2.nabble.com/How-should-we-handle-the-JPQL-DISTINCT-keyword-tp3908400p3909427.html
> Sent from the OpenJPA Developers mailing list archive at Nabble.com.
>

Re: How should we handle the JPQL DISTINCT keyword

Reply via email to