Re: Derby SQL ideas needed

Alexey Serba Sun, 19 Sep 2010 09:05:56 -0700

> The other thing is that we cannot afford to use the same "table"
> twice, as it is actually an extremely expensive query in its own
> right, with multiple joins, select distinct's, etc. under the covers.
Even if you create indexes on bucket and activitycount columns? It
might be that the query plans for these two queries (with "distinct
on" hack and subquery max/subquery order limit/join) would be the
same.


> I'd be happy to post it but it may shock you. ;-)
The way I indent SQL queries should say that I'm not afraid of
multipage queries :)

>
> Karl
>
>
>
>
>
> On Sun, Sep 19, 2010 at 11:32 AM, Alexey Serba <ase...@gmail.com> wrote:
>>> SELECT DISTINCT ON (idbucket) t3.bucket AS idbucket, t3.activitycount AS 
>>> activitycount, t3.windowstart AS starttime, t3.windowend AS endtime FROM 
>>> (...) t3
>> Do you have primary key in your t3 table?
>>
>>> In Postgresql, what this does is to return the FIRST entire row matching 
>>> each distinct idbucket result.
>> FIRST based on which sort?
>>
>> Lets say you want to return FIRST row based on t3.windowstart column
>> and you have primary key in t3 table. Then I believe your query can be
>> rewritten in the following ways:
>>
>> 1. Using subqueries
>> SELECT
>>    bucket, primary_key, windowstart, etc
>> FROM
>>    table AS t1
>> WHERE
>>    windowstart=( SELECT max(windowstart) FROM table AS t2 WHERE
>> bucket = t1.bucket )
>>
>> 2. Using joins instead of subqueries ( in case Derby doesn't support
>> subqueries - not sure about that )
>> SELECT
>>    t1.bucket, t1.primary_key, windowstart, etc
>> FROM
>>    table AS t1
>>    LEFT OUTER JOIN table AS t2 ON ( t1.bucket = t2.bucket AND
>> t2.windowstart > t1.windowstart )
>> WHERE
>>    t2.primary_key IS NULL
>>
>> HTH,
>> Alex
>>
>> On Sat, Sep 18, 2010 at 2:28 PM, Karl Wright <daddy...@gmail.com> wrote:
>>> Hi Folks,
>>>
>>> For two of the report queries, ACF uses the following Postgresql
>>> construct, which sadly seems to have no Derby equivalent:
>>>
>>> SELECT DISTINCT ON (idbucket) t3.bucket AS idbucket, t3.activitycount
>>> AS activitycount, t3.windowstart AS starttime, t3.windowend AS endtime
>>> FROM (...) t3
>>>
>>> In Postgresql, what this does is to return the FIRST entire row
>>> matching each distinct idbucket result.  If Derby had a "FIRST()"
>>> aggregate function, it would be the equivalent of:
>>>
>>> SELECT t3.bucket AS idbucket, FIRST(t3.activitycount) AS
>>> activitycount, FIRST(t3.windowstart) AS starttime, FIRST(t3.windowend)
>>> AS endtime FROM (...) t3 GROUP BY t3.bucket
>>>
>>> Unfortunately, Derby has no such aggregate function.  Furthermore, it
>>> would not be ideal if I were to do the work myself in ACF, because
>>> this is a resultset that needs to be paged through with offset and
>>> length, for presentation to the user and sorting, so it gets wrapped
>>> in another SELECT ... FROM (...) ORDER BY ... OFFSET ... LIMIT ...
>>> that does that part.
>>>
>>> Does anyone have any ideas and/or Derby contacts?  I'd really like the
>>> quick-start example to have a functional set of reports.
>>>
>>> Karl
>>>
>>
>

Re: Derby SQL ideas needed

Reply via email to