Re: Derby SQL ideas needed

2010-09-19 Thread Alexey Serba
SELECT DISTINCT ON (idbucket) t3.bucket AS idbucket, t3.activitycount AS activitycount, t3.windowstart AS starttime, t3.windowend AS endtime FROM (...) t3 Do you have primary key in your t3 table? In Postgresql, what this does is to return the FIRST entire row matching each distinct

Re: Derby SQL ideas needed

2010-09-19 Thread Karl Wright
FIRST based on which sort?? First based on the existing sort, which is crucial, because the sort is by bucket ASC, activitycount DESC. I'm looking for the row with the highest activitycount, per bucket. The other thing is that we cannot afford to use the same table twice, as it is actually an

Re: Derby SQL ideas needed

2010-09-19 Thread Alexey Serba
The other thing is that we cannot afford to use the same table twice, as it is actually an extremely expensive query in its own right, with multiple joins, select distinct's, etc. under the covers. Even if you create indexes on bucket and activitycount columns? It might be that the query plans

Re: Derby SQL ideas needed

2010-09-19 Thread Karl Wright
Here you go: // The query we will generate here looks like this: // SELECT * // FROM // (SELECT DISTINCT ON (idbucket) t3.bucket AS idbucket, t3.bytecount AS bytecount, // t3.windowstart AS starttime, t3.windowend AS endtime //

Re: Derby SQL ideas needed

2010-09-19 Thread Karl Wright
Looking at your proposal: SELECT bucket, primary_key, windowstart, etc FROM table AS t1 WHERE windowstart=( SELECT max(windowstart) FROM table AS t2 WHERE bucket = t1.bucket ) ... we'd be looking actually for something more like this: SELECT t1.bucket, t1.bytecount, t1.windowstart,

Re: Derby SQL ideas needed

2010-09-19 Thread Alexey Serba
You can also try ORDER BY bytecount DESC LIMIT 1 instead of aggregate function max, i.e. SELECT t1.bucket, t1.bytecount, t1.windowstart, t1.windowend FROM (xxx) t1 WHERE t1.bytecount=( SELECT t2.bytecount FROM (xxx) t2 WHERE t2.bucket = t1.bucket ORDER BY t2.bytecount DESC LIMIT 1 )

Re: Derby SQL ideas needed

2010-09-19 Thread Alexey Serba
And all of this is only with single table repohistory, right? Is this some kind of complex analytics/stats? On Sun, Sep 19, 2010 at 8:48 PM, Karl Wright daddy...@gmail.com wrote: Here you go:    // The query we will generate here looks like this:    // SELECT *    //   FROM    //    

Re: Derby SQL ideas needed

2010-09-19 Thread Karl Wright
Yes. This is for the Max Activity and Max Bandwidth reports. Karl On Sun, Sep 19, 2010 at 2:13 PM, Alexey Serba ase...@gmail.com wrote: And all of this is only with single table repohistory, right? Is this some kind of complex analytics/stats? On Sun, Sep 19, 2010 at 8:48 PM, Karl Wright

Re: Derby SQL ideas needed

2010-09-18 Thread Karl Wright
For what it's worth, defining a Derby function seems like the only way to do it. These seem to call arbitrary java that can accept a query as an argument and return a resultset as the result. But in order to write such a thing I will need the ability to call Derby at a java level, I think,

Re: Derby SQL ideas needed

2010-09-18 Thread Karl Wright
The Derby table-result function syntax requires all output columns to be declared as part of the function definition, and more importantly it does not seem to allow calls into Derby itself to get results. So this would not seem to be a viable option for that reason. Back to square 1, I guess.