Hello Bryan, query: select object_id from object_master where object_id in (? ..1000 times.. ?) time taken to prepare statement = 328 time taken execute statement = 31 time taken enumerate result set = 31813 (isn't this unusual? enumeration of results taking so much time?)
To answer your question regarding result set count for the previous query: 1069 Best regards, Arindam On Tue, Apr 7, 2009 at 7:54 PM, Bryan Pendleton <[email protected]>wrote: > I would like my query below to return within 100 millisecs. Please help me, >> > > Here's what Derby is doing with your query: > > 1) It scans the object_master_index looking for the object_id values in > your IN list. You provided 1,000 values. Derby sorts your list, finds the > lowest > value, and the highest value, and scans that section of the index. During > this > processing, Derby looks at 104,896 entries in the index, and discards > 103,896 > of them, resulting in 1,000 index entries. This requires touching 566 index > pages, > which I think are read somewhat randomly, and takes 30 seconds to process: > > > Rows seen = 104896 > > Rows filtered = 103896 > > next time (milliseconds) = 31769 > > > > Source result set: > > Index Scan ResultSet for OBJECT_MASTER using > > Rows seen = 104896 > > Number of pages visited=566 > > Honestly, I'm not quite sure why this step takes 30 seconds. This is a > puzzle to me. > > 2) Derby then takes those 1,000 rows, and joins them against > object_category_mapping > using the object_category_mapping index. This results in 19,000 rows in the > result, > which I guess means that each object belongs to 19 categories? This > requires > performing 1,000 probes into the index, which each touch about 4 pages on > average: > > > Index Scan ResultSet for OBJECT_CATEGORY_MAPPING > > using index OBJECT_CATEGORY_MAPPING_INDEX at read committed isolation > level > > Number of opens = 1000 > > Rows seen = 19000 > > Number of pages visited=4119 > > 3) The 19,000 rows resulting from the join are then sorted and grouped by > category_name, and the category_count is computed. This requires an > external sort, and results in 1084 category_name groups: > > > Grouped Aggregate ResultSet: > > Rows input = 19000 > > Sort information: > > Number of merge runs=1 > > Number of rows input=19000 > > Number of rows output=1084 > > Size of merge runs=[18220] > > Sort type=external > > 4) lastly, the result is sorted in order to return the results ordered by > count. > This is an small internal sort and is very fast. > > > Sort ResultSet: > > Sort information: > > Number of rows input=1068 > > Number of rows output=1068 > > Sort type=internal > > I'm a little confused about why the grouping appears to produce 1,084 rows, > but the order by only sees 1068. How many rows are in your final result? > > This query plan, overall, seems pretty good to me. It doesn't look like > Derby > did anything frightenly stupid. > > However, I am confused as to why the initial scan of the OBJECT_MASTER for > the > object_id values in the IN clause took 30 seconds. > > What happens if you shrink your query down, and just do: > > select object_id from object_master where object_id in (?,...,?) > > and pass it the exact same 1,000 values for the IN clause. Does that query > by itself also take 30 seconds? > > thanks, > > bryan > > >
