I've opened an bug report here:
https://issues.apache.org/jira/browse/DERBY-3603
Regards
David
On Mon, 2008-04-07 at 10:27 -0700, Bryan Pendleton wrote:
> David Butterworth wrote:
> > For anyone interested, the following select statements demonstrate the
> > inconsistencies
>
> Hi David,
>
> I was able to reproduce the odd results that you experience,
> using your database, and Derby 10.4.1.1 (still in beta testing).
>
> I simplified your queries slightly, to:
> select count(*) FROM spike.accounts account, spike.admin_units admin_unit,
> spike.bookings booking
> WHERE booking.child_id = 2 AND
> admin_unit.admin_unit_id IN (1,21) AND
> booking.booking_date_time_out >= 20080331000000 AND
> booking.booking_date_time_in <= 20080406235900 AND
> account.account_id = booking.account_id AND
> admin_unit.admin_unit_id = account.admin_unit_id
>
> versus the same thing, but with IN clause changed to:
>
> account.admin_unit_id IN (1,21) AND
>
> Interestingly, although the actual results do NOT contain any values
> for admin_unit_id = 21, if I change the query to:
>
> admin_unit.admin_unit_id IN (1)
> or
> account.admin_unit_id IN (1)
>
> then the problem disappears -- I get 3 rows for both queries.
>
> I also ran query plans for both the queries (in the IN (1,21) case)
> and have pasted the (simplified) query plans at the end of this message.
>
> I notice that in the case where the query gives 2 rows, which is
> when we specify admin_unit.admin_unit_id in (1,21), the admin_unit_id
> index scan output in the query plan contains:
>
> qualifiers:
> Column[0][0] Id: 0
> Operator: =
> Ordered nulls: false
> Unknown return value: false
> Negate comparison result: false
>
> However, in the case where the query gives 3 rows, which is
> when we specify account.admin_unit_id in (1,21), the admin_unit_id
> index scan output in the query plan contains:
>
> qualifiers:
> None
>
> I think it is the presence/absence of this qualifier on the query
> scan which is causing the different results in the query, as in
> the first case we see:
>
> Number of rows qualified=2
> Number of rows visited=3
>
> but in the second case we see:
>
> Number of rows qualified=3
> Number of rows visited=3
>
> I definitely don't have any explanation for why you are getting
> this odd behavior; it certainly seems like a bug to me.
>
> I think you should open an issue in JIRA, and provide as much
> information as possible about the queries and how to reproduce the
> problem. Ideally, if you can attach your sample database that would
> be great, or if you can provide a standalone script or
> test program that reproduces the problem from scratch that would
> be great too.
>
> Hopefully some other folks will have a chance to look at this, too,
> and offer their opinions on what is happening here.
>
> thanks,
>
> bryan
>
> ******* Query plan for the query that returns 2 rows:
>
> 2008-04-07 16:44:14.401 GMT Thread[main,5,main] (XID = 3585), (SESSIONID =
> 0), select count(*) FROM spike.accounts account, spike.admin_units
> admin_unit, spike.bookings booking WHERE booking.child_id
> = 2 AND admin_unit.admin_unit_id IN (1,21) AND booking.booking_date_time_out
> >= 20080331000000 AND booking.booking_date_time_in <= 20080406235900 AND
> account.account_id = booking.account_id AND
> admin_unit.admin_unit_id = account.admin_unit_id ******* Project-Restrict
> ResultSet (11):
> Number of opens = 1
> Rows seen = 1
> Rows filtered = 0
> restriction = false
> projection = true
> optimizer estimated row count: 1.00
> optimizer estimated cost: 42.65
>
> Source result set:
> Scalar Aggregate ResultSet:
> Number of opens = 1
> Rows input = 2
> optimizer estimated row count: 0.33
> optimizer estimated cost: 42.65
>
> Index Key Optimization = false
> Source result set:
> Project-Restrict ResultSet (10):
> Number of opens = 1
> Rows seen = 2
> Rows filtered = 0
> restriction = false
> projection = true
> optimizer estimated row count: 0.33
> optimizer estimated cost: 42.65
>
> Source result set:
> Nested Loop Exists Join ResultSet:
> Number of opens = 1
> Rows seen from the left = 3
> Rows seen from the right = 2
> Rows filtered = 0
> Rows returned = 2
> optimizer estimated row count: 0.33
> optimizer estimated cost: 42.65
>
> Left result set:
> Nested Loop Exists Join ResultSet:
> Number of opens = 1
> Rows seen from the left = 3
> Rows seen from the right = 3
> Rows filtered = 0
> Rows returned = 3
> optimizer estimated row count: 0.33
> optimizer estimated cost: 42.14
>
> Left result set:
> Project-Restrict ResultSet (6):
> Number of opens = 1
> Rows seen = 3
> Rows filtered = 0
> restriction = true
> projection = true
> optimizer estimated row count: 0.33
> optimizer estimated cost: 40.57
>
> Source result set:
> Index Row to Base Row ResultSet for BOOKINGS:
> Number of opens = 1
> Rows seen = 3
> Columns accessed from heap = {1, 2, 5, 6}
> optimizer estimated row count: 0.33
> optimizer estimated cost: 40.57
>
> Index Scan ResultSet for BOOKINGS using constraint
> bookings-children_fk at read committed isolation level using instantaneous
> share row locking chosen by the optimizer
> Number of opens = 1
> Rows seen = 3
> Rows filtered = 0
> Fetch Size = 16
>
> scan information:
> Bit set of columns fetched=All
> Number of columns fetched=2
> Number of deleted rows visited=1
> Number of pages visited=1
> Number of rows qualified=3
> Number of rows visited=4
> Scan type=btree
> Tree height=-1
> start position:
> >= on first 1 column(s).
> Ordered null semantics on the following columns:
> 0
> stop position:
> > on first 1 column(s).
> Ordered null semantics on the following columns:
> 0
> qualifiers:
> None
> optimizer estimated row count: 0.33
> optimizer estimated cost: 40.57
>
>
> Right result set:
> Index Row to Base Row ResultSet for ACCOUNTS:
> Number of opens = 3
> Rows seen = 3
> Columns accessed from heap = {11}
> optimizer estimated row count: 0.33
> optimizer estimated cost: 1.57
>
> Index Scan ResultSet for ACCOUNTS using constraint ACCOUNT_ID_PK
> at read committed isolation level using share row locking chosen by the
> optimizer
> Number of opens = 3
> Rows seen = 3
> Rows filtered = 0
> Fetch Size = 1
>
> scan information:
> Bit set of columns fetched=All
> Number of columns fetched=2
> Number of deleted rows visited=0
> Number of pages visited=3
> Number of rows qualified=3
> Number of rows visited=3
> Scan type=btree
> Tree height=1
> start position:
> >= on first 1 column(s).
> Ordered null semantics on the following columns:
> 0
> stop position:
> > on first 1 column(s).
> Ordered null semantics on the following columns:
> 0
> qualifiers:
> None
> optimizer estimated row count: 0.33
> optimizer estimated cost: 1.57
>
>
>
> Right result set:
> Index Scan ResultSet for ADMIN_UNITS using constraint
> ADMIN_UNIT_ID_PK at read committed isolation level using share row locking
> chosen by the optimizer
> Number of opens = 3
> Rows seen = 2
> Rows filtered = 0
> Fetch Size = 1
>
> scan information:
> Bit set of columns fetched={0}
> Number of columns fetched=1
> Number of deleted rows visited=0
> Number of pages visited=3
> Number of rows qualified=2
> Number of rows visited=3
> Scan type=btree
> Tree height=1
> start position:
> >= on first 1 column(s).
> Ordered null semantics on the following columns:
>
> stop position:
> > on first 1 column(s).
> Ordered null semantics on the following columns:
>
> qualifiers:
> Column[0][0] Id: 0
> Operator: =
> Ordered nulls: false
> Unknown return value: false
> Negate comparison result: false
>
> optimizer estimated row count: 0.33
> optimizer estimated cost: 0.51
>
>
>
> ******* Query plan for the query format that returns 3 rows:
>
> 2008-04-07 16:44:18.838 GMT Thread[main,5,main] (XID = 3587), (SESSIONID =
> 0), select count(*) FROM spike.accounts account, spike.admin_units
> admin_unit, spike.bookings booking WHERE booking.child_id
> = 2 AND account.admin_unit_id IN (1,21) AND booking.booking_date_time_out >=
> 20080331000000 AND booking.booking_date_time_in <= 20080406235900 AND
> account.account_id = booking.account_id AND
> admin_unit.admin_unit_id = account.admin_unit_id ******* Project-Restrict
> ResultSet (12):
> Number of opens = 1
> Rows seen = 1
> Rows filtered = 0
> restriction = false
> projection = true
> optimizer estimated row count: 1.00
> optimizer estimated cost: 42.65
>
> Source result set:
> Scalar Aggregate ResultSet:
> Number of opens = 1
> Rows input = 3
> optimizer estimated row count: 0.33
> optimizer estimated cost: 42.65
>
> Index Key Optimization = false
> Source result set:
> Project-Restrict ResultSet (11):
> Number of opens = 1
> Rows seen = 3
> Rows filtered = 0
> restriction = false
> projection = true
> optimizer estimated row count: 0.33
> optimizer estimated cost: 42.65
>
> Source result set:
> Nested Loop Exists Join ResultSet:
> Number of opens = 1
> Rows seen from the left = 3
> Rows seen from the right = 3
> Rows filtered = 0
> Rows returned = 3
> optimizer estimated row count: 0.33
> optimizer estimated cost: 42.65
>
> Left result set:
> Nested Loop Exists Join ResultSet:
> Number of opens = 1
> Rows seen from the left = 3
> Rows seen from the right = 3
> Rows filtered = 0
> Rows returned = 3
> optimizer estimated row count: 0.33
> optimizer estimated cost: 42.14
>
> Left result set:
> Project-Restrict ResultSet (6):
> Number of opens = 1
> Rows seen = 3
> Rows filtered = 0
> restriction = true
> projection = true
> optimizer estimated row count: 0.33
> optimizer estimated cost: 40.57
>
> Source result set:
> Index Row to Base Row ResultSet for BOOKINGS:
> Number of opens = 1
> Rows seen = 3
> Columns accessed from heap = {1, 2, 5, 6}
> optimizer estimated row count: 0.33
> optimizer estimated cost: 40.57
>
> Index Scan ResultSet for BOOKINGS using constraint
> bookings-children_fk at read committed isolation level using instantaneous
> share row locking chosen by the optimizer
> Number of opens = 1
> Rows seen = 3
> Rows filtered = 0
> Fetch Size = 16
>
> scan information:
> Bit set of columns fetched=All
> Number of columns fetched=2
> Number of deleted rows visited=1
> Number of pages visited=1
> Number of rows qualified=3
> Number of rows visited=4
> Scan type=btree
> Tree height=-1
> start position:
> >= on first 1 column(s).
> Ordered null semantics on the following columns:
> 0
> stop position:
> > on first 1 column(s).
> Ordered null semantics on the following columns:
> 0
> qualifiers:
> None
> optimizer estimated row count: 0.33
> optimizer estimated cost: 40.57
>
>
> Right result set:
> Project-Restrict ResultSet (9):
> Number of opens = 3
> Rows seen = 3
> Rows filtered = 0
> restriction = true
> projection = true
> optimizer estimated row count: 0.33
> optimizer estimated cost: 1.57
>
> Source result set:
> Index Row to Base Row ResultSet for ACCOUNTS:
> Number of opens = 3
> Rows seen = 3
> Columns accessed from heap = {11}
> optimizer estimated row count: 0.33
> optimizer estimated cost: 1.57
>
> Index Scan ResultSet for ACCOUNTS using constraint
> ACCOUNT_ID_PK at read committed isolation level using share row locking
> chosen by the optimizer
> Number of opens = 3
> Rows seen = 3
> Rows filtered = 0
> Fetch Size = 1
>
> scan information:
> Bit set of columns fetched=All
> Number of columns fetched=2
> Number of deleted rows visited=0
> Number of pages visited=3
> Number of rows qualified=3
> Number of rows visited=3
> Scan type=btree
> Tree height=1
> start position:
> >= on first 1 column(s).
> Ordered null semantics on the following columns:
> 0
> stop position:
> > on first 1 column(s).
> Ordered null semantics on the following columns:
> 0
> qualifiers:
> None
> optimizer estimated row count: 0.33
> optimizer estimated cost: 1.57
>
>
>
> Right result set:
> Index Scan ResultSet for ADMIN_UNITS using constraint
> ADMIN_UNIT_ID_PK at read committed isolation level using share row locking
> chosen by the optimizer
> Number of opens = 3
> Rows seen = 3
> Rows filtered = 0
> Fetch Size = 1
>
> scan information:
> Bit set of columns fetched={0}
> Number of columns fetched=1
> Number of deleted rows visited=0
> Number of pages visited=3
> Number of rows qualified=3
> Number of rows visited=3
> Scan type=btree
> Tree height=1
> start position:
> >= on first 1 column(s).
> Ordered null semantics on the following columns:
> 0
> stop position:
> > on first 1 column(s).
> Ordered null semantics on the following columns:
> 0
> qualifiers:
> None
> optimizer estimated row count: 0.33
> optimizer estimated cost: 0.51
>
>
>
>
>
>
>
>