Knut Anders Hatlen (JIRA) wrote:
[ https://issues.apache.org/jira/browse/DERBY-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575373#action_12575373 ]
Knut Anders Hatlen commented on DERBY-3479:
-------------------------------------------
I noticed that other tests that depend on stable query plans (wisconsin and
StalePlansTest) set derby.storage.checkpointInterval=100000. I tried to add
this property to predicatePushdown_derby.properties, and that seemed to
stabilize the test.
StalePlansTest contains some magic to flush the row count for the tables. I don't
understand exactly what it does, in particular I don't understand why "select
count(c1) from flusher" is needed to make the row count for other tables than
flusher visible. Perhaps changing the concurrency/timing in the buffer manager somehow
changes when the row count is flushed, and thereby it makes the optimizer choose a
different plan?
Changed/unexpected query plan when running test 'lang/predicatePushdown.sql'
As I remember it, this is how row count estimates work. Delta's are
maintained per page resident in the cache, rather than centralized in
order to not add a single point of contention for every insert/delete in
a table. An update of the single centralized counter is triggered in a
number of ways:
1) language forces the update, usually as a result of doing a full scan
and thus knowing the exact row count.
2) The incremental changes in a single page are "significant", I think
more than 10% of the current row count.
3) When the page is written to disk (see CachedPage!writePage()).
Writes happen either when page is forced from cache as part of
replacement or when a checkpoint happens.
So I think variable query plans in the tests come from:
1) variable row counts because checkpoints take different amounts of
time on different machines. Often the tests do some sort of table setup
in the beginning generating a lot of log and then a checkpoint and then
the checkpoint may be happening in the background while the test has
gotten to the actual query testing phase. I assume the replacement
policy is not variable. So solutions include manipulating checkpoint to
not happen
during test, call offline compress table on all tables before query
which will update all statistics and remove dependency on any
outstanding per page row count updates (offline creates new containers
and recreates indexes/statistics
2) variable amount of time spent in optimizer if query is complicated
enough to stop before costing all plans. Workaround might be to just
set the optimizer to not timeout.
With the .sql tests in my experience often the problem is that we don't
have a reasonable set of test data in the tables, so row estimates off
by 1 actually affect the plan. Usually both plans are reasonable -
usual case is one table on left or right does not actually matter. As
suggested in this case the test probably does not care about most of
the plan, just that the predicate pushdown happened. But it is
sometimes interesting to understand why a new plan is being consistently
chosen as part of new change.
----------------------------------------------------------------------------
Key: DERBY-3479
URL: https://issues.apache.org/jira/browse/DERBY-3479
Project: Derby
Issue Type: Bug
Components: Regression Test Failure
Affects Versions: 10.4.0.0
Environment: OS: Solaris 10 6/06 s10x_u2wos_09a X86 64bits - SunOS 5.10
Generic_118855-14
JVM: Sun Microsystems Inc., java version "1.6.0_04", Java(TM) SE Runtime
Environment (build 1.6.0_04-b12), Java HotSpot(TM) Client VM (build 10.0-b19, mixed mode)
Reporter: Ole Solberg
Seen in tinderbox since r631930.
See e.g.
http://dbtg.thresher.com/derby/test/tinderbox_trunk16/jvm1.6/testing/testlog/SunOS-5.10_i86pc-i386/631932-derbyall_diff.txt
:
*** Start: predicatePushdown jdk1.6.0_04 derbyall:derbylang 2008-02-28 14:02:49
***
9285 del
< Rows seen from the left = 20
9285a9285
Rows seen from the left = 10
9297 del
< Rows seen from the right = 20
9297a9297
Rows seen from the right = 10
9299 del
< Rows returned = 20
9299a9299
Rows returned = 10
.
.
.