On 02.04.12 14:28, Bergquist, Brett wrote:
I did debug this yesterday and see what happens but don't know how to fix it yet. When the problem occurs
and "updateIndexStatsMinion" is invoked, there are 3 entries in the
"statisticsDescriptorList" property (two are valid and one is the invalid one) and 3 entries in
"conglomerateDescriptorList" of the TableDescriptor, but one of those is for the table conglomerate
itself and skipped in processing. So the statistics are rewritten for the 2 index conglomerates, but the
invalid one is missed.
So if the code were to detect this and remove the invalid one, the problem clears up. I did hack
in forcefully removing all statistics for the table regardless of the index and this in fact did
clear up the problem, but it needs to be made more elegant that this. I believe that the
"updateIndexStatsMinion" is only called with the conglomerates that needs statistics
updating, so removing all of the statistics at the start is probably not good. Maybe comparing the
entries in the "statisticsDescriptorList" with the index conglomerates of the table and
removing the statistics for any that don't have a corresponding index conglomerate is a valid
solution.
Hi Brett,
Just attached a prototype patch to DERBY-5680. It may need some more
work. The approach is what you suggest above. I'm only dealing with the
statistics for the base table being processed - a broader fix could
maybe be added to the update phase if someone feels that is appropriate.
NOTE: I haven't properly tested this patch yet (I also did some
last-minute changes before posting...).
--
Kristian
Brett
________________________________________
From: Kristian Waagan [[email protected]]
Sent: Monday, April 02, 2012 3:52 AM
To: [email protected]
Subject: Re: Question on why indexStat deamon is being triggered
On 31.03.12 22:28, Mike Matrigali wrote:
Bergquist, Brett wrote:
I hitched up the debugger and caught the indexStat return "49" for the
number of rows. I walked the stack back and found this was for
conglomeratename = "f3ec4922-011d-491f-3d8c-0000376d74d3"
So I did this query:
select
c.TABLEID,
c.CONGLOMERATENUMBER,
c.CONGLOMERATENAME,
c.ISINDEX,
c.ISCONSTRAINT,
c.CONGLOMERATEID,
t.TABLEID,
t.TABLENAME,
t.TABLETYPE,
s.STATID,
s.REFERENCEID,
s.TABLEID,
s.CREATIONTIMESTAMP,
s.TYPE,
s.VALID,
s.COLCOUNT,
CAST(STATISTICS AS VARCHAR(40)) as STATISTICS
from sys.SYSCONGLOMERATES c join sys.SYSTABLES t on c.TABLEID =
t.TABLEID join sys.SYSSTATISTICS s on s.TABLEID = t.TABLEID where
c.CONGLOMERATENAME = 'f3ec4922-011d-491f-3d8c-0000376d74d3'
Which is for one of the tables that I am having problems with. This
returned:
TABLEID CONGLOMERATENUMBER CONGLOMERATENAME ISINDEX ISCONSTRAINT
CONGLOMERATEID TABLEID TABLENAME TABLETYPE STATID REFERENCEID TABLEID
CREATIONTIMESTAMP TYPE VALID COLCOUNT STATISTICS
f3ec4922-011d-491f-3d8c-0000376d74d3 30768
f3ec4922-011d-491f-3d8c-0000376d74d3 false false
0e34c923-011d-491f-3d8c-0000376d74d3
f3ec4922-011d-491f-3d8c-0000376d74d3 BC01_CONFIGURATION_SET T
645c405f-0136-6999-c1b4-000065089f97
2c5f8294-012b-3c38-b55c-000043ea6398
f3ec4922-011d-491f-3d8c-0000376d74d3 2012-03-31 12:50:05.348 I true 1
numunique= 20334 numrows= 20334
f3ec4922-011d-491f-3d8c-0000376d74d3 30768
f3ec4922-011d-491f-3d8c-0000376d74d3 false false
0e34c923-011d-491f-3d8c-0000376d74d3
f3ec4922-011d-491f-3d8c-0000376d74d3 BC01_CONFIGURATION_SET T
49fc4ea2-0129-489a-0a26-00000732b350
592fcc26-011d-491f-3d8c-0000376d74d3
f3ec4922-011d-491f-3d8c-0000376d74d3 2010-06-17 23:16:40.018 I true 1
numunique= 49 numrows= 49
f3ec4922-011d-491f-3d8c-0000376d74d3 30768
f3ec4922-011d-491f-3d8c-0000376d74d3 false false
0e34c923-011d-491f-3d8c-0000376d74d3
f3ec4922-011d-491f-3d8c-0000376d74d3 BC01_CONFIGURATION_SET T
2c44c05e-0136-6999-c1b4-000065089f97
e9a40921-011d-491f-3d8c-0000376d74d3
f3ec4922-011d-491f-3d8c-0000376d74d3 2012-03-31 12:50:03.427 I true 1
numunique= 20334 numrows= 20334
So there seems to be 3 statistics rows for this table. And low and
behold there is the "numrows = 49" which I see the indexStat when I
turn on tracing.
First thought is I didn't think we needed any statistics at all
for a single column primary key.
That's probably DERBY-3790 ([1]).
I'll see if I can write a quick patch proposal and then we can take it
from there.
There should be entries in this catalog for every index on a table, so
it is not necessarily expected to be one per table. I think for each
index there is an entry for each column in the index (actually the stat
applies to multiple columns so in a 3 column index I think there is an
entry for (col1), (col1,col2), and (col1, col2, col3). But looking at
your ddl I would only expect 1 entry, as there is one
index on the table, and it has one column. I am not sure what the
expectation is for the foreign key that matches the primary key, maybe
it is 2 rows?
I do wonder if the foreign key is causing some extra issues. I think
in the long past we would end up creating 2 indexes in this case and
they would be exactly the same. So functionality was added to just
"logically" create the index and it would rely on the underlying
physical index sort of "shared" by the primary key on id and the foreign
constraint on id. We have had bugs in this area in the past and wonder
if it left an "orphan" row in the statistics.
As mentioned in another post, I think there's an easy way to avoid
scanning the same index twice, but I need to make sure the check is
valid first.
I'm thinking of simply checking if we've seen the index conglomerate
number already, and if so, ignore that index. The question is, can two
truly different indexes be represented by the same conglomerate number?
Interesting that the second column it looks like it was created in 2010.
seems like there might be multiple problems here. It would be great to
figure out how the multiple rows got in there in the first place. I
would suggest seeing what the system does for ddl in a brand new
database, turn off index stat thread, run your ddl, add some rows to the
table, and hand run update statistics procedure.
Just for information, the istat daemon and the system procedure share
almost all of the core code at the moment. The addition of automatic
updates of index cardinality statistics consisted mainly of adding the
"daemon framework" and triggering logic.
I would not be surprised if existing code does not expect the number of
rows you are seeing. If this is really a buggy orphan row, probably best
zero admin fix is to change the update statistics code to look for
other rows and delete them.
you can follow the update statistic code starting at:
java/engine/org/apache/derby/impl/sql/execute/AlterTableConstantAction/updateStatistics
There may be a index stat daemon bug here too, but I think it might just
be getting confused by the bad data in the catalog. Maybe it just keeps
reading all the rows and queueing work, but calling update statistics
only update 2 of the 3 rows.
Yes, the triggering code is driven by the existing statistics for the
table, whereas the update code is driven by the list of indexes on the
table.
--
Kristian
[1] https://issues.apache.org/jira/browse/DERBY-3790
< snip>