On 31.03.12 22:28, Mike Matrigali wrote:
Bergquist, Brett wrote:
I hitched up the debugger and caught the indexStat return "49" for the
number of rows. I walked the stack back and found this was for
conglomeratename = "f3ec4922-011d-491f-3d8c-0000376d74d3"

So I did this query:

select
c.TABLEID,
c.CONGLOMERATENUMBER,
c.CONGLOMERATENAME,
c.ISINDEX,
c.ISCONSTRAINT,
c.CONGLOMERATEID,
t.TABLEID,
t.TABLENAME,
t.TABLETYPE,
s.STATID,
s.REFERENCEID,
s.TABLEID,
s.CREATIONTIMESTAMP,
s.TYPE,
s.VALID,
s.COLCOUNT,
CAST(STATISTICS AS VARCHAR(40)) as STATISTICS
from sys.SYSCONGLOMERATES c join sys.SYSTABLES t on c.TABLEID =
t.TABLEID join sys.SYSSTATISTICS s on s.TABLEID = t.TABLEID where
c.CONGLOMERATENAME = 'f3ec4922-011d-491f-3d8c-0000376d74d3'

Which is for one of the tables that I am having problems with. This
returned:

TABLEID CONGLOMERATENUMBER CONGLOMERATENAME ISINDEX ISCONSTRAINT
CONGLOMERATEID TABLEID TABLENAME TABLETYPE STATID REFERENCEID TABLEID
CREATIONTIMESTAMP TYPE VALID COLCOUNT STATISTICS
f3ec4922-011d-491f-3d8c-0000376d74d3 30768
f3ec4922-011d-491f-3d8c-0000376d74d3 false false
0e34c923-011d-491f-3d8c-0000376d74d3
f3ec4922-011d-491f-3d8c-0000376d74d3 BC01_CONFIGURATION_SET T
645c405f-0136-6999-c1b4-000065089f97
2c5f8294-012b-3c38-b55c-000043ea6398
f3ec4922-011d-491f-3d8c-0000376d74d3 2012-03-31 12:50:05.348 I true 1
numunique= 20334 numrows= 20334
f3ec4922-011d-491f-3d8c-0000376d74d3 30768
f3ec4922-011d-491f-3d8c-0000376d74d3 false false
0e34c923-011d-491f-3d8c-0000376d74d3
f3ec4922-011d-491f-3d8c-0000376d74d3 BC01_CONFIGURATION_SET T
49fc4ea2-0129-489a-0a26-00000732b350
592fcc26-011d-491f-3d8c-0000376d74d3
f3ec4922-011d-491f-3d8c-0000376d74d3 2010-06-17 23:16:40.018 I true 1
numunique= 49 numrows= 49
f3ec4922-011d-491f-3d8c-0000376d74d3 30768
f3ec4922-011d-491f-3d8c-0000376d74d3 false false
0e34c923-011d-491f-3d8c-0000376d74d3
f3ec4922-011d-491f-3d8c-0000376d74d3 BC01_CONFIGURATION_SET T
2c44c05e-0136-6999-c1b4-000065089f97
e9a40921-011d-491f-3d8c-0000376d74d3
f3ec4922-011d-491f-3d8c-0000376d74d3 2012-03-31 12:50:03.427 I true 1
numunique= 20334 numrows= 20334

So there seems to be 3 statistics rows for this table. And low and
behold there is the "numrows = 49" which I see the indexStat when I
turn on tracing.

First thought is I didn't think we needed any statistics at all
for a single column primary key.

That's probably DERBY-3790 ([1]).
I'll see if I can write a quick patch proposal and then we can take it from there.


There should be entries in this catalog for every index on a table, so
it is not necessarily expected to be one per table. I think for each
index there is an entry for each column in the index (actually the stat
applies to multiple columns so in a 3 column index I think there is an
entry for (col1), (col1,col2), and (col1, col2, col3). But looking at
your ddl I would only expect 1 entry, as there is one
index on the table, and it has one column. I am not sure what the
expectation is for the foreign key that matches the primary key, maybe
it is 2 rows?

I do wonder if the foreign key is causing some extra issues. I think
in the long past we would end up creating 2 indexes in this case and
they would be exactly the same. So functionality was added to just
"logically" create the index and it would rely on the underlying
physical index sort of "shared" by the primary key on id and the foreign
constraint on id. We have had bugs in this area in the past and wonder
if it left an "orphan" row in the statistics.

As mentioned in another post, I think there's an easy way to avoid scanning the same index twice, but I need to make sure the check is valid first. I'm thinking of simply checking if we've seen the index conglomerate number already, and if so, ignore that index. The question is, can two truly different indexes be represented by the same conglomerate number?


Interesting that the second column it looks like it was created in 2010.

seems like there might be multiple problems here. It would be great to
figure out how the multiple rows got in there in the first place. I
would suggest seeing what the system does for ddl in a brand new
database, turn off index stat thread, run your ddl, add some rows to the
table, and hand run update statistics procedure.

Just for information, the istat daemon and the system procedure share almost all of the core code at the moment. The addition of automatic updates of index cardinality statistics consisted mainly of adding the "daemon framework" and triggering logic.


I would not be surprised if existing code does not expect the number of
rows you are seeing. If this is really a buggy orphan row, probably best
zero admin fix is to change the update statistics code to look for
other rows and delete them.
you can follow the update statistic code starting at:
java/engine/org/apache/derby/impl/sql/execute/AlterTableConstantAction/updateStatistics


There may be a index stat daemon bug here too, but I think it might just
be getting confused by the bad data in the catalog. Maybe it just keeps
reading all the rows and queueing work, but calling update statistics
only update 2 of the 3 rows.

Yes, the triggering code is driven by the existing statistics for the table, whereas the update code is driven by the list of indexes on the table.


--
Kristian

[1] https://issues.apache.org/jira/browse/DERBY-3790

< snip >

Reply via email to