[ https://issues.apache.org/jira/browse/IGNITE-17157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mikhail Petrov updated IGNITE-17157: ------------------------------------ Fix Version/s: 2.16 > Documentation of the Ignite index reader > ---------------------------------------- > > Key: IGNITE-17157 > URL: https://issues.apache.org/jira/browse/IGNITE-17157 > Project: Ignite > Issue Type: Task > Reporter: Denis Chudov > Assignee: Julia Bakulina > Priority: Major > Labels: documentation, ise > Fix For: 2.16 > > Time Spent: 1h > Remaining Estimate: 0h > > It would be nice to have a documentation for the Ignite index reader utility > that was added in IGNITE-14529. > {panel:title=Draft} > // Here should also be an overview with the description of the purposes of > the utility > To run this utility, use index-reader.sh/index-reader.bat script from Ignite > *bin* directory. > *Command line parameters:* > *--dir*: partition directory, where index.bin and (optionally) partition > files are located. > *--part-cnt*: full partitions count in cache group. Default value: 0 > *--page-size*: page size. Default value: 4096 > *--page-store-ver*: page store version. Default value: 2 > *--indexes*: you can specify index tree names that will be processed, > separated by comma without spaces, other index trees will be skipped. Default > value: null. Index tree names are not the same as index names, they have > format _cacheId_typeId_indexName##H2Tree%segmentNumber_, e.g. > {{2652_885397586_T0_F0_F1_IDX##H2Tree%0}}. You can see them in utility > output, in traversal information sections (RECURSIVE and HORIZONTAL). > *--dest-file*: file to print the report to (by default report is printed to > console). Default value: null > *--check-parts*: check cache data tree in partition files and it's > consistency with indexes. Default value: false > The utility can analyze index.bin and optionally partitions, if *--part-cnt* > greater that 0 and partition files are present, to read CacheDataTree and to > look into data pages to check their availability. It reads all index trees > from index.bin and traverses them in two ways: > - recursive traversal from root to leaves > - traversal by each level, as all pages on one level are connected through > forward ids. > Also it reads page reuse lists. After all, it scans all pages in file, trying > to detect orphan pages (those which don’t have any references from index > trees and reuse lists). > So, the output of the IgniteIndexReader consists of 4 main sections: > - recursive traversal info (with prefix <RECURSIVE>) > - horizontal traversal info (with prefix <HORIZONTAL>) > - page reuse lists info (with prefix <PAGE_LIST>) > - sequential scan of all pages. > Optionally, with *--check-parts* parameter, it can have information about how > CacheDataTree matches SQL indexes. If there are no errors, then there is only > message like this: > {noformat} > Partitions check detected no errors. > Partition check finished, total errors: 0, total problem partitions: 0 > {noformat} > Otherwise, there is “Partitions check:“ section with list of errors. For > example, this is how looks message about the entry that was found in > CacheDataTree, but was not found in SQL indexes: > {noformat} > <ERROR> Errors detected in partition, partId=1023 > <ERROR> Entry is missing in index: I > [idxName=2652_885397586_T0_F0_F1_IDX##H2Tree%0, pageId=0002ffff0000000d], > cacheId=2652, partId=1023, pageIndex=8, itemId=0, link=285868728254472 > <ERROR> Entry is missing in index: I > [idxName=2652_885397586_T0_F2_IDX##H2Tree%0, pageId=0002ffff0000000b], > cacheId=2652, partId=1023, pageIndex=8, itemId=0, link=285868728254472 > All errors in the output have prefix <ERROR>. > {noformat} > h3. Command line examples > Analyze files from /gridgain/corrupted_idxs, there should be also 1024 > partitions in this cache group (some of partition files can be missing if > node where they have been received from was not owning these partitions), use > pageSize=4096 and page store version 2, report goes to report.txt: > {noformat} > ./index-reader.sh --dir "/gridgain/corrupted_idxs" --part-cnt 1024 > --page-size 4096 --page-store-ver 2 --dest-file "report.txt" > {noformat} > Read only SQL indexes: > {noformat} > ./index-reader.sh --dir "/gridgain/corrupted_idxs" --dest-file "report.txt" > {noformat} > Read SQL indexes and check cache data tree in partitions: > {noformat} > ./index-reader.sh --dir "/gridgain/corrupted_idxs" --part-cnt 1024 > --check-parts --dest-file "rep > {noformat} > h3. Output samples > <RECURSIVE> and <HORIZONTAL> output sections contain information about index > trees: tree name, root page id, page type statistics, count of items. The > format for both traversals is the same. > {noformat} > <RECURSIVE> Index tree: I [idxName=2654_-1177891018__key_PK##H2Tree%0, > pageId=0202ffff00000066] > <RECURSIVE> -- Page stat: > <RECURSIVE> H2ExtrasLeafIO: 2 > <RECURSIVE> H2ExtrasInnerIO: 1 > <RECURSIVE> BPlusMetaIO: 1 > <RECURSIVE> -- Count of items found in leaf pages: 200 > <RECURSIVE> No errors occurred while traversing. > ... > <RECURSIVE> Total trees: 19 > <RECURSIVE> Total pages found in trees: 49 > <RECURSIVE> Total errors during trees traversal: 2 > {noformat} > Page lists section also contains reuse list bucket data with list meta, > bucket number and start pages of lists found in bucket. It also contains page > type statistics: > {noformat} > <PAGE_LIST> Page lists info. > <PAGE_LIST> ---Printing buckets data: > <PAGE_LIST> List meta id=844420635164675, bucket number=0, > lists=[844420635164687] > <PAGE_LIST> -- Page stat: > <PAGE_LIST> H2ExtrasLeafIO: 32 > <PAGE_LIST> H2ExtrasInnerIO: 1 > <PAGE_LIST> BPlusMetaIO: 1 > <PAGE_LIST> ---No errors. > {noformat} > So does the sequential scan info: > {noformat} > ---These pages types were encountered during sequential scan: > H2ExtrasLeafIO: 165 > H2ExtrasInnerIO: 19 > PagesListNodeIO: 1 > PagesListMetaIO: 1 > MetaStoreLeafIO: 5 > BPlusMetaIO: 20 > PageMetaIO: 1 > MetaStoreInnerIO: 1 > TrackingPageIO: 1 > --- > Total pages encountered during sequential scan: 214 > Total errors occurred during sequential scan: 0 > {noformat} > Index reader compares the results of both traversals and sizes of indexes of > same caches, so you should just be aware of errors. E.g. error message about > index size inconsistency looks like this: > {noformat} > <ERROR> Index size inconsistency: cacheId=2652, typeId=885397586 > <ERROR> Index name: I [idxName=2652_885397586_T0_F0_F1_IDX##H2Tree%0, > pageId=0002ffff0000000d], size=1700 > <ERROR> Index name: I [idxName=2652_885397586__key_PK##H2Tree%0, > pageId=0002ffff00000005], size=0 > <ERROR> Index name: I [idxName=2652_885397586_T0_F1_IDX##H2Tree%0, > pageId=0002ffff00000009], size=1700 > <ERROR> Index name: I [idxName=2652_885397586_T0_F0_IDX##H2Tree%0, > pageId=0002ffff00000007], size=1700 > <ERROR> Index name: I [idxName=2652_885397586_T0_F2_IDX##H2Tree%0, > pageId=0002ffff0000000b] > {noformat} > {panel} -- This message was sent by Atlassian Jira (v8.20.10#820010)