[ 
https://issues.apache.org/jira/browse/IGNITE-17157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julia Bakulina updated IGNITE-17157:
------------------------------------
    Labels: documentation ise  (was: documentation)

> Documentation of the Ignite index reader
> ----------------------------------------
>
>                 Key: IGNITE-17157
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17157
>             Project: Ignite
>          Issue Type: Task
>            Reporter: Denis Chudov
>            Priority: Major
>              Labels: documentation, ise
>
> It would be nice to have a documentation for the Ignite index reader utility 
> that was added in IGNITE-14529.
> {panel:title=Draft}
> // Here should also be an overview with the description of the purposes of 
> the utility
> To run this utility, use index-reader.sh/index-reader.bat script from Ignite 
> *bin* directory.
> *Command line parameters:*
> *--dir*: partition directory, where index.bin and (optionally) partition 
> files are located.
> *--part-cnt*: full partitions count in cache group. Default value: 0
> *--page-size*: page size. Default value: 4096
> *--page-store-ver*: page store version. Default value: 2
> *--indexes*: you can specify index tree names that will be processed, 
> separated by comma without spaces, other index trees will be skipped. Default 
> value: null. Index tree names are not the same as index names, they have 
> format _cacheId_typeId_indexName##H2Tree%segmentNumber_, e.g. 
> {{2652_885397586_T0_F0_F1_IDX##H2Tree%0}}. You can see them in utility 
> output, in traversal information sections (RECURSIVE and HORIZONTAL).
> *--dest-file*: file to print the report to (by default report is printed to 
> console). Default value: null
> *--check-parts*: check cache data tree in partition files and it's 
> consistency with indexes. Default value: false
> The utility can analyze index.bin and optionally partitions, if *--part-cnt* 
> greater that 0 and partition files are present, to read CacheDataTree and to 
> look into data pages to check their availability. It reads all index trees 
> from index.bin and traverses them in two ways:
> - recursive traversal from root to leaves
> - traversal by each level, as all pages on one level are connected through 
> forward ids.
> Also it reads page reuse lists. After all, it scans all pages in file, trying 
> to detect orphan pages (those which don’t have any references from index 
> trees and reuse lists).
> So, the output of the IgniteIndexReader consists of 4 main sections:
> - recursive traversal info (with prefix <RECURSIVE>)
> - horizontal traversal info (with prefix <HORIZONTAL>)
> - page reuse lists info (with prefix <PAGE_LIST>)
> - sequential scan of all pages.
> Optionally, with *--check-parts* parameter, it can have information about how 
> CacheDataTree matches SQL indexes. If there are no errors, then there is only 
> message like this:
> {noformat}
> Partitions check detected no errors.
> Partition check finished, total errors: 0, total problem partitions: 0
> {noformat}
> Otherwise, there is “Partitions check:“ section with list of errors. For 
> example, this is how looks message about the entry that was found in 
> CacheDataTree, but was not found in SQL indexes:
> {noformat}
> <ERROR> Errors detected in partition, partId=1023
> <ERROR> Entry is missing in index: I 
> [idxName=2652_885397586_T0_F0_F1_IDX##H2Tree%0, pageId=0002ffff0000000d], 
> cacheId=2652, partId=1023, pageIndex=8, itemId=0, link=285868728254472
> <ERROR> Entry is missing in index: I 
> [idxName=2652_885397586_T0_F2_IDX##H2Tree%0, pageId=0002ffff0000000b], 
> cacheId=2652, partId=1023, pageIndex=8, itemId=0, link=285868728254472
> All errors in the output have prefix <ERROR>.
> {noformat}
> h3. Command line examples
> Analyze files from /gridgain/corrupted_idxs, there should be also 1024 
> partitions in this cache group (some of partition files can be missing if 
> node where they have been received from was not owning these partitions), use 
> pageSize=4096 and page store version 2, report goes to report.txt:
> {noformat}
> ./index-reader.sh --dir "/gridgain/corrupted_idxs" --part-cnt 1024 
> --page-size 4096 --page-store-ver 2  --dest-file "report.txt"
> {noformat}
> Read only SQL indexes:
> {noformat}
> ./index-reader.sh --dir "/gridgain/corrupted_idxs" --dest-file "report.txt"
> {noformat}
> Read SQL indexes and check cache data tree in partitions:
> {noformat}
> ./index-reader.sh --dir "/gridgain/corrupted_idxs" --part-cnt 1024 
> --check-parts --dest-file "rep
> {noformat}
> h3. Output samples
> <RECURSIVE> and <HORIZONTAL> output sections contain information about index 
> trees: tree name, root page id, page type statistics, count of items. The 
> format for both traversals is the same.
> {noformat}
> <RECURSIVE> Index tree: I [idxName=2654_-1177891018__key_PK##H2Tree%0, 
> pageId=0202ffff00000066]
> <RECURSIVE> -- Page stat:
> <RECURSIVE> H2ExtrasLeafIO: 2
> <RECURSIVE> H2ExtrasInnerIO: 1
> <RECURSIVE> BPlusMetaIO: 1
> <RECURSIVE> -- Count of items found in leaf pages: 200
> <RECURSIVE> No errors occurred while traversing.
> ...
> <RECURSIVE> Total trees: 19
> <RECURSIVE> Total pages found in trees: 49
> <RECURSIVE> Total errors during trees traversal: 2
> {noformat}
> Page lists section also contains reuse list bucket data with list meta, 
> bucket number and start pages of lists found in bucket. It also contains page 
> type statistics:
> {noformat}
> <PAGE_LIST> Page lists info.
> <PAGE_LIST> ---Printing buckets data:
> <PAGE_LIST> List meta id=844420635164675, bucket number=0, 
> lists=[844420635164687]
> <PAGE_LIST> -- Page stat:
> <PAGE_LIST> H2ExtrasLeafIO: 32
> <PAGE_LIST> H2ExtrasInnerIO: 1
> <PAGE_LIST> BPlusMetaIO: 1
> <PAGE_LIST> ---No errors.
> {noformat}
> So does the sequential scan info:
> {noformat}
> ---These pages types were encountered during sequential scan:
> H2ExtrasLeafIO: 165
> H2ExtrasInnerIO: 19
> PagesListNodeIO: 1
> PagesListMetaIO: 1
> MetaStoreLeafIO: 5
> BPlusMetaIO: 20
> PageMetaIO: 1
> MetaStoreInnerIO: 1
> TrackingPageIO: 1
> ---
> Total pages encountered during sequential scan: 214
> Total errors occurred during sequential scan: 0
> {noformat}
> Index reader compares the results of both traversals and sizes of indexes of 
> same caches, so you should just be aware of errors. E.g. error message about 
> index size inconsistency looks like this:
> {noformat}
> <ERROR> Index size inconsistency: cacheId=2652, typeId=885397586
> <ERROR>      Index name: I [idxName=2652_885397586_T0_F0_F1_IDX##H2Tree%0, 
> pageId=0002ffff0000000d], size=1700
> <ERROR>      Index name: I [idxName=2652_885397586__key_PK##H2Tree%0, 
> pageId=0002ffff00000005], size=0
> <ERROR>      Index name: I [idxName=2652_885397586_T0_F1_IDX##H2Tree%0, 
> pageId=0002ffff00000009], size=1700
> <ERROR>      Index name: I [idxName=2652_885397586_T0_F0_IDX##H2Tree%0, 
> pageId=0002ffff00000007], size=1700
> <ERROR>      Index name: I [idxName=2652_885397586_T0_F2_IDX##H2Tree%0, 
> pageId=0002ffff0000000b]
> {noformat}
> {panel}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to