[ 
https://issues.apache.org/jira/browse/IGNITE-11998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541980#comment-17541980
 ] 

Maxim Muzafarov commented on IGNITE-11998:
------------------------------------------

h4. The inital proposal

Currently, during a full scan of a cache group partition (SqlQuery or 
ScanQuery) all the data is read though the partition B-Tree and this in turn 
leads to the _n(log n)_ complexity. For such a queries it may be necessary to 
read all the data by sequential pages read directly from the partition file 
which has the _n_ complexity and also the sequential file reads has some 
benefits instead of random access file reads.

h4. The main issue

Accoring to the [Ignite Multi-Tier Storage - under the 
hood|https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Multi-Tier+Storage+-+under+the+hood#IgniteMultiTierStorageunderthehood-Longobjects]
 long objects are splitted on the several pages. For the pages which are 
contain an entry tail there is no any dedicated page attribute or page header 
flag to identify such a pages, however, such a pages have a link to an other 
fragment or a entry head. These pages may only be accessed from the page which 
contain the entry head.

h4. Current solution and benchmarks

_The double loop over the all partition pages. _ 

During the first loop we are reading all the pages and collecting references to 
the other pages (reading entries are performed from the head to tail, writing 
entries are preformed from the tail to head). On the second loop we are 
building the list of pages that doesn't have a references on itself - and these 
are the pages that containing the entries headers to be read.

||Data Page Scan||true||false||
|IgniteDataPageScanBenchmark|148848|179228|
|IgniteDataPageScanBenchmark|186917|166980|
|IgniteDataPageScanBenchmark|197114|175667|

h4. Possible solutions

An additional analysis and investigation required to perform the full partition 
scan using only the one loop. We need to identify the fragmented pages with 
entries tails:
- for such a pages we can write the {{freeSpace}}, {{directCounter}}, 
{{indirectCounter}} e.g. {{-1}} value (currently it's zero) and here we need 
check the pds compatibility.
- almost the same issue with identifying fragmented pages are here - 
IGNITE-12510


> Fix DataPageScan for fragmented pages.
> --------------------------------------
>
>                 Key: IGNITE-11998
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11998
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Ivan Bessonov
>            Assignee: Maxim Muzafarov
>            Priority: Critical
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Fragmented pages crash JVM when accessed by DataPageScan scanner/query 
> optimized scanner. It happens when scanner accesses data in later chunk in 
> fragmented entry but treats it like the first one, expecting length of the 
> payload, which is absent and replaced with raw entry data.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to