class FlushRegion:
def __init__(self, *entries, left_write = None, right_write = None):
for left, right in zip(entries[:-1], entries[1:]):
assert left.end <= right.start
if left_write is not None and len(entries):
assert left_write.end <= entries[0].start
if right_write
Flush:
- an updatable tree containing writes and indexes, designed to be
useful for syncing with a storage medium. contains a list of
subindexes
SubIndex
- a list of bounded regions (flush entries) of previous flushes,
each held as a useful tree. contains links to its left and write to
new
current plan is to consider flushes as lists of regions of subtrees,
terminated by either new writes or end-of-stream. then concepts around
how to best consolidate those can develop around member functions of
the lists.
i guess it would be good to retain is_full() in parallel with height
<= max_depth since the concept is partial . these are just metrics
that can be applied to portions of the graph, in order to judge what
to collect where. the metrics are basically the same.
so basically an important property is not whether the tree is full or
not, it is rather
- whether it is refrained from exceeding the max depth
- where parts of it are pulled out in order to produce that limit
the parts of it that are pulled out should go near future writes, so
as to combine with t
there might be some merit to building the index on the fly rather than
incrementally.
basically, you want the biggest full subtrees between the writes,
where these full subtrees are taken from the content of the last
flush.
the reason to have a full subtree is to limit the depth and sibling
node c
# all subtrees are kept as full trees.
def add(new_write):
# assume there is a sorted list of writes
# merge this in. retain what the lengths of non-write data are to
the left and right
# if the writes trim non-write entries, then:
# on the left, walk left-to-right, and on the rig
def add(new_write):
# hmm ... thinking of: the nearest write to the left and right of new_write
# seems we'll want a sorted list of write indices, like in version #1
Entry structure and Flush structure derive from Chunk. Entry wraps
either one, providing a view as if bounds are limited.
so i'm thinking the big remaining bit to make an algorithm here that i like
is to implement moving parts into balanced/full trees, away from recent writes.
maybe i'll just accumulate little bits around that.
can use chunk structure:
class Chunk:
start, end, data, height, leaf_count, age
10 matches
Mail list logo