On Tue, Jun 14, 2016 at 08:11:59PM +0200, Goffredo Baroncelli wrote:
> On 2016-06-12 20:53, Hans van Kranenburg wrote:
> > Hi!
> > 
> > On 06/12/2016 08:41 PM, Goffredo Baroncelli wrote:
> >> Hi All,
> >> 
> >> On 2016-06-10 22:47, Hans van Kranenburg wrote:
> >>>> +        if (sk->min_objectid < sk->max_objectid) + 
> >>>> sk->min_objectid += 1;
> >>> 
> >>> ...and now it's (289406977 168 19193856), which means you're 
> >>> continuing your search *after* the block group item!
> >>> 
> >>> (289406976 168 19193856) is actually (289406976 << 72) + (168 << 
> >>> 64) + 19193856, which is 1366685806470112827871857008640
> >>> 
> >>> The search is continued at 1366685811192479310741502222336,
> >>> which skips 4722366482869645213696 possible places where an
> >>> object could live in the tree.
> >> 
> >> I am not sure to follow you. The extent tree (the tree involved in 
> >> the search), contains only two kind of object:
> >> 
> >> - BLOCK_GROUP_ITEM  where the key means (logical address, 0xc0,
> >> size in bytes) - EXTENT_ITEM, where the key means (logical address,
> >> 0xa8, size in bytes)
> >> 
> >> So it seems that for each (possible) "logical address", only two 
> >> items might exist; the two item are completely identified by 
> >> (objectid, type, ). It should not possible (for the extent tree)
> >> to have two item with the same objectid,key and different offset.
> >> So, for the extent tree, it is safe to advance only the objectid
> >> field.
> >> 
> >> I am wrong ?
> > 
> > When calling the search ioctl, the caller has to provide a memory
> > buffer that the kernel is going to fill with results. For
> > BTRFS_IOC_TREE_SEARCH used here, this buffer has a fixed size of 4096
> > bytes. Without some headers etc, this leaves a bit less than 4000
> > bytes of space for the kernel to write search result objects to.
> > 
> > If I do a search that will result in far more objects to be returned
> > than possible to fit in those <4096 bytes, the kernel will just put a
> > few in there until the next one does not fit any more.
> > 
> > It's the responsibility of the caller to change the start of the
> > search to point just after the last received object and do the search
> > again, in order to retrieve a few extra results.
> 
> You are right. If the last item in the buffer is a EXTENT_ITEM, and the 
> next item in the disk is a BLOCK_GROUP_ITEM with the same object id,
> the latter would be skipped.
> 
> I was find always terrible the BTRFS_IOC_TREE_SEARCH; if the min_*
> fields was separate from the key, the use of this ioctl would
> be a lot simpler. Moreover in most case (like this one), it would be 
> reduced the context switches, because the ioctl would return
> only valid data.

   There's an argument for implementing it. However, given the way the
indexing works (concatenation of the key elements, resulting in
lexical ordering of keys), you'd still have to do exactly the same
work, only in the kernel instead. The only thing you really win is the
number of context switches.

   It would really have to be a new ioctl, too. You can't change the
behaviour of the existing one.

   Hugo.

> > 
> > So, the important line here was: "...when the extent_item just
> > manages to squeeze in as last result into the current result buffer
> > from the ioctl..."
> > 
> 
> 

-- 
Hugo Mills             | "What are we going to do tonight?"
hugo@... carfax.org.uk | "The same thing we do every night, Pinky. Try to
http://carfax.org.uk/  | take over the world!"
PGP: E2AB1DE4          |

Attachment: signature.asc
Description: Digital signature

Reply via email to