On Thu, Aug 29, 2013 at 7:08 PM, Zach Brown <z...@redhat.com> wrote: > On Thu, Aug 29, 2013 at 02:59:26PM +0100, Filipe David Borba Manana wrote: >> When the binary search returns 0 (exact match), the target key >> will necessarily be at slot 0 of all nodes below the current one, >> so in this case the binary search is not needed because it will >> always return 0, and we waste time doing it, holding node locks >> for longer than necessary, etc. >> >> Below follow histograms with the times spent on the current approach of >> doing a binary search when the previous binary search returned 0, and >> times for the new approach, which directly picks the first item/child >> node in the leaf/node. >> >> Count: 5013 >> Range: 25.000 - 497.000; Mean: 82.767; Median: 64.000; Stddev: 49.972 >> Percentiles: 90th: 141.000; 95th: 182.000; 99th: 287.000 > >> Count: 5013 >> Range: 10.000 - 8303.000; Mean: 28.505; Median: 18.000; Stddev: 119.147 >> Percentiles: 90th: 49.000; 95th: 74.000; 99th: 115.000 > > Where'd the giant increase in the range max come from? Just jittery > measurement? Maybe get a lot more data points to smooth that out?
Correct, just jittery. > >> +static int key_search(struct extent_buffer *b, struct btrfs_key *key, >> + int level, int *prev_cmp, int *slot) >> +{ >> + char *kaddr = NULL; >> + unsigned long map_start = 0; >> + unsigned long map_len = 0; >> + unsigned long offset; >> + struct btrfs_disk_key *k = NULL; >> + struct btrfs_disk_key unaligned; >> + int err; >> + >> + if (*prev_cmp != 0) { >> + *prev_cmp = bin_search(b, key, level, slot); >> + return *prev_cmp; >> + } > > >> + *slot = 0; >> + >> + return 0; > > So this is the actual work done by the function. Correct. That and the very first if statement in the function. > >> + >> + if (level == 0) >> + offset = offsetof(struct btrfs_leaf, items); >> + else >> + offset = offsetof(struct btrfs_node, ptrs); > > (+10 fragility points for assuming that the key starts each struct > instead of using [0].key) Ok. I just copied that from ctree.c:bin_search(). I guess that gives another +10 fragility points. Thanks for pointing out. > >> + >> + err = map_private_extent_buffer(b, offset, sizeof(struct >> btrfs_disk_key), >> + &kaddr, &map_start, &map_len); >> + if (!err) { >> + k = (struct btrfs_disk_key *)(kaddr + offset - map_start); >> + } else { >> + read_extent_buffer(b, &unaligned, offset, sizeof(unaligned)); >> + k = &unaligned; >> + } >> + >> + ASSERT(comp_keys(k, key) == 0); > > All of the rest of the function, including most of the local variables, > is overhead for that assertion. We don't actually care about the > relative sorted key position of the two keys so we don't need smart > field-aware comparisions. We can use a dumb memcmp. > > We can replace all that stuff with two easy memcmp_extent_buffers() > which vanish if ASSERT is a nop. > > if (level) > ASSERT(!memcmp_extent_buffer(b, key, > offsetof(struct btrfs_node, ptrs[0].key), > sizeof(*key))); > else > ASSERT(!memcmp_extent_buffer(b, key, > offsetof(struct btrfs_leaf, items[0].key), > sizeof(*key))); > > Right? No, and as Josef just pointed, like that we compare a btrfs_key with a btrfs_disk_key, which is wrong due to endianess differences. So I'll go for Josef's suggestion in the following mail about wrapping stuff with a CONFIG_BTRFS_ASSERT #ifdef macro. > > - z -- Filipe David Manana, "Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men." -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html