Jim Apple has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
......................................................................


Patch Set 5:

(31 comments)

http://gerrit.cloudera.org:8080/#/c/4715/4//COMMIT_MSG
Commit Message:

PS4, Line 13:  buddy allocation
> The main use case is to allocate power-of-two buffers, so currently this is
Though the number of elements in HTs is a power of two, the size of the 
allocations could be different if the sizeof a slot changes, yes? For instance, 
if sizeof(slot) is 24, then most allocations will waste a good bit of space?


http://gerrit.cloudera.org:8080/#/c/4715/5/be/src/bufferpool/suballocator-test.cc
File be/src/bufferpool/suballocator-test.cc:

Line 72:   /// Verify that the memory for all of the allocations is writable 
and disjoint by
Maybe replace "Verify" with "ASSERT", for clarity?


PS5, Line 72: allocations
"Suballocations", here and below


PS5, Line 77: list
vector


PS5, Line 77: clear
call std::vector::clear() after the for loop to prevent UB if a caller uses 
(*allocs)[i] for some i.


Line 85:   ObjectPool obj_pool_;
Each of these member variables could use an explanation about why it is here.


Line 101:   ASSERT_OK(pool.RegisterClient("client", &global_reservation_, 
profile(), &client));
The ASSERTs in this file could generally benefit from more info in the "<<" 
section to help diagnose failures from the logs. I have recently seen a backend 
test error that I was able to reproduce, but only on a machine image I had 
limited access to, so I had to ssh in between other people using it, and more 
backend test logs would have saved me some time.


Line 111:     ASSERT_OK(allocator.Allocate(ALLOC_SIZE, &allocs.back()));
I generally think of ASSERT as being for failures in which the test should not 
continue, but  here I think some further parts of the test are interesting. For 
instance, the smaller allocations below are interesting even if this one fails.

I can understand breaking from the loop if this fails, though.

My feeling is that several other ASSERTs should be EXPECTs in this file.


Line 116:   // Attempts to allocate more memory should fail gracefully.
Can we allocate 0 bytes here?


Line 127:   for (unique_ptr<Suballocation>& alloc : allocs) {
call FreeAllocations?


PS5, Line 147: odd
It might also be good to test edge cases: (1 << i) - 1


Line 159:     memset(alloc->data(), 0, alloc->len()); // Check memory is 
writable.
Add note that ASAN performs this check.


Line 180:   const int num_allocs = 16;
either "NUM_ALLOCS" or "total_mem", above


PS5, Line 184: int64_t curr_alloc_size = TEST_BUFFER_LEN / 8;
             :   while (curr_alloc_size * num_allocs < TOTAL_MEM) {
Could be more readable as a for loop


Line 188:     for (int i = 0; i < num_allocs; ++i) {
Can this be range-based, like the loop on 205?


Line 197:     // allocations should require an extra page.
Can you add a comment explaining why should they never require more than an 
extra page? That way, if this breaks, the test code will help the fixer 
understand why it was originally true and why it is part of the buddy system 
contract.


Line 206:     allocator.Free(move(alloc));
call FreeAllocations()


PS5, Line 215:   const int64_t TOTAL_MEM = TEST_BUFFER_LEN * 100;
             :   global_reservation_.InitRootTracker(nullptr, TOTAL_MEM);
             :   BufferPool pool(TEST_BUFFER_LEN, TOTAL_MEM);
             : 
             :   ReservationTracker reservation;
             :   reservation.InitChildTracker(nullptr, &global_reservation_, 
nullptr, TOTAL_MEM);
             :   BufferPool::Client client;
             :   ASSERT_OK(pool.RegisterClient("client", &reservation, 
profile(), &client));
             : 
             :   Suballocator allocator(&pool, &client, TEST_BUFFER_LEN, 
ExpansionStrategy::NEVER);
This section seems like a good candidate for a member function, as it appears 
very similarly elsewhere


Line 263: /// Do some randomised testing of the allocator. Simulate some 
interesting patterns with
This tests only correctness, not anything about fragmentation, right?


Line 289:       int64_t remaining_mem_per_alloc = (TOTAL_MEM - allocated_mem) / 
num_allocs;
const


PS5, Line 290: 1.0
The first parameter below is 2 - that's the mean, right, so the average is 0.2?


PS5, Line 334: 8
I didn't see this in the header file. I'd suggest that this expectation should 
be in the contract.


Line 336:     for (int64_t offset = 0; offset < alloc->len(); offset += 8) {
std::iota


http://gerrit.cloudera.org:8080/#/c/4715/4/be/src/bufferpool/suballocator.cc
File be/src/bufferpool/suballocator.cc:

Line 144:     DCHECK_EQ(free_node->len_, 1LL << (i + LOG_MIN_ALLOCATION_BYTES));
> Done
Does not look done. Maybe some changes were left out of PS5 by mistake?


http://gerrit.cloudera.org:8080/#/c/4715/4/be/src/bufferpool/suballocator.h
File be/src/bufferpool/suballocator.h:

Line 37: };
> It can but we end up with Suballocator::ExpansionStrategy everywhere, which
Maybe typedef liberally? impala::ExpansionStrategy strikes me as potentially 
more confusing


Line 50: ///
> Done
Looks missing


PS4, Line 127: llpt
> Done
Looks missing?


Line 171:   int64_t len() const { return len_; }
> unique_ptr to me means unique ownership, which doesn't preclude a raw point
std::unique_ptr to me means that all other references are statically scoped 
within the unique_ptr's lifetime, like

    unique_ptr<T> foo;
    Bar(foo.get());

I think of std::weak_ptr partially as having the effect of breaking shared_ptr 
cycles. As a code reader, I would find aw pointer less confusing than one 
unique_ptr and one raw pointer. Another pattern I was able to get working that 
I find more understandable is making parent pointers shared pointers and buddy 
pointers implicit in an array ordering, as follows:

    struct Duo {
      struct Block {
        shared_ptr<Duo> duo;
        bool idx;
        shared_ptr<Duo> Split() {
          auto result = make_shared<Duo>();
          result->parent = *this;
          result->size = duo->size / 2;
          return result;
        }
      };
      Block parent;
      bool in_use[2];
      int64_t size;
      using T = char*;
      T payload[2];
      static void Release(Block& b) {
        b.duo->in_use[b.idx] = false;
        if (!b.duo->in_use[1 - b.idx] && b.duo->parent.duo != nullptr) {
          Release(b.duo->parent);
        }
        b.duo.reset();
      }
    };


http://gerrit.cloudera.org:8080/#/c/4715/5/be/src/bufferpool/suballocator.h
File be/src/bufferpool/suballocator.h:

PS5, Line 94: and the number of different
            :   /// power-of-two allocation sizes
not anymore


Line 106:   int ComputeListIndex(int64_t bytes) const;
Deserves a comment


PS5, Line 202:   /// If this is in a free list, the next element in the list. 
nullptr if this is the last
             :   /// element in the free list.
             :   std::unique_ptr<Suballocation> next_free_;
             : 
             :   /// If this is in a free list, the previous element in the 
list. nullptr if this is the
             :   /// first element.
             :   Suballocation* prev_free_;
Since only unused Suballocations need these, they can be "intrusive" - you can 
reuse the first few bytes of the buffer_ for these pointers.


-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jbapple-imp...@apache.org>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to