Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
......................................................................


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19475/4//COMMIT_MSG@24
PS4, Line 24: Testing:
> > What circumstances show better performance?
I'm not convinced that this change will improve performance noticeably. When 
the thread writes to the data cache, the OS should buffer the write in memory, 
and the writer thread should not need to wait for the actual write to the 
underlying IO device. That may change when we start to use Direct IO for the 
data cache (IMPALA-11905).

I'm looking for some circumstance where data_cache_num_write_threads=N performs 
better than data_cache_num_write_threads=0. IMPALA-11886 mentions that 
performance deteriorates when using the data cache. I would like to see 
concrete numbers about how much this code change helps.

In theory, that would apply to the first run of a query when the cache is 
empty. It could also apply to workloads where the data cache is too small and 
the cache is being thrashed (e.g. a data cache of 1GB when there is a 5GB 
working set).


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h
File be/src/runtime/io/data-cache.h:

http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@271
PS4, Line 271:     uint8_t* buffer_;
> > I think it would be cleaner for this to be a unique_ptr.
Yes, this should be unique_ptr<uint8_t[]>. It will have a similar amount of 
code, but it does automatic cleanup of the buffer when the StoreTask is 
deleted. For example, if the buffer_ is nullptr, unique_ptr knows not to try to 
call free on it.

Impala's code style prefers to use unique_ptr (or other smart pointers) 
wherever possible to avoid dealing with manual allocations / frees.

https://google.github.io/styleguide/cppguide.html#Ownership_and_Smart_Pointers


http://gerrit.cloudera.org:8080/#/c/19475/4/be/src/runtime/io/data-cache.h@203
PS4, Line 203:
             :   /// The key used for look up in the cache.
             :   struct CacheKey {
             :    public:
             :     explicit CacheKey(const string& filename, int64_t mtime, 
int64_t offset)
             :       : key_(filename.size() + sizeof(mtime) + sizeof(offset)) {
             :       DCHECK_GE(mtime, 0);
             :       DCHECK_GE(offset, 0);
             :       key_.append(&mtime, sizeof(mtime));
             :       key_.append(&offset, sizeof(offset));
             :       key_.append(filename);
             :     }
             :
             :     int64_t Hash() const {
             :       return HashUtil::FastHash64(key_.data(), key_.size(), 0);
             :     }
             :
             :     Slice filename() const {
             :       return Slice(key_.data() + OFFSETOF_FILENAME, key_.size() 
- OFFSETOF_FILENAME);
             :     }
             :
             :     int64_t mtime() const {
             :       return UNALIGNED_LOAD64(key_.data() + OFFSETOF_MTIME);
             :     }
             :
             :     int64_t offset() const {
             :       return UNALIGNED_LOAD64(key_.data() + OFFSETOF_OFFSET);
             :     }
             :
             :     Slice ToSlice() const {
             :       return key_;
             :     }
             :
             :    private:
             :     // Key encoding stored in key_:
             :     //
             :     //  int64_t mtime;
             :     //  int64_t offset;
             :     //  <variable length bytes> filename;
             :     static constexpr int OFFSETOF_MTIME = 0;
             :     static constexpr int OFFSETOF_OFFSET = OFFSETOF_MTIME + 
sizeof(int64_t);
             :     static constexpr int OFFSETOF_FILENAME = OFFSETOF_OFFSET + 
sizeof(int64_t);
             :     kudu::faststring key_;
             :   };
             :
             :   /// The class to abstruct store behavior, including copying 
the buffer and holding it
             :   /// until store complete.
             :   class StoreTask {
             :    public:
             :     /// Creating a store task requires the filename, mtime, 
offset that constitutes the
             :     /// cache key, and the buffer and length of the cached data 
is required too. We
             :     /// allocate a new buffer in the constructor and copy the 
cache data and update
             :     /// total_size which keeps track of the total buffer size 
allocate by all store tasks.
             :     explicit StoreTask(const std::string& filename, int64_t 
mtime, int64_t offset,
             :         const uint8_t* buffer, int64_t buffer_len, AtomicInt64& 
total_size);
             :
             :     /// When the store task is destroyed, the allocated buffer 
is freed and total_size is
             :     /// updated.
             :     ~StoreTask();
             :
             :     const CacheKey& key() const { return key_; }
             :     const uint8_t* buffer() const { return buffer_; }
             :     int64_t buffer_len() const { return buffer_len_; }
             :
             :    private:
             :     DISALLOW_COPY_AND_ASSIGN(StoreTask);
             :
             :     CacheKey key_;
             :     uint8_t* buffer_;
             :     int64_t buffer_len_;
             :     AtomicInt64& total_size_;
             :   };
> > Small thing: It would be nice to keep these structures defined in
The reason the compiler gives an error about the incomplete type is because the 
constructor and destructor need to know the size of the type. Moving them to 
the .cc file eliminates the need to know the size in the .h file, so it can use 
a forward declaration.

The change I suggested allows the code to compile correctly.


http://gerrit.cloudera.org:8080/#/c/19475/5/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/19475/5/be/src/runtime/io/data-cache.cc@117
PS5, Line 117: DEFINE_int32(data_cache_num_async_write_threads, 0,
             :     "Number of data cache async write threads. Write threads 
will write the cache "
             :     "asynchronously after IO thread read data, so IO thread will 
return more quickly. "
             :     "Write threads need to bound the extra memory consumption 
for holding the temporary "
             :     "buffer though. If this's 0, then write will be 
synchronous.");
             : DEFINE_string(data_cache_async_write_buffer_limit, "1GB",
             :     "Limit of the total buffer size used by asynchronous store 
tasks.");
Let's describe these options as experimental by adding an "(Experimental)" 
prefix to the description.

Impala is not officially supporting these startup options yet, and we need 
flexibility to remove them later if there is a different design for async IO 
that we prefer.



--
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 5
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 Feb 2023 21:42:17 +0000
Gerrit-HasComments: Yes

Reply via email to