Re: How to flush `block_cache_capacity_mb` easily?
t; There is no command to have Kudu evict its block cache, but >>>>>>> restarting the tablet server process will have that effect. Ideally all >>>>>>> written data will be flushed before the restart, otherwise >>>>>>> startup/bootstrap will take a bit longer. Flushing typically happens >>>>>>> within >>>>>>> 60s of the last write. Waiting for flush and compaction is also a >>>>>>> best-practice for read-only benchmarks. I'm not sure if someone else on >>>>>>> the list has an easier way of determining when a flush happens, but I >>>>>>> typically look at the 'MemRowSet' memory usage for the tablet on the >>>>>>> /mem-trackers HTTP endpoint; it should show something minimal like 256B >>>>>>> if >>>>>>> it's fully flushed and empty. You can also see details about how much >>>>>>> memory is in the block cache on that page, if that interests you. >>>>>>> >>>>>>> - Dan >>>>>>> >>>>>>> On Thu, Apr 6, 2017 at 11:23 PM, Jason Heo >>>>>>> wrote: >>>>>>> >>>>>>>> Hi. >>>>>>>> >>>>>>>> I'm using Apache Kudu 1.2 on CDH 5.10. >>>>>>>> >>>>>>>> Currently, I'm doing a performance test of Kudu. >>>>>>>> >>>>>>>> Flushing OS Page Cache is easy, but I don't know how to flush >>>>>>>> `block_cache_capacity_mb` easily. >>>>>>>> >>>>>>>> I currently execute SELECT statement over a unnecessarily table to >>>>>>>> evict cached block of testing table. >>>>>>>> >>>>>>>> It is cumbersome, so I'd like to know is there a command for >>>>>>>> flushing block caches (or another kudu's caches which I don't know yet) >>>>>>>> >>>>>>>> Thanks. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Jason >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Todd Lipcon >>>>>> Software Engineer, Cloudera >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Todd Lipcon >>>> Software Engineer, Cloudera >>>> >>> >>> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> > > -- Todd Lipcon Software Engineer, Cloudera
Re: How to flush `block_cache_capacity_mb` easily?
Hi, Todd. I've temporarily pushed this patch to my repository. https://github.com/jason-heo/kudu/commit/aff1fe181541671d2dc192ad9cb4ed2172a51826 Could you please check I'm on right track? It will take more time until pushing to cloudera's gerrit because I have yet to test if my modification works well and I'm not familiar with the contributing process <https://kudu.apache.org/docs/contributing.html>. Thanks, Jason 2017-04-11 12:55 GMT+09:00 Todd Lipcon : > Sure. Here's a high-level overview of the approach: > > - in src/kudu/util/cache.h, you'll need to add a new method like > 'ClearCache'. In cache.cc and nvm_cache.cc you'll need to implement the > method. You could implement it for the NVM cache to just return > Status::NotSupported() if your main concern is the default (DRAM) cache. > - in tserver_service.proto, add a new RPC method called 'ClearCache' > - in tserver.proto, define its request/response protobufs. They can > probably be empty > - in tablet_service.h, tablet_service.cc implement the new method. It can > call through to BlockCache::GetInstance()->ClearCache() and then > RespondSuccess > - in tablet_server-test.cc add a test case which exercises this path > > Hope that helps > > -Todd > > On Mon, Apr 10, 2017 at 6:14 PM, Jason Heo > wrote: > >> Great. I would be appreciated it if you guide me how can I contribute it. >> Then I'll try in my spare time. >> >> 2017-04-11 7:46 GMT+09:00 Todd Lipcon : >> >>> On Sun, Apr 9, 2017 at 6:38 PM, Jason Heo >>> wrote: >>> >>>> Hi Todd. >>>> >>>> I hope you had a good weekend. >>>> >>>> Exactly, I'm testing the latency of cold-cache reads from SATA disks >>>> and performance of difference schema designs as well. >>>> >>>> We currently using Elasticsearch for a analytic service. ES has a >>>> "clear cache API" feature, it makes me easy to test. >>>> >>>> >>> Makes sense. I don't think it would be particularly difficult to add >>> such an API. Any interest in contributing a patch? I'm happy to point you >>> in the right direction, if so. >>> >>> -Todd >>> >>> >>>> 2017-04-08 5:05 GMT+09:00 Todd Lipcon : >>>> >>>>> Hey Jason, >>>>> >>>>> Can I ask what the purposes of the testing is? >>>>> >>>>> One thing to note is that we're currently leaving a fair bit of >>>>> performance on the table for cold-cache reads from spinning disks. So, if >>>>> you find that the performance is not satisfactory, it's worth being aware >>>>> that we will likely make some significant improvements in this area in the >>>>> future. >>>>> >>>>> https://issues.apache.org/jira/browse/KUDU-1289 has some details. >>>>> >>>>> -Todd >>>>> >>>>> On Fri, Apr 7, 2017 at 8:44 AM, Dan Burkert >>>>> wrote: >>>>> >>>>>> Hi Jason, >>>>>> >>>>>> There is no command to have Kudu evict its block cache, but >>>>>> restarting the tablet server process will have that effect. Ideally all >>>>>> written data will be flushed before the restart, otherwise >>>>>> startup/bootstrap will take a bit longer. Flushing typically happens >>>>>> within >>>>>> 60s of the last write. Waiting for flush and compaction is also a >>>>>> best-practice for read-only benchmarks. I'm not sure if someone else on >>>>>> the list has an easier way of determining when a flush happens, but I >>>>>> typically look at the 'MemRowSet' memory usage for the tablet on the >>>>>> /mem-trackers HTTP endpoint; it should show something minimal like 256B >>>>>> if >>>>>> it's fully flushed and empty. You can also see details about how much >>>>>> memory is in the block cache on that page, if that interests you. >>>>>> >>>>>> - Dan >>>>>> >>>>>> On Thu, Apr 6, 2017 at 11:23 PM, Jason Heo >>>>>> wrote: >>>>>> >>>>>>> Hi. >>>>>>> >>>>>>> I'm using Apache Kudu 1.2 on CDH 5.10. >>>>>>> >>>>>>> Currently, I'm doing a performance test of Kudu. >>>>>>> >>>>>>> Flushing OS Page Cache is easy, but I don't know how to flush >>>>>>> `block_cache_capacity_mb` easily. >>>>>>> >>>>>>> I currently execute SELECT statement over a unnecessarily table to >>>>>>> evict cached block of testing table. >>>>>>> >>>>>>> It is cumbersome, so I'd like to know is there a command for >>>>>>> flushing block caches (or another kudu's caches which I don't know yet) >>>>>>> >>>>>>> Thanks. >>>>>>> >>>>>>> Regards, >>>>>>> Jason >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Todd Lipcon >>>>> Software Engineer, Cloudera >>>>> >>>> >>>> >>> >>> >>> -- >>> Todd Lipcon >>> Software Engineer, Cloudera >>> >> >> > > > -- > Todd Lipcon > Software Engineer, Cloudera >
Re: How to flush `block_cache_capacity_mb` easily?
Sure. Here's a high-level overview of the approach: - in src/kudu/util/cache.h, you'll need to add a new method like 'ClearCache'. In cache.cc and nvm_cache.cc you'll need to implement the method. You could implement it for the NVM cache to just return Status::NotSupported() if your main concern is the default (DRAM) cache. - in tserver_service.proto, add a new RPC method called 'ClearCache' - in tserver.proto, define its request/response protobufs. They can probably be empty - in tablet_service.h, tablet_service.cc implement the new method. It can call through to BlockCache::GetInstance()->ClearCache() and then RespondSuccess - in tablet_server-test.cc add a test case which exercises this path Hope that helps -Todd On Mon, Apr 10, 2017 at 6:14 PM, Jason Heo wrote: > Great. I would be appreciated it if you guide me how can I contribute it. > Then I'll try in my spare time. > > 2017-04-11 7:46 GMT+09:00 Todd Lipcon : > >> On Sun, Apr 9, 2017 at 6:38 PM, Jason Heo >> wrote: >> >>> Hi Todd. >>> >>> I hope you had a good weekend. >>> >>> Exactly, I'm testing the latency of cold-cache reads from SATA disks and >>> performance of difference schema designs as well. >>> >>> We currently using Elasticsearch for a analytic service. ES has a "clear >>> cache API" feature, it makes me easy to test. >>> >>> >> Makes sense. I don't think it would be particularly difficult to add such >> an API. Any interest in contributing a patch? I'm happy to point you in the >> right direction, if so. >> >> -Todd >> >> >>> 2017-04-08 5:05 GMT+09:00 Todd Lipcon : >>> >>>> Hey Jason, >>>> >>>> Can I ask what the purposes of the testing is? >>>> >>>> One thing to note is that we're currently leaving a fair bit of >>>> performance on the table for cold-cache reads from spinning disks. So, if >>>> you find that the performance is not satisfactory, it's worth being aware >>>> that we will likely make some significant improvements in this area in the >>>> future. >>>> >>>> https://issues.apache.org/jira/browse/KUDU-1289 has some details. >>>> >>>> -Todd >>>> >>>> On Fri, Apr 7, 2017 at 8:44 AM, Dan Burkert >>>> wrote: >>>> >>>>> Hi Jason, >>>>> >>>>> There is no command to have Kudu evict its block cache, but restarting >>>>> the tablet server process will have that effect. Ideally all written data >>>>> will be flushed before the restart, otherwise startup/bootstrap will take >>>>> a >>>>> bit longer. Flushing typically happens within 60s of the last write. >>>>> Waiting for flush and compaction is also a best-practice for read-only >>>>> benchmarks. I'm not sure if someone else on the list has an easier way of >>>>> determining when a flush happens, but I typically look at the 'MemRowSet' >>>>> memory usage for the tablet on the /mem-trackers HTTP endpoint; it should >>>>> show something minimal like 256B if it's fully flushed and empty. You can >>>>> also see details about how much memory is in the block cache on that page, >>>>> if that interests you. >>>>> >>>>> - Dan >>>>> >>>>> On Thu, Apr 6, 2017 at 11:23 PM, Jason Heo >>>>> wrote: >>>>> >>>>>> Hi. >>>>>> >>>>>> I'm using Apache Kudu 1.2 on CDH 5.10. >>>>>> >>>>>> Currently, I'm doing a performance test of Kudu. >>>>>> >>>>>> Flushing OS Page Cache is easy, but I don't know how to flush >>>>>> `block_cache_capacity_mb` easily. >>>>>> >>>>>> I currently execute SELECT statement over a unnecessarily table to >>>>>> evict cached block of testing table. >>>>>> >>>>>> It is cumbersome, so I'd like to know is there a command for flushing >>>>>> block caches (or another kudu's caches which I don't know yet) >>>>>> >>>>>> Thanks. >>>>>> >>>>>> Regards, >>>>>> Jason >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Todd Lipcon >>>> Software Engineer, Cloudera >>>> >>> >>> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> > > -- Todd Lipcon Software Engineer, Cloudera
Re: How to flush `block_cache_capacity_mb` easily?
Great. I would be appreciated it if you guide me how can I contribute it. Then I'll try in my spare time. 2017-04-11 7:46 GMT+09:00 Todd Lipcon : > On Sun, Apr 9, 2017 at 6:38 PM, Jason Heo wrote: > >> Hi Todd. >> >> I hope you had a good weekend. >> >> Exactly, I'm testing the latency of cold-cache reads from SATA disks and >> performance of difference schema designs as well. >> >> We currently using Elasticsearch for a analytic service. ES has a "clear >> cache API" feature, it makes me easy to test. >> >> > Makes sense. I don't think it would be particularly difficult to add such > an API. Any interest in contributing a patch? I'm happy to point you in the > right direction, if so. > > -Todd > > >> 2017-04-08 5:05 GMT+09:00 Todd Lipcon : >> >>> Hey Jason, >>> >>> Can I ask what the purposes of the testing is? >>> >>> One thing to note is that we're currently leaving a fair bit of >>> performance on the table for cold-cache reads from spinning disks. So, if >>> you find that the performance is not satisfactory, it's worth being aware >>> that we will likely make some significant improvements in this area in the >>> future. >>> >>> https://issues.apache.org/jira/browse/KUDU-1289 has some details. >>> >>> -Todd >>> >>> On Fri, Apr 7, 2017 at 8:44 AM, Dan Burkert >>> wrote: >>> >>>> Hi Jason, >>>> >>>> There is no command to have Kudu evict its block cache, but restarting >>>> the tablet server process will have that effect. Ideally all written data >>>> will be flushed before the restart, otherwise startup/bootstrap will take a >>>> bit longer. Flushing typically happens within 60s of the last write. >>>> Waiting for flush and compaction is also a best-practice for read-only >>>> benchmarks. I'm not sure if someone else on the list has an easier way of >>>> determining when a flush happens, but I typically look at the 'MemRowSet' >>>> memory usage for the tablet on the /mem-trackers HTTP endpoint; it should >>>> show something minimal like 256B if it's fully flushed and empty. You can >>>> also see details about how much memory is in the block cache on that page, >>>> if that interests you. >>>> >>>> - Dan >>>> >>>> On Thu, Apr 6, 2017 at 11:23 PM, Jason Heo >>>> wrote: >>>> >>>>> Hi. >>>>> >>>>> I'm using Apache Kudu 1.2 on CDH 5.10. >>>>> >>>>> Currently, I'm doing a performance test of Kudu. >>>>> >>>>> Flushing OS Page Cache is easy, but I don't know how to flush >>>>> `block_cache_capacity_mb` easily. >>>>> >>>>> I currently execute SELECT statement over a unnecessarily table to >>>>> evict cached block of testing table. >>>>> >>>>> It is cumbersome, so I'd like to know is there a command for flushing >>>>> block caches (or another kudu's caches which I don't know yet) >>>>> >>>>> Thanks. >>>>> >>>>> Regards, >>>>> Jason >>>>> >>>> >>>> >>> >>> >>> -- >>> Todd Lipcon >>> Software Engineer, Cloudera >>> >> >> > > > -- > Todd Lipcon > Software Engineer, Cloudera >
Re: How to flush `block_cache_capacity_mb` easily?
On Sun, Apr 9, 2017 at 6:38 PM, Jason Heo wrote: > Hi Todd. > > I hope you had a good weekend. > > Exactly, I'm testing the latency of cold-cache reads from SATA disks and > performance of difference schema designs as well. > > We currently using Elasticsearch for a analytic service. ES has a "clear > cache API" feature, it makes me easy to test. > > Makes sense. I don't think it would be particularly difficult to add such an API. Any interest in contributing a patch? I'm happy to point you in the right direction, if so. -Todd > 2017-04-08 5:05 GMT+09:00 Todd Lipcon : > >> Hey Jason, >> >> Can I ask what the purposes of the testing is? >> >> One thing to note is that we're currently leaving a fair bit of >> performance on the table for cold-cache reads from spinning disks. So, if >> you find that the performance is not satisfactory, it's worth being aware >> that we will likely make some significant improvements in this area in the >> future. >> >> https://issues.apache.org/jira/browse/KUDU-1289 has some details. >> >> -Todd >> >> On Fri, Apr 7, 2017 at 8:44 AM, Dan Burkert >> wrote: >> >>> Hi Jason, >>> >>> There is no command to have Kudu evict its block cache, but restarting >>> the tablet server process will have that effect. Ideally all written data >>> will be flushed before the restart, otherwise startup/bootstrap will take a >>> bit longer. Flushing typically happens within 60s of the last write. >>> Waiting for flush and compaction is also a best-practice for read-only >>> benchmarks. I'm not sure if someone else on the list has an easier way of >>> determining when a flush happens, but I typically look at the 'MemRowSet' >>> memory usage for the tablet on the /mem-trackers HTTP endpoint; it should >>> show something minimal like 256B if it's fully flushed and empty. You can >>> also see details about how much memory is in the block cache on that page, >>> if that interests you. >>> >>> - Dan >>> >>> On Thu, Apr 6, 2017 at 11:23 PM, Jason Heo >>> wrote: >>> >>>> Hi. >>>> >>>> I'm using Apache Kudu 1.2 on CDH 5.10. >>>> >>>> Currently, I'm doing a performance test of Kudu. >>>> >>>> Flushing OS Page Cache is easy, but I don't know how to flush >>>> `block_cache_capacity_mb` easily. >>>> >>>> I currently execute SELECT statement over a unnecessarily table to >>>> evict cached block of testing table. >>>> >>>> It is cumbersome, so I'd like to know is there a command for flushing >>>> block caches (or another kudu's caches which I don't know yet) >>>> >>>> Thanks. >>>> >>>> Regards, >>>> Jason >>>> >>> >>> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> > > -- Todd Lipcon Software Engineer, Cloudera
Re: How to flush `block_cache_capacity_mb` easily?
Hi Todd. I hope you had a good weekend. Exactly, I'm testing the latency of cold-cache reads from SATA disks and performance of difference schema designs as well. We currently using Elasticsearch for a analytic service. ES has a "clear cache API" feature, it makes me easy to test. Thanks. Jason. 2017-04-08 5:05 GMT+09:00 Todd Lipcon : > Hey Jason, > > Can I ask what the purposes of the testing is? > > One thing to note is that we're currently leaving a fair bit of > performance on the table for cold-cache reads from spinning disks. So, if > you find that the performance is not satisfactory, it's worth being aware > that we will likely make some significant improvements in this area in the > future. > > https://issues.apache.org/jira/browse/KUDU-1289 has some details. > > -Todd > > On Fri, Apr 7, 2017 at 8:44 AM, Dan Burkert wrote: > >> Hi Jason, >> >> There is no command to have Kudu evict its block cache, but restarting >> the tablet server process will have that effect. Ideally all written data >> will be flushed before the restart, otherwise startup/bootstrap will take a >> bit longer. Flushing typically happens within 60s of the last write. >> Waiting for flush and compaction is also a best-practice for read-only >> benchmarks. I'm not sure if someone else on the list has an easier way of >> determining when a flush happens, but I typically look at the 'MemRowSet' >> memory usage for the tablet on the /mem-trackers HTTP endpoint; it should >> show something minimal like 256B if it's fully flushed and empty. You can >> also see details about how much memory is in the block cache on that page, >> if that interests you. >> >> - Dan >> >> On Thu, Apr 6, 2017 at 11:23 PM, Jason Heo >> wrote: >> >>> Hi. >>> >>> I'm using Apache Kudu 1.2 on CDH 5.10. >>> >>> Currently, I'm doing a performance test of Kudu. >>> >>> Flushing OS Page Cache is easy, but I don't know how to flush >>> `block_cache_capacity_mb` easily. >>> >>> I currently execute SELECT statement over a unnecessarily table to evict >>> cached block of testing table. >>> >>> It is cumbersome, so I'd like to know is there a command for flushing >>> block caches (or another kudu's caches which I don't know yet) >>> >>> Thanks. >>> >>> Regards, >>> Jason >>> >> >> > > > -- > Todd Lipcon > Software Engineer, Cloudera >
Re: How to flush `block_cache_capacity_mb` easily?
Hey Jason, Can I ask what the purposes of the testing is? One thing to note is that we're currently leaving a fair bit of performance on the table for cold-cache reads from spinning disks. So, if you find that the performance is not satisfactory, it's worth being aware that we will likely make some significant improvements in this area in the future. https://issues.apache.org/jira/browse/KUDU-1289 has some details. -Todd On Fri, Apr 7, 2017 at 8:44 AM, Dan Burkert wrote: > Hi Jason, > > There is no command to have Kudu evict its block cache, but restarting the > tablet server process will have that effect. Ideally all written data will > be flushed before the restart, otherwise startup/bootstrap will take a bit > longer. Flushing typically happens within 60s of the last write. Waiting > for flush and compaction is also a best-practice for read-only benchmarks. > I'm not sure if someone else on the list has an easier way of determining > when a flush happens, but I typically look at the 'MemRowSet' memory usage > for the tablet on the /mem-trackers HTTP endpoint; it should show something > minimal like 256B if it's fully flushed and empty. You can also see > details about how much memory is in the block cache on that page, if that > interests you. > > - Dan > > On Thu, Apr 6, 2017 at 11:23 PM, Jason Heo > wrote: > >> Hi. >> >> I'm using Apache Kudu 1.2 on CDH 5.10. >> >> Currently, I'm doing a performance test of Kudu. >> >> Flushing OS Page Cache is easy, but I don't know how to flush >> `block_cache_capacity_mb` easily. >> >> I currently execute SELECT statement over a unnecessarily table to evict >> cached block of testing table. >> >> It is cumbersome, so I'd like to know is there a command for flushing >> block caches (or another kudu's caches which I don't know yet) >> >> Thanks. >> >> Regards, >> Jason >> > > -- Todd Lipcon Software Engineer, Cloudera
Re: How to flush `block_cache_capacity_mb` easily?
Hi Jason, There is no command to have Kudu evict its block cache, but restarting the tablet server process will have that effect. Ideally all written data will be flushed before the restart, otherwise startup/bootstrap will take a bit longer. Flushing typically happens within 60s of the last write. Waiting for flush and compaction is also a best-practice for read-only benchmarks. I'm not sure if someone else on the list has an easier way of determining when a flush happens, but I typically look at the 'MemRowSet' memory usage for the tablet on the /mem-trackers HTTP endpoint; it should show something minimal like 256B if it's fully flushed and empty. You can also see details about how much memory is in the block cache on that page, if that interests you. - Dan On Thu, Apr 6, 2017 at 11:23 PM, Jason Heo wrote: > Hi. > > I'm using Apache Kudu 1.2 on CDH 5.10. > > Currently, I'm doing a performance test of Kudu. > > Flushing OS Page Cache is easy, but I don't know how to flush > `block_cache_capacity_mb` easily. > > I currently execute SELECT statement over a unnecessarily table to evict > cached block of testing table. > > It is cumbersome, so I'd like to know is there a command for flushing > block caches (or another kudu's caches which I don't know yet) > > Thanks. > > Regards, > Jason >
How to flush `block_cache_capacity_mb` easily?
Hi. I'm using Apache Kudu 1.2 on CDH 5.10. Currently, I'm doing a performance test of Kudu. Flushing OS Page Cache is easy, but I don't know how to flush `block_cache_capacity_mb` easily. I currently execute SELECT statement over a unnecessarily table to evict cached block of testing table. It is cumbersome, so I'd like to know is there a command for flushing block caches (or another kudu's caches which I don't know yet) Thanks. Regards, Jason