On Fri, Apr 24, 2015 at 5:43 AM, Ravikumar Govindarajan < [email protected]> wrote:
> > > > Ravi, > > > > Sorry not getting back to you sooner. > > > No problems at all. > > There were many times when I > > used it when it would completely fill the configured block cache. This > > would in turn cause things that were actually being used to be pushed out > > of the BC and performance would suffer > > > Yes quite true. Some sort of read storm in the network could cause > problems. I saw a ThrottledIndexInput in old-code. Guess it was meant for > this... > > Now the auto save and load warmup feature would be a completely different > > piece of that I think would be of great benefit for restarts / failures. > > > Yup. This must be a good feature for single machine failure/restarts.. > > But I just realized that the worst-case for this auto-save/load will be > cluster-wide restarts.. You think ThrottledIndexInput can serve us well > here also or we need some higher-order construct? > Not sure. Perhaps a scheduler would be better so that user queries take priority over warming the cache backup. Just so you are aware we are working to allow Blur to move blocks to datanodes where the shards are being served. That way short circuit reads can be used. This will likely help with any cache misses and if we create the cache reload feature it will likely speed up the reload process. Aaron > > -- > Ravi > > On Thu, Apr 23, 2015 at 5:20 PM, Aaron McCurry <[email protected]> wrote: > > > Ravi, > > > > Sorry not getting back to you sooner. > > > > I agree that auto saving and loading of the blocks that are actually used > > would be a benefit. However the previous warmup process in practice > pulled > > in too much data that was never accessed. There were many times when I > > used it when it would completely fill the configured block cache. This > > would in turn cause things that were actually being used to be pushed out > > of the BC and performance would suffer. So we ran Blur for some time > with > > the warmup process disabled and for the most part the entire system > worked > > better. > > > > Now the auto save and load warmup feature would be a completely different > > piece of that I think would be of great benefit for restarts / failures. > > > > Aaron > > > > On Thu, Apr 23, 2015 at 7:29 AM, Ravikumar Govindarajan < > > [email protected]> wrote: > > > > > Oh I am sorry… > > > > > > I find that Blur Warmup code has been completely removed in repository > > now. > > > Are there reasons for doing the same? > > > > > > I thought auto-save/loading of block-cache can benefit from this nicely > > > > > > -- > > > Ravi > > > > > > On Tue, Apr 21, 2015 at 6:06 PM, Ravikumar Govindarajan < > > > [email protected]> wrote: > > > > > > > Was just looking at Blur warmup logic. I could classify in 2 > stages... > > > > > > > > StageI > > > > Looks like openShard [DistributedIndexServer] submits the warmup > > request > > > > on a separate warmupExecutor. This is what is exactly needed for > > loading > > > > auto-saved block-cache from HDFS... > > > > > > > > StageII > > > > But when I prodded a little bit deeper, it got complex. > > > > TraceableDirectory, IndexTracer with thread-local stuff etc… I could > > not > > > > follow the code... > > > > > > > > I decided on an impl as follows... > > > > > > > > public class BlockCacheWarmup extends BlurIndexWarmup { > > > > > > > > @override > > > > > > > > public void warmBlurIndex(final TableDescriptor table, final String > > > > shard, IndexReader reader, > > > > > > > > AtomicBoolean isClosed, ReleaseReader releaseReader, AtomicLong > > > > pauseWarmup) throws IOException { > > > > for (each segment) { > > > > for (each file) { > > > > //Read cache-meta data from HDFS... > > > > //Directly open CacheIndexInput and populate block-cache > > > > } > > > > } > > > > } > > > > > > > > My qstn is… > > > > > > > > If I explicitly bypass StageII {TraceableDirectory and friends} and > > just > > > > populate block-cache alone, will this work-fine. Am I missing > something > > > > obvious? > > > > > > > > Any help is much appreciated… > > > > > > > > -- > > > > Ravi > > > > > > > > On Fri, Feb 6, 2015 at 2:58 PM, Aaron McCurry <[email protected]> > > > wrote: > > > > > > > >> Yes exactly. That way we could provide a set of blocks to be cache > > with > > > >> priority, so the most important bits get cached first. > > > >> > > > >> Aaron > > > >> > > > >> On Fri, Feb 6, 2015 at 12:43 AM, Ravikumar Govindarajan < > > > >> [email protected]> wrote: > > > >> > > > >> > That's a great idea... > > > >> > > > > >> > You meant like instead of saving blocks themselves, we can store > > > >> metadata > > > >> > {block-ids} for each file/shard in HDFS that is written to > > > >> block-cache... > > > >> > > > > >> > Opening a shard can then use this metadata to re-populate the hot > > > parts > > > >> of > > > >> > the files... > > > >> > > > > >> > We also need to handle evictions & file-deletes... > > > >> > > > > >> > Is this what you are hinting at? > > > >> > > > > >> > -- > > > >> > Ravi > > > >> > > > > >> > On Thu, Feb 5, 2015 at 7:03 PM, Aaron McCurry <[email protected] > > > > > >> wrote: > > > >> > > > > >> > > On Thu, Feb 5, 2015 at 6:30 AM, Ravikumar Govindarajan < > > > >> > > [email protected]> wrote: > > > >> > > > > > >> > > > I noticed in BigTable impl of Cassandra where they store the > > > >> "Memtable" > > > >> > > > info periodically onto disk to avoid cold start-ups... > > > >> > > > > > > >> > > > Is it possible to do something like that for Blur's > block-cache, > > > >> > > preferably > > > >> > > > in HDFS itself so that both cold start-ups and shard > take-overs > > > >> don't > > > >> > > > affect end-user latencies... > > > >> > > > > > > >> > > > In Cassandra's case, the size of Memtable will typically be > > > 2GB-4GB. > > > >> > But > > > >> > > in > > > >> > > > case of Blur, it could even be100 GB. So I don't know if > > > attempting > > > >> > such > > > >> > > > stuff is good idea. > > > >> > > > > > > >> > > > Any help is appreciated much... > > > >> > > > > > > >> > > > > > >> > > Yeah I agree that caches could be very large and storing in HDFS > > > >> could be > > > >> > > counter productive. Also the block cache represents what is on > > the > > > >> > single > > > >> > > node and it's not really broken up by shard or table. So if a > > node > > > >> was > > > >> > > restarted without a full cluster restart there's no guarantee > that > > > the > > > >> > > shard server will get the same shards back that it was serving > > > before. > > > >> > > > > > >> > > I like the idea though, perhaps we can write out what parts of > > what > > > >> files > > > >> > > the cache was storing with the lru order. Then any server that > is > > > >> > opening > > > >> > > the shard can know what parts of what files were hot the last > time > > > it > > > >> was > > > >> > > open. Then they could choose to populate the cache upon shard > > > >> opening. > > > >> > > > > > >> > > Thoughts? > > > >> > > > > > >> > > Aaron > > > >> > > > > > >> > > > > > >> > > > > > > >> > > > -- > > > >> > > > Ravi > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > > > > > > > > >
