Re: With latest versions, can I have just one cache to store small objects (several tens of bytes) and large objects (~20MiB) together?

2017-04-26 Thread dormando
>
> No, i'm not talking about performance trouble with 63 slab classes (and I 
> think I mistook that -- it should be 62 per PR 97). I'm talking about that in
> old versions when we could have two hundred classes, which I believed could 
> result in memory inefficiency.
>
> (and in old versions if I wanted to use a combined cache, I might choose to 
> have 200 classes to reduce intra-slab wastes).

Seems to have a negligable impact on memory overhead to have so many
classes. Most people leave it at the default (~43), and regardless of the
growth factor there're more gaps on the higher end. The benefits of the
better LRU algorithm for hit ratio seem to outweigh the costs there.
Though I wish it were less of a problem to steal two bits from elsewhere
so I wouldn't have to do that :)

On the other side, as I reduce the max item chunk size that makes the
60ish slab classes more efficient. I'd also like 1.5.0 to actually use all
available classes by default...

>
> Hmm. I didn't know that. Must've missed it in the Release Notes. Still, nice 
> to learn that!

https://github.com/memcached/memcached/wiki/ReleaseNotes1436 :)
  
> Yep, I get you. I think I can spot any anomaly through those metrics you 
> describe.
>
> So, I guess I'm good to go. I'm actually confident the results will be 
> positive. Please also post the script of the revised algorithm -- definitely
> interested in testing for you against our workloads.
>
> Thanks (again) for your time and response & cheers,
> - Mnjul

Good luck! I'll post to the mailing list, so keep an eye out.

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to memcached+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: With latest versions, can I have just one cache to store small objects (several tens of bytes) and large objects (~20MiB) together?

2017-04-26 Thread dormando
responses inline.

On Wed, 26 Apr 2017, Min-Zhong "John" Lu wrote:

> Hi again,
>
> When I was on my previous topic, I came to realize this question might be 
> asked too.
>
> Background story:
> I have employed memcached around its 1.4.x-ish version. And I've always had a 
> need to store objects larger than 1MB --- our cached objects can be as
> small as just a handful integers, and as large as ~20MiB. After some 
> experimentation with the old 1.4.14-ish version, and our access pattern, I 
> came to
> realize I probably needed two separate memcached instances, one with a 
> default max item size (1MB), and the other with -I 20m -n 1048576). The two 
> had
> different growth factors too.
>
> The rationale behind this was that, at that 1.4.14-ish moment, an -I 20m, 
> combined with the exponential growth-rate mechanism, meant I have lots of
> wasted memory in higher slab classes. For example, some 14MiB object needed 
> to be store at the some 18MiB slab class, which created a 4MB waste
> immediately. A smaller growth factor would, indeed, alleviate the problem, 
> but at that time I feared the larger number of slab classes would impose other
> performance bottlenecks (this was a guess, and wasn't formally profiled, 
> though). So, I decided to employ two separate memcached instances.
>
> Of course, this came with added complexity: There would be times when the 
> small cache still have a decent amount of available memory while the large
> cache are already evicting. And it's harder to maintain too: I needed to 
> design an adapter that pre-calculate item sizes to decide which instance I
> should send my item to (I don't want to "always feed the small cache first, 
> and if I get a toobig for that, feed to the large cache." That's slow.), and
> gets/deletions are more complex too. (The size pre-calculation would be 
> what's there in my previous topic.)

Not easy to route by key prefix or something?

> The actual question:
> But all these were quite a few years ago. With the latest large item-related 
> improvements, I wonder if my original concerns still stand? Can I now just
> employ one memcached instance, which I can store small items and huge items, 
> without worrying about performance? (And hey --- if I remove those adapter
> codes, my colleagues will get happy too.)
>
> For example, with -o modern,slab_chunk_max=524288, I won't have too many slab 
> classes (well, we can have at most 63 now regardless of slab_chunk_max, I
> know) which I had feared would impede memcached performance. Also if I store 
> a large object, the most memory I can waste is 512KiB. For me, this
> constitutes an good trade-off with all the dirty works I had to do for 
> separate caches.

63 slab classes would impede performance how? Memory efficiency or are you
talking about throughput/etc?

524288 should be the default in -o modern already. Also, as of the latest
release the waste for large items is < 512k. It should be close to exact,
as it's able to use a chunk from a different slab class for the final
piece.

> If combining my separate caches back into one cache sounds worth trying, what 
> should I watch out for? Are there any recommended parameters? Or stats that
> I should pay extra attention to, to know if memcached is underperforming, 
> under this scenario?

Yeah that is generally the intent here. Chunked items were done for two
main reasons:

1) Even large-only pools had bad memory efficiency due to how far apart
the slab classes are.
2) Enable better mixed-usage cases.

As much as possible, try `-o modern`'s suggestions, since I'm trying to
tune -o modern to become the new defaults in 1.5.0. For chunked items,
that makes the largest slab cass 512k, and larger items tend to be a
multiple of that + a final chunk from a different slab class.

You can still run into problems based on memory pressure, and the fact
that the slab rebalancer algorithm isn't great yet. If you wouldn't mind,
I will be posting a script soon to test the algorithm I hope to use for
1.5.0.

What I mean by memory pressure is; if your large items *can* dominate the
cache, slab classes for smaller items may stare. You could end up with one
or the other not living for very long in cache.

The easiest way to monitor that is to graph some of the stuff out of
`stats items` at least; you can see eviction pressure per slab class, as
well as evicted_tiime, cold_age, etc. On a well balanced system the
"idleness" of objects in the LRU tails should be similar.

You can also watch for evicted_unfetched rising, or get_hits dropping for
particular slab classes over time.

There can also be bugs. The code's had a few rounds of work and gets a lot
of bench testing but it's still relatively new. I do try very hard to not
have bugs.

As it matures, I might lower the default max class to 256k or even 128k,
but I don't recommend trying that right now. Benefit would be getting the
slab classes even closer together, improving some memory efficiency.

have fun,

With latest versions, can I have just one cache to store small objects (several tens of bytes) and large objects (~20MiB) together?

2017-04-26 Thread Min-Zhong "John" Lu
Hi again,

When I was on my previous topic 
, I came to 
realize this question might be asked too.

*Background story:*
I have employed memcached around its 1.4.x-ish version. And I've always had 
a need to store objects larger than 1MB --- our cached objects can be as 
small as just a handful integers, and as large as ~20MiB. After some 
experimentation with the old 1.4.14-ish version, and our access pattern, I 
came to realize I probably needed two separate memcached instances, one 
with a default max item size (1MB), and the other with -I 20m -n 1048576). 
The two had different growth factors too.

The rationale behind this was that, at that 1.4.14-ish moment, an -I 20m, 
combined with the exponential growth-rate mechanism, meant I have lots of 
wasted memory in higher slab classes. For example, some 14MiB object needed 
to be store at the some 18MiB slab class, which created a 4MB waste 
immediately. A smaller growth factor would, indeed, alleviate the problem, 
but at that time I feared the larger number of slab classes would impose 
other performance bottlenecks (this was a guess, and wasn't formally 
profiled, though). So, I decided to employ two separate memcached instances.

Of course, this came with added complexity: There would be times when the 
small cache still have a decent amount of available memory while the large 
cache are already evicting. And it's harder to maintain too: I needed to 
design an adapter that pre-calculate item sizes to decide which instance I 
should send my item to (I don't want to "always feed the small cache first, 
and if I get a toobig for that, feed to the large cache." That's slow.), 
and gets/deletions are more complex too. (The size pre-calculation would be 
what's there in my previous topic 
.)

*The actual question:*
But all these were quite a few years ago. With the latest large 
item-related improvements, I wonder if my original concerns still stand? 
Can I now just employ one memcached instance, which I can store small items 
and huge items, without worrying about performance? (And hey --- if I 
remove those adapter codes, my colleagues will get happy too.)

For example, with -o modern,slab_chunk_max=524288, I won't have too many 
slab classes (well, we can have at most 63 now regardless of 
slab_chunk_max, I know) which I had feared would impede memcached 
performance. Also if I store a large object, the most memory I can waste is 
512KiB. For me, this constitutes an good trade-off with all the dirty works 
I had to do for separate caches.

If combining my separate caches back into one cache sounds worth trying, 
what should I watch out for? Are there any recommended parameters? Or stats 
that I should pay extra attention to, to know if memcached is 
underperforming, under this scenario?

Thanks!
- Mnjul

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to memcached+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.