On Oct 4, 2007, at 9:11 , Steven Grimm wrote:
Tobias Lütke wrote:
This also means that the number of tags in the system will be quite
large. There will be one or more tags for each row in the articles
table. I expect the amount of tags to be vastly larger then the
amount
of keys in future memcached servers.
Which is why I'm kind of skeptical about the whole tags thing,
honestly. It seems like an optimization for the rare case
(invalidation) at the expense of the vastly more common case
(getting values by ID) by virtue of reducing the amount of memory
available for keys and values. Fewer items in the cache equals
lower hit rate.
There are only a large number of tags if you create a large number
of tags.
Obviously different applications have different usage. I can tell
you that in our application, gets outnumber deletes by at least two
orders of magnitude across the board, and many of our objects are
so small that any tag would likely eat more memory than the value
being cached. (Not, perhaps, than the object header, but certainly
more than the value.)
I would hope that it'd generally be the case that deletes aren't
common. I'm hoping that tags aren't going to encourage people to
delete *more*, but to delete more accurately.
Also, invalidating a tag means broadcasting a "delete by tag"
request to all the memcached servers since you have no way of
knowing which servers have objects with which tags. For large sites
with lots of memcached servers, or even medium-sized sites using
the "run a memcached instance on each web host" approach, that
means a ton of outgoing requests, almost all of which are likely to
not invalidate anything at all if the tags are relatively sparse.
It's a lot of requests rarely. Broadcast isn't particularly
expensive in my client, but I certainly can see how it is for others.
It comes down to measurements, I suppose. If tags help, then it'll
be useful.
Not saying the feature isn't worth adding; there are doubtless
valid use cases for it. But whatever implementation finally
arrives, IMO, shouldn't impose any per-object memory overhead on
objects that have no tags at all. Or if it does, it should be
surrounded by #ifdef so that sites that don't need it don't see
their available cache memory drop substantially when they upgrade.
I was imagining the overhead being something like 8 bytes per item
on a 32-bit system as well as the tag hash table.
--
Dustin Sallings