[Feel free to remove hypertable-dev if you feel it's more of a -user thread...]
On Thu, Apr 9, 2009 at 9:25 AM, Doug Judd <[email protected]> wrote: > I think the variance you observed here must have just been a coincidence. > This property is no longer used by the KFS broker. Hehe, fair enough. :-) It was most likely was a coincidence, but I thought I'd ask for kicks: is there any possibility that the value of RangeServer.AccessGroup.CellCache.PageSize could have anything to do with how quickly the DFS broker's memory would grow? > In other words, you should be able to stop Hypertable by just killing all > the binaries. Have you observed that this does not work? Well, before making the change to the value of PageSize I observed that when I left the RangeServers alone they would grow enough to start swapping and I'd have to restart Hypertable to try to get it back up. There were a few occasions in which I wasn't able to get the Hypertable instance to fully come back up because one or another RangeServer would try to perform a recovery and die some ten or twenty seconds after I would try to restart it, complaining about a corrupt CellStore file I think, and being able to read "0 of 56 bytes of data" I think was commonly in a line near the error. I'll post some real context from a log file if I see the problem again but it hasn't happened since I've been running with the new config (knock on wood.) Sort of still on-topic is the general question of which config values in particular should we try to tweak in order to tune various types of performance behavior? The ConfigProperties page and --help-config give some description of what there is, but I haven't seen much in the way of a practical performance example mapping to config suggestion. Right now we're interested in smoothing out occasional drops in raw select performance. For example, I can do select * from our "events" table and most of the time I reliably get around 16MB/s of throughput. However, sometimes the throughput goes down to something like 200KB/s for 20-30 seconds, then goes back up to normal. Watching the RangeServers during the slowdowns I can usually find one of them being really busy with something like major log compaction. I'm trying to figure out both which config options to try and what is the best general path to identify bottlenecks throughout the stack. I'm wondering if there is an easy way of asking the RangeServers if they are seeing poor performance from the DFS/network or if they are being slow for some other reason, beyond what I can get from parsing their logs (which can certainly be quite useful.) Thanks a bunch for your help! :-) Josh --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en -~----------~----~----~----~------~----~------~--~---
