I thought the messages persisted in the bookies even after someone consumes it. I had one test topic with one publisher and one subscriber. I published about 5 messages to the topic. I subscribed and consumed messages from my listener, which just prints out the message along with its sequence number. When I get rid of this listener and start another one, this new listener will get all previous messages from the topic. How is this possible if messages are not being piled up somewhere (bookies)? Does the hub keep all the messages? I am somewhat confused how consuming messages get rid of old messages. In my thought, they persisted in the bookies. Correct me if I am wrong.
Also I would like to contribute by adding delete method (if it is possible) and topic eviction, etc. However, I feel that I need to study its system, but I am not seeing very much information at http://zookeeper.apache.org/bookkeeper/docs/trunk/hedwigDesign.html. Is there any other design documentation with more details? Where is the best place to learn how hedwig is built without 100% digging through codes? Regard, On Thu, Dec 22, 2011 at 9:56 AM, Ivan Kelly <[email protected]> wrote: > On Thu, Dec 22, 2011 at 09:25:57AM -0600, Daniel S. Kim wrote: > > When I say "Delete", I mean that I want all the stuff about that topic to > > be gone. The reason is I need topic management to see if they are being > > used or not. If they are not being used for awhile, I expire the topic > and > > kill it. This is what I should do to save resources. Imagine a large > number > > of hedwig users that start new topics, send messages, etc. All these data > > build up eventually (and I believe there is no eviction mechanism and > > policy yet). Even though hedwig lets user to keep messages persistently. > I > > don't think it should persist when the user wants it gone. > The only reason data should build up like this is if there is a user > subscribed to a topic, and it it hasn't consumed all messages > published to the topic. Otherwise it should be safe to periodically > delete garbage collect topics who have no subscribers, but I don't > think we do this at the moment. It would my great if you could > contribute this ;) > > Where exactly are you seeing the problem? Is the zookeeper data > getting to big, or is the problem in bookkeeper, etc? > > > > > Since you said it would possibly break some of the guarantees, I would > have > > to look more into it. If my memory is correct, Ben Reed said adding > > administrative hedwig function to delete a topic should not be too > > complicated. If it is indeed complicated to achieve the functionality > > without breaking the guarantees, I will have to wait or build something > > around. I need to know little bit more about the hedwig hub > redistribution > > and how it works, if it is configurable, etc. Where should I start (i.e., > > which java package or classes deal with this)? > hedwig-server/src/main/java/org/apache/hedwig/server/topic > & > hedwig-server/src/main/java/org/apache/hedwig/server/subscriptions > should cover most of what you're interested in. > > -Ivan > -- Daniel S. Kim
