Hi Ben, To follow up with this question, which seems to be asking primarily about Hedwig (and I guess the answer is: it's not in production yet, anywhere), with one more about Bookkeeper: is BookKeeper used in production as a WAL (or for any other use) anywhere? If so, for what uses?
Any info (even anecdotal) would be great! -jake On Thu, Oct 7, 2010 at 9:15 AM, Benjamin Reed <br...@yahoo-inc.com> wrote: > hi amit, > > sorry for the late response. this week has been crunch time for a lot of > different things. > > here are your answers: > > production > > 1. it is still in prototype phase. we are evaluating different aspects, but > there is still some work to do to make it production ready. we also need to > get an engineering team to signup to stand behind it. > > 2. it's a generic pub/sub message bus. in some sense it is really a > datacenter solution with extensions for multi-data center operation, so it > is perfectly reasonable to use it in a single datacenter setting. > > 3. yeah, we have removed the hw.bash script. it had some hardcoded > assumptions and was a swiss army knife on steroids. he have been breaking it > up into simpler scripts. > > 4. session expiry really represents a fundamental connectivity problem, so > both bk and hedwig restart the component that gets the expired session > errror. > > data > > 1. yes. > > 2. once all subscribers have consumed a message there is a background > process that cleans it up. > > 3. yes there is a replication factor and we ensure replication on writes > and there is a recovery tool to recover bookies that fail. we don't have to > worry about conflicts because there is only a single writer for a give > ledger. because of this we do not need to do quorum reads. > > documentation > > yes, this is something we need to work on. i'll see if i can push out some > of our hello world applications. we'd also like to put a JMS API on top so > that the API is more familiar (and documented :). i don't want to delay the > answers to your other questions, so let me answer that HedwigSubscriber is > the class for clients. the other classes are internal. (for cross data > center hubs use a special kind of subscriptions to do cross data center > updates.) > > ben > > On 10/05/2010 10:32 PM, amit jaiswal wrote: > >> Hi, >> >> In Hedwig talk (http://vimeo.com/13282102), it was mentioned that the >> primary >> use case for Hedwig comes from the distributed key-value store PNUTS in >> Yahoo!, >> but also said that the work is new. >> >> Could you please about the following: >> >> Production readiness / Deployment >> 1. What is the production readiness of Hedwig / BookKeeper. Is it being >> used >> anywhere (like in PNUTS)? >> 2. Is Hedwig designed to use as a generic message bus or only for >> multi-datacenter operations? >> 3. Hedwig installation and deployment is done through a script hw.bash, >> but that >> is difficult to use especially in a production environment. Are there any >> other >> packages available that can simplify the deployment of hedwig. >> 4. How does BK/Hedwig handle zookeeper session expiry? >> >> Data Deletion, Handling data loss, Quorum >> 1. Does BookKeeper support deletion of old log entries which have been >> consumed. >> 2. How does Hedwig handles the case when all subscribers have consumed all >> the >> messages. In the talk, it was said that a subscriber can come back after >> hours, >> days or weeks. Is there any data retention / expiration policy for the >> data that >> is published? >> 3. How does Hedwig handles data loss? There is a replication factor, and a >> write >> operation must be accepted by majority of the bookies, but how data >> conflicts >> are handled? Is there any possibility of data conflict at all? Is the >> replication only for recovery? When the hub is reading data from bookies, >> does >> it reads from all the bookies to satisfy quorum read? >> >> Code >> What is the difference between PubSubServer, HedwigSubscriber, >> HedwigHubSubscriber. Is there any HelloWorld program that simply >> illustrates how >> to instantiate a hedwig client, and publish/consume messages. >> (HedwigBenchmark >> class is helpful, but was looking something like API documentation). >> >> -regards >> Amit >> > >