Thanks Sijie Il lun 5 dic 2016, 22:40 Sijie Guo <[email protected]> ha scritto:
On Mon, Dec 5, 2016 at 1:06 PM, Enrico Olivelli <[email protected]> wrote: Thank you Sijie for your explanations My "new" use case is more similar to JV usage of BookKeeper at SF, I'm trying to store data on BookKeeper as it could provide low-latency for both reads and writes. I think I will need the improvements from JV to force the advance of the LAC I will dig into code to understand clearly how LAC is written into entries and how many entries I should write to ensure that readers can see the advance of the LAC. If you have time to give me some pointers to code I will appreciate that, but anyway I think I will be able to find by myself this is the code to pack lac into the entry to write - https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/DigestManager.java#L83 this is the code to read last add confirmed: https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java#L62 Thank you. I took a look deeper into the code and run some test and I realized how the LAC works. I will wait for JV patch about forced LAC as well Another question: is there any real limit in the max size of an entry ? There is no real limit, to be honest. However, because currently bookkeeper doesn't provide a 'streaming' inteface. large-sized entry is not good for journal flush and might cause any gc pressure. I would like to store entries up to 100 MB and I would like not to split data into chunks (actually I'm going to store BLOBs) Is it possible to store the BLOBs as separated ledgers? That might be easier to manage. My system in production will spawn something like 100.000.000 blobs per day with an average size of 100kb. if I use one ledger per blob I will surely need those real 63bits ledger ids that we discussed on BP1. The creation of a new ledger is kind of an heavy process as it requires coordination. An entry is more lightweight. Maybe I will need to split larger blobs into chunks. I will do some bench for this case too. These blobs usually will be written once and then read within some millisecond from another machine and then they will be kept for 2 or 3 days. We are going to coordinate the deletion of blobs by dropping ledgers which do not contain any useful entry any more As far as I will have a working proof-of-concept I will share by code on GitHub, I think this will be another awesome example of usage of BookKeeper Enrico Il 05/12/2016 21:18, Sijie Guo ha scritto: On Fri, Dec 2, 2016 at 8:19 AM, Enrico Olivelli <[email protected]> wrote: Hi, I'm doing some benchs in order to use BookKeeper as low-latency data storage, but I'm missing some piece of the LAC protocol. 1) From client1 I create a ledger and then perform an addEntry and wait for the ACK. My LedgerHandle is still open. 2) Client1 obtains the entryId and passes it to client2. 3) Client2 opens the ledger with 'norecovery' and tries to read the entry 4) The read fails, and on Client2 the LAC (readLastConfirmed) is still -1 I know that the close or the openWithRecovery operations will clean up the ledger metadata and the LAC will be the number I expect. I see that the LAC sometimes "advances" even if the ledger is not closed or recovered, but I cannot find any way to force this advance. The lac will be packed into entries and written with them. so when you write next entries, it would advance the LAC. In DistributedLog, we write a *control record* to advance LAC based on the flush policy (immediate, time-based, size-based). I'm using this bookie side options: conf.setFlushInterval(1000); conf.setJournalFlushWhenQueueEmpty(true); conf.setGcWaitTime(10); My need is to have a low-latency "storage" and so I need that readers can access stored data as soon as the write receives the 'ack' of the write I think that the 'piggy back' of the LAC in BK 4.5 will help my case. yes. that would help here. My questions: 1) How is the LAC propagated from writer -> bookies -> readers ? LAC is packed and written along with entries. You can call #readLastAddConfirmed to get the new last added confirm in readers. Once you get the new last add confirmed, you can read new entries. with the long poll changes, you will be able to readLastConfirmedAndEntry within one rpc call. But since your use case is more like a 'streaming' use case, you can use the distributedlog-core library to achieve this. As it has handle the complexity of reading lac and entries in a streaming way, it would simply your use case. 2) Is there any way to force the flush of the LAC ? Right now, you can write entries to force flush the LAC. There is a change from JV that adds explicit write/read lac rpcs. With that change, the client can configure a flush policy to explicitly flush LAC. Enrico -- -- Enrico Olivelli
