Re: Question regarding Synchronization in InterleavedLedgerStorage
Charan, Thank you for sharing. I just took a quick glance. I have a few high-level questions: - Can you benchmark this change when you have time? It is interesting to see: (1) current logic vs new logic with 1 slot (2) current logic vs new logic with N slot repeat the above tests for - both interleaved ledger storage and sorted ledger storage. - increasing number of ledgers from 1, 10, 1000, 1 to 10. - Your approach might be helpful for using interleaved ledger storage directly, but it doesn't really help sorted ledger storage or any ledger storage having a sorted buffer over interleaved ledger storage. because at any given time, there is only one flush thread writing to entrylogger, especially if you have small number of ledgers, when flushing the memtable, it is effectively only one entry log file is written at that given time. - If this is done using a composite ledger store approach, it will benefit the users of sorted ledger storage because you will have multiple flush threads that write to different entry log files concurrently. Because you already have most of parts addressed in the multi-entrylogger-files approach, I would like to see how's the performance in this approach. Other comments inline: On Wed, Jul 19, 2017 at 7:31 AM, Charan Reddy Gwrote: > *"I am wondering if your multiple-entrylogs approach is making things > complicated. I have been thinking there can be a simpler approach achieving > the same goal: for example, having a ledger storage comprised of N > interleaved/sorted ledger storages, which they share same LedgerCache, but > having different memtables (for sortedledgerstore) and different entry log > files. "* > > > First of all I'm not convinced with the statement that dealing > MultipleEntryLogs at EntryLogger is more complicated. As far as I > understood, dealing with composition (containing N) at LedgerStorage level > is even more complicated than dealing at EntryLogger > > - The crux of the complication in the approach described in > https://issues.apache.org/jira/browse/BOOKKEEPER-1041 is maintenance of > state information (mapping of ledgerid to slotid) and in the event of > ledgerdir becoming full updating that mapping to available slot. The same > amount of complication will be applicable even in the case of composition > at LedgerStorage level. When we get addEntry at ComposedLedgerStorage there > should be logic to which LedgerStorage it should goto. So with this fact I > can say that composing at LedgerStorage level it will be atleast as > complicated as composing at EntryLogger level. > > - with the ComposedLedgerStorage approach (Composite Pattern), > ComposedLedgerStorage managing N > InterleavedLedgerStorage/SortedLedgerStorages, > there is going to be considerable change in the way the resources and state > information are handled. Each Interleaved/SortedLedgerStorage will have > its own EntryLogger (to serve our MultipleEntryLogs feature), but rest all > needs to be shared, mainly 'LedgerCache', 'gcThread' and 'activeLedgers'. > So there is going to be major churn in the code for moving the operations > dealing with shared resources and state to ComposedLedgerStorage and leave > the rest in InterleavedLedgerStorage. Amount of changes required in > testcode to deal with these changes is even more. We have to do this while > providing backward compatibility (Single EntryLog). Now this should make > one question where should the composition happen. > A very dump implementation to start - you might be just implementing a ledger storage with N ledger storages, each ledger storage is a subdirectory in the ledger directories. So at least you can compare this approach with your multi-entry-loggers approach to see which one can gain better performance. Because I remember the goal for multi-entry-loggers is for latency and performance. We need to compare to show the evidence which approach is better. > As far as I can say, it is supposed to happen at the lowest possible level > rather than at higher level, which needs extra band-aid efforts to deal > separately for the common resources/state and multiplexable resources. > I think this is arguable. if you stripe the entries across multiple entry log files, doing this at entry log file level will give you higher bandwidth at writes. but your data locality will be poor. if you don't stripe the entries (that means entries for a ledger going to one entry log files), doing this at entry log file level doesn't help ledger storages that have high level buffer and caches. > > - with composition at EntryLogger level, currently in my implementation > getEntry/readEntry path is mostly unchanged. but with composition at > LedgerStorage even for readEntry/getEntry the multiplexing/redirection > needs to happen. > > - and mainly LedgerStorage is not the only consumer of EntryLogger, but > also GarbageCollectorThread. It calls quite a few methods of EntryLogger - >
Re: Question regarding Synchronization in InterleavedLedgerStorage
*"I am wondering if your multiple-entrylogs approach is making things complicated. I have been thinking there can be a simpler approach achieving the same goal: for example, having a ledger storage comprised of N interleaved/sorted ledger storages, which they share same LedgerCache, but having different memtables (for sortedledgerstore) and different entry log files. "* First of all I'm not convinced with the statement that dealing MultipleEntryLogs at EntryLogger is more complicated. As far as I understood, dealing with composition (containing N) at LedgerStorage level is even more complicated than dealing at EntryLogger - The crux of the complication in the approach described in https://issues.apache.org/jira/browse/BOOKKEEPER-1041 is maintenance of state information (mapping of ledgerid to slotid) and in the event of ledgerdir becoming full updating that mapping to available slot. The same amount of complication will be applicable even in the case of composition at LedgerStorage level. When we get addEntry at ComposedLedgerStorage there should be logic to which LedgerStorage it should goto. So with this fact I can say that composing at LedgerStorage level it will be atleast as complicated as composing at EntryLogger level. - with the ComposedLedgerStorage approach (Composite Pattern), ComposedLedgerStorage managing N InterleavedLedgerStorage/SortedLedgerStorages, there is going to be considerable change in the way the resources and state information are handled. Each Interleaved/SortedLedgerStorage will have its own EntryLogger (to serve our MultipleEntryLogs feature), but rest all needs to be shared, mainly 'LedgerCache', 'gcThread' and 'activeLedgers'. So there is going to be major churn in the code for moving the operations dealing with shared resources and state to ComposedLedgerStorage and leave the rest in InterleavedLedgerStorage. Amount of changes required in testcode to deal with these changes is even more. We have to do this while providing backward compatibility (Single EntryLog). Now this should make one question where should the composition happen. As far as I can say, it is supposed to happen at the lowest possible level rather than at higher level, which needs extra band-aid efforts to deal separately for the common resources/state and multiplexable resources. - with composition at EntryLogger level, currently in my implementation getEntry/readEntry path is mostly unchanged. but with composition at LedgerStorage even for readEntry/getEntry the multiplexing/redirection needs to happen. - and mainly LedgerStorage is not the only consumer of EntryLogger, but also GarbageCollectorThread. It calls quite a few methods of EntryLogger - (entryLogger.addEntry, entryLogger.flush, entryLogger.removeEntryLog, entryLogger.scanEntryLog, entryLogger.getLeastUnflushedLogId, entryLogger.logExists, entryLogger.getEntryLogMetadata). It is going to be an issue with ComposedLedgerStorage, because with that EntryLogger is going to be responsible for just one EntryLog. For GarbageCollectorThread to work correctly with multiple EntryLogs, composition is required here as well. Which is double whammy when it comes down to implementation. - I remember Sijie mentioning that there is WI to improve GarbageCollectorThread compaction logic, to segregate the entries of ledgers based on lifespan while compacting entrylogs. With the correct implementation of MultipleEntryLogs, the logic/implementation of it shouldn't be affected, but with ComposedLedgerStorage it will have the same issues as I mentioned in the above point. Being said that, one difference what ComposedLedgerStorage could make is each SortedLedgerStorage will have its own EntryMemtable. With the current implementation of MultipleEntryLogs, it is going to be just one MemTable. Now the question is how is going to be beneficial to have multiple EntryMemTables? Andrey mentioned that current EntrymemTable's snapshot flush processes entries sequentially, but this can be parallelized by having ThreadPoolExecutor for processing the entries and also Sijie confirmed that synchrnoization is not needed for InterleavedLedgerStorage.processEntry and InterleavedLedgerStorage.addEntry. Size of the EntryMemTable is already configurable. And mainly with MultipleEntryLogs there is not much need of SortedLedgerStorage, because entries of different ledgers are inherently segregated and there wont be much Interleaving even if we use InterleavedLedgerStorage. Also, InterleavedLedgerStorage helps in overcoming unpredictable tail latency caused by SortedLedgerStorage' EntryMemTable flush. @sijie Code is almost ready for pull request, I just need to do some final touches, in the meantime you may check the code at - https://github.com/reddycharan/bookkeeper/tree/multipleentrylogs . It is rebased to current code. It should be good. Thanks, Charan On Mon, Jul 17, 2017 at 2:25 PM, Venkateswara Rao Jujjuriwrote: > > > On Fri, Jul 14, 2017 at 6:00 PM,
Re: Question regarding Synchronization in InterleavedLedgerStorage
On Fri, Jul 14, 2017 at 6:00 PM, Sijie Guowrote: > > > On Sat, Jul 15, 2017 at 8:06 AM, Charan Reddy G > wrote: > >> Hey, >> >> In InterleavedLedgerStorage, since the initial version of it ( >> https://github.com/apache/bookkeeper/commit/4a94ce1d8184f5f >> 38def015d80777a8113b96690 and https://github.com/apache/book >> keeper/commit/d175ada58dcaf78f0a70b0ebebf489255ae67b5f), addEntry and >> processEntry methods are synchronized. If it is synchronized then I dont >> get what is the point in having 'writeThreadPool' in >> BookieRequestProcessor, if anyhow they are going to be executed >> sequentially because of synchronized addEntry method in >> InterleavedLedgerStorage. >> > > When InterleavedLedgerStore is used in the context of SortedLedgerStore, > the addEntry and processEntry are only called when flushing happened. The > flushing happens in background thread, which is effectively running > sequentially. But adding to the memtable happens concurrently. > > The reason of having 'writeThreadPool' is more on separating writes and > reads into different thread pools. so writes will not be affected by reads. > In the context of SortedLedgerStore, the 'writeThreadPool' adds the > concurrency. > > >> >> If we look at the implementation of addEntry and processEntry method, >> 'somethingWritten' can be made threadsafe by using AtomicBoolean, >> ledgerCache.updateLastAddConfirmed and entryLogger.addEntry methods are >> inherently threadsafe. >> >> I'm not sure about semantics of ledgerCache.putEntryOffset method here. >> I'm not confident enough to say if LedgerCacheImpl and IndexInMemPageMgr >> (and probably IndexPersistenceMgr) are thread-safe classes. >> > > LedgerCacheImpl and IndexInMemPageMgr are thread-safe classes. You can > confirm this from SortedLedgerStorage. > > >> >> As far as I understood, if ledgerCache.putEntryOffset is thread safe, >> then I dont see the need of synchronization for those methods. In any case, >> if they are not thread-safe can you please say why it is not thread-safe >> and how we can do more granular synchronization at LedgerCacheImpl level, >> so that we can remove the need of synchrnoization at >> InterleavedLedgerStorage level. >> > > I don't see any reason why we can't remove the synchronization. > > >> >> I'm currently working on Multiple Entrylogs - >> https://issues.apache.org/jira/browse/BOOKKEEPER-1041. >> > > I am wondering if your multiple-entrylogs approach is making things > complicated. I have been thinking there can be a simpler approach achieving > the same goal: for example, having a ledger storage comprised of N > interleaved/sorted ledger storages, which they share same LedgerCache, but > having different memtables (for sortedledgerstore) and different entry log > files. > This is more cleaner approach. @charan can you comment? JV > > >> To reap the benefits of multipleentrylogs feature from performance >> perspective, this synchrnoization should be taken care or atleast bring it >> down to more granular synchronization (if possible). >> >> @Override >> synchronized public long addEntry(ByteBuffer entry) throws >> IOException { >> long ledgerId = entry.getLong(); >> long entryId = entry.getLong(); >> long lac = entry.getLong(); >> entry.rewind(); >> processEntry(ledgerId, entryId, entry); >> ledgerCache.updateLastAddConfirmed(ledgerId, lac); >> return entryId; >> } >> >> synchronized protected void processEntry(long ledgerId, long entryId, >> ByteBuffer entry, boolean rollLog) >> throws IOException { >> somethingWritten = true; >> long pos = entryLogger.addEntry(ledgerId, entry, rollLog); >> ledgerCache.putEntryOffset(ledgerId, entryId, pos); >> } >> >> Thanks, >> Charan >> > > -- Jvrao --- First they ignore you, then they laugh at you, then they fight you, then you win. - Mahatma Gandhi
Re: Question regarding Synchronization in InterleavedLedgerStorage
On Sat, Jul 15, 2017 at 8:06 AM, Charan Reddy Gwrote: > Hey, > > In InterleavedLedgerStorage, since the initial version of it ( > https://github.com/apache/bookkeeper/commit/4a94ce1d8184f5f38def015d80777a > 8113b96690 and https://github.com/apache/bookkeeper/commit/ > d175ada58dcaf78f0a70b0ebebf489255ae67b5f), addEntry and processEntry > methods are synchronized. If it is synchronized then I dont get what is the > point in having 'writeThreadPool' in BookieRequestProcessor, if anyhow they > are going to be executed sequentially because of synchronized addEntry > method in InterleavedLedgerStorage. > When InterleavedLedgerStore is used in the context of SortedLedgerStore, the addEntry and processEntry are only called when flushing happened. The flushing happens in background thread, which is effectively running sequentially. But adding to the memtable happens concurrently. The reason of having 'writeThreadPool' is more on separating writes and reads into different thread pools. so writes will not be affected by reads. In the context of SortedLedgerStore, the 'writeThreadPool' adds the concurrency. > > If we look at the implementation of addEntry and processEntry method, > 'somethingWritten' can be made threadsafe by using AtomicBoolean, > ledgerCache.updateLastAddConfirmed and entryLogger.addEntry methods are > inherently threadsafe. > > I'm not sure about semantics of ledgerCache.putEntryOffset method here. > I'm not confident enough to say if LedgerCacheImpl and IndexInMemPageMgr > (and probably IndexPersistenceMgr) are thread-safe classes. > LedgerCacheImpl and IndexInMemPageMgr are thread-safe classes. You can confirm this from SortedLedgerStorage. > > As far as I understood, if ledgerCache.putEntryOffset is thread safe, then > I dont see the need of synchronization for those methods. In any case, if > they are not thread-safe can you please say why it is not thread-safe and > how we can do more granular synchronization at LedgerCacheImpl level, so > that we can remove the need of synchrnoization at InterleavedLedgerStorage > level. > I don't see any reason why we can't remove the synchronization. > > I'm currently working on Multiple Entrylogs - https://issues.apache.org/ > jira/browse/BOOKKEEPER-1041. > I am wondering if your multiple-entrylogs approach is making things complicated. I have been thinking there can be a simpler approach achieving the same goal: for example, having a ledger storage comprised of N interleaved/sorted ledger storages, which they share same LedgerCache, but having different memtables (for sortedledgerstore) and different entry log files. > To reap the benefits of multipleentrylogs feature from performance > perspective, this synchrnoization should be taken care or atleast bring it > down to more granular synchronization (if possible). > > @Override > synchronized public long addEntry(ByteBuffer entry) throws IOException > { > long ledgerId = entry.getLong(); > long entryId = entry.getLong(); > long lac = entry.getLong(); > entry.rewind(); > processEntry(ledgerId, entryId, entry); > ledgerCache.updateLastAddConfirmed(ledgerId, lac); > return entryId; > } > > synchronized protected void processEntry(long ledgerId, long entryId, > ByteBuffer entry, boolean rollLog) > throws IOException { > somethingWritten = true; > long pos = entryLogger.addEntry(ledgerId, entry, rollLog); > ledgerCache.putEntryOffset(ledgerId, entryId, pos); > } > > Thanks, > Charan >
Question regarding Synchronization in InterleavedLedgerStorage
Hey, In InterleavedLedgerStorage, since the initial version of it ( https://github.com/apache/bookkeeper/commit/4a94ce1d8184f5f38def015d80777a8113b96690 and https://github.com/apache/bookkeeper/commit/d175ada58dcaf78f0a70b0ebebf489255ae67b5f), addEntry and processEntry methods are synchronized. If it is synchronized then I dont get what is the point in having 'writeThreadPool' in BookieRequestProcessor, if anyhow they are going to be executed sequentially because of synchronized addEntry method in InterleavedLedgerStorage. If we look at the implementation of addEntry and processEntry method, 'somethingWritten' can be made threadsafe by using AtomicBoolean, ledgerCache.updateLastAddConfirmed and entryLogger.addEntry methods are inherently threadsafe. I'm not sure about semantics of ledgerCache.putEntryOffset method here. I'm not confident enough to say if LedgerCacheImpl and IndexInMemPageMgr (and probably IndexPersistenceMgr) are thread-safe classes. As far as I understood, if ledgerCache.putEntryOffset is thread safe, then I dont see the need of synchronization for those methods. In any case, if they are not thread-safe can you please say why it is not thread-safe and how we can do more granular synchronization at LedgerCacheImpl level, so that we can remove the need of synchrnoization at InterleavedLedgerStorage level. I'm currently working on Multiple Entrylogs - https://issues.apache.org/jira/browse/BOOKKEEPER-1041. To reap the benefits of multipleentrylogs feature from performance perspective, this synchrnoization should be taken care or atleast bring it down to more granular synchronization (if possible). @Override synchronized public long addEntry(ByteBuffer entry) throws IOException { long ledgerId = entry.getLong(); long entryId = entry.getLong(); long lac = entry.getLong(); entry.rewind(); processEntry(ledgerId, entryId, entry); ledgerCache.updateLastAddConfirmed(ledgerId, lac); return entryId; } synchronized protected void processEntry(long ledgerId, long entryId, ByteBuffer entry, boolean rollLog) throws IOException { somethingWritten = true; long pos = entryLogger.addEntry(ledgerId, entry, rollLog); ledgerCache.putEntryOffset(ledgerId, entryId, pos); } Thanks, Charan