Re: Question regarding Synchronization in InterleavedLedgerStorage

2017-07-18 Thread Sijie Guo
Charan,

Thank you for sharing. I just took a quick glance. I have a few high-level
questions:

- Can you benchmark this change when you have time? It is interesting to
see:

(1) current logic vs new logic with 1 slot
(2) current logic vs new logic with N slot

repeat the above tests for
- both interleaved ledger storage and sorted ledger storage.
- increasing number of ledgers from 1, 10, 1000, 1 to 10.

- Your approach might be helpful for using interleaved ledger storage
directly, but it doesn't really help sorted ledger storage or any ledger
storage having a sorted buffer over interleaved ledger storage. because at
any given time, there is only one flush thread writing to entrylogger,
especially if you have small number of ledgers, when flushing the memtable,
it is effectively only one entry log file is written at that given time.

- If this is done using a composite ledger store approach, it will benefit
the users of sorted ledger storage because you will have multiple flush
threads that write to different entry log files concurrently.

Because you already have most of parts addressed in the
multi-entrylogger-files approach, I would like to see how's the performance
in this approach.


Other comments inline:

On Wed, Jul 19, 2017 at 7:31 AM, Charan Reddy G 
wrote:

> *"I am wondering if your multiple-entrylogs approach is making things
> complicated. I have been thinking there can be a simpler approach achieving
> the same goal: for example, having a ledger storage comprised of N
> interleaved/sorted ledger storages, which they share same LedgerCache, but
> having different memtables (for sortedledgerstore) and different entry log
> files. "*
>
>
> First of all I'm not convinced with the statement that dealing
> MultipleEntryLogs at EntryLogger is more complicated. As far as I
> understood, dealing with composition (containing N) at LedgerStorage level
> is even more complicated than dealing at EntryLogger
>
> - The crux of the complication in the approach described in
> https://issues.apache.org/jira/browse/BOOKKEEPER-1041 is maintenance of
> state information (mapping of ledgerid to slotid) and in the event of
> ledgerdir becoming full updating that mapping to available slot. The same
> amount of complication will be applicable even in the case of composition
> at LedgerStorage level. When we get addEntry at ComposedLedgerStorage there
> should be logic to which LedgerStorage it should goto. So with this fact I
> can say that composing at LedgerStorage level it will be atleast as
> complicated as composing at EntryLogger level.
>
> - with the ComposedLedgerStorage approach (Composite Pattern),
> ComposedLedgerStorage managing N 
> InterleavedLedgerStorage/SortedLedgerStorages,
> there is going to be considerable change in the way the resources and state
> information are handled. Each Interleaved/SortedLedgerStorage will have
> its own EntryLogger (to serve our MultipleEntryLogs feature), but rest all
> needs to be shared, mainly 'LedgerCache', 'gcThread' and 'activeLedgers'.
>
So there is going to be major churn in the code for moving the operations
> dealing with shared resources and state to ComposedLedgerStorage and leave
> the rest in InterleavedLedgerStorage. Amount of changes required in
> testcode to deal with these changes is even more. We have to do this while
> providing backward compatibility (Single EntryLog). Now this should make
> one question where should the composition happen.
>


A very dump implementation to start - you might be just implementing a
ledger storage with N ledger storages, each ledger storage is a
subdirectory in the ledger directories. So at least you can compare this
approach with your multi-entry-loggers approach to see which one can gain
better performance. Because I remember the goal for multi-entry-loggers is
for latency and performance. We need to compare to show the evidence which
approach is better.




> As far as I can say, it is supposed to happen at the lowest possible level
> rather than at higher level, which needs extra band-aid efforts to deal
> separately for the common resources/state and multiplexable resources.
>

I think this is arguable.

if you stripe the entries across multiple entry log files, doing this at
entry log file level will give you higher bandwidth at writes. but your
data locality will be poor.

if you don't stripe the entries (that means entries for a ledger going to
one entry log files), doing this at entry log file level doesn't help
ledger storages that have high level buffer and caches.


>
> - with composition at EntryLogger level, currently in my implementation
> getEntry/readEntry path is mostly unchanged. but with composition at
> LedgerStorage even for readEntry/getEntry the multiplexing/redirection
> needs to happen.
>
> - and mainly LedgerStorage is not the only consumer of EntryLogger, but
> also GarbageCollectorThread. It calls quite a few methods of EntryLogger -
> 

Re: Question regarding Synchronization in InterleavedLedgerStorage

2017-07-18 Thread Charan Reddy G
*"I am wondering if your multiple-entrylogs approach is making things
complicated. I have been thinking there can be a simpler approach achieving
the same goal: for example, having a ledger storage comprised of N
interleaved/sorted ledger storages, which they share same LedgerCache, but
having different memtables (for sortedledgerstore) and different entry log
files. "*


First of all I'm not convinced with the statement that dealing
MultipleEntryLogs at EntryLogger is more complicated. As far as I
understood, dealing with composition (containing N) at LedgerStorage level
is even more complicated than dealing at EntryLogger

- The crux of the complication in the approach described in
https://issues.apache.org/jira/browse/BOOKKEEPER-1041 is maintenance of
state information (mapping of ledgerid to slotid) and in the event of
ledgerdir becoming full updating that mapping to available slot. The same
amount of complication will be applicable even in the case of composition
at LedgerStorage level. When we get addEntry at ComposedLedgerStorage there
should be logic to which LedgerStorage it should goto. So with this fact I
can say that composing at LedgerStorage level it will be atleast as
complicated as composing at EntryLogger level.

- with the ComposedLedgerStorage approach (Composite Pattern),
ComposedLedgerStorage managing N
InterleavedLedgerStorage/SortedLedgerStorages, there is going to be
considerable change in the way the resources and state information are
handled. Each Interleaved/SortedLedgerStorage will have its own EntryLogger
(to serve our MultipleEntryLogs feature), but rest all needs to be shared,
mainly 'LedgerCache', 'gcThread' and 'activeLedgers'. So there is going to
be major churn in the code for moving the operations dealing with shared
resources and state to ComposedLedgerStorage and leave the rest in
InterleavedLedgerStorage. Amount of changes required in testcode to deal
with these changes is even more. We have to do this while providing
backward compatibility (Single EntryLog). Now this should make one question
where should the composition happen. As far as I can say, it is supposed to
happen at the lowest possible level rather than at higher level, which
needs extra band-aid efforts to deal separately for the common
resources/state and multiplexable resources.

- with composition at EntryLogger level, currently in my implementation
getEntry/readEntry path is mostly unchanged. but with composition at
LedgerStorage even for readEntry/getEntry the multiplexing/redirection
needs to happen.

- and mainly LedgerStorage is not the only consumer of EntryLogger, but
also GarbageCollectorThread. It calls quite a few methods of EntryLogger -
(entryLogger.addEntry, entryLogger.flush, entryLogger.removeEntryLog,
entryLogger.scanEntryLog, entryLogger.getLeastUnflushedLogId,
entryLogger.logExists, entryLogger.getEntryLogMetadata). It is going to be
an issue with ComposedLedgerStorage, because with that EntryLogger is going
to be responsible for just one EntryLog. For GarbageCollectorThread to work
correctly with multiple EntryLogs, composition is required here as well.
Which is double whammy when it comes down to implementation.

- I remember Sijie mentioning that there is WI to improve
GarbageCollectorThread compaction logic, to segregate the entries of
ledgers based on lifespan while compacting entrylogs. With the correct
implementation of MultipleEntryLogs, the logic/implementation of it
shouldn't be affected, but with ComposedLedgerStorage it will have the same
issues as I mentioned in the above point.


Being said that, one difference what ComposedLedgerStorage could make is
each SortedLedgerStorage will have its own EntryMemtable. With the current
implementation of MultipleEntryLogs, it is going to be just one MemTable.
Now the question is how is going to be beneficial to have multiple
EntryMemTables? Andrey mentioned that current EntrymemTable's snapshot
flush processes entries sequentially, but this can be parallelized by
having ThreadPoolExecutor for processing the entries and also Sijie
confirmed that synchrnoization is not needed for
InterleavedLedgerStorage.processEntry and
InterleavedLedgerStorage.addEntry. Size of the EntryMemTable is already
configurable. And mainly with MultipleEntryLogs there is not much need of
SortedLedgerStorage, because entries of different ledgers are inherently
segregated and there wont be much Interleaving even if we use
InterleavedLedgerStorage. Also, InterleavedLedgerStorage helps in
overcoming unpredictable tail latency caused by SortedLedgerStorage'
EntryMemTable flush.

@sijie
Code is almost ready for pull request, I just need to do some final
touches, in the meantime you may check the code at -
https://github.com/reddycharan/bookkeeper/tree/multipleentrylogs . It is
rebased to current code. It should be good.

Thanks,
Charan

On Mon, Jul 17, 2017 at 2:25 PM, Venkateswara Rao Jujjuri  wrote:

>
>
> On Fri, Jul 14, 2017 at 6:00 PM, 

Re: Question regarding Synchronization in InterleavedLedgerStorage

2017-07-17 Thread Venkateswara Rao Jujjuri
On Fri, Jul 14, 2017 at 6:00 PM, Sijie Guo  wrote:

>
>
> On Sat, Jul 15, 2017 at 8:06 AM, Charan Reddy G 
> wrote:
>
>> Hey,
>>
>> In InterleavedLedgerStorage, since the initial version of it (
>> https://github.com/apache/bookkeeper/commit/4a94ce1d8184f5f
>> 38def015d80777a8113b96690 and https://github.com/apache/book
>> keeper/commit/d175ada58dcaf78f0a70b0ebebf489255ae67b5f), addEntry and
>> processEntry methods are synchronized. If it is synchronized then I dont
>> get what is the point in having 'writeThreadPool' in
>> BookieRequestProcessor, if anyhow they are going to be executed
>> sequentially because of synchronized addEntry method in
>> InterleavedLedgerStorage.
>>
>
> When InterleavedLedgerStore is used in the context of SortedLedgerStore,
> the addEntry and processEntry are only called when flushing happened. The
> flushing happens in background thread, which is effectively running
> sequentially. But adding to the memtable happens concurrently.
>
> The reason of having 'writeThreadPool' is more on separating writes and
> reads into different thread pools. so writes will not be affected by reads.
> In the context of SortedLedgerStore, the 'writeThreadPool' adds the
> concurrency.
>
>
>>
>> If we look at the implementation of addEntry and processEntry method,
>> 'somethingWritten' can be made threadsafe by using AtomicBoolean,
>> ledgerCache.updateLastAddConfirmed and entryLogger.addEntry methods are
>> inherently threadsafe.
>>
>> I'm not sure about semantics of ledgerCache.putEntryOffset method here.
>> I'm not confident enough to say if LedgerCacheImpl and IndexInMemPageMgr
>> (and probably IndexPersistenceMgr) are thread-safe classes.
>>
>
> LedgerCacheImpl and IndexInMemPageMgr are thread-safe classes. You can
> confirm this from SortedLedgerStorage.
>
>
>>
>> As far as I understood, if ledgerCache.putEntryOffset is thread safe,
>> then I dont see the need of synchronization for those methods. In any case,
>> if they are not thread-safe can you please say why it is not thread-safe
>> and how we can do more granular synchronization at LedgerCacheImpl level,
>> so that we can remove the need of synchrnoization at
>> InterleavedLedgerStorage level.
>>
>
> I don't see any reason why we can't remove the synchronization.
>
>
>>
>> I'm currently working on Multiple Entrylogs -
>> https://issues.apache.org/jira/browse/BOOKKEEPER-1041.
>>
>
> I am wondering if your multiple-entrylogs approach is making things
> complicated. I have been thinking there can be a simpler approach achieving
> the same goal: for example, having a ledger storage comprised of N
> interleaved/sorted ledger storages, which they share same LedgerCache, but
> having different memtables (for sortedledgerstore) and different entry log
> files.
>

This is more cleaner approach. @charan can you comment?

JV


>
>
>> To reap the benefits of multipleentrylogs feature from performance
>> perspective, this synchrnoization should be taken care or atleast bring it
>> down to more granular synchronization (if possible).
>>
>> @Override
>> synchronized public long addEntry(ByteBuffer entry) throws
>> IOException {
>> long ledgerId = entry.getLong();
>> long entryId = entry.getLong();
>> long lac = entry.getLong();
>> entry.rewind();
>> processEntry(ledgerId, entryId, entry);
>> ledgerCache.updateLastAddConfirmed(ledgerId, lac);
>> return entryId;
>> }
>>
>> synchronized protected void processEntry(long ledgerId, long entryId,
>> ByteBuffer entry, boolean rollLog)
>> throws IOException {
>> somethingWritten = true;
>> long pos = entryLogger.addEntry(ledgerId, entry, rollLog);
>> ledgerCache.putEntryOffset(ledgerId, entryId, pos);
>> }
>>
>> Thanks,
>> Charan
>>
>
>


-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi


Re: Question regarding Synchronization in InterleavedLedgerStorage

2017-07-14 Thread Sijie Guo
On Sat, Jul 15, 2017 at 8:06 AM, Charan Reddy G 
wrote:

> Hey,
>
> In InterleavedLedgerStorage, since the initial version of it (
> https://github.com/apache/bookkeeper/commit/4a94ce1d8184f5f38def015d80777a
> 8113b96690 and https://github.com/apache/bookkeeper/commit/
> d175ada58dcaf78f0a70b0ebebf489255ae67b5f), addEntry and processEntry
> methods are synchronized. If it is synchronized then I dont get what is the
> point in having 'writeThreadPool' in BookieRequestProcessor, if anyhow they
> are going to be executed sequentially because of synchronized addEntry
> method in InterleavedLedgerStorage.
>

When InterleavedLedgerStore is used in the context of SortedLedgerStore,
the addEntry and processEntry are only called when flushing happened. The
flushing happens in background thread, which is effectively running
sequentially. But adding to the memtable happens concurrently.

The reason of having 'writeThreadPool' is more on separating writes and
reads into different thread pools. so writes will not be affected by reads.
In the context of SortedLedgerStore, the 'writeThreadPool' adds the
concurrency.


>
> If we look at the implementation of addEntry and processEntry method,
> 'somethingWritten' can be made threadsafe by using AtomicBoolean,
> ledgerCache.updateLastAddConfirmed and entryLogger.addEntry methods are
> inherently threadsafe.
>
> I'm not sure about semantics of ledgerCache.putEntryOffset method here.
> I'm not confident enough to say if LedgerCacheImpl and IndexInMemPageMgr
> (and probably IndexPersistenceMgr) are thread-safe classes.
>

LedgerCacheImpl and IndexInMemPageMgr are thread-safe classes. You can
confirm this from SortedLedgerStorage.


>
> As far as I understood, if ledgerCache.putEntryOffset is thread safe, then
> I dont see the need of synchronization for those methods. In any case, if
> they are not thread-safe can you please say why it is not thread-safe and
> how we can do more granular synchronization at LedgerCacheImpl level, so
> that we can remove the need of synchrnoization at InterleavedLedgerStorage
> level.
>

I don't see any reason why we can't remove the synchronization.


>
> I'm currently working on Multiple Entrylogs - https://issues.apache.org/
> jira/browse/BOOKKEEPER-1041.
>

I am wondering if your multiple-entrylogs approach is making things
complicated. I have been thinking there can be a simpler approach achieving
the same goal: for example, having a ledger storage comprised of N
interleaved/sorted ledger storages, which they share same LedgerCache, but
having different memtables (for sortedledgerstore) and different entry log
files.


> To reap the benefits of multipleentrylogs feature from performance
> perspective, this synchrnoization should be taken care or atleast bring it
> down to more granular synchronization (if possible).
>
> @Override
> synchronized public long addEntry(ByteBuffer entry) throws IOException
> {
> long ledgerId = entry.getLong();
> long entryId = entry.getLong();
> long lac = entry.getLong();
> entry.rewind();
> processEntry(ledgerId, entryId, entry);
> ledgerCache.updateLastAddConfirmed(ledgerId, lac);
> return entryId;
> }
>
> synchronized protected void processEntry(long ledgerId, long entryId,
> ByteBuffer entry, boolean rollLog)
> throws IOException {
> somethingWritten = true;
> long pos = entryLogger.addEntry(ledgerId, entry, rollLog);
> ledgerCache.putEntryOffset(ledgerId, entryId, pos);
> }
>
> Thanks,
> Charan
>


Question regarding Synchronization in InterleavedLedgerStorage

2017-07-14 Thread Charan Reddy G
Hey,

In InterleavedLedgerStorage, since the initial version of it (
https://github.com/apache/bookkeeper/commit/4a94ce1d8184f5f38def015d80777a8113b96690
and
https://github.com/apache/bookkeeper/commit/d175ada58dcaf78f0a70b0ebebf489255ae67b5f),
addEntry and processEntry methods are synchronized. If it is synchronized
then I dont get what is the point in having 'writeThreadPool' in
BookieRequestProcessor, if anyhow they are going to be executed
sequentially because of synchronized addEntry method in
InterleavedLedgerStorage.

If we look at the implementation of addEntry and processEntry method,
'somethingWritten' can be made threadsafe by using AtomicBoolean,
ledgerCache.updateLastAddConfirmed and entryLogger.addEntry methods are
inherently threadsafe.

I'm not sure about semantics of ledgerCache.putEntryOffset method here. I'm
not confident enough to say if LedgerCacheImpl and IndexInMemPageMgr (and
probably IndexPersistenceMgr) are thread-safe classes.

As far as I understood, if ledgerCache.putEntryOffset is thread safe, then
I dont see the need of synchronization for those methods. In any case, if
they are not thread-safe can you please say why it is not thread-safe and
how we can do more granular synchronization at LedgerCacheImpl level, so
that we can remove the need of synchrnoization at InterleavedLedgerStorage
level.

I'm currently working on Multiple Entrylogs -
https://issues.apache.org/jira/browse/BOOKKEEPER-1041. To reap the benefits
of multipleentrylogs feature from performance perspective, this
synchrnoization should be taken care or atleast bring it down to more
granular synchronization (if possible).

@Override
synchronized public long addEntry(ByteBuffer entry) throws IOException {
long ledgerId = entry.getLong();
long entryId = entry.getLong();
long lac = entry.getLong();
entry.rewind();
processEntry(ledgerId, entryId, entry);
ledgerCache.updateLastAddConfirmed(ledgerId, lac);
return entryId;
}

synchronized protected void processEntry(long ledgerId, long entryId,
ByteBuffer entry, boolean rollLog)
throws IOException {
somethingWritten = true;
long pos = entryLogger.addEntry(ledgerId, entry, rollLog);
ledgerCache.putEntryOffset(ledgerId, entryId, pos);
}

Thanks,
Charan