[GitHub] ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager.
ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager. URL: https://github.com/apache/bookkeeper/pull/1281#discussion_r177035776 ## File path: bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/SortedLedgerStorage.java ## @@ -78,6 +79,7 @@ public void initialize(ServerConfiguration conf, .setNameFormat("SortedLedgerStorage-%d") .setPriority((Thread.NORM_PRIORITY + Thread.MAX_PRIORITY) / 2).build()); this.stateManager = stateManager; +this.isTransactionalCompactionEnabled = conf.getUseTransactionalCompaction(); Review comment: Separate change, so should not be in this patch. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager.
ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager. URL: https://github.com/apache/bookkeeper/pull/1281#discussion_r177033723 ## File path: bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/EntryLogger.java ## @@ -802,88 +876,217 @@ private long readLastLogId(File f) { } } +interface EntryLogManager { Review comment: Move into its own java file. EntryLogger is already huge. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager.
ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager. URL: https://github.com/apache/bookkeeper/pull/1281#discussion_r177034732 ## File path: bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/EntryLogger.java ## @@ -292,12 +304,37 @@ public EntryLogger(ServerConfiguration conf, logId = lastLogId; } } -this.leastUnflushedLogId = logId + 1; +this.recentlyCreatedEntryLogsStatus = new RecentEntryLogsStatus(logId + 1); Review comment: Should be part of the entryLogManager. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager.
ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager. URL: https://github.com/apache/bookkeeper/pull/1281#discussion_r177035373 ## File path: bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/EntryLogger.java ## @@ -264,6 +274,8 @@ public EntryLogger(ServerConfiguration conf, // but the protocol varies so an exact value is difficult to determine this.maxSaneEntrySize = conf.getNettyMaxFrameSizeBytes() - 500; this.ledgerDirsManager = ledgerDirsManager; +this.conf = conf; +entryLogPerLedgerEnabled = conf.isEntryLogPerLedgerEnabled(); Review comment: The entryLogManager should be constructed outside of the EntryLogger and passed in as an argument. The entryLogger shouldn't need to know which implementation of the entryLogManager it is using, which is the whole point of having an interface. Also injecting the implementation like this, will make for easier testing, and SortedLedgerStorage would not need to be touched at all (i.e. SortedLedgerStorage can be changed to always create the SingleLog variant). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager.
ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager. URL: https://github.com/apache/bookkeeper/pull/1281#discussion_r177033940 ## File path: bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/EntryLogger.java ## @@ -802,88 +876,217 @@ private long readLastLogId(File f) { } } +interface EntryLogManager { +/* + * acquire lock for this ledger. + */ +void acquireLock(Long ledgerId); + +/* + * acquire lock for this ledger if it is not already available for this + * ledger then it will create a new one and then acquire lock. + */ +void acquireLockByCreatingIfRequired(Long ledgerId); + +/* + * release lock for this ledger + */ +void releaseLock(Long ledgerId); + +/* + * sets the logChannel for the given ledgerId. The previous one will be + * removed from replicaOfCurrentLogChannels. Previous logChannel will be + * added to rotatedLogChannels. + */ +void setCurrentLogForLedger(Long ledgerId, BufferedLogChannel logChannel); + +/* + * gets the logChannel for the given ledgerId. + */ +BufferedLogChannel getCurrentLogForLedger(Long ledgerId); + +/* + * gets the copy of rotatedLogChannels + */ +Set getCopyOfRotatedLogChannels(); + +/* + * gets the copy of replicaOfCurrentLogChannels + */ +Set getCopyOfCurrentLogs(); + +/* + * gets the active logChannel with the given entryLogId. null if it is + * not existing. + */ +BufferedLogChannel getCurrentLogIfPresent(long entryLogId); + +/* + * removes the logChannel from rotatedLogChannels collection + */ +void removeFromRotatedLogChannels(BufferedLogChannel rotatedLogChannelToRemove); + +/* + * Returns eligible writable ledger dir for the creation next entrylog + */ +File getDirForNextEntryLog(List writableLedgerDirs); + +/* + * Do the operations required for checkpoint. + */ +void checkpoint() throws IOException; + +/* + * roll entryLogs. + */ +void rollLogs() throws IOException; +} + +class EntryLogManagerForSingleEntryLog implements EntryLogManager { Review comment: Move into it's own file. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager.
ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager. URL: https://github.com/apache/bookkeeper/pull/1281#discussion_r177382401 ## File path: bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/EntryLogger.java ## @@ -264,6 +274,8 @@ public EntryLogger(ServerConfiguration conf, // but the protocol varies so an exact value is difficult to determine this.maxSaneEntrySize = conf.getNettyMaxFrameSizeBytes() - 500; this.ledgerDirsManager = ledgerDirsManager; +this.conf = conf; +entryLogPerLedgerEnabled = conf.isEntryLogPerLedgerEnabled(); Review comment: The current class hierarchy here looks like this: ![current](https://user-images.githubusercontent.com/54955/37962613-fbc5bb1a-31bb-11e8-9900-d0b68ecd6c64.jpg) There's no abstraction here. Everything is a cycle. It's not hard to break these up though. The coupling is fairly weak, and breaking it now will stop it from becoming strong in the future. Even just making EntryLoggerAllocator and the EntryLogManager static will take it a long way. With them static the code looks like https://github.com/ivankelly/bookkeeper/commit/62a333c768d7b7c0fa16e209470c82fed77eb594 And the dependencies look like: ![static](https://user-images.githubusercontent.com/54955/37962791-7fb5dec8-31bc-11e8-9f83-9c3846fc177b.jpg) Still not ideal because there's a cycle between the allocator and the manager. This only exists so that createNewLog can select a directory. But almost all of the time, createNewLog is called from the manager (except for a compaction log, but that should go through manager also). So since that is the case, the manager can select the directory before the call and pass it in, breaking the cycle. See the code at https://github.com/ivankelly/bookkeeper/commit/2e29373570da59ff0c90f8983ee991c2826bf729 The dependencies look like: ![nocycle](https://user-images.githubusercontent.com/54955/37962944-114aeb1c-31bd-11e8-9c61-0ef0d1e294c2.jpg) It's now possible to unit test each of these classes individually. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager.
ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager. URL: https://github.com/apache/bookkeeper/pull/1281#discussion_r177388920 ## File path: bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/SortedLedgerStorage.java ## @@ -78,6 +79,7 @@ public void initialize(ServerConfiguration conf, .setNameFormat("SortedLedgerStorage-%d") .setPriority((Thread.NORM_PRIORITY + Thread.MAX_PRIORITY) / 2).build()); this.stateManager = stateManager; +this.isTransactionalCompactionEnabled = conf.getUseTransactionalCompaction(); Review comment: I don't see what the point of this flag even is. The external behaviour of SortedLedgerStorage should not change with this patch, so why the flag? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager.
ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager. URL: https://github.com/apache/bookkeeper/pull/1281#discussion_r177391042 ## File path: bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/SortedLedgerStorage.java ## @@ -209,16 +202,12 @@ public void onSizeLimitReached(final Checkpoint cp) throws IOException { public void run() { try { LOG.info("Started flushing mem table."); -long logIdBeforeFlush = entryLogger.getCurrentLogId(); memTable.flush(SortedLedgerStorage.this); -long logIdAfterFlush = entryLogger.getCurrentLogId(); // in any case that an entry log reaches the limit, we roll the log and start checkpointing. // if a memory table is flushed spanning over two entry log files, we also roll log. this is // for performance consideration: since we don't wanna checkpoint a new log file that ledger // storage is writing to. -if (entryLogger.reachEntryLogLimit(0) || logIdAfterFlush != logIdBeforeFlush) { -LOG.info("Rolling entry logger since it reached size limitation"); -entryLogger.rollLog(); +if (entryLogger.rollLogsIfEntryLogLimitReached()) { Review comment: @sijie if ```rollLogsIfEntryLogLimitReached``` is false, it will fall through to the else if statement, which covers the flush over two logs. This is weird though, because it is really tied to a single log, but it acts like it's working with many. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager.
ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager. URL: https://github.com/apache/bookkeeper/pull/1281#discussion_r177383683 ## File path: bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/EntryLogger.java ## @@ -292,12 +304,37 @@ public EntryLogger(ServerConfiguration conf, logId = lastLogId; } } -this.leastUnflushedLogId = logId + 1; +this.recentlyCreatedEntryLogsStatus = new RecentEntryLogsStatus(logId + 1); Review comment: EntryLogManager shouldn't call out to EntryLogger (see comment above). Both new log creation and flushing act almost entirely on the entrylogmanager. New log creation needs the listeners, but this is the only place the listeners are used, so they should be part of the entrylogmanager. EntryLogManager can have a reference to the RecentEntryLogStatus to update on flush. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager.
ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager. URL: https://github.com/apache/bookkeeper/pull/1281#discussion_r177987104 ## File path: bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/SortedLedgerStorage.java ## @@ -209,16 +202,12 @@ public void onSizeLimitReached(final Checkpoint cp) throws IOException { public void run() { try { LOG.info("Started flushing mem table."); -long logIdBeforeFlush = entryLogger.getCurrentLogId(); memTable.flush(SortedLedgerStorage.this); -long logIdAfterFlush = entryLogger.getCurrentLogId(); // in any case that an entry log reaches the limit, we roll the log and start checkpointing. // if a memory table is flushed spanning over two entry log files, we also roll log. this is // for performance consideration: since we don't wanna checkpoint a new log file that ledger // storage is writing to. -if (entryLogger.reachEntryLogLimit(0) || logIdAfterFlush != logIdBeforeFlush) { -LOG.info("Rolling entry logger since it reached size limitation"); -entryLogger.rollLog(); +if (entryLogger.rollLogsIfEntryLogLimitReached()) { Review comment: > @ivankelly that's what I commented before. I don't think the methods in EntryLogManager provide the right abstraction. I agree about the methods providing the right abstraction. I was under the impression that #getPreviousAllocatedEntryLogId() would always return an incremented value if the current log has changed, but having read the code, this is not the case. It seems to me that SortedLedgerStorage is reaching very far into the internals of the entrylogger to do what it wants to do. Since this is the case, we shouldn't try to shoehorn this into the abstraction. SortedLedgerStorage assumes that it uses a single log, so it should only ever be constructed with a SingleLogEntryLogmanager, and SingleLogEntryLogManager should expose the stuff that SortedLedgerStorage has required until now. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager.
ivankelly commented on a change in pull request #1281: Issue #570: Introducing EntryLogManager. URL: https://github.com/apache/bookkeeper/pull/1281#discussion_r177986877 ## File path: bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/SortedLedgerStorage.java ## @@ -78,6 +79,7 @@ public void initialize(ServerConfiguration conf, .setNameFormat("SortedLedgerStorage-%d") .setPriority((Thread.NORM_PRIORITY + Thread.MAX_PRIORITY) / 2).build()); this.stateManager = stateManager; +this.isTransactionalCompactionEnabled = conf.getUseTransactionalCompaction(); Review comment: If you have to supply a flag, then there's something external to be observed. Let me rephrase. As I understand it, we do not want the behaviour of SortedLedgerStorage flushing to change because of the change in this PR. So why are we adding a new configuration flag? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services