Re: [PR] IGNITE-25854 Implement method to get non-destroyed RocksDB tables IDs [ignite-3]
rpuch merged PR #6229: URL: https://github.com/apache/ignite-3/pull/6229 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] IGNITE-25854 Implement method to get non-destroyed RocksDB tables IDs [ignite-3]
rpuch commented on code in PR #6229:
URL: https://github.com/apache/ignite-3/pull/6229#discussion_r2197847533
##
modules/storage-api/src/testFixtures/java/org/apache/ignite/internal/storage/engine/AbstractPersistentStorageEngineTest.java:
##
@@ -70,6 +70,9 @@
* because it doesn't limit the usage of the engine with a single table.
*/
public abstract class AbstractPersistentStorageEngineTest extends
AbstractStorageEngineTest {
+/** Makes sure that table destruction is persisted durably. */
+protected abstract void persistTableDestructionIfNeeded();
Review Comment:
Technically, it can, but
1. It's not needed (we'll ensure durability of destruction on the higher
level; this PR is about building this higher-level mechanism)
2. It's not free (it will require a flush, which we don't need per item 1)
3. It will bring additional complexity to the code (necessity to call flush,
some methods' return values will be `CompletableFuture` vs void)
So it seems more rational to not introduce such guarantees.
The fact that pagemem-based storages give these guarantees is a lucky
coincidence (we use file/dir removal there, which is fast). Also, pagemem-based
storages provide these guarantees only while we only think about Ignite
restart/crash; if a machine or OS crashes, those file-based guarantees vanish
as it seems that we don't do dir fsync when deleting files, and not all FSs use
journals.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] IGNITE-25854 Implement method to get non-destroyed RocksDB tables IDs [ignite-3]
sashapolo commented on code in PR #6229:
URL: https://github.com/apache/ignite-3/pull/6229#discussion_r2197853167
##
modules/storage-api/src/testFixtures/java/org/apache/ignite/internal/storage/engine/AbstractPersistentStorageEngineTest.java:
##
@@ -70,6 +70,9 @@
* because it doesn't limit the usage of the engine with a single table.
*/
public abstract class AbstractPersistentStorageEngineTest extends
AbstractStorageEngineTest {
+/** Makes sure that table destruction is persisted durably. */
+protected abstract void persistTableDestructionIfNeeded();
Review Comment:
Sorry, I don't get it: PageMemory has durable destroy, while Rocksdb has
not. Why is this ok? We even need some hacks for tests to mitigate this
situation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] IGNITE-25854 Implement method to get non-destroyed RocksDB tables IDs [ignite-3]
rpuch commented on code in PR #6229:
URL: https://github.com/apache/ignite-3/pull/6229#discussion_r2197857090
##
modules/storage-rocksdb/src/main/java/org/apache/ignite/internal/storage/rocksdb/RocksDbStorageEngine.java:
##
@@ -270,4 +273,24 @@ public void destroyMvTable(int tableId) {
rocksDbStorage.rocksDbInstance.destroyTable(tableId);
}
}
+
+@Override
+public Set tableIdsOnDisk() {
+return storageByProfileName.values().stream()
+.flatMap(storage ->
storage.rocksDbInstance.tableIdsInRocksDb().stream())
Review Comment:
Renamed it to tableIdsOnDisk
##
modules/storage-rocksdb/src/main/java/org/apache/ignite/internal/storage/rocksdb/index/AbstractRocksDbIndexStorage.java:
##
@@ -257,14 +257,27 @@ String createStorageInfo() {
}
/**
- * Deletes the data associated with the index, using passed write batch
for the operation.
+ * Deletes the data associated with the index to prepare the storage for
subsequent use, using passed write batch for the operation.
+ *
+ * @throws RocksDBException If failed to delete data.
+ */
+public final void eraseData(WriteBatch writeBatch) throws RocksDBException
{
Review Comment:
Oops. Made another rename
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] IGNITE-25854 Implement method to get non-destroyed RocksDB tables IDs [ignite-3]
sashapolo commented on code in PR #6229:
URL: https://github.com/apache/ignite-3/pull/6229#discussion_r2197857331
##
modules/storage-rocksdb/src/main/java/org/apache/ignite/internal/storage/rocksdb/instance/SharedRocksDbInstance.java:
##
@@ -443,4 +448,44 @@ private void destroyColumnFamily(ColumnFamily
columnFamily) {
);
}
}
+
+/**
+ * Returns IDs of all tables for which there are storages in the
underlying RocksDB.
+ */
+public Set tableIdsInRocksDb() {
+Set tableIds = new HashSet<>();
+
+try (
+var upperBound = new
Slice(incrementPrefix(PARTITION_META_PREFIX));
+var readOptions = new
ReadOptions().setIterateUpperBound(upperBound);
+RocksIterator it = meta.columnFamily().newIterator(readOptions)
+) {
+it.seek(PARTITION_META_PREFIX);
+
+while (it.isValid()) {
+byte[] key = it.key();
+int tableId = ByteUtils.bytesToInt(key,
PARTITION_META_PREFIX.length);
+tableIds.add(tableId);
+
+it.next();
+}
+
+// Doing this to make an exception thrown if the iteration was
stopped due to an error and not due to exhausting
+// the iteration space.
+it.status();
+} catch (RocksDBException e) {
+throw new IgniteInternalException(INTERNAL_ERR, "Cannot get table
IDs", e);
+}
+
+return Set.copyOf(tableIds);
Review Comment:
ok, whatever, it just looks odd, that's all
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] IGNITE-25854 Implement method to get non-destroyed RocksDB tables IDs [ignite-3]
rpuch commented on code in PR #6229:
URL: https://github.com/apache/ignite-3/pull/6229#discussion_r2197826927
##
modules/storage-rocksdb/src/main/java/org/apache/ignite/internal/storage/rocksdb/instance/SharedRocksDbInstance.java:
##
@@ -443,4 +448,44 @@ private void destroyColumnFamily(ColumnFamily
columnFamily) {
);
}
}
+
+/**
+ * Returns IDs of all tables for which there are storages in the
underlying RocksDB.
+ */
+public Set tableIdsInRocksDb() {
+Set tableIds = new HashSet<>();
+
+try (
+var upperBound = new
Slice(incrementPrefix(PARTITION_META_PREFIX));
+var readOptions = new
ReadOptions().setIterateUpperBound(upperBound);
+RocksIterator it = meta.columnFamily().newIterator(readOptions)
+) {
+it.seek(PARTITION_META_PREFIX);
+
+while (it.isValid()) {
+byte[] key = it.key();
+int tableId = ByteUtils.bytesToInt(key,
PARTITION_META_PREFIX.length);
+tableIds.add(tableId);
+
+it.next();
+}
+
+// Doing this to make an exception thrown if the iteration was
stopped due to an error and not due to exhausting
+// the iteration space.
+it.status();
+} catch (RocksDBException e) {
+throw new IgniteInternalException(INTERNAL_ERR, "Cannot get table
IDs", e);
+}
+
+return Set.copyOf(tableIds);
Review Comment:
It's not about protecting internal state, that's true; but the collection is
not intended for being modified by the caller, so immutability seems to be a
good thing here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] IGNITE-25854 Implement method to get non-destroyed RocksDB tables IDs [ignite-3]
sashapolo commented on code in PR #6229:
URL: https://github.com/apache/ignite-3/pull/6229#discussion_r2197664216
##
modules/storage-rocksdb/src/main/java/org/apache/ignite/internal/storage/rocksdb/index/AbstractRocksDbIndexStorage.java:
##
@@ -257,14 +257,27 @@ String createStorageInfo() {
}
/**
- * Deletes the data associated with the index, using passed write batch
for the operation.
+ * Deletes the data associated with the index to prepare the storage for
subsequent use, using passed write batch for the operation.
+ *
+ * @throws RocksDBException If failed to delete data.
+ */
+public final void eraseData(WriteBatch writeBatch) throws RocksDBException
{
Review Comment:
You renamed it to `cleanData`, was that on purpose?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] IGNITE-25854 Implement method to get non-destroyed RocksDB tables IDs [ignite-3]
sashapolo commented on code in PR #6229:
URL: https://github.com/apache/ignite-3/pull/6229#discussion_r2197660458
##
modules/storage-rocksdb/src/main/java/org/apache/ignite/internal/storage/rocksdb/RocksDbStorageEngine.java:
##
@@ -270,4 +273,24 @@ public void destroyMvTable(int tableId) {
rocksDbStorage.rocksDbInstance.destroyTable(tableId);
}
}
+
+@Override
+public Set tableIdsOnDisk() {
+return storageByProfileName.values().stream()
+.flatMap(storage ->
storage.rocksDbInstance.tableIdsInRocksDb().stream())
Review Comment:
can we call it `persistedTableIds`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] IGNITE-25854 Implement method to get non-destroyed RocksDB tables IDs [ignite-3]
sashapolo commented on code in PR #6229:
URL: https://github.com/apache/ignite-3/pull/6229#discussion_r2197659312
##
modules/storage-rocksdb/src/main/java/org/apache/ignite/internal/storage/rocksdb/instance/SharedRocksDbInstance.java:
##
@@ -443,4 +448,44 @@ private void destroyColumnFamily(ColumnFamily
columnFamily) {
);
}
}
+
+/**
+ * Returns IDs of all tables for which there are storages in the
underlying RocksDB.
+ */
+public Set tableIdsInRocksDb() {
+Set tableIds = new HashSet<>();
+
+try (
+var upperBound = new
Slice(incrementPrefix(PARTITION_META_PREFIX));
+var readOptions = new
ReadOptions().setIterateUpperBound(upperBound);
+RocksIterator it = meta.columnFamily().newIterator(readOptions)
+) {
+it.seek(PARTITION_META_PREFIX);
+
+while (it.isValid()) {
+byte[] key = it.key();
+int tableId = ByteUtils.bytesToInt(key,
PARTITION_META_PREFIX.length);
+tableIds.add(tableId);
+
+it.next();
+}
+
+// Doing this to make an exception thrown if the iteration was
stopped due to an error and not due to exhausting
+// the iteration space.
+it.status();
+} catch (RocksDBException e) {
+throw new IgniteInternalException(INTERNAL_ERR, "Cannot get table
IDs", e);
+}
+
+return Set.copyOf(tableIds);
Review Comment:
yes, it's just strange, because you fill an intermediate variable already
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] IGNITE-25854 Implement method to get non-destroyed RocksDB tables IDs [ignite-3]
sashapolo commented on code in PR #6229:
URL: https://github.com/apache/ignite-3/pull/6229#discussion_r2197658396
##
modules/storage-api/src/testFixtures/java/org/apache/ignite/internal/storage/engine/AbstractPersistentStorageEngineTest.java:
##
@@ -70,6 +70,9 @@
* because it doesn't limit the usage of the engine with a single table.
*/
public abstract class AbstractPersistentStorageEngineTest extends
AbstractStorageEngineTest {
+/** Makes sure that table destruction is persisted durably. */
+protected abstract void persistTableDestructionIfNeeded();
Review Comment:
Why can't Rocksdb's `destroy` method have the same guarantees as Page
Memory's?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] IGNITE-25854 Implement method to get non-destroyed RocksDB tables IDs [ignite-3]
rpuch commented on code in PR #6229:
URL: https://github.com/apache/ignite-3/pull/6229#discussion_r2197655210
##
modules/storage-api/src/testFixtures/java/org/apache/ignite/internal/storage/engine/AbstractPersistentStorageEngineTest.java:
##
@@ -70,6 +70,9 @@
* because it doesn't limit the usage of the engine with a single table.
*/
public abstract class AbstractPersistentStorageEngineTest extends
AbstractStorageEngineTest {
+/** Makes sure that table destruction is persisted durably. */
+protected abstract void persistTableDestructionIfNeeded();
Review Comment:
Yes, it's because for RocksDB we need to additionally call flush (which I
did not put on the `StorageEngine` interface as it's only needed for tests).
Making them identical would probably mean having the method on the interface
which is weird (as it's only for tests).
Are you thinking about a different solution?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] IGNITE-25854 Implement method to get non-destroyed RocksDB tables IDs [ignite-3]
rpuch commented on code in PR #6229:
URL: https://github.com/apache/ignite-3/pull/6229#discussion_r2197651451
##
modules/storage-rocksdb/src/main/java/org/apache/ignite/internal/storage/rocksdb/instance/SharedRocksDbInstance.java:
##
@@ -443,4 +448,44 @@ private void destroyColumnFamily(ColumnFamily
columnFamily) {
);
}
}
+
+/**
+ * Returns IDs of all tables for which there are storages in the
underlying RocksDB.
+ */
+public Set tableIdsInRocksDb() {
+Set tableIds = new HashSet<>();
+
+try (
+var upperBound = new
Slice(incrementPrefix(PARTITION_META_PREFIX));
+var readOptions = new
ReadOptions().setIterateUpperBound(upperBound);
+RocksIterator it = meta.columnFamily().newIterator(readOptions)
+) {
+it.seek(PARTITION_META_PREFIX);
+
+while (it.isValid()) {
+byte[] key = it.key();
+int tableId = ByteUtils.bytesToInt(key,
PARTITION_META_PREFIX.length);
+tableIds.add(tableId);
+
+it.next();
+}
+
+// Doing this to make an exception thrown if the iteration was
stopped due to an error and not due to exhausting
+// the iteration space.
+it.status();
+} catch (RocksDBException e) {
+throw new IgniteInternalException(INTERNAL_ERR, "Cannot get table
IDs", e);
+}
+
+return Set.copyOf(tableIds);
Review Comment:
Are you asking about the defensive copy? It's to make it easier to reason
about the code, and this method will not be called often, so allocations are
not a problem
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] IGNITE-25854 Implement method to get non-destroyed RocksDB tables IDs [ignite-3]
rpuch commented on code in PR #6229:
URL: https://github.com/apache/ignite-3/pull/6229#discussion_r2197626520
##
modules/storage-rocksdb/src/main/java/org/apache/ignite/internal/storage/rocksdb/RocksDbStorageEngine.java:
##
@@ -270,4 +273,24 @@ public void destroyMvTable(int tableId) {
rocksDbStorage.rocksDbInstance.destroyTable(tableId);
}
}
+
+@Override
+public Set tableIdsOnDisk() {
+return storageByProfileName.values().stream()
+.flatMap(storage ->
storage.rocksDbInstance.tableIdsInRocksDb().stream())
Review Comment:
With `onDisk` there is a slight semantic problem: it is possible that what
this method returns is not on disk yet; it's *potentially* on disk. `InRocksDb`
describes the situation more precisely, that's why I chose this name.
I can rename it to use `onDisk` as well, it doesn't seem to be a big deal.
Or we could invent a better name for `tableIdsOnDisk`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] IGNITE-25854 Implement method to get non-destroyed RocksDB tables IDs [ignite-3]
sashapolo commented on code in PR #6229:
URL: https://github.com/apache/ignite-3/pull/6229#discussion_r2197529302
##
modules/storage-rocksdb/src/main/java/org/apache/ignite/internal/storage/rocksdb/RocksDbStorageEngine.java:
##
@@ -270,4 +273,24 @@ public void destroyMvTable(int tableId) {
rocksDbStorage.rocksDbInstance.destroyTable(tableId);
}
}
+
+@Override
+public Set tableIdsOnDisk() {
+return storageByProfileName.values().stream()
+.flatMap(storage ->
storage.rocksDbInstance.tableIdsInRocksDb().stream())
+.collect(toUnmodifiableSet());
+}
+
+/**
+ * Flushes all changes made to the underlying RocksDB instances to disk.
+ *
+ * @return Future that gets completed when flush is complete.
+ */
+public CompletableFuture flush() {
+@SuppressWarnings("rawtypes")
+CompletableFuture[] futures = storageByProfileName.values().stream()
Review Comment:
you can write `CompletableFuture[] futures` and the raw types warning
will go away
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] IGNITE-25854 Implement method to get non-destroyed RocksDB tables IDs [ignite-3]
sashapolo commented on code in PR #6229:
URL: https://github.com/apache/ignite-3/pull/6229#discussion_r2197516582
##
modules/storage-rocksdb/src/main/java/org/apache/ignite/internal/storage/rocksdb/index/AbstractRocksDbIndexStorage.java:
##
@@ -257,14 +257,27 @@ String createStorageInfo() {
}
/**
- * Deletes the data associated with the index, using passed write batch
for the operation.
+ * Deletes the data associated with the index to prepare the storage for
subsequent use, using passed write batch for the operation.
+ *
+ * @throws RocksDBException If failed to delete data.
+ */
+public final void eraseData(WriteBatch writeBatch) throws RocksDBException
{
Review Comment:
We usually use `clearData` in this context
##
modules/storage-rocksdb/src/main/java/org/apache/ignite/internal/storage/rocksdb/RocksDbStorageEngine.java:
##
@@ -270,4 +273,24 @@ public void destroyMvTable(int tableId) {
rocksDbStorage.rocksDbInstance.destroyTable(tableId);
}
}
+
+@Override
+public Set tableIdsOnDisk() {
+return storageByProfileName.values().stream()
+.flatMap(storage ->
storage.rocksDbInstance.tableIdsInRocksDb().stream())
Review Comment:
`tableIdsInRocksDb` doesn't look like a good name here, it's obvious that
it's from RocksDB. Why is this method not also called `tableIdsOnDisk`?
##
modules/storage-rocksdb/src/main/java/org/apache/ignite/internal/storage/rocksdb/instance/SharedRocksDbInstance.java:
##
@@ -443,4 +448,44 @@ private void destroyColumnFamily(ColumnFamily
columnFamily) {
);
}
}
+
+/**
+ * Returns IDs of all tables for which there are storages in the
underlying RocksDB.
+ */
+public Set tableIdsInRocksDb() {
+Set tableIds = new HashSet<>();
+
+try (
+var upperBound = new
Slice(incrementPrefix(PARTITION_META_PREFIX));
+var readOptions = new
ReadOptions().setIterateUpperBound(upperBound);
+RocksIterator it = meta.columnFamily().newIterator(readOptions)
+) {
+it.seek(PARTITION_META_PREFIX);
+
+while (it.isValid()) {
+byte[] key = it.key();
+int tableId = ByteUtils.bytesToInt(key,
PARTITION_META_PREFIX.length);
+tableIds.add(tableId);
+
+it.next();
+}
+
+// Doing this to make an exception thrown if the iteration was
stopped due to an error and not due to exhausting
+// the iteration space.
+it.status();
+} catch (RocksDBException e) {
+throw new IgniteInternalException(INTERNAL_ERR, "Cannot get table
IDs", e);
+}
+
+return Set.copyOf(tableIds);
Review Comment:
What's this for?
##
modules/storage-rocksdb/src/main/java/org/apache/ignite/internal/storage/rocksdb/index/AbstractRocksDbIndexStorage.java:
##
@@ -257,14 +257,27 @@ String createStorageInfo() {
}
/**
- * Deletes the data associated with the index, using passed write batch
for the operation.
+ * Deletes the data associated with the index to prepare the storage for
subsequent use, using passed write batch for the operation.
+ *
+ * @throws RocksDBException If failed to delete data.
+ */
+public final void eraseData(WriteBatch writeBatch) throws RocksDBException
{
+cleanup(writeBatch, false);
+}
+
+/**
+ * Deletes the data associated with the index (the storage will not be
used anymore), using passed write batch for the operation.
*
* @throws RocksDBException If failed to delete data.
*/
public final void destroyData(WriteBatch writeBatch) throws
RocksDBException {
+cleanup(writeBatch, true);
+}
+
+private void cleanup(WriteBatch writeBatch, boolean finalDestruction)
throws RocksDBException {
clearIndex(writeBatch);
-if (descriptor.mustBeBuilt()) {
+if (descriptor.mustBeBuilt() && !finalDestruction) {
Review Comment:
maybe two separate (partially copy-pasted) methods are a cleaner solution
here...
##
modules/storage-api/src/testFixtures/java/org/apache/ignite/internal/storage/engine/AbstractPersistentStorageEngineTest.java:
##
@@ -70,6 +70,9 @@
* because it doesn't limit the usage of the engine with a single table.
*/
public abstract class AbstractPersistentStorageEngineTest extends
AbstractStorageEngineTest {
+/** Makes sure that table destruction is persisted durably. */
+protected abstract void persistTableDestructionIfNeeded();
Review Comment:
Why is this method needed? Is it because PageMemory and RocksDb have
different `destroy` durability effects? Can we just make them identical?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
