Young-Seok Kim has posted comments on this change. Change subject: Introducing Data Replication To AsterixDB ......................................................................
Patch Set 6: (36 comments) I left comments. Please address them. I haven't finished the review yet, so I will continue. https://asterix-gerrit.ics.uci.edu/#/c/338/6/asterix-app/src/main/java/org/apache/asterix/hyracks/bootstrap/NCApplicationEntryPoint.java File asterix-app/src/main/java/org/apache/asterix/hyracks/bootstrap/NCApplicationEntryPoint.java: Line 162: //Note: this is a hack since each node right now maintains its own copy of the cluster configuration. Who provides the original configuration? When is the copy created? and Where is the copy maintained? Line 187: recoveryMgr.checkpoint(true, NON_SHARP_CHECKPOINT_TARGET_LSN); It will be simpler if we can do checkpoint right after LifeCycleComponentManager().stopAll() method. https://asterix-gerrit.ics.uci.edu/#/c/338/6/asterix-common/src/main/java/org/apache/asterix/common/config/AsterixReplicationProperties.java File asterix-common/src/main/java/org/apache/asterix/common/config/AsterixReplicationProperties.java: Line 14: */ This license should be changed. Line 173: public Set<String> getNodeReplicas(String nodeId) { It's better to change the method name as something like getNodeReplicaIds() to distinguish from the getRemoteReplicas() which actually returns a set of replicas. Line 197: for (int i = nodeIndex - 1; i >= 0; i--) { Why is the direction of iteration different from the for clause in getRemoteReplicas() method? It seems that there is no reason to be different. Line 208: for (int i = cluster.getNode().size() - 1; i >= 0; i--) { same comment for this "Why is the direction of iteration different from the for clause in getRemoteReplicas() method? It seems that there is no reason to be different. " Line 221: public Set<String> getNodeDataReplicas(String nodeId) { Please help me understand the difference between this method and the method, getNodeReplicas() above. They seem to return nodeIds each of which replicate the same data. What am I missing here? Also, regardless of whether the two methods are play the same role or not, the name better be more like a getXXXXIds() instead of getXXXReplicas(). https://asterix-gerrit.ics.uci.edu/#/c/338/6/asterix-common/src/main/java/org/apache/asterix/common/replication/AsterixReplicationJob.java File asterix-common/src/main/java/org/apache/asterix/common/replication/AsterixReplicationJob.java: Line 22: Please explain the difference between the LSMIndexReplicationJob and AsterixReplicationJob. Which kind of replications should belong to AsterixReplicationJob? https://asterix-gerrit.ics.uci.edu/#/c/338/6/asterix-common/src/main/java/org/apache/asterix/common/replication/IReplicationLifecycleListener.java File asterix-common/src/main/java/org/apache/asterix/common/replication/IReplicationLifecycleListener.java: Line 20: What's the purpose of having a separate Interface without having any added public method? https://asterix-gerrit.ics.uci.edu/#/c/338/6/asterix-common/src/main/java/org/apache/asterix/common/replication/Replica.java File asterix-common/src/main/java/org/apache/asterix/common/replication/Replica.java: Line 14: */ license should be changed. Line 59: String replicaIPAddreess = node.getClusterIp(); typo: replicaIPAddreess --> replicaIPAddress Line 73: output.writeUTF(node.getId()); why UTF? https://asterix-gerrit.ics.uci.edu/#/c/338/6/asterix-common/src/main/java/org/apache/asterix/common/replication/ReplicaEvent.java File asterix-common/src/main/java/org/apache/asterix/common/replication/ReplicaEvent.java: Line 24: public enum ReplicaEventType { Let's explain the meansing of each event and when each event can happen. https://asterix-gerrit.ics.uci.edu/#/c/338/6/asterix-common/src/main/java/org/apache/asterix/common/transactions/LogRecord.java File asterix-common/src/main/java/org/apache/asterix/common/transactions/LogRecord.java: Line 40: * NodeIdLength(4) NodeIdLength is first written and then NodeId is written. So, they should be switched. Line 133: buffer.put(nodeId.getBytes(java.nio.charset.StandardCharsets.UTF_8)); why UTF? Line 169: public void writeLogRecord(ByteBuffer buffer, long appendLSN) { In order to avoid duplicated code, we should create an private method which can be used for replicated mode's writeLogRecord, non-replicated mode's writeLogRecord and serialize() method, too. The same comment goes to readLogRecord() and deserialize() method. Line 204: >From here, please reflect the newly added fields to the LogRecordFormat at the >top of this class. Also, update the size of field(s) accordingly in the >comment. Otherwise, that will confuse the readers of this code. Line 347: public void formJobTerminateLogRecord(ITransactionContext txnCtx, boolean isCommit) { formXXXLogRecord() methods can be refactored to avoid duplicated code. Line 437: logSize += nodeIdLength; nodeIdLength can be included in the existing constants such as XXX_SIZE instead of adding here. Line 463: public int serialize(ByteBuffer buffer) { Again, this method should be refactored. Line 498: public void deserialize(ByteBuffer buffer) { Again, this should be refactored. Line 539: tupleBuffer = ByteBuffer.allocate(newValueSize); You don't want to allocate buffer for every update log record. It's better to allocate the buffer beforehand and reuse it. Also, if you allocate buffer every time, then, this tupleBuffer doesn't have to be a member variable. https://asterix-gerrit.ics.uci.edu/#/c/338/6/asterix-replication/src/main/java/org/apache/asterix/replication/functions/AsterixReplicationProtocol.java File asterix-replication/src/main/java/org/apache/asterix/replication/functions/AsterixReplicationProtocol.java: Line 42: public final static int REPLICATION_FUNCATION_SIZE = INTEGER_SIZE; typo: FUNCATION --> FUNCTION Line 45: public enum ReplicationFunctions { It's better to change the enum class name into something like ReplicationFunctionType or ReplicationRequestType. Also, the name had better be consistent with other method names such as readRequest() and constructXXXRequest(). If the enum is ReplicationFunciton, then those methods better be XXXFunciton() instead of Request(). Lastly, each of the types should have description such as the role of each type and when it is used. Line 67: dataBuffer = ByteBuffer.allocate(requestSize); This doesn't work, right? The newly allocated buffer is not visible to the caller of this method. Either the allocated buffer should be returned or buffer should be guaranteed to be large enough to accommodate the object to be read. Line 74: public static void constructLSMComponentPropertiesRequest(LSMComponentProperties lsmCompProp, ByteBuffer buffer) All constuctXXXRequest() methods write a given object to a destination buffer. It will be more understandable if the method name is modified to something like a writeXXXRequest(object, destBuffer). Line 77: DataOutputStream oos = new DataOutputStream(outputStream); Better be avoiding creating these two object in every call unless this call is one time call per component. LSMComponentProperties is a fixed size. Also, is there a way to avoid two memory writes, one for serialize and one for put the serialized one to the buffer? Line 83: buffer = ByteBuffer.allocate(requestSize); This doesn't work, right? The newly allocated buffer is not visible to the caller of this method. Either the allocated buffer should be returned or buffer should be guaranteed to be large enough to accommodate the object to be written. Line 93: public static ReplicationFunctions getRequestFunction(SocketChannel socketChannel, ByteBuffer byteBuffer) better be getRequestType() or getReplicationRequestType() Line 109: ByteBuffer bb = ByteBuffer.allocate(REPLICATION_FUNCATION_SIZE); Why this bb and the bb in getOKBuffer() methods are allocated every time? GOODBYE and OK is sent only once? If it's not, better be allocated beforehand and reuse it. Or, this goodbye bytebuffer can be created once and reused again and again. Same comment for getOKBuffer(). Line 139: requestBuffer.putInt(serializedLog.length); serializedLog.length means the allocated byte array size, not the limit position of the bytebuffer. Unless serializedLog byte array is allocated exactly the same size of the serialized log, the serializedLog.length will be bigger than the actual serializedLog size. I can see in ReplicationManager's code: remoteLogsQ.offer(logRecord.getSerializedLog().array().clone()); I think the bytearray size should be given as well, to avoid this problem. ------------------- ByteBuffer b = ByteBuffer.allocate(10); b.put((byte) 1); b.flip(); System.out.println(b.array().clone().length); ---------------------- The above code prints 10 not 1. Line 151: same comments as constructLSMComponentPropertiesRequest(). https://asterix-gerrit.ics.uci.edu/#/c/338/6/asterix-replication/src/main/java/org/apache/asterix/replication/storage/LSMComponentProperties.java File asterix-replication/src/main/java/org/apache/asterix/replication/storage/LSMComponentProperties.java: Line 57: } This can be removed. https://asterix-gerrit.ics.uci.edu/#/c/338/6/asterix-replication/src/main/java/org/apache/asterix/replication/storage/ReplicaResourcesManager.java File asterix-replication/src/main/java/org/apache/asterix/replication/storage/ReplicaResourcesManager.java: Line 153: What will happen if the system crashes right after the file is deleted, but before the lsnMap is updated? Shouldn't deleting the file and updating lsnMap be atomic? Line 278: public Set<String> getLaggingReplicaIndexes(String replicaId, long targetLSN) throws IOException { Can we name it as getLaggingReplicaIndexPaths()? Line 297: public HashMap<Long, String> getLaggingReplicaIndexes2(String replicaId, long targetLSN) throws IOException { I understand the naming is hard. :) Can we name it as getLaggingReplicaIndexesId2PathMap()? -- To view, visit https://asterix-gerrit.ics.uci.edu/338 To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-MessageType: comment Gerrit-Change-Id: I729fdd1144dbc9ff039b4bc414494860d7553810 Gerrit-PatchSet: 6 Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Owner: Murtadha Hubail <[email protected]> Gerrit-Reviewer: Ian Maxon <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Yingyi Bu <[email protected]> Gerrit-Reviewer: Young-Seok Kim <[email protected]> Gerrit-Reviewer: abdullah alamoudi <[email protected]> Gerrit-HasComments: Yes
