[ https://issues.apache.org/jira/browse/HDDS-1085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16769933#comment-16769933 ]
Anu Engineer edited comment on HDDS-1085 at 2/16/19 1:12 AM: ------------------------------------------------------------- [~avijayan] It is a very good patch, well written and very easy to understand. I have some very minor comments. # *DBCheckPointSnapShot#getCheckpointLocation* – Return a path ? # *OMDbSnapshotServlet.java#doGet* - I understand that doing this inline is perhaps simpler than anything else. But we seem to be doing, one, checkpointing, two taring before we start the transfer. For DB sizes, in GBs it might be ok, but in the long run I am worried that we might start seeing client timeouts. ## To understand what is happening, it might be interesting to have 3 counters – or a map of counters. ### How much time are we taking for each CheckPoint ### How much time are we taking for each Tar operation – along with sizes ### How much time are we taking for the transfer. ## You don't have to do this in this patch, feel free to add that in a different patch. In the long run, if we have issues like client time out, this number will help us tune the client params. Also, at some point, we will have to do this in a background thread and just return when we are ready and not sync like this. But this is a great start. So let us go ahead and see what we can get out of this. # *OMDbSnapshotServlet.java#doGet* - Since we are using the TransferImage class, are we going to carry hadoop-hdfs Jar too ? Should we even consider moving this to hadoop-common? [~xyao], [~elek], [~bharatviswa] # OmUtils.java- check if we have this Tarfile code already in Ozone. I think we have something like this already [~elek] ? # *OmUtils.java#addFilesToArchive* – In the recursive call we seem to pass _cFile.getAbsolutePath_, is that expected? or should the archive contain relative paths? # *RDBCheckpointManager#createCheckpointSnapshot* - I see we are reading the temp directory for the JVM env. but doesn't the checkpoint of RocksDB need/or is fast if it is on the same disk since it is able to hard link the SST and WAL files? Just wanted to make sure that my understanding is not busted. was (Author: anu): [~avijayan] It is a very good patch, well written and very easy to understand. I have some very minor comments. # *DBCheckPointSnapShot#getCheckpointLocation* – Return a path ? # *OMDbSnapshotServlet.java#doGe*t - I understand that doing this inline is perhaps simpler than anything else. But we seem to be doing, one, checkpointing, two taring before we start the transfer. For DB sizes, in GBs it might be ok, but in the long run I am worried that we might start seeing client timeouts. ## To understand what is happening, it might be interesting to have 3 counters – or a map of counters. ### How much time are we taking for each CheckPoint ### How much time are we taking for each Tar operation – along with sizes ### How much time are we taking for the transfer. ## You don't have to do this in this patch, feel free to add that in a different patch. In the long run, if we have issues like client time out, this number will help us tune the client params. Also, at some point, we will have to do this in a background thread and just return when we are ready and not sync like this. But this is a great start. So let us go ahead and see what we can get out of this. # *OMDbSnapshotServlet.java#doGet* - Since we are using the TransferImage class, are we going to carry hadoop-hdfs Jar too ? Should we even consider moving this to hadoop-common? [~xyao], [~elek], [~bharatviswa] # OmUtils.java- check if we have this Tarfile code already in Ozone. I think we have something like this already [~elek] ? # *OmUtils.java#addFilesToArchive* – In the recursive call we seem to pass _cFile.getAbsolutePath_, is that expected? or should the archive contain relative paths? # *RDBCheckpointManager#createCheckpointSnapshot* - I see we are reading the temp directory for the JVM env. but doesn't the checkpoint of RocksDB need/or is fast if it is on the same disk since it is able to hard link the SST and WAL files? Just wanted to make sure that my understanding is not busted. > Create an OM API to serve snapshots to Recon server > --------------------------------------------------- > > Key: HDDS-1085 > URL: https://issues.apache.org/jira/browse/HDDS-1085 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Reporter: Siddharth Wagle > Assignee: Aravindan Vijayan > Priority: Major > Attachments: HDDS-1085-000.patch, HDDS-1085-001.patch, > HDDS-1085-002.patch > > > We need to add an API to OM so that we can serve snapshots from the OM server. > - The snapshot should be streamed to fsck server with the ability to > throttle network utilization (like TransferFsImage) -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org