[jira] [Comment Edited] (HDDS-1085) Create an OM API to serve snapshots to Recon server

Anu Engineer (JIRA) Fri, 15 Feb 2019 17:13:29 -0800


    [ 
https://issues.apache.org/jira/browse/HDDS-1085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16769933#comment-16769933
 ]


Anu Engineer edited comment on HDDS-1085 at 2/16/19 1:12 AM:
-------------------------------------------------------------

[~avijayan] It is a very good patch, well written and very easy to understand. 
I have some very minor comments.
 # *DBCheckPointSnapShot#getCheckpointLocation* – Return a path ?
 # *OMDbSnapshotServlet.java#doGet* - I understand that doing this inline is 
perhaps simpler than anything else. But we seem to be doing, one, 
checkpointing, two taring before we start the transfer. For DB sizes, in GBs it 
might be ok, but in the long run I am worried that we might start seeing client 
timeouts.
 ## To understand what is happening, it might be interesting to have 3 counters 
– or a map of counters.
 ### How much time are we taking for each CheckPoint
 ### How much time are we taking for each Tar operation – along with sizes
 ### How much time are we taking for the transfer.
 ## You don't have to do this in this patch, feel free to add that in a 
different patch. In the long run, if we have issues like client time out, this 
number will help us tune the client params. Also, at some point, we will have 
to do this in a background thread and just return when we are ready and not 
sync like this. But this is a great start. So let us go ahead and see what we 
can get out of this.
 # *OMDbSnapshotServlet.java#doGet* - Since we are using the TransferImage 
class, are we going to carry hadoop-hdfs Jar too ? Should we even consider 
moving this to hadoop-common? [~xyao], [~elek], [~bharatviswa]
 # OmUtils.java- check if we have this Tarfile code already in Ozone. I think 
we have something like this already [~elek] ?
 # *OmUtils.java#addFilesToArchive* – In the recursive call we seem to pass 
_cFile.getAbsolutePath_, is that expected? or should the archive contain 
relative paths?
 # *RDBCheckpointManager#createCheckpointSnapshot* - I see we are reading the 
temp directory for the JVM env. but doesn't the checkpoint of RocksDB need/or 
is fast if it is on the same disk since it is able to hard link the SST and WAL 
files? Just wanted to make sure that my understanding is not busted.

 

 


was (Author: anu):
[~avijayan] It is a very good patch, well written and very easy to understand. 
I have some very minor comments.
 # *DBCheckPointSnapShot#getCheckpointLocation* – Return a path ?
 # *OMDbSnapshotServlet.java#doGe*t - I understand that doing this inline is 
perhaps simpler than anything else. But we seem to be doing, one, 
checkpointing, two taring before we start the transfer. For DB sizes, in GBs it 
might be ok, but in the long run I am worried that we might start seeing client 
timeouts.
 ## To understand what is happening, it might be interesting to have 3 counters 
– or a map of counters.
 ### How much time are we taking for each CheckPoint
 ### How much time are we taking for each Tar operation – along with sizes
 ### How much time are we taking for the transfer.
 ## You don't have to do this in this patch, feel free to add that in a 
different patch. In the long run, if we have issues like client time out, this 
number will help us tune the client params. Also, at some point, we will have 
to do this in a background thread and just return when we are ready and not 
sync like this. But this is a great start. So let us go ahead and see what we 
can get out of this.
 # *OMDbSnapshotServlet.java#doGet* - Since we are using the TransferImage 
class, are we going to carry hadoop-hdfs Jar too ? Should we even consider 
moving this to hadoop-common? [~xyao], [~elek], [~bharatviswa]
 # OmUtils.java- check if we have this Tarfile code already in Ozone. I think 
we have something like this already [~elek] ?
 # *OmUtils.java#addFilesToArchive* – In the recursive call we seem to pass 
_cFile.getAbsolutePath_, is that expected? or should the archive contain 
relative paths?
 # *RDBCheckpointManager#createCheckpointSnapshot* - I see we are reading the 
temp directory for the JVM env. but doesn't the checkpoint of RocksDB need/or 
is fast if it is on the same disk since it is able to hard link the SST and WAL 
files? Just wanted to make sure that my understanding is not busted.

 

 

> Create an OM API to serve snapshots to Recon server
> ---------------------------------------------------
>
>                 Key: HDDS-1085
>                 URL: https://issues.apache.org/jira/browse/HDDS-1085
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>            Reporter: Siddharth Wagle
>            Assignee: Aravindan Vijayan
>            Priority: Major
>         Attachments: HDDS-1085-000.patch, HDDS-1085-001.patch, 
> HDDS-1085-002.patch
>
>
> We need to add an API to OM so that we can serve snapshots from the OM server.
>  - The snapshot should be streamed to fsck server with the ability to 
> throttle network utilization (like TransferFsImage)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDDS-1085) Create an OM API to serve snapshots to Recon server

Reply via email to