On Mon, Nov 26, 2018 at 4:00 PM Ranjit DSouza <ranjit.dso...@veritas.com>
wrote:

> Nir
>
>
>
> I think you were spot on with the content-range not getting sent to RHV
> server. Good catch!
>
>
>
> Ok so that problem was in our http client code where we were not setting
> this header in libcurl. Now that we have moved forward, we are seeing that
> the restored disk actual_size is 4k over the provisioned size.
>

actual size is the allocated size on storage - basically:

    st_blocks * 512

We know that sometimes creating fully allocated disk show st_size + 4k. I
don't know
why this happens but it does not change anything for the guest or for oVirt.

The important check is having exactly st_size bytes in the upload - same as
in the
uploaded file, and checking that both contain the same content.

Nir


>
>
> {
>
>   "actual_size" : "3221229568", //this is 3GB + 4k
>
>   "alias" : "vmRestoreDisk",
>
>   "content_type" : "data",
>
>   "format" : "raw",
>
>   "image_id" : "b69363da-620e-4f55-a3c7-1481e85c4164",
>
>   "propagate_errors" : "false",
>
>   "provisioned_size" : "3221225472",
>
>   "shareable" : "false",
>
>   "sparse" : "true",
>
>   "status" : "ok",
>
>   "storage_type" : "image",
>
>   "total_size" : "0",
>
>   "wipe_after_delete" : "false",
>
>   "disk_profile" : {
>
>     "href" :
> "/ovirt-engine/api/diskprofiles/555ef5b2-807e-4f21-9a32-0494686515e4",
>
>     "id" : "555ef5b2-807e-4f21-9a32-0494686515e4"
>
>   },
>
>
>
> I was expecting it to be 1 GB as was the original disk. But I am able to
> boot the vm and log in and look at the directories, (earlier I was getting
> an error when I opened the console that it was not a bootable disk)
>
>
>
> {
>
>   "actual_size" : "1389109248",
>
>   "alias" : "3gbdisk",
>
>   "content_type" : "data",
>
>   "format" : "raw",
>
>   "image_id" : "8fbac55e-0c86-4c0b-911b-f5b0a6722834",
>
>   "propagate_errors" : "false",
>
>   "provisioned_size" : "3221225472",
>
>   "shareable" : "false",
>
>   "sparse" : "true",
>
>   "status" : "ok",
>
>
>
>
>
> I am going through the var/log/ovirt-imageio-daemon logs to check for any
> clues. In the meanwhile, do let us know your thoughts on why this may have
> happened.
>
> (we are taking your performance related comments seriously and will work
> on it once we are done with this)
>
>
>
> Thanks
>
> Ranjit
>
>
>
> *From:* Nir Soffer [mailto:nsof...@redhat.com]
> *Sent:* Saturday, November 24, 2018 12:17 AM
> *To:* Ranjit DSouza <ranjit.dso...@veritas.com>
> *Cc:* devel <devel@ovirt.org>; Pavan Chavva <pcha...@redhat.com>;
> Suchitra Herwadkar <suchitra.herwad...@veritas.com>; Abhay Marode <
> abhay.mar...@veritas.com>
> *Subject:* [EXTERNAL] Re: mismatch in disk size while uploading a disk in
> chunks using Image Transfer
>
>
>
> On Fri, Nov 23, 2018 at 2:49 PM Ranjit DSouza <ranjit.dso...@veritas.com>
> wrote:
>
> ...
>
> I am trying to upload a snapshot disk in chunks. Everything seems to work
> fine, but observed that the actual_size after upload, is much lesser than
> the actual_size of the original disk.
>
>
>
> Here are the steps:
>
> 1.       Take a snapshot of a vm disk and download it (using Image
> Transfer mechanism). Save it on the file system somewhere.  This disk name
> is *3gbdisk*. It is Raw + sparse. Resides on nfs storage. The size of
> this downloaded file is 3 GB.
>
>
>
>   "actual_size" : "*1389109248*", //1 GB
>
>
>
> This is the allocated size (what du -sh filename will show).
>
>
>
> But in 4.2 we do not support yet detection of zero or unallocated areas in
> the image,
>
> so you always download the complete image. Zero or unallocated areas are
> downloaded
>
> as zeros.
>
>
>
> ...
>
>  2.       Now create a new floating disk, (raw + sparse), with
> provisioned_size = 3221225472, or 3 GB. This disk name is vmRestoreDisk
>
> 3.       Upload to this disk using Image Transfer API, using libCurl  in
> chunks of 128 MB. This is done in a while loop,  sequentially reading
> portions of the file downloaded in step 1 and uploading these chunks via
> libcurl.  I Use the Transfer URL, not proxy URL.
>
>
>
> Here is the trace of the first chunk. Note the Content-Range and
> Content-Length headers. Start offset = 0, end offset = 134217727 (or 128 MB)
>
>
>
> upload request for chunk, start offset: 0, end offset: 134217727
>
> Upload Started
>
> Header:Content-Range: bytes 0-134217727/3221225472
>
>
>
> The Content-Range header looks correct...
>
>
>
> Header:Content-Length: 3221225472
>
> *   Trying 10.210.46.215...
>
> * TCP_NODELAY set
>
> * Connected to pnm86hpch30bl15.pne.ven.veritas.com (10.210.46.215) port
> 54322 (#0)
>
> * ALPN, offering http/1.1
>
> * Cipher selection:
> ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
>
> * SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
>
> * ALPN, server did not agree to a protocol
>
> * Server certificate:
>
> *  subject: O=pne.ven.veritas.com; CN=pnm86hpch30bl15.pne.ven.veritas.com
>
> *  start date: Oct  7 08:55:24 2018 GMT
>
> *  expire date: Oct  7 08:55:24 2023 GMT
>
> *  issuer: C=US; O=pne.ven.veritas.com;
> CN=pravauto20.pne.ven.veritas.com.59289
>
> *  SSL certificate verify result: unable to get local issuer certificate
> (20), continuing anyway.
>
> > PUT /images/8ebc9fa8-d322-423e-8a14-5e46ca10ed4e HTTP/1.1
>
> Host: pnm86hpch30bl15.pne.ven.veritas.com:54322
>
> Accept: */*
>
> Content-Length: 134217728
>
> Expect: 100-continue
>
>
>
> But you did not send the Content-Range header for this request...
>
>
>
>
>
> * Done waiting for 100-continue
>
> * We are completely uploaded and fine
>
> * HTTP 1.0, assume close after body
>
> < HTTP/1.0 200 OK
>
>
>
> The request was successful, writing the first 128 MiB...
>
>
>
> < Date: Fri, 23 Nov 2018 11:52:53 GMT
>
> < Server: WSGIServer/0.1 Python/2.7.5
>
> < Content-Type: application/json; charset=UTF-8
>
> < Content-Length: 0
>
> <
>
> * Closing connection 0
>
> http response code from curl 200
>
> Upload Finished. Return Value: 0
>
>
>
> Looking in the attached trace, you never sent the Content-Range, so imageio
>
> happily wrote all chunks to the start of the image...
>
>
>
> 4.       Finalize the Image Transfer after all chunks are uploaded.
> Observed that the disk status goes from ‘uploading via API’ to finalizing
> to OK.
>
> 5.       Do a GET call on the disk (vmRestoreDisk).
>
>   "actual_size" : "*134217728*", //128MB
>
>
>
> Which explain why the file size is smaller than expected.
>
>
>
>   "alias" : "vmRestoreDisk",
>
>   "content_type" : "data",
>
>   "format" : "*raw*",
>
>   "image_id" : "3eda3df2-514a-4e78-b999-1729216b25db",
>
>   "propagate_errors" : "false",
>
>   "provisioned_size" : "3221225472",
>
>   "shareable" : "false",
>
>   "*sparse*" : "*true*",
>
>   "status" : "ok",
>
>   "storage_type" : "image",
>
>   "total_size" : "0",
>
>   "wipe_after_delete" : "false",
>
>
>
> As you can see, the actual size is just 128 MB, not 1 GB.  I have attached
> the logs of the upload operation. I think I may be missing something, let
> me know in case you need further information.
>
>
>
> Please always include the relevant part from
>
> /var/log/ovirt-imageio-daemon/daemon.log
>
>
>
> If you check this log you will find that all requests for this upload have:
>
>
>
> WRITE offset=0 size=134217728 ...
>
>
>
> Other issue I see in the attached trace:
>
>
>
> - You close the connection after every request - this is not needed and
> reduce throughput
>
>   use the same connection for the entire request
>
>
>
> - libcurl sends "Expect: 100-continue" header, but imageio does not handle
> this yet in
>
>   4.2. This may cause 1 second delay for every request, when libcurl wait
> for
>
>   "100 Continue" response, before sending the payload. This feature should
> be available
>
>   in 4.3[4]. Until this feature is supported it would be good idea to
> disable 100-continue
>
>   header in libcurl[5]. If you cannot disable the option, you can change
> the timeout[6] to
>
>   avoid the delay.
>
>
>
> - You don't check the server capabilities using OPTIONS[0] request. Every
> upload sholud
>
>   start by checking the server capabilities so you can optimize the upload
> using zero and
>
>   flush operations.
>
>
>
> - You don't use the ?flush=no query string - this is recommended for
> improving performance
>
>   if you use flush=no, you should send PATCH/flush[1] request at the end
> of the transfer.
>
>
>
> - It would be more efficient to send bigger chunks. The size of the chunk
> is depends on
>
>   the amount of data you like to resend if a request fails.
>
>
>
> - You can speed up the upload if you detect zero areas in the image and
> send them
>
>   using PATCH/zero[2] request.
>
>
>
> For example using all these features, see imageio python client[3]. If you
> can use the
>
> client you will get all this for free. Otherwise you can use it as example
> code for
>
> implementing the upload in another language.
>
>
>
> [0] http://ovirt.github.io/ovirt-imageio/random-io.html#options
>
> [1] http://ovirt.github.io/ovirt-imageio/random-io.html#zero-operation
>
> [2] http://ovirt.github.io/ovirt-imageio/random-io.html#flush-operation
>
> [3]
> https://github.com/oVirt/ovirt-imageio/blob/master/common/ovirt_imageio_common/client.py
>
> [4] https://bugzilla.redhat.com/1512324
>
> [5] https://curl.haxx.se/mail/lib-2017-07/0013.html
>
> [6] https://curl.haxx.se/libcurl/c/CURLOPT_EXPECT_100_TIMEOUT_MS.html
>
>
>
> Nir
>
>
>
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/TQYWZAY3U2LU7BA4A2SDOSI552O2M4C7/

Reply via email to