Re: cloning to multiple drives with the "dd" command.

Bob Proulx Mon, 23 Feb 2015 13:31:06 -0800

didier chavaroche wrote:
> Ok, I recognize trying to clone 23 disks a once can be a little bit
> hard for my system.

Yes.  To say the least!  That could be a very large amount of data.
All divided by the total available bus bandwidth which will be around
100M/s.

> It is giving a hard time to the disk cache.

It will be giving a hard time to everything.

> But why can't I do it like in 2 or 3 step. cloning 8 drives at once,
> then another 8 then another 8.
>
> I tried this way and cloning the first 8 works fine I have a
> transfert rate about 100MB/s.
> 
> But then for the second part, cloning the 8 drives leads to a
> transfert rate about 20MB/s.

I can't understand what you are saying here.  You can clone the first
8 drives okay but then trying to repeat that a second time is not as
fast?  Is that what you are saying?

Previously you wrote:
> Technicaly I don't see any failure. It is just that my tranfert rate
> is around 100MB/s for the first half then it slows down to around
> 30MB/s for the second half which gravely affect the productivity of my
> system.  If it comes from the system there should be some command that
> I could use to res tore 100MB/s on the second half of the cloning
> process like flushing some cache or buffer.

That reads to me that the system was filling up ram buffer cache for
the first part and then once that was full the kernel needed to block
until it had space freed up for more data by writing to disk the
cached data.  Your raw system bus I/O speed seems to be around 30MB/s
and it was only faster due to the ability to cache it to ram.  Once
that ran out then it was limited to the physical bandwidth available.
That is all pretty normal.

> I don't understand why the cache is stil so trashed after cloning
> disks.

How big of an image are you reading and writing?

How big is your system RAM?  How much of it is available for file
system buffer cache?

Let me construct an example.  Let's assume a 500G image, a 1G ram
system, 500M used for userland, and 500M available for file system
buffer cache.  That is a fairly happy system for general purpose use.
But if you try to copy the 500M image that will be almost sized small
enough to be cached.  It won't quite fit however.  Therefore accessing
that image repeatedly will cause it the final bits of it to cause the
kernel to flush out earlier bits in order to make room.  And then when
copying again and those flushed out bits will need to be read from
disk again.  That will push out some later bits.  When reading those
later bits they will need to be read from disk again too.

Trying to run many asynchronous processes in parallel makes the
problem worse.  They won't be in sync with each other.  This will
cause a more severe thrashing of the cache.

A guideline I use is to have at least twice as much buffer cache
available as any images that you wish to copy.  This may be excessive
in some cases.  There are many combinations possible when system
tuning.  And you haven't given any details yet.

Also the system will have a limited amount of I/O bus bandwdith
available.  Today's typical systems will be around 100MB/s as a
typical number.  The sum of all of the write bandwidth will share that
total data pipeline bandwidth.  In other words one copy process may be
able to achieve 100MB/s.  Running five in parallel would divide down
to 20MB/s each.  However system resources are necessary to for the
kernel to DMA and other addtional resource limitations.  If the system
doesn't have enough ram I would expect performance to fall off after a
certain number of parallel processes.

> Is there a way to restore or clean it so I have a Transfert rate
> about 100MB/s on all my cloning steps?

I would ensure that you have sufficient ram in the system.  Lack of
ram is the most typical problem causing performance loss.  Monitor the
amount of file system cache available during the task run.  Personally
I like the 'htop' tool's mem bargraph display best for this.  I like
'iotop' for monitoring I/O.

I would keep the source image in a tmpfs ramdisk to avoid any storage
I/O used for reading the image.  That will improve the parallel copy
performance by removing read I/O for the source image.

I would use dd oflag=nocache as a hint to the kernel not to cache the
data in the file system buffer cache.  If your combination and kernel
and file system support this hint through the entire stack then the
kernel will avoid caching that I/O into the file system buffer cache
and will write through to disk.  Since your task is writing to disk I
don't see any performance downside as you will need to wait for the
I/O to complete regardless.  The positive benefit is that it won't
thrash the cache.  This requires the kernel and file system to support
this hint.

Note that dd iflag=nocache is also available.  I don't think that it
is needed when using a tmpfs.  I think that the kernel tmpfs driver
has a high order understanding of storage and ram and won't cache the
ramdisk.  Obviously ram caching a ram file system would be redundant.

In order to scale total I/O bandwidth above the amount available to
the system I would use multiple systems in parallel.  For example if
you had eight SATA ports on a system and three systems in parallel
then you could clone 24 disks in parallel on three systems in the same
time as 8 on one system.

Bob

Re: cloning to multiple drives with the "dd" command.

Reply via email to