Re: Problem using dd within bash script

Sebastian Rückerl Mon, 28 Apr 2014 06:26:30 -0700

On Sat, Apr 26, 2014 at 11:41:45AM -0600, Bob Proulx wrote:
> Pádraig Brady wrote:
> > Sebastian Rückerl wrote:
> > > Everything seems to work fine as long as I am not trying to
> > > interrupt this process (using CTRL-C).
> > > Only in this case everything just freezes for some time (for about
> > > 1 minute the terminal is blocked) even if I try to kill it from
> > > another terminal using its pid. This includes that all further
> > > CTRL-C are just printed to the terminal and nothing happens.
> > >
> > > My workaround now is to manage all the signal handling on my
> > > own. This seems to work if my only goal was not to have the
> > > terminal beeing somehow not responsive, but it does not kill the
> > > dd. (Which can still be seen using ps ax | grep dd) Anyways dd is
> > > somehow interrupted (but not killed totally) as it no longer
> > > responds to kill -USR1.
> >
> > dd does catch ^C, but it processes pending interrupts before each
> > read() and write(), and those read() and write() calls should return
> > with EINTR.
> > Therefore it seems like the kernel is blocking these signals?
> 
> I was thinking the same thing.  I was thinking that the dd process
> must be in the kernel blocked queue waiting for I/O.  That is why it
> doesn't exit quickly after being sent kill signals.  Processes blocked
> waiting for I/O aren't actually "running" at that moment and therefore
> can't handle the signal and can't die in the normal sense until the
> I/O completes and they get back into the run queue and runs.
> 
> > > Can those (still somehow running) dd instances do any harm? Or are
> > > they just waiting for something before they can be destroyed?
> 
> Those processes that are still there after you try to kill it have
> performed a read(2) or write(2) operation and are waiting on the
> kernel to return control to them.  There isn't anything that can be
> done outside of the kernel to change their state.  The programs are
> not actually "running" as control has been passed to the kernel to
> perform the I/O.  As soon as the kernel finishes the write(2) and
> returns to the program then the program will handle the signal and
> exit.
> 
> > Note I've seen the Linux kernel behave badly here, especially with
> > slow devices, where it caches too much before it starts writing out
> > a lot of data which then blocks other stuff. It caches based on free
> > RAM rather than on attributes of the sink device. This is a bug in
> > the kernel IMHO.
> 
> Yes.  And people tuning systems often do the opposite of what I think
> should be done.  It is really hard to do good optimization without
> objective benchmarks to provide data.
>


First of all think you for this explanation. If I got it right this is related 
to
unfinished operations, which would explain this behaviour. 

> > You could avoid large parts of the Linux VM by using O_DIRECT:
> >
> >   dd if=$1 of=$2 bs=4M oflag=direct
> 
> Good suggestion!
> 

This really helps to keep the delay low between pressing CTRL-C and the exit of 
my
script. But it feels as if this would slow down the whole process.
But if I got it right this is just my feeling as I would have to wait the time
at the next sync (or the unmount which should sync, too). So the total amount of
time needed to perform the whole thing (copy + sync ) should still be the same.

I think I might try this at some point in the future. But in this case it's not 
about
speed but about a working script to do my work. No matter if it's a little bit 
slower
than it could be in a perfect case.

> > > control_c (){
> > >         echo "Aborting due to interrupt. Disk will most likely be 
> > > useless."
> > >         kill -9 $dd_pid
> > >         exit 1
> > > }
> > > 
> > > # trap keyboard interrupt (control-c)
> > > trap control_c SIGINT
> 
> In your trap handler don't 'exit 1'.  That washes off the correct exit
> code.  Instead reset your trap handler to the default handler and then
> send the kill signal back to the current process.  That will cause it
> to exit with the correct, kill on signal, information.
> 
>   control_c (){
>           echo "Aborting due to interrupt. Disk will most likely be useless." 
> 1>&2
>           kill -9 $dd_pid
>           trap '' INT
>           kill -s INT $$
>   }
>   
>   # trap keyboard interrupt (control-c)
>   trap control_c SIGINT
> 

Is this really a big deal as the next process (the one calling this script) is 
the
bash / myself starting it on the command line? So is this something that is
generally important for scripting? I must admit, I have never thought about it.

signature.asc
Description: Digital signature

Re: Problem using dd within bash script

Reply via email to