Re: CVS commit: src/sys

2016-11-07 Thread J. Hannken-Illjes

> On 07 Nov 2016, at 14:04, Christos Zoulas  wrote:
> 
> In article 
> ,
> JaromĆ­r DoleÄ ek   wrote:
>> 
>> Anyway, IMO eventual change to allow fsck to DTRT after crash in
>> ffs_truncate() should go separately from the crash fix.
> 
> While fsck could be fixed, IMO, it is best for the kernel to try to leave
> things as consistent as possible avoid scary noise from fsck and to keep
> recovery simpler.

It is impossible to fix fsck here:  fsck gets a file system where some
inodes have the same block allocated.  It is impossible to determine
which allocation is wrong as its inode was in truncation.

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)


Re: CVS commit: src/sys

2016-11-07 Thread Kamil Rytarowski
On 07.11.2016 22:39, J. Hannken-Illjes wrote:
> My patch contains corruption issues only and it passes ATF and
> it passes my stress test which is a bit more than just some fsx.
> 
> As -current currently corrupts file systems we should either fix it very
> soon or revert your changes completely.
> 

Could we please revert it and recommit it once will be fixed. Some
people need to run NetBSD-current that is literally from yesterday and
cannot track wapbl-related patches.

Also the first line of defense should be the automatic testing
framework, not a computer of a user/developer doing the daily job on NetBSD.

Perhaps a rhetorical question, but is it viable to add more ATF tests
for filesystem code, especially before altering anything functional in it?



signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/sys

2016-11-07 Thread J. Hannken-Illjes

> On 07 Nov 2016, at 21:53, Jaromír Doleček  wrote:
> 
> 2016-11-07 12:11 GMT+01:00 J. Hannken-Illjes :
>> It gets used when a block of block pointers has been deallocated and
>> brelsed, that is NOT written to disk.  If we allow the deallocation of
>> this block to fail and ufs_truncate_retry() runs ffs_truncate again
>> it will read the block from disk and free already freed blocks.
> 
> Good point. The same is however true for ALL the calls in the chain,
> so then we should have to really use FORCE everywhere.

No.  If one of the other calls (clearing one pointer in a block) fails
we already bwrite() (if (error != 0 ...).  In the wapbl case writing
all blocks would result in transitions too big for the log.  I tried it ...

> One solution is to always bwrite() or bdwrite() back even fully zeroed
> block for wapbl case instead of brelse(BC_INVAL). I think that for
> fsck to reliably recover from crash within truncate, this might
> actually be needed also for !wapbl case.

No.  The !wapbl case is safe.  The block is either written after
the copy (before we change it) or from ffs_truncate() when it
sets "sync = 1".  For wapbl case see above.

> Another way how to solve this would be to try to register the
> deallocation and on error bail out, before the diving. It would
> require cancelling the registration if the diving call returns EAGAIN
> however.
> 
>> You mean this:
>> 
>>error = ffs_indirtrunc(ip, nlbn, FFS_FSBTODB(fs, nb),
>>(daddr_t)-1, level - 1, countp);
>> -   if (error)
>> -   goto out;
>> +   if (error == EAGAIN)
>> +   goto out;
>> +   else if (error && !allerror)
>> +   allerror = error;
> 
> No, I mean the copy logic and big blocks with condition on
> ip->i_ump->um_mountp->mnt_wapbl.

The copy logic is here to prevent corruption beyond fsck capabilities
and therefore we need in the !wapbl case.  There IS a reason it
was here since the beginning of time.

Not sure if it would make sense to split it into ffs_indir_trunc()
and ffs_indir_trunc_wapbl().  Possibly easier to understand but
prone to errors fixed in only one of them.

>> I don't understand ... do you want to split into two diffs?
> 
> I prefer to split commits into incremental changes if reasonably
> possible, so it's easier to review and revise. That's all. That's why
> I prefer the fix for immediate corruption to go separately.

My patch contains corruption issues only and it passes ATF and
it passes my stress test which is a bit more than just some fsx.

As -current currently corrupts file systems we should either fix it very
soon or revert your changes completely.

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: CVS commit: src/sys

2016-11-07 Thread Jaromír Doleček
2016-11-07 12:11 GMT+01:00 J. Hannken-Illjes :
> It gets used when a block of block pointers has been deallocated and
> brelsed, that is NOT written to disk.  If we allow the deallocation of
> this block to fail and ufs_truncate_retry() runs ffs_truncate again
> it will read the block from disk and free already freed blocks.

Good point. The same is however true for ALL the calls in the chain,
so then we should have to really use FORCE everywhere.

One solution is to always bwrite() or bdwrite() back even fully zeroed
block for wapbl case instead of brelse(BC_INVAL). I think that for
fsck to reliably recover from crash within truncate, this might
actually be needed also for !wapbl case.

Another way how to solve this would be to try to register the
deallocation and on error bail out, before the diving. It would
require cancelling the registration if the diving call returns EAGAIN
however.

> You mean this:
>
> error = ffs_indirtrunc(ip, nlbn, FFS_FSBTODB(fs, nb),
> (daddr_t)-1, level - 1, countp);
> -   if (error)
> -   goto out;
> +   if (error == EAGAIN)
> +   goto out;
> +   else if (error && !allerror)
> +   allerror = error;

No, I mean the copy logic and big blocks with condition on
ip->i_ump->um_mountp->mnt_wapbl.

> I don't understand ... do you want to split into two diffs?

I prefer to split commits into incremental changes if reasonably
possible, so it's easier to review and revise. That's all. That's why
I prefer the fix for immediate corruption to go separately.

Jaromir


Re: CVS commit: src

2016-11-07 Thread Alexander Nasonov
Joerg Sonnenberger wrote:
> At least for my evbppc64 build this fails to link because it uses
> mktemp and -Wl,--fatal-errors. Does it really need a random file name
> and can't just use an easier to predict for debugging name in the
> current directory?

New change is committed.

Alex


Re: CVS commit: src/sys

2016-11-07 Thread Christos Zoulas
In article ,
Jaromír DoleÄ ek   wrote:
>
>Anyway, IMO eventual change to allow fsck to DTRT after crash in
>ffs_truncate() should go separately from the crash fix.

While fsck could be fixed, IMO, it is best for the kernel to try to leave
things as consistent as possible avoid scary noise from fsck and to keep
recovery simpler.

christos



Re: CVS commit: src

2016-11-07 Thread Joerg Sonnenberger
On Sun, Nov 06, 2016 at 10:54:42AM +, Alexander Nasonov wrote:
> Module Name:  src
> Committed By: alnsn
> Date: Sun Nov  6 10:54:42 UTC 2016
> 
> Modified Files:
>   src/distrib/sets/lists/tests: mi
>   src/tests/dev/cgd: Makefile
> Added Files:
>   src/tests/dev/cgd: t_cgd_aes.c
> 
> Log Message:
> Add tests for not-yet-committed cgd algorithm AES-XTS.
> 
> The tests are marked as expected failures.

At least for my evbppc64 build this fails to link because it uses
mktemp and -Wl,--fatal-errors. Does it really need a random file name
and can't just use an easier to predict for debugging name in the
current directory?

Joerg


Re: CVS commit: src/sys

2016-11-07 Thread J. Hannken-Illjes

> On 07 Nov 2016, at 11:54, Jaromír Doleček  wrote:
> 
> 2016-11-07 10:25 GMT+01:00 J. Hannken-Illjes :
>> The first part should not be necessary.  After the loop we should have
>> "i == last" -- from a quick look "i < last" is impossible.
> 
> Yes, I know - it's just to make the diff vs 1.117 smaller and hence
> easier to review. I want to commit this separately.
> 
>> The second part is right and responsible for most of the panics.
> 
> Right.
> 
>> Your patch still doesn't address my second observation, if the
>> machine crashs inside ffs_truncate we leave the file system in a
>> state fsck can't handle.  The blocks we freed get attached to other
>> nodes before they get cleared from our on-disk copy leading to
>> duplicate blocks.
> 
> That patch is not really very ideal.
> 
> One is UFS_WAPBL_REGISTER_DEALLOCATION_FORCE(), which should only be
> used as desperate measure.

It gets used when a block of block pointers has been deallocated and
brelsed, that is NOT written to disk.  If we allow the deallocation of
this block to fail and ufs_truncate_retry() runs ffs_truncate again
it will read the block from disk and free already freed blocks.

> Second is that the patch adds IMO too much difference for wapbl and
> !wapbl case, to already quite complicated function. Also it's slightly
> confusing that the !wapbl change now doesn't support failure in the
> middle any more - it will just leave the block inconsistent. I need to
> think a little about this, maybe there is some better way.

The wapbl and !wapbl cases ARE different.

You mean this:

error = ffs_indirtrunc(ip, nlbn, FFS_FSBTODB(fs, nb),
(daddr_t)-1, level - 1, countp);
-   if (error)
-   goto out;
+   if (error == EAGAIN)
+   goto out;
+   else if (error && !allerror)
+   allerror = error;

> Anyway, IMO eventual change to allow fsck to DTRT after crash in
> ffs_truncate() should go separately from the crash fix.

I don't understand ... do you want to split into two diffs?

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: CVS commit: src/sys

2016-11-07 Thread Jaromír Doleček
2016-11-07 10:25 GMT+01:00 J. Hannken-Illjes :
> The first part should not be necessary.  After the loop we should have
> "i == last" -- from a quick look "i < last" is impossible.

Yes, I know - it's just to make the diff vs 1.117 smaller and hence
easier to review. I want to commit this separately.

> The second part is right and responsible for most of the panics.

Right.

> Your patch still doesn't address my second observation, if the
> machine crashs inside ffs_truncate we leave the file system in a
> state fsck can't handle.  The blocks we freed get attached to other
> nodes before they get cleared from our on-disk copy leading to
> duplicate blocks.

That patch is not really very ideal.

One is UFS_WAPBL_REGISTER_DEALLOCATION_FORCE(), which should only be
used as desperate measure.

Second is that the patch adds IMO too much difference for wapbl and
!wapbl case, to already quite complicated function. Also it's slightly
confusing that the !wapbl change now doesn't support failure in the
middle any more - it will just leave the block inconsistent. I need to
think a little about this, maybe there is some better way.

Anyway, IMO eventual change to allow fsck to DTRT after crash in
ffs_truncate() should go separately from the crash fix.

Jaromir


Re: CVS commit: src/external/gpl3/gdb/dist/gdb

2016-11-07 Thread Martin Husemann
On Mon, Nov 07, 2016 at 06:38:54PM +1100, matthew green wrote:
> can we put this function into a netbsd-common file that all
> ports can reference, rather than repeating it?  ie, this is
> identical to the arm version, and probably others.

Yes please, IIUC Nick and Rin are working on that.

Martin


Re: CVS commit: src/sys

2016-11-07 Thread J. Hannken-Illjes

> On 07 Nov 2016, at 00:11, Jaromír Doleček  wrote:
> 
>> On Fri, Nov 04, 2016 at 04:44:10PM +0100, J. Hannken-Illjes wrote:
>>> - This change results in "panic: ffs_blkfree_common: freeing free block"
>>>  if I put a file system under stress (*1).
>>> 
>>> - I suppose not zeroing the blocks to be freed before freeing them
>>>  makes the life of fsck harder.
>>> 
>>> - Running "brelse(bp, BC_INVAL)" doesn't look OK.
> 
> The brelse(bp, BC_INVAL) call was there before as well, but the
> condition changed and is wrong.
> 
> I can repeat the problem with your script and the packaged fsx (thanks
> Thomas). Whipped up a patch to what looked wrong there, and it no
> longer panics for me. Patch is attached. I'll test further and commit
> it tomorrow.

Index: ffs_inode.c
===
RCS file: /cvsroot/src/sys/ufs/ffs/ffs_inode.c,v
retrieving revision 1.118
diff -u -p -r1.118 ffs_inode.c
--- ffs_inode.c 28 Oct 2016 20:38:12 -  1.118
+++ ffs_inode.c 6 Nov 2016 23:09:15 -
@@ -675,18 +675,18 @@ ffs_indirtrunc(struct inode *ip, daddr_t
 * Recursively free blocks on the now last partial indirect block.
 */
if (level > SINGLE && lastbn >= 0) {
-   nb = RBAP(ip, last);
+   last = lastbn % factor;
+   nb = RBAP(ip, i);
if (nb != 0) {
error = ffs_indirtrunc(ip, nlbn, FFS_FSBTODB(fs, nb),
-  lastbn % factor, level - 1,
-  countp);
+  last, level - 1, countp);
if (error)
goto out;
}
}
 
 out:
-   if (RBAP(ip, 0) == 0) {
+   if (lastbn < 0 && error == 0) {
/* all freed, release without writing back */
brelse(bp, BC_INVAL);
} else {

The first part should not be necessary.  After the loop we should have
"i == last" -- from a quick look "i < last" is impossible.

The second part is right and responsible for most of the panics.

Your patch still doesn't address my second observation, if the
machine crashs inside ffs_truncate we leave the file system in a 
state fsck can't handle.  The blocks we freed get attached to other
nodes before they get cleared from our on-disk copy leading to
duplicate blocks.

The attached diff:

- Brings back the non-wapbl behaviour to first clear the allocations
  in the on-disk node and then deallocating them.

- Fixes the condition to brelse() on exit like your diff does.

- Makes sure a block that is completely deallocated and NOT written
  to disk gets deallocated even if this deallocation would fail.

Please review.

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)


ffs_inode.c.diff
Description: Binary data