Re: What is wrong?

2014-03-04 Thread Andrew Ruder
On Tue, Mar 04, 2014 at 10:54:04AM +0200, Leon Pollak wrote:
> I will recheck everything and try. Meanwhile, the news are not good: our 
> guys say that it appears that the additional sync DOES NOT SOLVE the 
> issue.

Gonna be honest, I have a tough time explaining this. :(  Unfortunately
I don't have a board here with a hardware write protect which would make
things easier to verify.

- Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What is wrong?

2014-03-04 Thread Leon Pollak
Hello, all.

I am really sorry for the silence - I was on the business trip and 
returned today.

I will recheck everything and try. Meanwhile, the news are not good: our 
guys say that it appears that the additional sync DOES NOT SOLVE the 
issue.
I ask for excuse, but as I did not know the exact processing, I was 
mistaken and, probably, used already gc-ted unit for tests.

Sorry, again.

BR

On Tuesday 04 March 2014 00:33:25 Brian Norris wrote:
> On Mon, Mar 03, 2014 at 03:13:36PM -0600, Andrew Ruder wrote:
> > On Thu, Feb 27, 2014 at 01:22:08PM -0800, Brian Norris wrote:
> > > Perhaps Richard or Andrew can comment on whether this patch should
> > > help you. But I think JFFS2 on NAND uses write-buffered support
> > > which can be affected by this bug.
> > 
> > Definitely sounds like the same issue and I'm kind of glad to see it
> > crop up in another filesystem.
> 
> We haven't confirmed that the *patch* actually affects Leon's problem;
> just that if he runs an additional 'sync' it solves his problem.
> Leon, did you get to try the patch?
> 
> Anyway, should commit 807612db2f9940b9fa6deaef054eb16d51bd3e00 be
> marked for -stable?
> 
> Brian

-- 
Leon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What is wrong?

2014-03-04 Thread Brian Norris
On Mon, Mar 03, 2014 at 03:13:36PM -0600, Andrew Ruder wrote:
> On Thu, Feb 27, 2014 at 01:22:08PM -0800, Brian Norris wrote:
> > Perhaps Richard or Andrew can comment on whether this patch should help
> > you. But I think JFFS2 on NAND uses write-buffered support which can be
> > affected by this bug.
> 
> Definitely sounds like the same issue and I'm kind of glad to see it
> crop up in another filesystem.

We haven't confirmed that the *patch* actually affects Leon's problem;
just that if he runs an additional 'sync' it solves his problem. Leon,
did you get to try the patch?

Anyway, should commit 807612db2f9940b9fa6deaef054eb16d51bd3e00 be marked
for -stable? 

Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What is wrong?

2014-03-04 Thread Brian Norris
On Mon, Mar 03, 2014 at 03:13:36PM -0600, Andrew Ruder wrote:
 On Thu, Feb 27, 2014 at 01:22:08PM -0800, Brian Norris wrote:
  Perhaps Richard or Andrew can comment on whether this patch should help
  you. But I think JFFS2 on NAND uses write-buffered support which can be
  affected by this bug.
 
 Definitely sounds like the same issue and I'm kind of glad to see it
 crop up in another filesystem.

We haven't confirmed that the *patch* actually affects Leon's problem;
just that if he runs an additional 'sync' it solves his problem. Leon,
did you get to try the patch?

Anyway, should commit 807612db2f9940b9fa6deaef054eb16d51bd3e00 be marked
for -stable? 

Brian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What is wrong?

2014-03-04 Thread Leon Pollak
Hello, all.

I am really sorry for the silence - I was on the business trip and 
returned today.

I will recheck everything and try. Meanwhile, the news are not good: our 
guys say that it appears that the additional sync DOES NOT SOLVE the 
issue.
I ask for excuse, but as I did not know the exact processing, I was 
mistaken and, probably, used already gc-ted unit for tests.

Sorry, again.

BR

On Tuesday 04 March 2014 00:33:25 Brian Norris wrote:
 On Mon, Mar 03, 2014 at 03:13:36PM -0600, Andrew Ruder wrote:
  On Thu, Feb 27, 2014 at 01:22:08PM -0800, Brian Norris wrote:
   Perhaps Richard or Andrew can comment on whether this patch should
   help you. But I think JFFS2 on NAND uses write-buffered support
   which can be affected by this bug.
  
  Definitely sounds like the same issue and I'm kind of glad to see it
  crop up in another filesystem.
 
 We haven't confirmed that the *patch* actually affects Leon's problem;
 just that if he runs an additional 'sync' it solves his problem.
 Leon, did you get to try the patch?
 
 Anyway, should commit 807612db2f9940b9fa6deaef054eb16d51bd3e00 be
 marked for -stable?
 
 Brian

-- 
Leon
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What is wrong?

2014-03-04 Thread Andrew Ruder
On Tue, Mar 04, 2014 at 10:54:04AM +0200, Leon Pollak wrote:
 I will recheck everything and try. Meanwhile, the news are not good: our 
 guys say that it appears that the additional sync DOES NOT SOLVE the 
 issue.

Gonna be honest, I have a tough time explaining this. :(  Unfortunately
I don't have a board here with a hardware write protect which would make
things easier to verify.

- Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What is wrong?

2014-03-03 Thread Andrew Ruder
On Thu, Feb 27, 2014 at 01:22:08PM -0800, Brian Norris wrote:
> Perhaps Richard or Andrew can comment on whether this patch should help
> you. But I think JFFS2 on NAND uses write-buffered support which can be
> affected by this bug.

Definitely sounds like the same issue and I'm kind of glad to see it
crop up in another filesystem.  Also glad you Cc'd me with the URL
because I had the painful task of recreating this issue on another
filesystem on my TODO list as I didn't think it had ever been committed.

Cheers,
Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What is wrong?

2014-03-03 Thread Andrew Ruder
On Thu, Feb 27, 2014 at 01:22:08PM -0800, Brian Norris wrote:
 Perhaps Richard or Andrew can comment on whether this patch should help
 you. But I think JFFS2 on NAND uses write-buffered support which can be
 affected by this bug.

Definitely sounds like the same issue and I'm kind of glad to see it
crop up in another filesystem.  Also glad you Cc'd me with the URL
because I had the painful task of recreating this issue on another
filesystem on my TODO list as I didn't think it had ever been committed.

Cheers,
Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What is wrong?

2014-02-27 Thread Brian Norris
+ others

Hi Leon,

Can you please keep the CC list intact? And please try to reply below
the quotes and trim context, rather than top-posting. Thanks!

On Thu, Feb 27, 2014 at 02:00:25PM +0200, Leon Pollak wrote:
> I am VERY(!) thankful to you for the answer.
> First, I am calm now that there is no any error on my side and the 
> system remains clean despite these messages.
> Second, yes, the workaround worked.

That's nice to hear, but that is (as you note) a workaround. You should
not need an extra sync after remounting read-only. Do you think you can
try the linked patch?

commit 807612db2f9940b9fa6deaef054eb16d51bd3e00
Author: Andrew Ruder 
Date:   Thu Jan 30 09:26:54 2014 -0600

fs/super.c: sync ro remount after blocking writers

Perhaps Richard or Andrew can comment on whether this patch should help
you. But I think JFFS2 on NAND uses write-buffered support which can be
affected by this bug.

> May thanks to you for your help!!!

You're welcome.

I have a few other questions: are you using NOR or NAND (it looks like
maybe NAND)?

Leaving most context intact for others, below.

> On Wednesday 26 February 2014 17:11:48 you wrote:
> > On Wed, Feb 26, 2014 at 04:07:21PM +0200, Leon Pollak wrote:
> > > The NAND is write protected by HW and the partition is mounted as
> > > RO.
> > > At some moment I need to update a small file.
> > > So I do:
> > > - HW write protect off,
> > > - remount RW,
> > > - update file,
> > > - sync,
> > > - remount RO,
> > > - write protect on.
> > > 
> > > Looking at linux console I see a lot of messages like:
> > > Erase at 0x0040 failed immediately: errno -5
> > > Erase at 0x003e failed immediately: errno -5
> > > ..
> > > Erase at 0x0034 failed immediately: errno -5
> > > jffs2_flush_wbuf(): Write failed with -5
> > > Write of 2016 bytes at 0x002578a0 failed. returned -5, retlen 0
> > > Not marking the space at 0x002578a0 as dirty because the flash
> > > driver
> > > returned retlen zero.
> > > 
> > > 
> > > This is repeated for a long time, but everything seems work OK.
> > > The sequential starts and even file updates are also OK, without
> > > error messages.
> > > 
> > > What do I do wrong? Thanks a lot.
> > 
> > It's possible you're seeing symptoms of this:
> > 
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit
> > /?id=807612db2f9940b9fa6deaef054eb16d51bd3e00
> > 
> > It seems like maybe JFFS2 is still doing some GC and/or write flushing
> > after the remount.
> > 
> > Could try this?
> > 
> >  - HW write protect off,
> >  - remount RW,
> >  - update file,
> >  - sync,
> >  - remount RO,
> >  - sync, <-- add this, to see if you're experiencing any
> >  writeback after remount
> >  - write protect on.

Regards,
Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What is wrong?

2014-02-27 Thread Brian Norris
+ others

Hi Leon,

Can you please keep the CC list intact? And please try to reply below
the quotes and trim context, rather than top-posting. Thanks!

On Thu, Feb 27, 2014 at 02:00:25PM +0200, Leon Pollak wrote:
 I am VERY(!) thankful to you for the answer.
 First, I am calm now that there is no any error on my side and the 
 system remains clean despite these messages.
 Second, yes, the workaround worked.

That's nice to hear, but that is (as you note) a workaround. You should
not need an extra sync after remounting read-only. Do you think you can
try the linked patch?

commit 807612db2f9940b9fa6deaef054eb16d51bd3e00
Author: Andrew Ruder andrew.ru...@elecsyscorp.com
Date:   Thu Jan 30 09:26:54 2014 -0600

fs/super.c: sync ro remount after blocking writers

Perhaps Richard or Andrew can comment on whether this patch should help
you. But I think JFFS2 on NAND uses write-buffered support which can be
affected by this bug.

 May thanks to you for your help!!!

You're welcome.

I have a few other questions: are you using NOR or NAND (it looks like
maybe NAND)?

Leaving most context intact for others, below.

 On Wednesday 26 February 2014 17:11:48 you wrote:
  On Wed, Feb 26, 2014 at 04:07:21PM +0200, Leon Pollak wrote:
   The NAND is write protected by HW and the partition is mounted as
   RO.
   At some moment I need to update a small file.
   So I do:
   - HW write protect off,
   - remount RW,
   - update file,
   - sync,
   - remount RO,
   - write protect on.
   
   Looking at linux console I see a lot of messages like:
   Erase at 0x0040 failed immediately: errno -5
   Erase at 0x003e failed immediately: errno -5
   ..
   Erase at 0x0034 failed immediately: errno -5
   jffs2_flush_wbuf(): Write failed with -5
   Write of 2016 bytes at 0x002578a0 failed. returned -5, retlen 0
   Not marking the space at 0x002578a0 as dirty because the flash
   driver
   returned retlen zero.
   
   
   This is repeated for a long time, but everything seems work OK.
   The sequential starts and even file updates are also OK, without
   error messages.
   
   What do I do wrong? Thanks a lot.
  
  It's possible you're seeing symptoms of this:
  
  https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit
  /?id=807612db2f9940b9fa6deaef054eb16d51bd3e00
  
  It seems like maybe JFFS2 is still doing some GC and/or write flushing
  after the remount.
  
  Could try this?
  
   - HW write protect off,
   - remount RW,
   - update file,
   - sync,
   - remount RO,
   - sync, -- add this, to see if you're experiencing any
   writeback after remount
   - write protect on.

Regards,
Brian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/