Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-03 Thread Dan Williams
On Tue, May 3, 2016 at 8:18 PM, Dave Chinner wrote: > On Tue, May 03, 2016 at 10:28:15AM -0700, Dan Williams wrote: >> On Mon, May 2, 2016 at 6:51 PM, Dave Chinner wrote: >> > On Mon, May 02, 2016 at 04:25:51PM -0700, Dan Williams wrote: >> [..] >> >

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-03 Thread Dan Williams
On Tue, May 3, 2016 at 8:18 PM, Dave Chinner wrote: > On Tue, May 03, 2016 at 10:28:15AM -0700, Dan Williams wrote: >> On Mon, May 2, 2016 at 6:51 PM, Dave Chinner wrote: >> > On Mon, May 02, 2016 at 04:25:51PM -0700, Dan Williams wrote: >> [..] >> > Yes, I know, and it doesn't answer any of the

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-03 Thread Dave Chinner
On Tue, May 03, 2016 at 10:28:15AM -0700, Dan Williams wrote: > On Mon, May 2, 2016 at 6:51 PM, Dave Chinner wrote: > > On Mon, May 02, 2016 at 04:25:51PM -0700, Dan Williams wrote: > [..] > > Yes, I know, and it doesn't answer any of the questions I just > > asked. What you

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-03 Thread Dave Chinner
On Tue, May 03, 2016 at 10:28:15AM -0700, Dan Williams wrote: > On Mon, May 2, 2016 at 6:51 PM, Dave Chinner wrote: > > On Mon, May 02, 2016 at 04:25:51PM -0700, Dan Williams wrote: > [..] > > Yes, I know, and it doesn't answer any of the questions I just > > asked. What you just told me is that

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-03 Thread Dave Chinner
On Tue, May 03, 2016 at 06:30:04PM +, Rudoff, Andy wrote: > > > >And when the filesystem says no because the fs devs don't want to > >have to deal with broken apps because app devs learn that "this is a > >go fast knob" and data integrity be damned? It's "fsync is slow so I > >won't use it"

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-03 Thread Dave Chinner
On Tue, May 03, 2016 at 06:30:04PM +, Rudoff, Andy wrote: > > > >And when the filesystem says no because the fs devs don't want to > >have to deal with broken apps because app devs learn that "this is a > >go fast knob" and data integrity be damned? It's "fsync is slow so I > >won't use it"

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-03 Thread Rudoff, Andy
> >And when the filesystem says no because the fs devs don't want to >have to deal with broken apps because app devs learn that "this is a >go fast knob" and data integrity be damned? It's "fsync is slow so I >won't use it" all over again... ... > >And, please keep in mind: many application

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-03 Thread Rudoff, Andy
> >And when the filesystem says no because the fs devs don't want to >have to deal with broken apps because app devs learn that "this is a >go fast knob" and data integrity be damned? It's "fsync is slow so I >won't use it" all over again... ... > >And, please keep in mind: many application

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-03 Thread Dan Williams
On Mon, May 2, 2016 at 6:51 PM, Dave Chinner wrote: > On Mon, May 02, 2016 at 04:25:51PM -0700, Dan Williams wrote: [..] > Yes, I know, and it doesn't answer any of the questions I just > asked. What you just told me is that there is something that is kept > three levels of

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-03 Thread Dan Williams
On Mon, May 2, 2016 at 6:51 PM, Dave Chinner wrote: > On Mon, May 02, 2016 at 04:25:51PM -0700, Dan Williams wrote: [..] > Yes, I know, and it doesn't answer any of the questions I just > asked. What you just told me is that there is something that is kept > three levels of abstraction away from

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Dave Chinner
On Tue, May 03, 2016 at 01:26:46AM +, Rudoff, Andy wrote: > > >> The takeaway is that msync() is 9-10x slower than userspace cache > >> management. > > > >An alternative viewpoint: that flushing clean cachelines is > >extremely expensive on Intel CPUs. ;) > > > >i.e. Same numbers, different

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Dave Chinner
On Tue, May 03, 2016 at 01:26:46AM +, Rudoff, Andy wrote: > > >> The takeaway is that msync() is 9-10x slower than userspace cache > >> management. > > > >An alternative viewpoint: that flushing clean cachelines is > >extremely expensive on Intel CPUs. ;) > > > >i.e. Same numbers, different

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Dave Chinner
On Mon, May 02, 2016 at 04:25:51PM -0700, Dan Williams wrote: > On Mon, May 2, 2016 at 4:04 PM, Dave Chinner wrote: > > On Mon, May 02, 2016 at 11:18:36AM -0400, Jeff Moyer wrote: > >> Dave Chinner writes: > >> > >> > On Mon, Apr 25, 2016 at 11:53:13PM

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Dave Chinner
On Mon, May 02, 2016 at 04:25:51PM -0700, Dan Williams wrote: > On Mon, May 2, 2016 at 4:04 PM, Dave Chinner wrote: > > On Mon, May 02, 2016 at 11:18:36AM -0400, Jeff Moyer wrote: > >> Dave Chinner writes: > >> > >> > On Mon, Apr 25, 2016 at 11:53:13PM +, Verma, Vishal L wrote: > >> >> On

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Rudoff, Andy
>> The takeaway is that msync() is 9-10x slower than userspace cache management. > >An alternative viewpoint: that flushing clean cachelines is >extremely expensive on Intel CPUs. ;) > >i.e. Same numbers, different analysis from a different PoV, and >that gives a *completely different

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Rudoff, Andy
>> The takeaway is that msync() is 9-10x slower than userspace cache management. > >An alternative viewpoint: that flushing clean cachelines is >extremely expensive on Intel CPUs. ;) > >i.e. Same numbers, different analysis from a different PoV, and >that gives a *completely different

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Dave Chinner
On Mon, May 02, 2016 at 10:53:25AM -0700, Dan Williams wrote: > On Mon, May 2, 2016 at 8:18 AM, Jeff Moyer wrote: > > Dave Chinner writes: > [..] > >> We need some form of redundancy and correction in the PMEM stack to > >> prevent single sector errors

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Dave Chinner
On Mon, May 02, 2016 at 10:53:25AM -0700, Dan Williams wrote: > On Mon, May 2, 2016 at 8:18 AM, Jeff Moyer wrote: > > Dave Chinner writes: > [..] > >> We need some form of redundancy and correction in the PMEM stack to > >> prevent single sector errors from taking down services until an > >>

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Dan Williams
On Mon, May 2, 2016 at 4:04 PM, Dave Chinner wrote: > On Mon, May 02, 2016 at 11:18:36AM -0400, Jeff Moyer wrote: >> Dave Chinner writes: >> >> > On Mon, Apr 25, 2016 at 11:53:13PM +, Verma, Vishal L wrote: >> >> On Tue, 2016-04-26 at 09:25 +1000,

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Dan Williams
On Mon, May 2, 2016 at 4:04 PM, Dave Chinner wrote: > On Mon, May 02, 2016 at 11:18:36AM -0400, Jeff Moyer wrote: >> Dave Chinner writes: >> >> > On Mon, Apr 25, 2016 at 11:53:13PM +, Verma, Vishal L wrote: >> >> On Tue, 2016-04-26 at 09:25 +1000, Dave Chinner wrote: >> > You're assuming

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Verma, Vishal L
On Tue, 2016-05-03 at 09:04 +1000, Dave Chinner wrote: > On Mon, May 02, 2016 at 11:18:36AM -0400, Jeff Moyer wrote: > > > > Dave Chinner writes: > > > > > > > > On Mon, Apr 25, 2016 at 11:53:13PM +, Verma, Vishal L wrote: > > > > > > > > On Tue, 2016-04-26 at 09:25

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Verma, Vishal L
On Tue, 2016-05-03 at 09:04 +1000, Dave Chinner wrote: > On Mon, May 02, 2016 at 11:18:36AM -0400, Jeff Moyer wrote: > > > > Dave Chinner writes: > > > > > > > > On Mon, Apr 25, 2016 at 11:53:13PM +, Verma, Vishal L wrote: > > > > > > > > On Tue, 2016-04-26 at 09:25 +1000, Dave Chinner

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Dave Chinner
On Mon, May 02, 2016 at 11:18:36AM -0400, Jeff Moyer wrote: > Dave Chinner writes: > > > On Mon, Apr 25, 2016 at 11:53:13PM +, Verma, Vishal L wrote: > >> On Tue, 2016-04-26 at 09:25 +1000, Dave Chinner wrote: > > You're assuming that only the DAX aware application

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Dave Chinner
On Mon, May 02, 2016 at 11:18:36AM -0400, Jeff Moyer wrote: > Dave Chinner writes: > > > On Mon, Apr 25, 2016 at 11:53:13PM +, Verma, Vishal L wrote: > >> On Tue, 2016-04-26 at 09:25 +1000, Dave Chinner wrote: > > You're assuming that only the DAX aware application accesses it's > > files.

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Dan Williams
On Mon, May 2, 2016 at 8:18 AM, Jeff Moyer wrote: > Dave Chinner writes: [..] >> We need some form of redundancy and correction in the PMEM stack to >> prevent single sector errors from taking down services until an >> administrator can correct the

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Dan Williams
On Mon, May 2, 2016 at 8:18 AM, Jeff Moyer wrote: > Dave Chinner writes: [..] >> We need some form of redundancy and correction in the PMEM stack to >> prevent single sector errors from taking down services until an >> administrator can correct the problem. I'm trying to understand >> where this

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Jeff Moyer
Dave Chinner writes: > On Mon, Apr 25, 2016 at 11:53:13PM +, Verma, Vishal L wrote: >> On Tue, 2016-04-26 at 09:25 +1000, Dave Chinner wrote: > You're assuming that only the DAX aware application accesses it's > files. users, backup programs, data replicators, fileystem

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-05-02 Thread Jeff Moyer
Dave Chinner writes: > On Mon, Apr 25, 2016 at 11:53:13PM +, Verma, Vishal L wrote: >> On Tue, 2016-04-26 at 09:25 +1000, Dave Chinner wrote: > You're assuming that only the DAX aware application accesses it's > files. users, backup programs, data replicators, fileystem > re-organisers

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread Dan Williams
On Tue, Apr 26, 2016 at 8:31 AM, Jan Kara wrote: > On Tue 26-04-16 07:59:10, Dan Williams wrote: >> On Tue, Apr 26, 2016 at 1:27 AM, Dave Chinner wrote: >> > On Mon, Apr 25, 2016 at 09:18:42PM -0700, Dan Williams wrote: >> [..] >> > It seems to me you are

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread Dan Williams
On Tue, Apr 26, 2016 at 8:31 AM, Jan Kara wrote: > On Tue 26-04-16 07:59:10, Dan Williams wrote: >> On Tue, Apr 26, 2016 at 1:27 AM, Dave Chinner wrote: >> > On Mon, Apr 25, 2016 at 09:18:42PM -0700, Dan Williams wrote: >> [..] >> > It seems to me you are focussing on code/technologies that

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread Jan Kara
On Tue 26-04-16 07:59:10, Dan Williams wrote: > On Tue, Apr 26, 2016 at 1:27 AM, Dave Chinner wrote: > > On Mon, Apr 25, 2016 at 09:18:42PM -0700, Dan Williams wrote: > [..] > > It seems to me you are focussing on code/technologies that exist > > today instead of trying to

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread Jan Kara
On Tue 26-04-16 07:59:10, Dan Williams wrote: > On Tue, Apr 26, 2016 at 1:27 AM, Dave Chinner wrote: > > On Mon, Apr 25, 2016 at 09:18:42PM -0700, Dan Williams wrote: > [..] > > It seems to me you are focussing on code/technologies that exist > > today instead of trying to define an architecture

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread Dan Williams
On Tue, Apr 26, 2016 at 1:27 AM, Dave Chinner wrote: > On Mon, Apr 25, 2016 at 09:18:42PM -0700, Dan Williams wrote: [..] > It seems to me you are focussing on code/technologies that exist > today instead of trying to define an architecture that is more > optimal for pmem

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread Dan Williams
On Tue, Apr 26, 2016 at 1:27 AM, Dave Chinner wrote: > On Mon, Apr 25, 2016 at 09:18:42PM -0700, Dan Williams wrote: [..] > It seems to me you are focussing on code/technologies that exist > today instead of trying to define an architecture that is more > optimal for pmem storage systems. Yes,

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread Vishal Verma
On Tue, 2016-04-26 at 01:33 -0700, h...@infradead.org wrote: > On Mon, Apr 25, 2016 at 05:14:36PM +, Verma, Vishal L wrote: > > > > - Application hits EIO doing dax_IO or load/store io > > > > - It checks badblocks and discovers it's files have lost data > > > > - It write()s those sectors

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread Vishal Verma
On Tue, 2016-04-26 at 01:33 -0700, h...@infradead.org wrote: > On Mon, Apr 25, 2016 at 05:14:36PM +, Verma, Vishal L wrote: > > > > - Application hits EIO doing dax_IO or load/store io > > > > - It checks badblocks and discovers it's files have lost data > > > > - It write()s those sectors

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread Vishal Verma
On Tue, 2016-04-26 at 10:41 +1000, Dave Chinner wrote: > <> > > The application doesn't have to scan the entire filesystem, but > > presumably it knows what files it 'owns', and does a fiemap for > > those. > You're assuming that only the DAX aware application accesses it's > files.  users,

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread Vishal Verma
On Tue, 2016-04-26 at 10:41 +1000, Dave Chinner wrote: > <> > > The application doesn't have to scan the entire filesystem, but > > presumably it knows what files it 'owns', and does a fiemap for > > those. > You're assuming that only the DAX aware application accesses it's > files.  users,

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread h...@infradead.org
On Mon, Apr 25, 2016 at 05:14:36PM +, Verma, Vishal L wrote: > - Application hits EIO doing dax_IO or load/store io > > - It checks badblocks and discovers it's files have lost data > > - It write()s those sectors (possibly converted to file offsets using > fiemap) > ?? ?? * This triggers

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread h...@infradead.org
On Mon, Apr 25, 2016 at 05:14:36PM +, Verma, Vishal L wrote: > - Application hits EIO doing dax_IO or load/store io > > - It checks badblocks and discovers it's files have lost data > > - It write()s those sectors (possibly converted to file offsets using > fiemap) > ?? ?? * This triggers

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread h...@infradead.org
On Mon, Apr 25, 2016 at 11:32:08AM -0400, Jeff Moyer wrote: > > EINVAL is a concern here. Not due to the right error reported, but > > because it means your current scheme is fundamentally broken - we > > need to support I/O at any alignment for DAX I/O, and not fail due to > > alignbment

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread h...@infradead.org
On Mon, Apr 25, 2016 at 11:32:08AM -0400, Jeff Moyer wrote: > > EINVAL is a concern here. Not due to the right error reported, but > > because it means your current scheme is fundamentally broken - we > > need to support I/O at any alignment for DAX I/O, and not fail due to > > alignbment

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread Dave Chinner
On Mon, Apr 25, 2016 at 09:18:42PM -0700, Dan Williams wrote: > On Mon, Apr 25, 2016 at 7:56 PM, Dave Chinner wrote: > > On Mon, Apr 25, 2016 at 06:45:08PM -0700, Dan Williams wrote: > >> > I haven't seen any design/documentation for infrastructure at the > >> > application

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-26 Thread Dave Chinner
On Mon, Apr 25, 2016 at 09:18:42PM -0700, Dan Williams wrote: > On Mon, Apr 25, 2016 at 7:56 PM, Dave Chinner wrote: > > On Mon, Apr 25, 2016 at 06:45:08PM -0700, Dan Williams wrote: > >> > I haven't seen any design/documentation for infrastructure at the > >> > application layer to handle

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dan Williams
On Mon, Apr 25, 2016 at 7:56 PM, Dave Chinner wrote: > On Mon, Apr 25, 2016 at 06:45:08PM -0700, Dan Williams wrote: [..] >> Otherwise, if an application wants to use DAX then it might >> need to be prepared to handle media errors itself same as the >> un-RAIDed disk case.

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dan Williams
On Mon, Apr 25, 2016 at 7:56 PM, Dave Chinner wrote: > On Mon, Apr 25, 2016 at 06:45:08PM -0700, Dan Williams wrote: [..] >> Otherwise, if an application wants to use DAX then it might >> need to be prepared to handle media errors itself same as the >> un-RAIDed disk case. Yes, at an

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dave Chinner
On Mon, Apr 25, 2016 at 06:45:08PM -0700, Dan Williams wrote: > On Mon, Apr 25, 2016 at 5:11 PM, Dave Chinner wrote: > > On Mon, Apr 25, 2016 at 04:43:14PM -0700, Dan Williams wrote: > [..] > >> Maybe I missed something, but all these assumptions are already > >> present for

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dave Chinner
On Mon, Apr 25, 2016 at 06:45:08PM -0700, Dan Williams wrote: > On Mon, Apr 25, 2016 at 5:11 PM, Dave Chinner wrote: > > On Mon, Apr 25, 2016 at 04:43:14PM -0700, Dan Williams wrote: > [..] > >> Maybe I missed something, but all these assumptions are already > >> present for typical block

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dan Williams
On Mon, Apr 25, 2016 at 5:11 PM, Dave Chinner wrote: > On Mon, Apr 25, 2016 at 04:43:14PM -0700, Dan Williams wrote: [..] >> Maybe I missed something, but all these assumptions are already >> present for typical block devices, i.e. sectors may go bad and a write >> may make

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dan Williams
On Mon, Apr 25, 2016 at 5:11 PM, Dave Chinner wrote: > On Mon, Apr 25, 2016 at 04:43:14PM -0700, Dan Williams wrote: [..] >> Maybe I missed something, but all these assumptions are already >> present for typical block devices, i.e. sectors may go bad and a write >> may make the sector usable

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dave Chinner
On Mon, Apr 25, 2016 at 11:53:13PM +, Verma, Vishal L wrote: > On Tue, 2016-04-26 at 09:25 +1000, Dave Chinner wrote: > >  > <> > > > > > > > - It checks badblocks and discovers it's files have lost data > > Lots of hand-waving here. How does the application map a bad > > "sector" to a file

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dave Chinner
On Mon, Apr 25, 2016 at 11:53:13PM +, Verma, Vishal L wrote: > On Tue, 2016-04-26 at 09:25 +1000, Dave Chinner wrote: > >  > <> > > > > > > > - It checks badblocks and discovers it's files have lost data > > Lots of hand-waving here. How does the application map a bad > > "sector" to a file

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dave Chinner
On Mon, Apr 25, 2016 at 04:43:14PM -0700, Dan Williams wrote: > On Mon, Apr 25, 2016 at 4:25 PM, Dave Chinner wrote: > > On Mon, Apr 25, 2016 at 05:14:36PM +, Verma, Vishal L wrote: > >> On Mon, 2016-04-25 at 01:31 -0700, h...@infradead.org wrote: > >> > On Sat, Apr 23,

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dave Chinner
On Mon, Apr 25, 2016 at 04:43:14PM -0700, Dan Williams wrote: > On Mon, Apr 25, 2016 at 4:25 PM, Dave Chinner wrote: > > On Mon, Apr 25, 2016 at 05:14:36PM +, Verma, Vishal L wrote: > >> On Mon, 2016-04-25 at 01:31 -0700, h...@infradead.org wrote: > >> > On Sat, Apr 23, 2016 at 06:08:37PM

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Verma, Vishal L
On Tue, 2016-04-26 at 09:25 +1000, Dave Chinner wrote: >  <> > > > > - It checks badblocks and discovers it's files have lost data > Lots of hand-waving here. How does the application map a bad > "sector" to a file without scanning the entire filesystem to find > the owner of the bad sector?

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Verma, Vishal L
On Tue, 2016-04-26 at 09:25 +1000, Dave Chinner wrote: >  <> > > > > - It checks badblocks and discovers it's files have lost data > Lots of hand-waving here. How does the application map a bad > "sector" to a file without scanning the entire filesystem to find > the owner of the bad sector?

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dan Williams
On Mon, Apr 25, 2016 at 4:25 PM, Dave Chinner wrote: > On Mon, Apr 25, 2016 at 05:14:36PM +, Verma, Vishal L wrote: >> On Mon, 2016-04-25 at 01:31 -0700, h...@infradead.org wrote: >> > On Sat, Apr 23, 2016 at 06:08:37PM +, Verma, Vishal L wrote: >> > > >> > >

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dan Williams
On Mon, Apr 25, 2016 at 4:25 PM, Dave Chinner wrote: > On Mon, Apr 25, 2016 at 05:14:36PM +, Verma, Vishal L wrote: >> On Mon, 2016-04-25 at 01:31 -0700, h...@infradead.org wrote: >> > On Sat, Apr 23, 2016 at 06:08:37PM +, Verma, Vishal L wrote: >> > > >> > > direct_IO might fail with

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Darrick J. Wong
On Tue, Apr 26, 2016 at 09:25:52AM +1000, Dave Chinner wrote: > On Mon, Apr 25, 2016 at 05:14:36PM +, Verma, Vishal L wrote: > > On Mon, 2016-04-25 at 01:31 -0700, h...@infradead.org wrote: > > > On Sat, Apr 23, 2016 at 06:08:37PM +, Verma, Vishal L wrote: > > > > > > > > direct_IO might

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Darrick J. Wong
On Tue, Apr 26, 2016 at 09:25:52AM +1000, Dave Chinner wrote: > On Mon, Apr 25, 2016 at 05:14:36PM +, Verma, Vishal L wrote: > > On Mon, 2016-04-25 at 01:31 -0700, h...@infradead.org wrote: > > > On Sat, Apr 23, 2016 at 06:08:37PM +, Verma, Vishal L wrote: > > > > > > > > direct_IO might

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dave Chinner
On Mon, Apr 25, 2016 at 05:14:36PM +, Verma, Vishal L wrote: > On Mon, 2016-04-25 at 01:31 -0700, h...@infradead.org wrote: > > On Sat, Apr 23, 2016 at 06:08:37PM +, Verma, Vishal L wrote: > > > > > > direct_IO might fail with -EINVAL due to misalignment, or -ENOMEM > > > due > > > to

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dave Chinner
On Mon, Apr 25, 2016 at 05:14:36PM +, Verma, Vishal L wrote: > On Mon, 2016-04-25 at 01:31 -0700, h...@infradead.org wrote: > > On Sat, Apr 23, 2016 at 06:08:37PM +, Verma, Vishal L wrote: > > > > > > direct_IO might fail with -EINVAL due to misalignment, or -ENOMEM > > > due > > > to

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dan Williams
On Mon, Apr 25, 2016 at 10:14 AM, Verma, Vishal L wrote: > On Mon, 2016-04-25 at 01:31 -0700, h...@infradead.org wrote: >> On Sat, Apr 23, 2016 at 06:08:37PM +, Verma, Vishal L wrote: >> > >> > direct_IO might fail with -EINVAL due to misalignment, or -ENOMEM >> >

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Dan Williams
On Mon, Apr 25, 2016 at 10:14 AM, Verma, Vishal L wrote: > On Mon, 2016-04-25 at 01:31 -0700, h...@infradead.org wrote: >> On Sat, Apr 23, 2016 at 06:08:37PM +, Verma, Vishal L wrote: >> > >> > direct_IO might fail with -EINVAL due to misalignment, or -ENOMEM >> > due >> > to some allocation

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Verma, Vishal L
On Mon, 2016-04-25 at 01:31 -0700, h...@infradead.org wrote: > On Sat, Apr 23, 2016 at 06:08:37PM +, Verma, Vishal L wrote: > > > > direct_IO might fail with -EINVAL due to misalignment, or -ENOMEM > > due > > to some allocation failing, and I thought we should return the > > original > >

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Verma, Vishal L
On Mon, 2016-04-25 at 01:31 -0700, h...@infradead.org wrote: > On Sat, Apr 23, 2016 at 06:08:37PM +, Verma, Vishal L wrote: > > > > direct_IO might fail with -EINVAL due to misalignment, or -ENOMEM > > due > > to some allocation failing, and I thought we should return the > > original > >

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Jeff Moyer
"h...@infradead.org" writes: > On Sat, Apr 23, 2016 at 06:08:37PM +, Verma, Vishal L wrote: >> direct_IO might fail with -EINVAL due to misalignment, or -ENOMEM due >> to some allocation failing, and I thought we should return the original >> -EIO in such cases so that

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread Jeff Moyer
"h...@infradead.org" writes: > On Sat, Apr 23, 2016 at 06:08:37PM +, Verma, Vishal L wrote: >> direct_IO might fail with -EINVAL due to misalignment, or -ENOMEM due >> to some allocation failing, and I thought we should return the original >> -EIO in such cases so that the application

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread h...@infradead.org
On Sat, Apr 23, 2016 at 06:08:37PM +, Verma, Vishal L wrote: > direct_IO might fail with -EINVAL due to misalignment, or -ENOMEM due > to some allocation failing, and I thought we should return the original > -EIO in such cases so that the application doesn't lose the information > that the

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-25 Thread h...@infradead.org
On Sat, Apr 23, 2016 at 06:08:37PM +, Verma, Vishal L wrote: > direct_IO might fail with -EINVAL due to misalignment, or -ENOMEM due > to some allocation failing, and I thought we should return the original > -EIO in such cases so that the application doesn't lose the information > that the

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-23 Thread Verma, Vishal L
On Wed, 2016-04-20 at 13:59 -0700, Christoph Hellwig wrote: > On Fri, Apr 15, 2016 at 12:11:36PM -0400, Jeff Moyer wrote: > > > > > > > > + if (IS_DAX(inode)) { > > > + ret = dax_do_io(iocb, inode, iter, offset, > > > blkdev_get_block, > > >   NULL,

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-23 Thread Verma, Vishal L
On Wed, 2016-04-20 at 13:59 -0700, Christoph Hellwig wrote: > On Fri, Apr 15, 2016 at 12:11:36PM -0400, Jeff Moyer wrote: > > > > > > > > + if (IS_DAX(inode)) { > > > + ret = dax_do_io(iocb, inode, iter, offset, > > > blkdev_get_block, > > >   NULL,

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-20 Thread Christoph Hellwig
On Fri, Apr 15, 2016 at 12:11:36PM -0400, Jeff Moyer wrote: > > + if (IS_DAX(inode)) { > > + ret = dax_do_io(iocb, inode, iter, offset, blkdev_get_block, > > NULL, DIO_SKIP_DIO_COUNT); > > + if (ret == -EIO && (iov_iter_rw(iter) == WRITE)) > > +

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-20 Thread Christoph Hellwig
On Fri, Apr 15, 2016 at 12:11:36PM -0400, Jeff Moyer wrote: > > + if (IS_DAX(inode)) { > > + ret = dax_do_io(iocb, inode, iter, offset, blkdev_get_block, > > NULL, DIO_SKIP_DIO_COUNT); > > + if (ret == -EIO && (iov_iter_rw(iter) == WRITE)) > > +

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Toshi Kani
On Fri, 2016-04-15 at 13:01 -0600, Toshi Kani wrote: > On Fri, 2016-04-15 at 11:17 -0700, Dan Williams wrote: > > > > On Fri, Apr 15, 2016 at 11:06 AM, Jeff Moyer wrote: > > > > > > Dan Williams writes: > > >   > > > > > > There's a lot of special

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Toshi Kani
On Fri, 2016-04-15 at 13:01 -0600, Toshi Kani wrote: > On Fri, 2016-04-15 at 11:17 -0700, Dan Williams wrote: > > > > On Fri, Apr 15, 2016 at 11:06 AM, Jeff Moyer wrote: > > > > > > Dan Williams writes: > > >   > > > > > > There's a lot of special casing here, so you might consider > > > > > >

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Jeff Moyer
Dan Williams writes: > On Fri, Apr 15, 2016 at 11:24 AM, Jeff Moyer wrote: >>> Moreover, we're going to do the full badblocks lookup anyway when we >>> call ->direct_access(). If we had that information earlier we can >>> avoid this fallback dance.

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Jeff Moyer
Dan Williams writes: > On Fri, Apr 15, 2016 at 11:24 AM, Jeff Moyer wrote: >>> Moreover, we're going to do the full badblocks lookup anyway when we >>> call ->direct_access(). If we had that information earlier we can >>> avoid this fallback dance. >> >> None of the proposed approaches looks

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Toshi Kani
On Fri, 2016-04-15 at 11:17 -0700, Dan Williams wrote: > On Fri, Apr 15, 2016 at 11:06 AM, Jeff Moyer wrote: > > > > Dan Williams writes: > >  > > > > > There's a lot of special casing here, so you might consider > > > > > adding comments. > > > >

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Toshi Kani
On Fri, 2016-04-15 at 11:17 -0700, Dan Williams wrote: > On Fri, Apr 15, 2016 at 11:06 AM, Jeff Moyer wrote: > > > > Dan Williams writes: > >  > > > > > There's a lot of special casing here, so you might consider > > > > > adding comments. > > > > Correct - maybe we should reconsider

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Dan Williams
On Fri, Apr 15, 2016 at 11:24 AM, Jeff Moyer wrote: >> Moreover, we're going to do the full badblocks lookup anyway when we >> call ->direct_access(). If we had that information earlier we can >> avoid this fallback dance. > > None of the proposed approaches looks clean to me.

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Dan Williams
On Fri, Apr 15, 2016 at 11:24 AM, Jeff Moyer wrote: >> Moreover, we're going to do the full badblocks lookup anyway when we >> call ->direct_access(). If we had that information earlier we can >> avoid this fallback dance. > > None of the proposed approaches looks clean to me. I'll go along

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Jeff Moyer
Dan Williams writes: > On Fri, Apr 15, 2016 at 11:06 AM, Jeff Moyer wrote: >> Dan Williams writes: >> > There's a lot of special casing here, so you might consider adding > comments. Correct - maybe we

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Jeff Moyer
Dan Williams writes: > On Fri, Apr 15, 2016 at 11:06 AM, Jeff Moyer wrote: >> Dan Williams writes: >> > There's a lot of special casing here, so you might consider adding > comments. Correct - maybe we should reconsider wrapper-izing this? :) >>> >>> Another option is just to

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Dan Williams
On Fri, Apr 15, 2016 at 11:06 AM, Jeff Moyer wrote: > Dan Williams writes: > There's a lot of special casing here, so you might consider adding comments. >>> >>> Correct - maybe we should reconsider wrapper-izing this? :) >> >> Another

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Dan Williams
On Fri, Apr 15, 2016 at 11:06 AM, Jeff Moyer wrote: > Dan Williams writes: > There's a lot of special casing here, so you might consider adding comments. >>> >>> Correct - maybe we should reconsider wrapper-izing this? :) >> >> Another option is just to skip dax_do_io() and this

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Jeff Moyer
Dan Williams writes: >>> There's a lot of special casing here, so you might consider adding >>> comments. >> >> Correct - maybe we should reconsider wrapper-izing this? :) > > Another option is just to skip dax_do_io() and this special casing > fallback entirely if

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Jeff Moyer
Dan Williams writes: >>> There's a lot of special casing here, so you might consider adding >>> comments. >> >> Correct - maybe we should reconsider wrapper-izing this? :) > > Another option is just to skip dax_do_io() and this special casing > fallback entirely if errors are present. I.e. only

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Dan Williams
On Fri, Apr 15, 2016 at 10:37 AM, Verma, Vishal L wrote: > On Fri, 2016-04-15 at 13:11 -0400, Jeff Moyer wrote: [..] >> > >> > But, how does _EIOCBQUEUED work? Maybe we need an exception for it? >> For async direct I/O, only the setup phase of the I/O is performed >> and

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Dan Williams
On Fri, Apr 15, 2016 at 10:37 AM, Verma, Vishal L wrote: > On Fri, 2016-04-15 at 13:11 -0400, Jeff Moyer wrote: [..] >> > >> > But, how does _EIOCBQUEUED work? Maybe we need an exception for it? >> For async direct I/O, only the setup phase of the I/O is performed >> and >> then we return to the

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Verma, Vishal L
On Fri, 2016-04-15 at 13:11 -0400, Jeff Moyer wrote: > "Verma, Vishal L" writes: > > > > > On Fri, 2016-04-15 at 12:11 -0400, Jeff Moyer wrote: > > > > > > Vishal Verma writes: > > > > > > > > + if (IS_DAX(inode)) { > > > > +

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Verma, Vishal L
On Fri, 2016-04-15 at 13:11 -0400, Jeff Moyer wrote: > "Verma, Vishal L" writes: > > > > > On Fri, 2016-04-15 at 12:11 -0400, Jeff Moyer wrote: > > > > > > Vishal Verma writes: > > > > > > > > + if (IS_DAX(inode)) { > > > > + ret = dax_do_io(iocb, inode, iter, offset, > >

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Jeff Moyer
"Verma, Vishal L" writes: > On Fri, 2016-04-15 at 12:11 -0400, Jeff Moyer wrote: >> Vishal Verma writes: >> > + if (IS_DAX(inode)) { >> > + ret = dax_do_io(iocb, inode, iter, offset, >> > blkdev_get_block, >> >  

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Jeff Moyer
"Verma, Vishal L" writes: > On Fri, 2016-04-15 at 12:11 -0400, Jeff Moyer wrote: >> Vishal Verma writes: >> > + if (IS_DAX(inode)) { >> > + ret = dax_do_io(iocb, inode, iter, offset, >> > blkdev_get_block, >> >   NULL, DIO_SKIP_DIO_COUNT); >> > - return

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Verma, Vishal L
On Fri, 2016-04-15 at 12:11 -0400, Jeff Moyer wrote: > Vishal Verma writes: > > > > > dax_do_io (called for read() or write() for a dax file system) may > > fail > > in the presence of bad blocks or media errors. Since we expect that > > a > > write should clear media

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Verma, Vishal L
On Fri, 2016-04-15 at 12:11 -0400, Jeff Moyer wrote: > Vishal Verma writes: > > > > > dax_do_io (called for read() or write() for a dax file system) may > > fail > > in the presence of bad blocks or media errors. Since we expect that > > a > > write should clear media errors on nvdimms, make

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Jeff Moyer
Vishal Verma writes: > dax_do_io (called for read() or write() for a dax file system) may fail > in the presence of bad blocks or media errors. Since we expect that a > write should clear media errors on nvdimms, make dax_do_io fall back to > the direct_IO path, which

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-04-15 Thread Jeff Moyer
Vishal Verma writes: > dax_do_io (called for read() or write() for a dax file system) may fail > in the presence of bad blocks or media errors. Since we expect that a > write should clear media errors on nvdimms, make dax_do_io fall back to > the direct_IO path, which will send down a bio to the

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-03-30 Thread Christoph Hellwig
On Wed, Mar 30, 2016 at 12:54:37AM -0600, Vishal Verma wrote: > On Tue, 2016-03-29 at 23:34 -0700, Christoph Hellwig wrote: > > Hi Vishal, > > > > still NAK to calling the direct I/O code directly from the dax code. > > Hm, I thought this was what you meant -- do the fallback/retry attempts > at

Re: [PATCH v2 5/5] dax: handle media errors in dax_do_io

2016-03-30 Thread Christoph Hellwig
On Wed, Mar 30, 2016 at 12:54:37AM -0600, Vishal Verma wrote: > On Tue, 2016-03-29 at 23:34 -0700, Christoph Hellwig wrote: > > Hi Vishal, > > > > still NAK to calling the direct I/O code directly from the dax code. > > Hm, I thought this was what you meant -- do the fallback/retry attempts > at

  1   2   >