Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-18 Thread Namjae Jeon
2013/2/18 OGAWA Hirofumi :
> Namjae Jeon  writes:
>
>>> Hm. My concerns are compatibility and reliability. Although We can
>>> change on-disk format if need, but I don't think it can be compatible
>>> and reliable. If so, who wants to use it? I feel there is no reason to
>>> use FAT if there is no compatible.
>>>
>>> Well, anyway, possible solution would be, we can pre-allocate physical
>>> blocks via fallocate(2) or something, but discard pre-allocated blocks
>>> at ->release() (or before unmount at least). This way would have
>>> compatibility (no on-disk change over unmount) and possible breakage
>>> would be same with normal extend write patterns on kernel crash
>>> (i.e. Windows or fsck will truncate after i_size).
>> Hi OGAWA.
>> We don't need to consider device unplugging case ?
>> If yes, I can rework fat fallocate patch as your suggestion.
>
> In my suggestion, I think, kernel crash or something like unplugging
> cases handles has no change from current way.
>
> Any pre-allocated blocks are truncated by fsck as inconsistency state,
> like crash before updating i_size for normal extend write.  I.e. across
> unmount, nobody care whether pre-allocated or not. IOW, if there is
> inconsistent between i_size and cluster chain (includes via
> fallocate(2)) across unmount, it should be handled as broken state.
>
> In short, the lifetime of pre-allocated blocks are from fallocate(2) to
> ->release() only.
Okay, I will post updated fat fallocate patch after looking into more.
Thanks.
>
> Thanks.
> --
> OGAWA Hirofumi 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-18 Thread OGAWA Hirofumi
Namjae Jeon  writes:

>> Hm. My concerns are compatibility and reliability. Although We can
>> change on-disk format if need, but I don't think it can be compatible
>> and reliable. If so, who wants to use it? I feel there is no reason to
>> use FAT if there is no compatible.
>>
>> Well, anyway, possible solution would be, we can pre-allocate physical
>> blocks via fallocate(2) or something, but discard pre-allocated blocks
>> at ->release() (or before unmount at least). This way would have
>> compatibility (no on-disk change over unmount) and possible breakage
>> would be same with normal extend write patterns on kernel crash
>> (i.e. Windows or fsck will truncate after i_size).
> Hi OGAWA.
> We don't need to consider device unplugging case ?
> If yes, I can rework fat fallocate patch as your suggestion.

In my suggestion, I think, kernel crash or something like unplugging
cases handles has no change from current way.

Any pre-allocated blocks are truncated by fsck as inconsistency state,
like crash before updating i_size for normal extend write.  I.e. across
unmount, nobody care whether pre-allocated or not. IOW, if there is
inconsistent between i_size and cluster chain (includes via
fallocate(2)) across unmount, it should be handled as broken state.

In short, the lifetime of pre-allocated blocks are from fallocate(2) to
->release() only.

Thanks.
-- 
OGAWA Hirofumi 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-18 Thread Namjae Jeon
2013/2/18 OGAWA Hirofumi :
> Andrew Bartlett  writes:
>
>>> >> First, Thanks for your interest !
>>> >> A mismatch between inode size and reserved blocks can be either due to
>>> >> pre-allocation (after our changes) or due to corruption (sudden unplug
>>> >> of media etc).
>>> >> We don’t think it is right to include only read only support (i.e.
>>> >> without fallocate support) for such files because if such files are
>>> >> encountered it only means that the file is corrupted, as there is no
>>> >> current method to check if the issue is due to pre-allocation.
>>> >> If it is to be included in the kernel, then the whole patch has to go
>>> >> in.
>>> >
>>> > I don't see why that is the case.
>>> If we consider that there is no FALLOCATE support, then the condition
>>> of file size and blocks not matching can be only possible in case of
>>> corruption, right?
>>
>> Sure.  I was just suggesting we transparently recover from that, by
>> using the blocks.  Think of it more as an online fsck not about
>> fallocate.
>>
>> Anyway, if you don't think it's reasonable to use those blocks, or to
>> 'just fix it', then we just have to continue to do as we currently do.
>> That is on first sign of FS corruption just stop doing writes, and await
>> an FSCK.
>
> I'm not sure what is suggesting actually though. We have to consider
> about synchronous runtime fsck makes normal path enough slower.
>
> E.g. probably, in this case, all first open(2) of the inode will have to
> walk cluster chain until end of cluster mark, to verify cluster chain.
>
>>> >> But then again, since the FAT specifications do not accommodate
>>> >> for pre-allocation, then it is up to OGAWA to decide if this is
>>> >> acceptable.
>>> >> In any case, the patch will definitely break backward compatibility
>>> >> (on an older fat driver without fallocate support) and also in case
>>> >> for the two variants for the same kernel versions and only one has
>>> >> FALLOCATE enabled, in such cases also, the behavior will assume
>>> >> corruption in one case.
>>> >
>>> > I agree that the sudden unplug is a concern, but why not make the
>>> > filesystem more robust against that inevitable occurrence?  If the
>>> > blocks appear to be allocated to the file, why not use them?
>>> We also agree that there should be pre-allocation feature on FAT, and
>>> we had shared the scenarios where this could be required for a TV/
>>> recorder.
>>> But there are certain drawbacks which were raised by OGAWA with
>>> respect to compatibility and we also tend to agree on them.
>>> There could possibly be an issue where we are unable to distinguish
>>> between pre-allocation and corruption. Perhaps we could set a status
>>> bit on the file to indicate whether the file has pre-allocated blocks.
>>> This will make it clear whether the allocation is genuine through the
>>> FAT Fallocate request or is a result of corruption. Depending on the
>>> status of the flag - the decision can be made regard to reading
>>> blocks.
>>> But, the main issue in this will be storing this bit in on-disk
>>> directory entry for that file. From the feature and usability point of
>>> view, we should have fallocate on FAT too.
>>>
>>> But it needs initial ACK from OGAWA to continue to work on this so
>>> that we can figure out the proper solution to move forward.
>>
>> OK.  Given the need to find other approaches, I wanted to suggest some
>> ideas - some of which you may have already considered:
>>
>> What about having a shadow FAT in a file, say called 'allocated space',
>> that would contain inode -> cluster list pairs, and where that file
>> would itself contain the free space the 'belongs' to other files?
>>
>> As new clusters become needed in a file, they would simply be removed
>> from the 'allocated space' file, and assigned to the file they really
>> belong to.  That way, another OS just sees a large file, nothing more.
>>
>> Or, if we cannot make any changes to the on-disk format, what about
>> keeping such a database in memory, allocating some of the existing free
>> list to files that have had fallocate() called on them?  (Naturally,
>> this makes it non-persistent, and instead more of a 'hint', but could at
>> least solve our mutual performance issues).
>
> [...]
>
> Hm. My concerns are compatibility and reliability. Although We can
> change on-disk format if need, but I don't think it can be compatible
> and reliable. If so, who wants to use it? I feel there is no reason to
> use FAT if there is no compatible.
>
> Well, anyway, possible solution would be, we can pre-allocate physical
> blocks via fallocate(2) or something, but discard pre-allocated blocks
> at ->release() (or before unmount at least). This way would have
> compatibility (no on-disk change over unmount) and possible breakage
> would be same with normal extend write patterns on kernel crash
> (i.e. Windows or fsck will truncate after i_size).
Hi OGAWA.
We don't need to consider device unplugging case ?
If yes, I can 

Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-18 Thread Andrew Bartlett
On Mon, 2013-02-18 at 20:36 +0900, OGAWA Hirofumi wrote:
> Andrew Bartlett  writes:

> > Or, if we cannot make any changes to the on-disk format, what about
> > keeping such a database in memory, allocating some of the existing free
> > list to files that have had fallocate() called on them?  (Naturally,
> > this makes it non-persistent, and instead more of a 'hint', but could at
> > least solve our mutual performance issues).
> 
> [...]
> 
> Hm. My concerns are compatibility and reliability. Although We can
> change on-disk format if need, but I don't think it can be compatible
> and reliable. If so, who wants to use it? I feel there is no reason to
> use FAT if there is no compatible.
> 
> Well, anyway, possible solution would be, we can pre-allocate physical
> blocks via fallocate(2) or something, but discard pre-allocated blocks
> at ->release() (or before unmount at least). This way would have
> compatibility (no on-disk change over unmount) and possible breakage
> would be same with normal extend write patterns on kernel crash
> (i.e. Windows or fsck will truncate after i_size).

That would certainly give me what the Samba NAS with USB FAT disk use
case needs.

Thanks,

Andrew Bartlett

-- 
Andrew Bartletthttp://samba.org/~abartlet/
Authentication Developer, Samba Team   http://samba.org


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-18 Thread OGAWA Hirofumi
Andrew Bartlett  writes:

>> >> First, Thanks for your interest !
>> >> A mismatch between inode size and reserved blocks can be either due to
>> >> pre-allocation (after our changes) or due to corruption (sudden unplug
>> >> of media etc).
>> >> We don’t think it is right to include only read only support (i.e.
>> >> without fallocate support) for such files because if such files are
>> >> encountered it only means that the file is corrupted, as there is no
>> >> current method to check if the issue is due to pre-allocation.
>> >> If it is to be included in the kernel, then the whole patch has to go
>> >> in.
>> >
>> > I don't see why that is the case.
>> If we consider that there is no FALLOCATE support, then the condition
>> of file size and blocks not matching can be only possible in case of
>> corruption, right?
>
> Sure.  I was just suggesting we transparently recover from that, by
> using the blocks.  Think of it more as an online fsck not about
> fallocate. 
>
> Anyway, if you don't think it's reasonable to use those blocks, or to
> 'just fix it', then we just have to continue to do as we currently do.
> That is on first sign of FS corruption just stop doing writes, and await
> an FSCK.  

I'm not sure what is suggesting actually though. We have to consider
about synchronous runtime fsck makes normal path enough slower.

E.g. probably, in this case, all first open(2) of the inode will have to
walk cluster chain until end of cluster mark, to verify cluster chain.

>> >> But then again, since the FAT specifications do not accommodate
>> >> for pre-allocation, then it is up to OGAWA to decide if this is
>> >> acceptable.
>> >> In any case, the patch will definitely break backward compatibility
>> >> (on an older fat driver without fallocate support) and also in case
>> >> for the two variants for the same kernel versions and only one has
>> >> FALLOCATE enabled, in such cases also, the behavior will assume
>> >> corruption in one case.
>> >
>> > I agree that the sudden unplug is a concern, but why not make the
>> > filesystem more robust against that inevitable occurrence?  If the
>> > blocks appear to be allocated to the file, why not use them?
>> We also agree that there should be pre-allocation feature on FAT, and
>> we had shared the scenarios where this could be required for a TV/
>> recorder.
>> But there are certain drawbacks which were raised by OGAWA with
>> respect to compatibility and we also tend to agree on them.
>> There could possibly be an issue where we are unable to distinguish
>> between pre-allocation and corruption. Perhaps we could set a status
>> bit on the file to indicate whether the file has pre-allocated blocks.
>> This will make it clear whether the allocation is genuine through the
>> FAT Fallocate request or is a result of corruption. Depending on the
>> status of the flag - the decision can be made regard to reading
>> blocks.
>> But, the main issue in this will be storing this bit in on-disk
>> directory entry for that file. From the feature and usability point of
>> view, we should have fallocate on FAT too.
>> 
>> But it needs initial ACK from OGAWA to continue to work on this so
>> that we can figure out the proper solution to move forward.
>
> OK.  Given the need to find other approaches, I wanted to suggest some
> ideas - some of which you may have already considered:
>
> What about having a shadow FAT in a file, say called 'allocated space',
> that would contain inode -> cluster list pairs, and where that file
> would itself contain the free space the 'belongs' to other files?
>
> As new clusters become needed in a file, they would simply be removed
> from the 'allocated space' file, and assigned to the file they really
> belong to.  That way, another OS just sees a large file, nothing more. 
>
> Or, if we cannot make any changes to the on-disk format, what about
> keeping such a database in memory, allocating some of the existing free
> list to files that have had fallocate() called on them?  (Naturally,
> this makes it non-persistent, and instead more of a 'hint', but could at
> least solve our mutual performance issues).

[...]

Hm. My concerns are compatibility and reliability. Although We can
change on-disk format if need, but I don't think it can be compatible
and reliable. If so, who wants to use it? I feel there is no reason to
use FAT if there is no compatible.

Well, anyway, possible solution would be, we can pre-allocate physical
blocks via fallocate(2) or something, but discard pre-allocated blocks
at ->release() (or before unmount at least). This way would have
compatibility (no on-disk change over unmount) and possible breakage
would be same with normal extend write patterns on kernel crash
(i.e. Windows or fsck will truncate after i_size).

Thanks.
-- 
OGAWA Hirofumi 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-18 Thread Andrew Bartlett
On Mon, 2013-02-18 at 20:36 +0900, OGAWA Hirofumi wrote:
 Andrew Bartlett abart...@samba.org writes:

  Or, if we cannot make any changes to the on-disk format, what about
  keeping such a database in memory, allocating some of the existing free
  list to files that have had fallocate() called on them?  (Naturally,
  this makes it non-persistent, and instead more of a 'hint', but could at
  least solve our mutual performance issues).
 
 [...]
 
 Hm. My concerns are compatibility and reliability. Although We can
 change on-disk format if need, but I don't think it can be compatible
 and reliable. If so, who wants to use it? I feel there is no reason to
 use FAT if there is no compatible.
 
 Well, anyway, possible solution would be, we can pre-allocate physical
 blocks via fallocate(2) or something, but discard pre-allocated blocks
 at -release() (or before unmount at least). This way would have
 compatibility (no on-disk change over unmount) and possible breakage
 would be same with normal extend write patterns on kernel crash
 (i.e. Windows or fsck will truncate after i_size).

That would certainly give me what the Samba NAS with USB FAT disk use
case needs.

Thanks,

Andrew Bartlett

-- 
Andrew Bartletthttp://samba.org/~abartlet/
Authentication Developer, Samba Team   http://samba.org


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-18 Thread Namjae Jeon
2013/2/18 OGAWA Hirofumi hirof...@mail.parknet.co.jp:
 Andrew Bartlett abart...@samba.org writes:

  First, Thanks for your interest !
  A mismatch between inode size and reserved blocks can be either due to
  pre-allocation (after our changes) or due to corruption (sudden unplug
  of media etc).
  We don’t think it is right to include only read only support (i.e.
  without fallocate support) for such files because if such files are
  encountered it only means that the file is corrupted, as there is no
  current method to check if the issue is due to pre-allocation.
  If it is to be included in the kernel, then the whole patch has to go
  in.
 
  I don't see why that is the case.
 If we consider that there is no FALLOCATE support, then the condition
 of file size and blocks not matching can be only possible in case of
 corruption, right?

 Sure.  I was just suggesting we transparently recover from that, by
 using the blocks.  Think of it more as an online fsck not about
 fallocate.

 Anyway, if you don't think it's reasonable to use those blocks, or to
 'just fix it', then we just have to continue to do as we currently do.
 That is on first sign of FS corruption just stop doing writes, and await
 an FSCK.

 I'm not sure what is suggesting actually though. We have to consider
 about synchronous runtime fsck makes normal path enough slower.

 E.g. probably, in this case, all first open(2) of the inode will have to
 walk cluster chain until end of cluster mark, to verify cluster chain.

  But then again, since the FAT specifications do not accommodate
  for pre-allocation, then it is up to OGAWA to decide if this is
  acceptable.
  In any case, the patch will definitely break backward compatibility
  (on an older fat driver without fallocate support) and also in case
  for the two variants for the same kernel versions and only one has
  FALLOCATE enabled, in such cases also, the behavior will assume
  corruption in one case.
 
  I agree that the sudden unplug is a concern, but why not make the
  filesystem more robust against that inevitable occurrence?  If the
  blocks appear to be allocated to the file, why not use them?
 We also agree that there should be pre-allocation feature on FAT, and
 we had shared the scenarios where this could be required for a TV/
 recorder.
 But there are certain drawbacks which were raised by OGAWA with
 respect to compatibility and we also tend to agree on them.
 There could possibly be an issue where we are unable to distinguish
 between pre-allocation and corruption. Perhaps we could set a status
 bit on the file to indicate whether the file has pre-allocated blocks.
 This will make it clear whether the allocation is genuine through the
 FAT Fallocate request or is a result of corruption. Depending on the
 status of the flag - the decision can be made regard to reading
 blocks.
 But, the main issue in this will be storing this bit in on-disk
 directory entry for that file. From the feature and usability point of
 view, we should have fallocate on FAT too.

 But it needs initial ACK from OGAWA to continue to work on this so
 that we can figure out the proper solution to move forward.

 OK.  Given the need to find other approaches, I wanted to suggest some
 ideas - some of which you may have already considered:

 What about having a shadow FAT in a file, say called 'allocated space',
 that would contain inode - cluster list pairs, and where that file
 would itself contain the free space the 'belongs' to other files?

 As new clusters become needed in a file, they would simply be removed
 from the 'allocated space' file, and assigned to the file they really
 belong to.  That way, another OS just sees a large file, nothing more.

 Or, if we cannot make any changes to the on-disk format, what about
 keeping such a database in memory, allocating some of the existing free
 list to files that have had fallocate() called on them?  (Naturally,
 this makes it non-persistent, and instead more of a 'hint', but could at
 least solve our mutual performance issues).

 [...]

 Hm. My concerns are compatibility and reliability. Although We can
 change on-disk format if need, but I don't think it can be compatible
 and reliable. If so, who wants to use it? I feel there is no reason to
 use FAT if there is no compatible.

 Well, anyway, possible solution would be, we can pre-allocate physical
 blocks via fallocate(2) or something, but discard pre-allocated blocks
 at -release() (or before unmount at least). This way would have
 compatibility (no on-disk change over unmount) and possible breakage
 would be same with normal extend write patterns on kernel crash
 (i.e. Windows or fsck will truncate after i_size).
Hi OGAWA.
We don't need to consider device unplugging case ?
If yes, I can rework fat fallocate patch as your suggestion.
Thanks.

 Thanks.
 --
 OGAWA Hirofumi hirof...@mail.parknet.co.jp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to 

Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-18 Thread OGAWA Hirofumi
Namjae Jeon linkinj...@gmail.com writes:

 Hm. My concerns are compatibility and reliability. Although We can
 change on-disk format if need, but I don't think it can be compatible
 and reliable. If so, who wants to use it? I feel there is no reason to
 use FAT if there is no compatible.

 Well, anyway, possible solution would be, we can pre-allocate physical
 blocks via fallocate(2) or something, but discard pre-allocated blocks
 at -release() (or before unmount at least). This way would have
 compatibility (no on-disk change over unmount) and possible breakage
 would be same with normal extend write patterns on kernel crash
 (i.e. Windows or fsck will truncate after i_size).
 Hi OGAWA.
 We don't need to consider device unplugging case ?
 If yes, I can rework fat fallocate patch as your suggestion.

In my suggestion, I think, kernel crash or something like unplugging
cases handles has no change from current way.

Any pre-allocated blocks are truncated by fsck as inconsistency state,
like crash before updating i_size for normal extend write.  I.e. across
unmount, nobody care whether pre-allocated or not. IOW, if there is
inconsistent between i_size and cluster chain (includes via
fallocate(2)) across unmount, it should be handled as broken state.

In short, the lifetime of pre-allocated blocks are from fallocate(2) to
-release() only.

Thanks.
-- 
OGAWA Hirofumi hirof...@mail.parknet.co.jp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-18 Thread Namjae Jeon
2013/2/18 OGAWA Hirofumi hirof...@mail.parknet.co.jp:
 Namjae Jeon linkinj...@gmail.com writes:

 Hm. My concerns are compatibility and reliability. Although We can
 change on-disk format if need, but I don't think it can be compatible
 and reliable. If so, who wants to use it? I feel there is no reason to
 use FAT if there is no compatible.

 Well, anyway, possible solution would be, we can pre-allocate physical
 blocks via fallocate(2) or something, but discard pre-allocated blocks
 at -release() (or before unmount at least). This way would have
 compatibility (no on-disk change over unmount) and possible breakage
 would be same with normal extend write patterns on kernel crash
 (i.e. Windows or fsck will truncate after i_size).
 Hi OGAWA.
 We don't need to consider device unplugging case ?
 If yes, I can rework fat fallocate patch as your suggestion.

 In my suggestion, I think, kernel crash or something like unplugging
 cases handles has no change from current way.

 Any pre-allocated blocks are truncated by fsck as inconsistency state,
 like crash before updating i_size for normal extend write.  I.e. across
 unmount, nobody care whether pre-allocated or not. IOW, if there is
 inconsistent between i_size and cluster chain (includes via
 fallocate(2)) across unmount, it should be handled as broken state.

 In short, the lifetime of pre-allocated blocks are from fallocate(2) to
 -release() only.
Okay, I will post updated fat fallocate patch after looking into more.
Thanks.

 Thanks.
 --
 OGAWA Hirofumi hirof...@mail.parknet.co.jp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-14 Thread Andrew Bartlett
On Thu, 2013-02-14 at 18:52 +0900, Namjae Jeon wrote:
> [snip]
> >> >
> >> > Thanks,
> >> Hi Andrew.
> >>
> >> First, Thanks for your interest !
> >> A mismatch between inode size and reserved blocks can be either due to
> >> pre-allocation (after our changes) or due to corruption (sudden unplug
> >> of media etc).
> >> We don’t think it is right to include only read only support (i.e.
> >> without fallocate support) for such files because if such files are
> >> encountered it only means that the file is corrupted, as there is no
> >> current method to check if the issue is due to pre-allocation.
> >> If it is to be included in the kernel, then the whole patch has to go
> >> in.
> >
> > I don't see why that is the case.
> If we consider that there is no FALLOCATE support, then the condition
> of file size and blocks not matching can be only possible in case of
> corruption, right?

Sure.  I was just suggesting we transparently recover from that, by
using the blocks.  Think of it more as an online fsck not about
fallocate. 

Anyway, if you don't think it's reasonable to use those blocks, or to
'just fix it', then we just have to continue to do as we currently do.
That is on first sign of FS corruption just stop doing writes, and await
an FSCK.  

> >> But then again, since the FAT specifications do not accommodate
> >> for pre-allocation, then it is up to OGAWA to decide if this is
> >> acceptable.
> >> In any case, the patch will definitely break backward compatibility
> >> (on an older fat driver without fallocate support) and also in case
> >> for the two variants for the same kernel versions and only one has
> >> FALLOCATE enabled, in such cases also, the behavior will assume
> >> corruption in one case.
> >
> > I agree that the sudden unplug is a concern, but why not make the
> > filesystem more robust against that inevitable occurrence?  If the
> > blocks appear to be allocated to the file, why not use them?
> We also agree that there should be pre-allocation feature on FAT, and
> we had shared the scenarios where this could be required for a TV/
> recorder.
> But there are certain drawbacks which were raised by OGAWA with
> respect to compatibility and we also tend to agree on them.
> There could possibly be an issue where we are unable to distinguish
> between pre-allocation and corruption. Perhaps we could set a status
> bit on the file to indicate whether the file has pre-allocated blocks.
> This will make it clear whether the allocation is genuine through the
> FAT Fallocate request or is a result of corruption. Depending on the
> status of the flag - the decision can be made regard to reading
> blocks.
> But, the main issue in this will be storing this bit in on-disk
> directory entry for that file. From the feature and usability point of
> view, we should have fallocate on FAT too.
> 
> But it needs initial ACK from OGAWA to continue to work on this so
> that we can figure out the proper solution to move forward.

OK.  Given the need to find other approaches, I wanted to suggest some
ideas - some of which you may have already considered:

What about having a shadow FAT in a file, say called 'allocated space',
that would contain inode -> cluster list pairs, and where that file
would itself contain the free space the 'belongs' to other files?

As new clusters become needed in a file, they would simply be removed
from the 'allocated space' file, and assigned to the file they really
belong to.  That way, another OS just sees a large file, nothing more. 

Or, if we cannot make any changes to the on-disk format, what about
keeping such a database in memory, allocating some of the existing free
list to files that have had fallocate() called on them?  (Naturally,
this makes it non-persistent, and instead more of a 'hint', but could at
least solve our mutual performance issues). 

Or, could we leave allocated but unused clusters in the free cluster
list, but maintain a file with a hint that a particular file should use
a particular free cluster next, if available?  That list of 'allocated
free' clusters could be honoured by fallocate-aware OSs to reduce df and
increase du, but be ignored by other OSs, ensuring you could not run out
of space expanding a file in another OS.  

If a cluster was observed no longer to be in the real free list, it
would be ignored in the 'allocated free' list, to avoid corruption. 

In short, I see the restriction on not breaking existing implementations
as a difficult, but certainty not impossible problem. 

Thanks,

Andrew Bartlett

-- 
Andrew Bartletthttp://samba.org/~abartlet/
Authentication Developer, Samba Team   http://samba.org


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-14 Thread Namjae Jeon
[snip]
>> >
>> > Thanks,
>> Hi Andrew.
>>
>> First, Thanks for your interest !
>> A mismatch between inode size and reserved blocks can be either due to
>> pre-allocation (after our changes) or due to corruption (sudden unplug
>> of media etc).
>> We don’t think it is right to include only read only support (i.e.
>> without fallocate support) for such files because if such files are
>> encountered it only means that the file is corrupted, as there is no
>> current method to check if the issue is due to pre-allocation.
>> If it is to be included in the kernel, then the whole patch has to go
>> in.
>
> I don't see why that is the case.
If we consider that there is no FALLOCATE support, then the condition
of file size and blocks not matching can be only possible in case of
corruption, right?

>
>> But then again, since the FAT specifications do not accommodate
>> for pre-allocation, then it is up to OGAWA to decide if this is
>> acceptable.
>> In any case, the patch will definitely break backward compatibility
>> (on an older fat driver without fallocate support) and also in case
>> for the two variants for the same kernel versions and only one has
>> FALLOCATE enabled, in such cases also, the behavior will assume
>> corruption in one case.
>
> I agree that the sudden unplug is a concern, but why not make the
> filesystem more robust against that inevitable occurrence?  If the
> blocks appear to be allocated to the file, why not use them?
We also agree that there should be pre-allocation feature on FAT, and
we had shared the scenarios where this could be required for a TV/
recorder.
But there are certain drawbacks which were raised by OGAWA with
respect to compatibility and we also tend to agree on them.
There could possibly be an issue where we are unable to distinguish
between pre-allocation and corruption. Perhaps we could set a status
bit on the file to indicate whether the file has pre-allocated blocks.
This will make it clear whether the allocation is genuine through the
FAT Fallocate request or is a result of corruption. Depending on the
status of the flag - the decision can be made regard to reading
blocks.
But, the main issue in this will be storing this bit in on-disk
directory entry for that file. From the feature and usability point of
view, we should have fallocate on FAT too.

But it needs initial ACK from OGAWA to continue to work on this so
that we can figure out the proper solution to move forward.
>
> That is, while it is hard to predict the many different ways a
> filesystem can be corrupted, what would go wrong if we did use these
> clusters?  Do you fear that they might also be allocated to someone
> else?
>
> That would, if I understand correctly just mean that that more broken,
> not quite valid USB thumb drives and other FAT filesystems work equally
> well on Windows and Linux, without administrative privileges.  (Given
> that running fsck requires root, and isn't trivially available to normal
> users in Linux, and I presume is similarly privileged in windows).
>
> What I'm doing is suggesting re-purposing your patch, from preallocation
> to robustness.  In this light, do you think this worth pushing forward?
The patch’s main aim was to reserve space. If the work that you
propose only aims to enable reads in case of corrupt files using size
mismatch as a criteria, then we think it would not be a good idea.

Thanks :-)
>
> We can later address if there is any safe way to preallocate files on
> FAT as a different question, hoping that this means it will 'just work'
> on a broader range of other Linux hosts, just as it is claimed to 'just
> work' on Windows.
>
> Thanks,
>
> Andrew Bartlett
>
> --
> Andrew Bartletthttp://samba.org/~abartlet/
> Authentication Developer, Samba Team   http://samba.org
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-14 Thread Namjae Jeon
[snip]
 
  Thanks,
 Hi Andrew.

 First, Thanks for your interest !
 A mismatch between inode size and reserved blocks can be either due to
 pre-allocation (after our changes) or due to corruption (sudden unplug
 of media etc).
 We don’t think it is right to include only read only support (i.e.
 without fallocate support) for such files because if such files are
 encountered it only means that the file is corrupted, as there is no
 current method to check if the issue is due to pre-allocation.
 If it is to be included in the kernel, then the whole patch has to go
 in.

 I don't see why that is the case.
If we consider that there is no FALLOCATE support, then the condition
of file size and blocks not matching can be only possible in case of
corruption, right?


 But then again, since the FAT specifications do not accommodate
 for pre-allocation, then it is up to OGAWA to decide if this is
 acceptable.
 In any case, the patch will definitely break backward compatibility
 (on an older fat driver without fallocate support) and also in case
 for the two variants for the same kernel versions and only one has
 FALLOCATE enabled, in such cases also, the behavior will assume
 corruption in one case.

 I agree that the sudden unplug is a concern, but why not make the
 filesystem more robust against that inevitable occurrence?  If the
 blocks appear to be allocated to the file, why not use them?
We also agree that there should be pre-allocation feature on FAT, and
we had shared the scenarios where this could be required for a TV/
recorder.
But there are certain drawbacks which were raised by OGAWA with
respect to compatibility and we also tend to agree on them.
There could possibly be an issue where we are unable to distinguish
between pre-allocation and corruption. Perhaps we could set a status
bit on the file to indicate whether the file has pre-allocated blocks.
This will make it clear whether the allocation is genuine through the
FAT Fallocate request or is a result of corruption. Depending on the
status of the flag - the decision can be made regard to reading
blocks.
But, the main issue in this will be storing this bit in on-disk
directory entry for that file. From the feature and usability point of
view, we should have fallocate on FAT too.

But it needs initial ACK from OGAWA to continue to work on this so
that we can figure out the proper solution to move forward.

 That is, while it is hard to predict the many different ways a
 filesystem can be corrupted, what would go wrong if we did use these
 clusters?  Do you fear that they might also be allocated to someone
 else?

 That would, if I understand correctly just mean that that more broken,
 not quite valid USB thumb drives and other FAT filesystems work equally
 well on Windows and Linux, without administrative privileges.  (Given
 that running fsck requires root, and isn't trivially available to normal
 users in Linux, and I presume is similarly privileged in windows).

 What I'm doing is suggesting re-purposing your patch, from preallocation
 to robustness.  In this light, do you think this worth pushing forward?
The patch’s main aim was to reserve space. If the work that you
propose only aims to enable reads in case of corrupt files using size
mismatch as a criteria, then we think it would not be a good idea.

Thanks :-)

 We can later address if there is any safe way to preallocate files on
 FAT as a different question, hoping that this means it will 'just work'
 on a broader range of other Linux hosts, just as it is claimed to 'just
 work' on Windows.

 Thanks,

 Andrew Bartlett

 --
 Andrew Bartletthttp://samba.org/~abartlet/
 Authentication Developer, Samba Team   http://samba.org



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-14 Thread Andrew Bartlett
On Thu, 2013-02-14 at 18:52 +0900, Namjae Jeon wrote:
 [snip]
  
   Thanks,
  Hi Andrew.
 
  First, Thanks for your interest !
  A mismatch between inode size and reserved blocks can be either due to
  pre-allocation (after our changes) or due to corruption (sudden unplug
  of media etc).
  We don’t think it is right to include only read only support (i.e.
  without fallocate support) for such files because if such files are
  encountered it only means that the file is corrupted, as there is no
  current method to check if the issue is due to pre-allocation.
  If it is to be included in the kernel, then the whole patch has to go
  in.
 
  I don't see why that is the case.
 If we consider that there is no FALLOCATE support, then the condition
 of file size and blocks not matching can be only possible in case of
 corruption, right?

Sure.  I was just suggesting we transparently recover from that, by
using the blocks.  Think of it more as an online fsck not about
fallocate. 

Anyway, if you don't think it's reasonable to use those blocks, or to
'just fix it', then we just have to continue to do as we currently do.
That is on first sign of FS corruption just stop doing writes, and await
an FSCK.  

  But then again, since the FAT specifications do not accommodate
  for pre-allocation, then it is up to OGAWA to decide if this is
  acceptable.
  In any case, the patch will definitely break backward compatibility
  (on an older fat driver without fallocate support) and also in case
  for the two variants for the same kernel versions and only one has
  FALLOCATE enabled, in such cases also, the behavior will assume
  corruption in one case.
 
  I agree that the sudden unplug is a concern, but why not make the
  filesystem more robust against that inevitable occurrence?  If the
  blocks appear to be allocated to the file, why not use them?
 We also agree that there should be pre-allocation feature on FAT, and
 we had shared the scenarios where this could be required for a TV/
 recorder.
 But there are certain drawbacks which were raised by OGAWA with
 respect to compatibility and we also tend to agree on them.
 There could possibly be an issue where we are unable to distinguish
 between pre-allocation and corruption. Perhaps we could set a status
 bit on the file to indicate whether the file has pre-allocated blocks.
 This will make it clear whether the allocation is genuine through the
 FAT Fallocate request or is a result of corruption. Depending on the
 status of the flag - the decision can be made regard to reading
 blocks.
 But, the main issue in this will be storing this bit in on-disk
 directory entry for that file. From the feature and usability point of
 view, we should have fallocate on FAT too.
 
 But it needs initial ACK from OGAWA to continue to work on this so
 that we can figure out the proper solution to move forward.

OK.  Given the need to find other approaches, I wanted to suggest some
ideas - some of which you may have already considered:

What about having a shadow FAT in a file, say called 'allocated space',
that would contain inode - cluster list pairs, and where that file
would itself contain the free space the 'belongs' to other files?

As new clusters become needed in a file, they would simply be removed
from the 'allocated space' file, and assigned to the file they really
belong to.  That way, another OS just sees a large file, nothing more. 

Or, if we cannot make any changes to the on-disk format, what about
keeping such a database in memory, allocating some of the existing free
list to files that have had fallocate() called on them?  (Naturally,
this makes it non-persistent, and instead more of a 'hint', but could at
least solve our mutual performance issues). 

Or, could we leave allocated but unused clusters in the free cluster
list, but maintain a file with a hint that a particular file should use
a particular free cluster next, if available?  That list of 'allocated
free' clusters could be honoured by fallocate-aware OSs to reduce df and
increase du, but be ignored by other OSs, ensuring you could not run out
of space expanding a file in another OS.  

If a cluster was observed no longer to be in the real free list, it
would be ignored in the 'allocated free' list, to avoid corruption. 

In short, I see the restriction on not breaking existing implementations
as a difficult, but certainty not impossible problem. 

Thanks,

Andrew Bartlett

-- 
Andrew Bartletthttp://samba.org/~abartlet/
Authentication Developer, Samba Team   http://samba.org


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-13 Thread Andrew Bartlett
On Thu, 2013-02-14 at 15:44 +0900, Namjae Jeon wrote:
> 2013/2/14, Andrew Bartlett :
> > (apologies for the duplicate mail, I typo-ed the maintainers address)
> >
> > G'day,
> >
> > I've been looking into the patch "[v2] fat: editions to support
> > fat_fallocate()" and I wonder if there is a way we can split this issue
> > in two, so that we get at least some of the patch into the kernel.
> >
> > https://lkml.org/lkml/2012/10/13/75
> > https://patchwork.kernel.org/patch/1589161/
> >
> > What I'm wanting to discuss (and perhaps implement, with you if
> > possible) is splitting this patch into writing to existing pre-allocated
> > files, and creating a new pre-allocation.
> >
> > If Windows does, as you claim, simply read preallocations as zero, and
> > writes to them normally and without error, then Linux should do the
> > same.  Here of course I'm assuming that Windows is not preallocating,
> > but instead simply trying to recover gracefully and safely from a simple
> > 'file system corruption', where the sectors are allocated but not used.
> >
> > The bulk of this patch is implementing this transparent recovery, and it
> > seem relatively harmless to include this into the kernel.
> >
> > Then vendors doing TV streaming, or in my case copies of large files
> > onto Samba-mounted USB FAT devices, can add only the smaller patch to
> > implement fallocate, at their own risk and fully knowing that it will be
> > regarded as corrupt on Linux.
> >
> > If accepted read support will, over a period of years, trickle down to
> > other Linux users, broadening the base that can still read these
> > 'corrupt' drives, no matter the cause.
> >
> > I hope you agree that this is a practical way forward, and I look
> > forward to working with you on this.
> >
> > Thanks,
> Hi Andrew.
> 
> First, Thanks for your interest !
> A mismatch between inode size and reserved blocks can be either due to
> pre-allocation (after our changes) or due to corruption (sudden unplug
> of media etc).
> We don’t think it is right to include only read only support (i.e.
> without fallocate support) for such files because if such files are
> encountered it only means that the file is corrupted, as there is no
> current method to check if the issue is due to pre-allocation.
> If it is to be included in the kernel, then the whole patch has to go
> in. 

I don't see why that is the case. 

> But then again, since the FAT specifications do not accommodate
> for pre-allocation, then it is up to OGAWA to decide if this is
> acceptable.
> In any case, the patch will definitely break backward compatibility
> (on an older fat driver without fallocate support) and also in case
> for the two variants for the same kernel versions and only one has
> FALLOCATE enabled, in such cases also, the behavior will assume
> corruption in one case.

I agree that the sudden unplug is a concern, but why not make the
filesystem more robust against that inevitable occurrence?  If the
blocks appear to be allocated to the file, why not use them?

That is, while it is hard to predict the many different ways a
filesystem can be corrupted, what would go wrong if we did use these
clusters?  Do you fear that they might also be allocated to someone
else? 

That would, if I understand correctly just mean that that more broken,
not quite valid USB thumb drives and other FAT filesystems work equally
well on Windows and Linux, without administrative privileges.  (Given
that running fsck requires root, and isn't trivially available to normal
users in Linux, and I presume is similarly privileged in windows). 

What I'm doing is suggesting re-purposing your patch, from preallocation
to robustness.  In this light, do you think this worth pushing forward?

We can later address if there is any safe way to preallocate files on
FAT as a different question, hoping that this means it will 'just work'
on a broader range of other Linux hosts, just as it is claimed to 'just
work' on Windows.

Thanks,

Andrew Bartlett

-- 
Andrew Bartletthttp://samba.org/~abartlet/
Authentication Developer, Samba Team   http://samba.org


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-13 Thread Namjae Jeon
2013/2/14, Andrew Bartlett :
> (apologies for the duplicate mail, I typo-ed the maintainers address)
>
> G'day,
>
> I've been looking into the patch "[v2] fat: editions to support
> fat_fallocate()" and I wonder if there is a way we can split this issue
> in two, so that we get at least some of the patch into the kernel.
>
> https://lkml.org/lkml/2012/10/13/75
> https://patchwork.kernel.org/patch/1589161/
>
> What I'm wanting to discuss (and perhaps implement, with you if
> possible) is splitting this patch into writing to existing pre-allocated
> files, and creating a new pre-allocation.
>
> If Windows does, as you claim, simply read preallocations as zero, and
> writes to them normally and without error, then Linux should do the
> same.  Here of course I'm assuming that Windows is not preallocating,
> but instead simply trying to recover gracefully and safely from a simple
> 'file system corruption', where the sectors are allocated but not used.
>
> The bulk of this patch is implementing this transparent recovery, and it
> seem relatively harmless to include this into the kernel.
>
> Then vendors doing TV streaming, or in my case copies of large files
> onto Samba-mounted USB FAT devices, can add only the smaller patch to
> implement fallocate, at their own risk and fully knowing that it will be
> regarded as corrupt on Linux.
>
> If accepted read support will, over a period of years, trickle down to
> other Linux users, broadening the base that can still read these
> 'corrupt' drives, no matter the cause.
>
> I hope you agree that this is a practical way forward, and I look
> forward to working with you on this.
>
> Thanks,
Hi Andrew.

First, Thanks for your interest !
A mismatch between inode size and reserved blocks can be either due to
pre-allocation (after our changes) or due to corruption (sudden unplug
of media etc).
We don’t think it is right to include only read only support (i.e.
without fallocate support) for such files because if such files are
encountered it only means that the file is corrupted, as there is no
current method to check if the issue is due to pre-allocation.
If it is to be included in the kernel, then the whole patch has to go
in. But then again, since the FAT specifications do not accommodate
for pre-allocation, then it is up to OGAWA to decide if this is
acceptable.
In any case, the patch will definitely break backward compatibility
(on an older fat driver without fallocate support) and also in case
for the two variants for the same kernel versions and only one has
FALLOCATE enabled, in such cases also, the behavior will assume
corruption in one case.

Thanks.

>
> Andrew Bartlett
> --
> Andrew Bartletthttp://samba.org/~abartlet/
> Authentication Developer, Samba Team   http://samba.org
>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-13 Thread Andrew Bartlett
(apologies for the duplicate mail, I typo-ed the maintainers address)

G'day,

I've been looking into the patch "[v2] fat: editions to support
fat_fallocate()" and I wonder if there is a way we can split this issue
in two, so that we get at least some of the patch into the kernel.

https://lkml.org/lkml/2012/10/13/75
https://patchwork.kernel.org/patch/1589161/

What I'm wanting to discuss (and perhaps implement, with you if
possible) is splitting this patch into writing to existing pre-allocated
files, and creating a new pre-allocation.

If Windows does, as you claim, simply read preallocations as zero, and
writes to them normally and without error, then Linux should do the
same.  Here of course I'm assuming that Windows is not preallocating,
but instead simply trying to recover gracefully and safely from a simple
'file system corruption', where the sectors are allocated but not used. 

The bulk of this patch is implementing this transparent recovery, and it
seem relatively harmless to include this into the kernel.

Then vendors doing TV streaming, or in my case copies of large files
onto Samba-mounted USB FAT devices, can add only the smaller patch to
implement fallocate, at their own risk and fully knowing that it will be
regarded as corrupt on Linux. 

If accepted read support will, over a period of years, trickle down to
other Linux users, broadening the base that can still read these
'corrupt' drives, no matter the cause. 

I hope you agree that this is a practical way forward, and I look
forward to working with you on this.

Thanks,

Andrew Bartlett
-- 
Andrew Bartletthttp://samba.org/~abartlet/
Authentication Developer, Samba Team   http://samba.org



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-13 Thread Andrew Bartlett
G'day,

I've been looking into the patch "[v2] fat: editions to support
fat_fallocate()" and I wonder if there is a way we can split this issue
in two, so that we get at least some of the patch into the kernel.

https://lkml.org/lkml/2012/10/13/75
https://patchwork.kernel.org/patch/1589161/

What I'm wanting to discuss (and perhaps implement, with you if
possible) is splitting this patch into writing to existing pre-allocated
files, and creating a new pre-allocation.

If Windows does, as you claim, simply read preallocations as zero, and
writes to them normally and without error, then Linux should do the
same.  Here of course I'm assuming that Windows is not preallocating,
but instead simply trying to recover gracefully and safely from a simple
'file system corruption', where the sectors are allocated but not used. 

The bulk of this patch is implementing this transparent recovery, and it
seem relatively harmless to include this into the kernel.

Then vendors doing TV streaming, or in my case copies of large files
onto Samba-mounted USB FAT devices, can add only the smaller patch to
implement fallocate, at their own risk and fully knowing that it will be
regarded as corrupt on Linux. 

If accepted read support will, over a period of years, trickle down to
other Linux users, broadening the base that can still read these
'corrupt' drives, no matter the cause. 

I hope you agree that this is a practical way forward, and I look
forward to working with you on this.

Thanks,

Andrew Bartlett
-- 
Andrew Bartletthttp://samba.org/~abartlet/
Authentication Developer, Samba Team   http://samba.org


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-13 Thread Andrew Bartlett
G'day,

I've been looking into the patch [v2] fat: editions to support
fat_fallocate() and I wonder if there is a way we can split this issue
in two, so that we get at least some of the patch into the kernel.

https://lkml.org/lkml/2012/10/13/75
https://patchwork.kernel.org/patch/1589161/

What I'm wanting to discuss (and perhaps implement, with you if
possible) is splitting this patch into writing to existing pre-allocated
files, and creating a new pre-allocation.

If Windows does, as you claim, simply read preallocations as zero, and
writes to them normally and without error, then Linux should do the
same.  Here of course I'm assuming that Windows is not preallocating,
but instead simply trying to recover gracefully and safely from a simple
'file system corruption', where the sectors are allocated but not used. 

The bulk of this patch is implementing this transparent recovery, and it
seem relatively harmless to include this into the kernel.

Then vendors doing TV streaming, or in my case copies of large files
onto Samba-mounted USB FAT devices, can add only the smaller patch to
implement fallocate, at their own risk and fully knowing that it will be
regarded as corrupt on Linux. 

If accepted read support will, over a period of years, trickle down to
other Linux users, broadening the base that can still read these
'corrupt' drives, no matter the cause. 

I hope you agree that this is a practical way forward, and I look
forward to working with you on this.

Thanks,

Andrew Bartlett
-- 
Andrew Bartletthttp://samba.org/~abartlet/
Authentication Developer, Samba Team   http://samba.org


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-13 Thread Andrew Bartlett
(apologies for the duplicate mail, I typo-ed the maintainers address)

G'day,

I've been looking into the patch [v2] fat: editions to support
fat_fallocate() and I wonder if there is a way we can split this issue
in two, so that we get at least some of the patch into the kernel.

https://lkml.org/lkml/2012/10/13/75
https://patchwork.kernel.org/patch/1589161/

What I'm wanting to discuss (and perhaps implement, with you if
possible) is splitting this patch into writing to existing pre-allocated
files, and creating a new pre-allocation.

If Windows does, as you claim, simply read preallocations as zero, and
writes to them normally and without error, then Linux should do the
same.  Here of course I'm assuming that Windows is not preallocating,
but instead simply trying to recover gracefully and safely from a simple
'file system corruption', where the sectors are allocated but not used. 

The bulk of this patch is implementing this transparent recovery, and it
seem relatively harmless to include this into the kernel.

Then vendors doing TV streaming, or in my case copies of large files
onto Samba-mounted USB FAT devices, can add only the smaller patch to
implement fallocate, at their own risk and fully knowing that it will be
regarded as corrupt on Linux. 

If accepted read support will, over a period of years, trickle down to
other Linux users, broadening the base that can still read these
'corrupt' drives, no matter the cause. 

I hope you agree that this is a practical way forward, and I look
forward to working with you on this.

Thanks,

Andrew Bartlett
-- 
Andrew Bartletthttp://samba.org/~abartlet/
Authentication Developer, Samba Team   http://samba.org



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-13 Thread Namjae Jeon
2013/2/14, Andrew Bartlett abart...@samba.org:
 (apologies for the duplicate mail, I typo-ed the maintainers address)

 G'day,

 I've been looking into the patch [v2] fat: editions to support
 fat_fallocate() and I wonder if there is a way we can split this issue
 in two, so that we get at least some of the patch into the kernel.

 https://lkml.org/lkml/2012/10/13/75
 https://patchwork.kernel.org/patch/1589161/

 What I'm wanting to discuss (and perhaps implement, with you if
 possible) is splitting this patch into writing to existing pre-allocated
 files, and creating a new pre-allocation.

 If Windows does, as you claim, simply read preallocations as zero, and
 writes to them normally and without error, then Linux should do the
 same.  Here of course I'm assuming that Windows is not preallocating,
 but instead simply trying to recover gracefully and safely from a simple
 'file system corruption', where the sectors are allocated but not used.

 The bulk of this patch is implementing this transparent recovery, and it
 seem relatively harmless to include this into the kernel.

 Then vendors doing TV streaming, or in my case copies of large files
 onto Samba-mounted USB FAT devices, can add only the smaller patch to
 implement fallocate, at their own risk and fully knowing that it will be
 regarded as corrupt on Linux.

 If accepted read support will, over a period of years, trickle down to
 other Linux users, broadening the base that can still read these
 'corrupt' drives, no matter the cause.

 I hope you agree that this is a practical way forward, and I look
 forward to working with you on this.

 Thanks,
Hi Andrew.

First, Thanks for your interest !
A mismatch between inode size and reserved blocks can be either due to
pre-allocation (after our changes) or due to corruption (sudden unplug
of media etc).
We don’t think it is right to include only read only support (i.e.
without fallocate support) for such files because if such files are
encountered it only means that the file is corrupted, as there is no
current method to check if the issue is due to pre-allocation.
If it is to be included in the kernel, then the whole patch has to go
in. But then again, since the FAT specifications do not accommodate
for pre-allocation, then it is up to OGAWA to decide if this is
acceptable.
In any case, the patch will definitely break backward compatibility
(on an older fat driver without fallocate support) and also in case
for the two variants for the same kernel versions and only one has
FALLOCATE enabled, in such cases also, the behavior will assume
corruption in one case.

Thanks.


 Andrew Bartlett
 --
 Andrew Bartletthttp://samba.org/~abartlet/
 Authentication Developer, Samba Team   http://samba.org




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-13 Thread Andrew Bartlett
On Thu, 2013-02-14 at 15:44 +0900, Namjae Jeon wrote:
 2013/2/14, Andrew Bartlett abart...@samba.org:
  (apologies for the duplicate mail, I typo-ed the maintainers address)
 
  G'day,
 
  I've been looking into the patch [v2] fat: editions to support
  fat_fallocate() and I wonder if there is a way we can split this issue
  in two, so that we get at least some of the patch into the kernel.
 
  https://lkml.org/lkml/2012/10/13/75
  https://patchwork.kernel.org/patch/1589161/
 
  What I'm wanting to discuss (and perhaps implement, with you if
  possible) is splitting this patch into writing to existing pre-allocated
  files, and creating a new pre-allocation.
 
  If Windows does, as you claim, simply read preallocations as zero, and
  writes to them normally and without error, then Linux should do the
  same.  Here of course I'm assuming that Windows is not preallocating,
  but instead simply trying to recover gracefully and safely from a simple
  'file system corruption', where the sectors are allocated but not used.
 
  The bulk of this patch is implementing this transparent recovery, and it
  seem relatively harmless to include this into the kernel.
 
  Then vendors doing TV streaming, or in my case copies of large files
  onto Samba-mounted USB FAT devices, can add only the smaller patch to
  implement fallocate, at their own risk and fully knowing that it will be
  regarded as corrupt on Linux.
 
  If accepted read support will, over a period of years, trickle down to
  other Linux users, broadening the base that can still read these
  'corrupt' drives, no matter the cause.
 
  I hope you agree that this is a practical way forward, and I look
  forward to working with you on this.
 
  Thanks,
 Hi Andrew.
 
 First, Thanks for your interest !
 A mismatch between inode size and reserved blocks can be either due to
 pre-allocation (after our changes) or due to corruption (sudden unplug
 of media etc).
 We don’t think it is right to include only read only support (i.e.
 without fallocate support) for such files because if such files are
 encountered it only means that the file is corrupted, as there is no
 current method to check if the issue is due to pre-allocation.
 If it is to be included in the kernel, then the whole patch has to go
 in. 

I don't see why that is the case. 

 But then again, since the FAT specifications do not accommodate
 for pre-allocation, then it is up to OGAWA to decide if this is
 acceptable.
 In any case, the patch will definitely break backward compatibility
 (on an older fat driver without fallocate support) and also in case
 for the two variants for the same kernel versions and only one has
 FALLOCATE enabled, in such cases also, the behavior will assume
 corruption in one case.

I agree that the sudden unplug is a concern, but why not make the
filesystem more robust against that inevitable occurrence?  If the
blocks appear to be allocated to the file, why not use them?

That is, while it is hard to predict the many different ways a
filesystem can be corrupted, what would go wrong if we did use these
clusters?  Do you fear that they might also be allocated to someone
else? 

That would, if I understand correctly just mean that that more broken,
not quite valid USB thumb drives and other FAT filesystems work equally
well on Windows and Linux, without administrative privileges.  (Given
that running fsck requires root, and isn't trivially available to normal
users in Linux, and I presume is similarly privileged in windows). 

What I'm doing is suggesting re-purposing your patch, from preallocation
to robustness.  In this light, do you think this worth pushing forward?

We can later address if there is any safe way to preallocate files on
FAT as a different question, hoping that this means it will 'just work'
on a broader range of other Linux hosts, just as it is claimed to 'just
work' on Windows.

Thanks,

Andrew Bartlett

-- 
Andrew Bartletthttp://samba.org/~abartlet/
Authentication Developer, Samba Team   http://samba.org


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/