Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-30 Thread davelist



> On Mar 23, 2017, at 8:01 PM, davel...@mac.com wrote:
> 
> 
>> On Mar 23, 2017, at 12:24 PM, David Duncan  wrote:
>> 
>> I just want to remind everyone I’m *not* a file system’s engineer – I’m just 
>> trying to help Dave (and anyone else caught in this) make sure their app can 
>> find their files.
>> 
>>> On Mar 23, 2017, at 1:53 AM, Alastair Houghton 
>>>  wrote:
>>> 
>>> On 22 Mar 2017, at 18:00, David Duncan  wrote:
 
 So there was another explanation posted on the bug that I’m not certain 
 you got, but which I think may explain.
 
 Basically the concept is that since APFS doesn’t normalize file names, if 
 you store file names in some other storage (say in your preferences) then 
 what could happen is this:
 
 10.2: File is saved with a file name handed to the file system in NFC 
 form. File system converts the file name to NFD. You store it as NFC.
 10.3: File system is converted to APFS, and the file name is NFD. You try 
 to look up the file as NFC, and it fails.
>>> 
>>> This is going to cause problems, though, when things migrate from HFS+ to 
>>> APFS, because the HFS normalisation *isn’t* a standard one.  In particular, 
>>> it certainly *isn’t* NFD for the current version of Unicode.
>> 
>> Yes, that is the crux of Dave’s issue – HFS+ => APFS only translated the 
>> file names (from UTF-16 to UTF-8), it did not re-normalize them.
>> 
>>> The only obvious solution for that would be to have the HFS+ to APFS 
>>> migration tool *re-normalise* the filenames (maybe it does?), but that’s 
>>> bound to break things in the (presumably quite common) case where the 
>>> filename stored in e.g. a plist was originally obtained from the filesystem.
>> 
>> Arguably there is no way for the file system converter to know how it should 
>> renormalize file names. This is akin to case sensitive vs case insensitive 
>> file systems. If you ran a converter from a case insensitive file system to 
>> a case sensitive one, you could preserve the capitalization during the 
>> conversion, but file lookups that used the wrong case would fail after the 
>> conversion. But the converter can’t know you want to look up “foo” via “FOO” 
>> or “Foo” to do any kind of normalization. The difference here is that for 
>> the most part unicode normalization is invisible to the developer.
>> 
>>> 
>>> Kind regards,
>>> 
>>> Alastair.
>>> 
>>> --
>>> http://alastairs-place.net
>>> 
>> 
>> --
>> David Duncan
> 
> 
> I appreciate the help you (and everyone else) has given. I should be able to 
> add an option to rescan what files are there. And I'll make time this weekend 
> to submit a DTS incident and see what answer I can get and share it here. I 
> do suspect I won't be the only one bit by this.
> 
> Thanks,
> Dave Reed


I received a reply from DTS with lots of information and references. The 
suggestion for me was to store both the name the user enters and a filename 
that won't have the issue in my plist file with the list of "courses" (I assume 
they mean use a ASCII name such as GUID). The other suggestion was to iterate 
over the files and apply the same decomposition to the filename and the user 
entered string to find the match - that's what I did over the weekend and the 
update is now available in the App Store.

Dave Reed


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-29 Thread David Duncan

> On Mar 22, 2017, at 2:25 PM, davel...@mac.com wrote:
> 
>> 
>> On Mar 22, 2017, at 2:00 PM, David Duncan  wrote:
>> 
>>> 
>>> On Mar 22, 2017, at 4:15 AM, davel...@mac.com wrote:
>>> 
 
 On Mar 22, 2017, at 5:05 AM, Alastair Houghton 
  wrote:
 
 On 21 Mar 2017, at 20:49, Quincey Morris 
  wrote:
> 
> On Mar 20, 2017, at 14:23 , davel...@mac.com wrote:
>> 
>> "iOS HFS Normalized UNICODE names , APFS now treats all file[ name]s as 
>> a bag of bytes on iOS . We are requesting that Applications developers 
>> call the correct Normalization routines to make sure the file name 
>> contains the correct representation."
> 
> I’ve been letting this simmer for a couple of days now, and I’ve come to 
> the conclusion that it’s — sincere apologies to the unnamed Apple 
> engineer who wrote it — as dumb as dirt.
> 
> — It’s not a "bag of bytes”, because bags of stuff are generally 
> understood as unordered sets, and I doubt that’s what’s intended. It has 
> to be a sequence of bytes.
 
 In the context of filesystems (and specifically filenames), the phrases 
 “bag of bytes” and “bunch of bytes” have a fairly specific meaning.  The 
 point is that the filesystem doesn’t inspect the bytes it’s given, and 
 doesn’t care what they represent (about the only exception is that it 
 probably doesn’t support embedded NULs).  It isn’t suggesting that the 
 names are treated as an unordered set of bytes (that’d just be silly).  
 It’s just expressing the fact that the filesystem doesn’t care what they 
 are - it may compare them, and if it does so, it will use binary ordering 
 (not some other collation sequence) and won’t worry about things like case 
 or encoding at all.
 
> — It’s not just a string, it has to be a string in a known encoding. 
> Otherwise, how could you ever mount an external drive on a different 
> computer? The encoding has to be pre-specified for APFS, or it has to be 
> stored in metadata on each volume.
 
 Agreed, that’s where the “bunch of bytes” approach falls down.
 
> — It’s not just going to be a string of known encoding, it’s going to be 
> Unicode. That’s going to be true even if the fact is specified in volume 
> metadata and it’s theoretically possible to create APFS volumes with 
> non-Unicode file names. Anything other than Unicode would, at this point, 
> be a crime against humanity.
 
 If I’d designed APFS, it probably would use Unicode names (and it’d store 
 the version of Unicode it used in the filesystem header, to avoid having 
 to hard-code it).
 
 But I didn’t design it - Dominic Giampaolo and his team did - and we still 
 don’t have that much information about how APFS works.  I’m sure they had 
 their reasons for whatever decision they’ve made here.
 
> Is *that* the bottom line? I doubt it. I don’t believe the above quoted 
> statement can be correct. I could believe that normalization is being 
> moved out of the file system code, but it would have to be moved to 
> (e.g.) the Cocoa frameworks, still “downstream” of the file-handling 
> APIs. It can’t go upstream of the public APIs without breaking an API 
> contract that has existed for the 16+ years since OS X 10.0.
 
 This is a tricky area.  The problem with what we have at the moment 
 (-fileSystemRepresentation) is that it *assumes* HFS+ semantics.  That 
 isn’t always going to be correct for existing non-HFS+ filesystems, let 
 alone in the future.  Of course, if you’re using the NSURL or NSString 
 methods, rather than calling the BSD or C library APIs yourself, this is 
 all hidden from you anyway (you certainly shouldn’t, IMO, be required to 
 do anything unusual at Cocoa level - the Foundation framework should just 
 make this all work, rather in the same way it presently does for numerous 
 other things).
 
 It’s also complicated by the fact that, unlike on DOS or Windows, 
 UNIX-like systems use a unified filesystem - that is, other filesystems 
 are joined on at mount points.  Thus you could have a name like
 
 /Volumes/Foo/Bar/Baz/Blam
 
 where (say) both Foo and Baz are mount points, and the rules about 
 filenames could differ markedly, at least in principle; that is, 
 /Volumes/Foo would have to conform to HFS+ (or APFS) rules, Bar/Baz to 
 whatever rules govern the filesystem mounted at Foo, and Blam to whatever 
 rules govern the filesystem mounted at Baz.  And remember, not every 
 filesystem will be using a well known encoding - macOS already has code to 
 add and remove percent escapes (I kid you not) for this very reason.
 
 I’d like to hear what Dominic has to say (at least 

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-29 Thread David Duncan

> On Mar 22, 2017, at 4:15 AM, davel...@mac.com wrote:
> 
>> 
>> On Mar 22, 2017, at 5:05 AM, Alastair Houghton 
>>  wrote:
>> 
>> On 21 Mar 2017, at 20:49, Quincey Morris 
>>  wrote:
>>> 
>>> On Mar 20, 2017, at 14:23 , davel...@mac.com wrote:
 
 "iOS HFS Normalized UNICODE names , APFS now treats all file[ name]s as a 
 bag of bytes on iOS . We are requesting that Applications developers call 
 the correct Normalization routines to make sure the file name contains the 
 correct representation."
>>> 
>>> I’ve been letting this simmer for a couple of days now, and I’ve come to 
>>> the conclusion that it’s — sincere apologies to the unnamed Apple engineer 
>>> who wrote it — as dumb as dirt.
>>> 
>>> — It’s not a "bag of bytes”, because bags of stuff are generally understood 
>>> as unordered sets, and I doubt that’s what’s intended. It has to be a 
>>> sequence of bytes.
>> 
>> In the context of filesystems (and specifically filenames), the phrases “bag 
>> of bytes” and “bunch of bytes” have a fairly specific meaning.  The point is 
>> that the filesystem doesn’t inspect the bytes it’s given, and doesn’t care 
>> what they represent (about the only exception is that it probably doesn’t 
>> support embedded NULs).  It isn’t suggesting that the names are treated as 
>> an unordered set of bytes (that’d just be silly).  It’s just expressing the 
>> fact that the filesystem doesn’t care what they are - it may compare them, 
>> and if it does so, it will use binary ordering (not some other collation 
>> sequence) and won’t worry about things like case or encoding at all.
>> 
>>> — It’s not just a string, it has to be a string in a known encoding. 
>>> Otherwise, how could you ever mount an external drive on a different 
>>> computer? The encoding has to be pre-specified for APFS, or it has to be 
>>> stored in metadata on each volume.
>> 
>> Agreed, that’s where the “bunch of bytes” approach falls down.
>> 
>>> — It’s not just going to be a string of known encoding, it’s going to be 
>>> Unicode. That’s going to be true even if the fact is specified in volume 
>>> metadata and it’s theoretically possible to create APFS volumes with 
>>> non-Unicode file names. Anything other than Unicode would, at this point, 
>>> be a crime against humanity.
>> 
>> If I’d designed APFS, it probably would use Unicode names (and it’d store 
>> the version of Unicode it used in the filesystem header, to avoid having to 
>> hard-code it).
>> 
>> But I didn’t design it - Dominic Giampaolo and his team did - and we still 
>> don’t have that much information about how APFS works.  I’m sure they had 
>> their reasons for whatever decision they’ve made here.
>> 
>>> Is *that* the bottom line? I doubt it. I don’t believe the above quoted 
>>> statement can be correct. I could believe that normalization is being moved 
>>> out of the file system code, but it would have to be moved to (e.g.) the 
>>> Cocoa frameworks, still “downstream” of the file-handling APIs. It can’t go 
>>> upstream of the public APIs without breaking an API contract that has 
>>> existed for the 16+ years since OS X 10.0.
>> 
>> This is a tricky area.  The problem with what we have at the moment 
>> (-fileSystemRepresentation) is that it *assumes* HFS+ semantics.  That isn’t 
>> always going to be correct for existing non-HFS+ filesystems, let alone in 
>> the future.  Of course, if you’re using the NSURL or NSString methods, 
>> rather than calling the BSD or C library APIs yourself, this is all hidden 
>> from you anyway (you certainly shouldn’t, IMO, be required to do anything 
>> unusual at Cocoa level - the Foundation framework should just make this all 
>> work, rather in the same way it presently does for numerous other things).
>> 
>> It’s also complicated by the fact that, unlike on DOS or Windows, UNIX-like 
>> systems use a unified filesystem - that is, other filesystems are joined on 
>> at mount points.  Thus you could have a name like
>> 
>> /Volumes/Foo/Bar/Baz/Blam
>> 
>> where (say) both Foo and Baz are mount points, and the rules about filenames 
>> could differ markedly, at least in principle; that is, /Volumes/Foo would 
>> have to conform to HFS+ (or APFS) rules, Bar/Baz to whatever rules govern 
>> the filesystem mounted at Foo, and Blam to whatever rules govern the 
>> filesystem mounted at Baz.  And remember, not every filesystem will be using 
>> a well known encoding - macOS already has code to add and remove percent 
>> escapes (I kid you not) for this very reason.
>> 
>> I’d like to hear what Dominic has to say (at least what he *can* say) about 
>> this, since he’s likely in a position to shed some light on it - or at least 
>> to take on board that we’re worrying about it.  At the very least it’d be 
>> nice to see some more detail about APFS published somewhere *soon*...
>> 
>> Kind regards,
>> 
>> Alastair.
>> 
>> --
>> 

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-23 Thread davelist

> On Mar 23, 2017, at 12:24 PM, David Duncan  wrote:
> 
> I just want to remind everyone I’m *not* a file system’s engineer – I’m just 
> trying to help Dave (and anyone else caught in this) make sure their app can 
> find their files.
> 
>> On Mar 23, 2017, at 1:53 AM, Alastair Houghton 
>>  wrote:
>> 
>> On 22 Mar 2017, at 18:00, David Duncan  wrote:
>>> 
>>> So there was another explanation posted on the bug that I’m not certain you 
>>> got, but which I think may explain.
>>> 
>>> Basically the concept is that since APFS doesn’t normalize file names, if 
>>> you store file names in some other storage (say in your preferences) then 
>>> what could happen is this:
>>> 
>>> 10.2: File is saved with a file name handed to the file system in NFC form. 
>>> File system converts the file name to NFD. You store it as NFC.
>>> 10.3: File system is converted to APFS, and the file name is NFD. You try 
>>> to look up the file as NFC, and it fails.
>> 
>> This is going to cause problems, though, when things migrate from HFS+ to 
>> APFS, because the HFS normalisation *isn’t* a standard one.  In particular, 
>> it certainly *isn’t* NFD for the current version of Unicode.
> 
> Yes, that is the crux of Dave’s issue – HFS+ => APFS only translated the file 
> names (from UTF-16 to UTF-8), it did not re-normalize them.
> 
>> The only obvious solution for that would be to have the HFS+ to APFS 
>> migration tool *re-normalise* the filenames (maybe it does?), but that’s 
>> bound to break things in the (presumably quite common) case where the 
>> filename stored in e.g. a plist was originally obtained from the filesystem.
> 
> Arguably there is no way for the file system converter to know how it should 
> renormalize file names. This is akin to case sensitive vs case insensitive 
> file systems. If you ran a converter from a case insensitive file system to a 
> case sensitive one, you could preserve the capitalization during the 
> conversion, but file lookups that used the wrong case would fail after the 
> conversion. But the converter can’t know you want to look up “foo” via “FOO” 
> or “Foo” to do any kind of normalization. The difference here is that for the 
> most part unicode normalization is invisible to the developer.
> 
>> 
>> Kind regards,
>> 
>> Alastair.
>> 
>> --
>> http://alastairs-place.net
>> 
> 
> --
> David Duncan


I appreciate the help you (and everyone else) has given. I should be able to 
add an option to rescan what files are there. And I'll make time this weekend 
to submit a DTS incident and see what answer I can get and share it here. I do 
suspect I won't be the only one bit by this.

Thanks,
Dave Reed





___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-23 Thread Alastair Houghton
On 23 Mar 2017, at 17:57, Ed Wynne  wrote:
> 
>> Shouldn’t the VFS layer actually be doing this? It is part of its whole 
>> raison d’être, no? Just have -[NSURL fileSystemRepresentation] normalize 
>> things according to the correct Unicode rules, and let the VFS layer 
>> translate that to HFS+’s normalization style when dealing with HFS+.
> 
> Yes, this.
> 
> Having the conversion only available up in the Cocoa layer is an incredibly 
> poor choice. It effectively means nothing at the BSD layer will be able to 
> properly normalize file names. Having it at the VFS layer is the most sane 
> option, even with the problems that causes.

It can’t really take place at the VFS layer, because the appropriate 
normalisation is filesystem specific - some filesystems don’t normalise, others 
do, and the exact rules differ.

It *could* take place in the filesystem driver, as happens currently for HFS+.  
The problem with that is that while your software will work fine on HFS+, it 
might break if given a different filesystem to run on, which is kind of what 
this thread is all about, no?  (And we already had similar problems with 
case-sensitive HFS+ too, which usually breaks certain big brand-name 
applications software.)  I have to say I’m generally in favour of APFS 
normalising Unicode names, but I can understand that there are reasons the APFS 
team might have decided not to (it’s really up to them to elucidate what those 
reasons were).

This is a rather horrible area of filesystem work, made worse by the fact that 
many historic filesystems don’t even bother storing what character encoding was 
used.  Indeed, on such systems it’s even possible that users will use different 
encoding in different directories (:-()

Clearly, encoding detailed knowledge of appropriate normalisation on a 
per-filesystem basis in end-user applications is not a sensible approach here.  
Apple suggesting that we normalise filenames before passing them to the BSD 
layer wouldn’t be the end of the world, but it might result in some 
applications not being able to cope with some otherwise valid filenames because 
the name on disk differs from the chosen normalisation.

Another option might be to add some flags to the BSD open() API (for instance, 
O_UNICODE and O_CASEFOLD) that cause it to use a Unicode-aware comparison 
routine inside the filesystem implementation, the idea being that it will open 
a file with the exact name passed if it exists, or, if that file doesn’t exist, 
it will enumerate the containing directory looking for one that matches.  
Sadly, this enumeration would need to be recursive (since the directory name 
might have the same problem).  The Foundation framework could then use the new 
flags to obtain reasonable behaviour.

Kind regards,

Alastair.

--
http://alastairs-place.net


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-23 Thread Jens Alfke

> On Mar 22, 2017, at 2:25 PM, davel...@mac.com wrote:
> 
> "Engineering has the following feedback for you:
> 
> iOS HFS Normalized UNICODE names , APFS now treats all files as a bag of 
> bytes on iOS . We are requesting that Applications developers call the 
> correct Normalization routines to make sure the file name contains the 
> correct representation.
> 
> We are now closing this bug report.

If Apple really is making developers responsible for Unicode normalization of 
filenames, that’s a big compatibility issue and they would need to educate 
developers, give them sample code, etc. In other words, something that would 
have been a big deal at last year’s WWDC when APFS was announced. I’m pretty 
sure that very few developers understand Unicode normalization (I don’t beyond 
a surface level), so Apple can’t expect them to take it on as an “oh, by the 
way” sort of thing.

Apple takes I18N pretty seriously, and I find it hard to believe that they’d 
change the filesystem in a way that could potentially cause huge problems 
accessing files with non-Roman names, without making sure that developers can 
handle the transition.

The above makes me doubt that this is really what’s going on. I’ve worked at 
Apple, and I know it’s entirely possible that the above quote is the result of 
a game of ‘Telephone’ in which the actual meaning’s gotten messed up when 
passed from engineering to tech support.

—Jens
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-23 Thread Ed Wynne

> On Mar 23, 2017, at 1:40 PM, Charles Srstka  wrote:
> 
>> On Mar 23, 2017, at 3:50 AM, Alastair Houghton 
>>  wrote:
>> 
>> On 22 Mar 2017, at 19:13, Chris Ridd > > wrote:
>>> 
 On 22 Mar 2017, at 09:05, Alastair Houghton > wrote:
 
 In the context of filesystems (and specifically filenames), the phrases 
 “bag of bytes” and “bunch of bytes” have a fairly specific meaning.  The 
 point is that the filesystem doesn’t inspect the bytes it’s given, and 
 doesn’t care what they represent (about the only exception is that it 
 probably doesn’t support embedded NULs).  It isn’t suggesting that the 
 names are treated as an unordered set of bytes (that’d just be silly).  
 It’s just expressing the fact that the filesystem doesn’t care what they 
 are - it may compare them, and if it does so, it will use binary ordering 
 (not some other collation sequence) and won’t worry about things like case 
 or encoding at all.
>>> 
>>> That doesn’t sound sensible at all. It means you can create a filename with 
>>> a byte sequence that isn’t valid UTF-8 and which likely then cannot be 
>>> accessed by MacOS/iOS processes.
>> 
>> That isn’t possible on macOS - there’s a percent escaping mechanism built in 
>> to the kernel to prevent this problem.
>> 
>>> It means that you could create multiple files with the “same" name, and 
>>> that doesn’t sound like a win either. e.g. Aandi’s examples of LATIN SMALL 
>>> LETTER E (U+0065)
>>> COMBINING ACUTE ACCENT (U+0301) and LATIN SMALL LETTER E WITH ACUTE (U+00E9)
>> 
>> Yes, it does.
>> 
>>> How can a “next gen” filesystem avoid using Unicode rules when handling 
>>> filenames?
>> 
>> Well, if I had designed it, it wouldn’t.  But I didn’t.
>> 
>> To be fair, I can see arguments in favour of the bunch of bytes approach; 
>> the existing approach has created a problem in HFS+, in that the 
>> normalisation is essentially fixed for all time, and doesn’t correspond to 
>> the current version of Unicode.  It’s actually worse than it might be, 
>> because (IIRC) they fixed the normalisation *before* Unicode adopted a 
>> stability policy for normalisation...
>> 
>> But if the filesystem (or kernel) isn’t doing it, then IMO the Cocoa 
>> frameworks certainly should.
> 
> Shouldn’t the VFS layer actually be doing this? It is part of its whole 
> raison d’être, no? Just have -[NSURL fileSystemRepresentation] normalize 
> things according to the correct Unicode rules, and let the VFS layer 
> translate that to HFS+’s normalization style when dealing with HFS+.


Yes, this.

Having the conversion only available up in the Cocoa layer is an incredibly 
poor choice. It effectively means nothing at the BSD layer will be able to 
properly normalize file names. Having it at the VFS layer is the most sane 
option, even with the problems that causes.

-Ed


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-23 Thread Charles Srstka
> On Mar 23, 2017, at 3:50 AM, Alastair Houghton  
> wrote:
> 
> On 22 Mar 2017, at 19:13, Chris Ridd  > wrote:
>> 
>>> On 22 Mar 2017, at 09:05, Alastair Houghton >> > wrote:
>>> 
>>> In the context of filesystems (and specifically filenames), the phrases 
>>> “bag of bytes” and “bunch of bytes” have a fairly specific meaning.  The 
>>> point is that the filesystem doesn’t inspect the bytes it’s given, and 
>>> doesn’t care what they represent (about the only exception is that it 
>>> probably doesn’t support embedded NULs).  It isn’t suggesting that the 
>>> names are treated as an unordered set of bytes (that’d just be silly).  
>>> It’s just expressing the fact that the filesystem doesn’t care what they 
>>> are - it may compare them, and if it does so, it will use binary ordering 
>>> (not some other collation sequence) and won’t worry about things like case 
>>> or encoding at all.
>> 
>> That doesn’t sound sensible at all. It means you can create a filename with 
>> a byte sequence that isn’t valid UTF-8 and which likely then cannot be 
>> accessed by MacOS/iOS processes.
> 
> That isn’t possible on macOS - there’s a percent escaping mechanism built in 
> to the kernel to prevent this problem.
> 
>> It means that you could create multiple files with the “same" name, and that 
>> doesn’t sound like a win either. e.g. Aandi’s examples of LATIN SMALL LETTER 
>> E (U+0065)
>> COMBINING ACUTE ACCENT (U+0301) and LATIN SMALL LETTER E WITH ACUTE (U+00E9)
> 
> Yes, it does.
> 
>> How can a “next gen” filesystem avoid using Unicode rules when handling 
>> filenames?
> 
> Well, if I had designed it, it wouldn’t.  But I didn’t.
> 
> To be fair, I can see arguments in favour of the bunch of bytes approach; the 
> existing approach has created a problem in HFS+, in that the normalisation is 
> essentially fixed for all time, and doesn’t correspond to the current version 
> of Unicode.  It’s actually worse than it might be, because (IIRC) they fixed 
> the normalisation *before* Unicode adopted a stability policy for 
> normalisation...
> 
> But if the filesystem (or kernel) isn’t doing it, then IMO the Cocoa 
> frameworks certainly should.

Shouldn’t the VFS layer actually be doing this? It is part of its whole raison 
d’être, no? Just have -[NSURL fileSystemRepresentation] normalize things 
according to the correct Unicode rules, and let the VFS layer translate that to 
HFS+’s normalization style when dealing with HFS+.

Charles

___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-23 Thread David Duncan
I just want to remind everyone I’m *not* a file system’s engineer – I’m just 
trying to help Dave (and anyone else caught in this) make sure their app can 
find their files.

> On Mar 23, 2017, at 1:53 AM, Alastair Houghton  
> wrote:
> 
> On 22 Mar 2017, at 18:00, David Duncan  wrote:
>> 
>> So there was another explanation posted on the bug that I’m not certain you 
>> got, but which I think may explain.
>> 
>> Basically the concept is that since APFS doesn’t normalize file names, if 
>> you store file names in some other storage (say in your preferences) then 
>> what could happen is this:
>> 
>> 10.2: File is saved with a file name handed to the file system in NFC form. 
>> File system converts the file name to NFD. You store it as NFC.
>> 10.3: File system is converted to APFS, and the file name is NFD. You try to 
>> look up the file as NFC, and it fails.
> 
> This is going to cause problems, though, when things migrate from HFS+ to 
> APFS, because the HFS normalisation *isn’t* a standard one.  In particular, 
> it certainly *isn’t* NFD for the current version of Unicode.

Yes, that is the crux of Dave’s issue – HFS+ => APFS only translated the file 
names (from UTF-16 to UTF-8), it did not re-normalize them.

> The only obvious solution for that would be to have the HFS+ to APFS 
> migration tool *re-normalise* the filenames (maybe it does?), but that’s 
> bound to break things in the (presumably quite common) case where the 
> filename stored in e.g. a plist was originally obtained from the filesystem.

Arguably there is no way for the file system converter to know how it should 
renormalize file names. This is akin to case sensitive vs case insensitive file 
systems. If you ran a converter from a case insensitive file system to a case 
sensitive one, you could preserve the capitalization during the conversion, but 
file lookups that used the wrong case would fail after the conversion. But the 
converter can’t know you want to look up “foo” via “FOO” or “Foo” to do any 
kind of normalization. The difference here is that for the most part unicode 
normalization is invisible to the developer.

> 
> Kind regards,
> 
> Alastair.
> 
> --
> http://alastairs-place.net
> 

--
David Duncan


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-23 Thread Alastair Houghton
On 22 Mar 2017, at 18:00, David Duncan  wrote:
> 
> So there was another explanation posted on the bug that I’m not certain you 
> got, but which I think may explain.
> 
> Basically the concept is that since APFS doesn’t normalize file names, if you 
> store file names in some other storage (say in your preferences) then what 
> could happen is this:
> 
> 10.2: File is saved with a file name handed to the file system in NFC form. 
> File system converts the file name to NFD. You store it as NFC.
> 10.3: File system is converted to APFS, and the file name is NFD. You try to 
> look up the file as NFC, and it fails.

This is going to cause problems, though, when things migrate from HFS+ to APFS, 
because the HFS normalisation *isn’t* a standard one.  In particular, it 
certainly *isn’t* NFD for the current version of Unicode.

The only obvious solution for that would be to have the HFS+ to APFS migration 
tool *re-normalise* the filenames (maybe it does?), but that’s bound to break 
things in the (presumably quite common) case where the filename stored in e.g. 
a plist was originally obtained from the filesystem.

Kind regards,

Alastair.

--
http://alastairs-place.net


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-23 Thread Alastair Houghton
On 22 Mar 2017, at 19:13, Chris Ridd  wrote:
> 
>> On 22 Mar 2017, at 09:05, Alastair Houghton  
>> wrote:
>> 
>> In the context of filesystems (and specifically filenames), the phrases “bag 
>> of bytes” and “bunch of bytes” have a fairly specific meaning.  The point is 
>> that the filesystem doesn’t inspect the bytes it’s given, and doesn’t care 
>> what they represent (about the only exception is that it probably doesn’t 
>> support embedded NULs).  It isn’t suggesting that the names are treated as 
>> an unordered set of bytes (that’d just be silly).  It’s just expressing the 
>> fact that the filesystem doesn’t care what they are - it may compare them, 
>> and if it does so, it will use binary ordering (not some other collation 
>> sequence) and won’t worry about things like case or encoding at all.
> 
> That doesn’t sound sensible at all. It means you can create a filename with a 
> byte sequence that isn’t valid UTF-8 and which likely then cannot be accessed 
> by MacOS/iOS processes.

That isn’t possible on macOS - there’s a percent escaping mechanism built in to 
the kernel to prevent this problem.

> It means that you could create multiple files with the “same" name, and that 
> doesn’t sound like a win either. e.g. Aandi’s examples of LATIN SMALL LETTER 
> E (U+0065)
> COMBINING ACUTE ACCENT (U+0301) and LATIN SMALL LETTER E WITH ACUTE (U+00E9)

Yes, it does.

> How can a “next gen” filesystem avoid using Unicode rules when handling 
> filenames?

Well, if I had designed it, it wouldn’t.  But I didn’t.

To be fair, I can see arguments in favour of the bunch of bytes approach; the 
existing approach has created a problem in HFS+, in that the normalisation is 
essentially fixed for all time, and doesn’t correspond to the current version 
of Unicode.  It’s actually worse than it might be, because (IIRC) they fixed 
the normalisation *before* Unicode adopted a stability policy for 
normalisation...

But if the filesystem (or kernel) isn’t doing it, then IMO the Cocoa frameworks 
certainly should.

Kind regards,

Alastair.

--
http://alastairs-place.net


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-23 Thread Quincey Morris
On Mar 22, 2017, at 14:25 , davel...@mac.com wrote:
> 
> On Mar 22, 2017, at 2:00 PM, David Duncan  > wrote:
>> 
>> So there was another explanation posted on the bug that I’m not certain you 
>> got, but which I think may explain.
>> 
>> Basically the concept is that since APFS doesn’t normalize file names, if 
>> you store file names in some other storage (say in your preferences) then 
>> what could happen is this:

The “why” doesn’t matter at all. As I said before, the problem is that breaking 
the existing Cocoa file system API contract will break existing apps. It’s as 
simple as that.

Apple can certainly move the normalization out of the file system code into 
Cocoa frameworks code, but they can’t simply drop it. Imagine what would happen 
if they did. Within a few weeks, users would start reporting that they had lost 
access to some of their files. Not only existing files, but potentially files 
created after the conversion to APFS. 

Not only that, but the Cocoa frameworks would break, too. Any time a file name 
was programmatically manipulated in the frameworks, such as having a suffix 
appended that was taken from a string constant or a resource string, the 
resulting name would be vulnerable to reversion to non-NFD, and … blammo.

This isn’t viable.

> Do I use the following?
> 
> NSURL *url = [[self courseDirectory] 
> URLByAppendingPathComponent:name.decomposedStringWithCanonicalMapping];
> 
> Where [self courseDirectory] is a URL in English (assuming the sandboxed 
> Documents directory are in English for other locales) and name is the 
> NSString the user enters. 

That isn’t going to be enough. Your “courseDirectory” subdirectory of the 
Documents directory might have an English name, but:

1. There’s no guarantee that English names have only a single Unicode form. 
(You could make this assumption for ASCII-character names, though.)

2. It’s unlikely that the path to the Documents directory is in English, and 
it’s not under your control.
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-22 Thread davelist

> On Mar 22, 2017, at 2:00 PM, David Duncan  wrote:
> 
>> 
>> On Mar 22, 2017, at 4:15 AM, davel...@mac.com wrote:
>> 
>>> 
>>> On Mar 22, 2017, at 5:05 AM, Alastair Houghton 
>>>  wrote:
>>> 
>>> On 21 Mar 2017, at 20:49, Quincey Morris 
>>>  wrote:
 
 On Mar 20, 2017, at 14:23 , davel...@mac.com wrote:
> 
> "iOS HFS Normalized UNICODE names , APFS now treats all file[ name]s as a 
> bag of bytes on iOS . We are requesting that Applications developers call 
> the correct Normalization routines to make sure the file name contains 
> the correct representation."
 
 I’ve been letting this simmer for a couple of days now, and I’ve come to 
 the conclusion that it’s — sincere apologies to the unnamed Apple engineer 
 who wrote it — as dumb as dirt.
 
 — It’s not a "bag of bytes”, because bags of stuff are generally 
 understood as unordered sets, and I doubt that’s what’s intended. It has 
 to be a sequence of bytes.
>>> 
>>> In the context of filesystems (and specifically filenames), the phrases 
>>> “bag of bytes” and “bunch of bytes” have a fairly specific meaning.  The 
>>> point is that the filesystem doesn’t inspect the bytes it’s given, and 
>>> doesn’t care what they represent (about the only exception is that it 
>>> probably doesn’t support embedded NULs).  It isn’t suggesting that the 
>>> names are treated as an unordered set of bytes (that’d just be silly).  
>>> It’s just expressing the fact that the filesystem doesn’t care what they 
>>> are - it may compare them, and if it does so, it will use binary ordering 
>>> (not some other collation sequence) and won’t worry about things like case 
>>> or encoding at all.
>>> 
 — It’s not just a string, it has to be a string in a known encoding. 
 Otherwise, how could you ever mount an external drive on a different 
 computer? The encoding has to be pre-specified for APFS, or it has to be 
 stored in metadata on each volume.
>>> 
>>> Agreed, that’s where the “bunch of bytes” approach falls down.
>>> 
 — It’s not just going to be a string of known encoding, it’s going to be 
 Unicode. That’s going to be true even if the fact is specified in volume 
 metadata and it’s theoretically possible to create APFS volumes with 
 non-Unicode file names. Anything other than Unicode would, at this point, 
 be a crime against humanity.
>>> 
>>> If I’d designed APFS, it probably would use Unicode names (and it’d store 
>>> the version of Unicode it used in the filesystem header, to avoid having to 
>>> hard-code it).
>>> 
>>> But I didn’t design it - Dominic Giampaolo and his team did - and we still 
>>> don’t have that much information about how APFS works.  I’m sure they had 
>>> their reasons for whatever decision they’ve made here.
>>> 
 Is *that* the bottom line? I doubt it. I don’t believe the above quoted 
 statement can be correct. I could believe that normalization is being 
 moved out of the file system code, but it would have to be moved to (e.g.) 
 the Cocoa frameworks, still “downstream” of the file-handling APIs. It 
 can’t go upstream of the public APIs without breaking an API contract that 
 has existed for the 16+ years since OS X 10.0.
>>> 
>>> This is a tricky area.  The problem with what we have at the moment 
>>> (-fileSystemRepresentation) is that it *assumes* HFS+ semantics.  That 
>>> isn’t always going to be correct for existing non-HFS+ filesystems, let 
>>> alone in the future.  Of course, if you’re using the NSURL or NSString 
>>> methods, rather than calling the BSD or C library APIs yourself, this is 
>>> all hidden from you anyway (you certainly shouldn’t, IMO, be required to do 
>>> anything unusual at Cocoa level - the Foundation framework should just make 
>>> this all work, rather in the same way it presently does for numerous other 
>>> things).
>>> 
>>> It’s also complicated by the fact that, unlike on DOS or Windows, UNIX-like 
>>> systems use a unified filesystem - that is, other filesystems are joined on 
>>> at mount points.  Thus you could have a name like
>>> 
>>> /Volumes/Foo/Bar/Baz/Blam
>>> 
>>> where (say) both Foo and Baz are mount points, and the rules about 
>>> filenames could differ markedly, at least in principle; that is, 
>>> /Volumes/Foo would have to conform to HFS+ (or APFS) rules, Bar/Baz to 
>>> whatever rules govern the filesystem mounted at Foo, and Blam to whatever 
>>> rules govern the filesystem mounted at Baz.  And remember, not every 
>>> filesystem will be using a well known encoding - macOS already has code to 
>>> add and remove percent escapes (I kid you not) for this very reason.
>>> 
>>> I’d like to hear what Dominic has to say (at least what he *can* say) about 
>>> this, since he’s likely in a position to shed some light on it - or at 
>>> least to take on board that we’re worrying about 

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-22 Thread Chris Ridd

> On 22 Mar 2017, at 09:05, Alastair Houghton  
> wrote:
> 
> In the context of filesystems (and specifically filenames), the phrases “bag 
> of bytes” and “bunch of bytes” have a fairly specific meaning.  The point is 
> that the filesystem doesn’t inspect the bytes it’s given, and doesn’t care 
> what they represent (about the only exception is that it probably doesn’t 
> support embedded NULs).  It isn’t suggesting that the names are treated as an 
> unordered set of bytes (that’d just be silly).  It’s just expressing the fact 
> that the filesystem doesn’t care what they are - it may compare them, and if 
> it does so, it will use binary ordering (not some other collation sequence) 
> and won’t worry about things like case or encoding at all.

That doesn’t sound sensible at all. It means you can create a filename with a 
byte sequence that isn’t valid UTF-8 and which likely then cannot be accessed 
by MacOS/iOS processes.

It means that you could create multiple files with the “same" name, and that 
doesn’t sound like a win either. e.g. Aandi’s examples of LATIN SMALL LETTER E 
(U+0065)
COMBINING ACUTE ACCENT (U+0301) and LATIN SMALL LETTER E WITH ACUTE (U+00E9)

How can a “next gen” filesystem avoid using Unicode rules when handling 
filenames?

Chris
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-22 Thread davelist

> On Mar 22, 2017, at 5:05 AM, Alastair Houghton  
> wrote:
> 
> On 21 Mar 2017, at 20:49, Quincey Morris 
>  wrote:
>> 
>> On Mar 20, 2017, at 14:23 , davel...@mac.com wrote:
>>> 
>>> "iOS HFS Normalized UNICODE names , APFS now treats all file[ name]s as a 
>>> bag of bytes on iOS . We are requesting that Applications developers call 
>>> the correct Normalization routines to make sure the file name contains the 
>>> correct representation."
>> 
>> I’ve been letting this simmer for a couple of days now, and I’ve come to the 
>> conclusion that it’s — sincere apologies to the unnamed Apple engineer who 
>> wrote it — as dumb as dirt.
>> 
>> — It’s not a "bag of bytes”, because bags of stuff are generally understood 
>> as unordered sets, and I doubt that’s what’s intended. It has to be a 
>> sequence of bytes.
> 
> In the context of filesystems (and specifically filenames), the phrases “bag 
> of bytes” and “bunch of bytes” have a fairly specific meaning.  The point is 
> that the filesystem doesn’t inspect the bytes it’s given, and doesn’t care 
> what they represent (about the only exception is that it probably doesn’t 
> support embedded NULs).  It isn’t suggesting that the names are treated as an 
> unordered set of bytes (that’d just be silly).  It’s just expressing the fact 
> that the filesystem doesn’t care what they are - it may compare them, and if 
> it does so, it will use binary ordering (not some other collation sequence) 
> and won’t worry about things like case or encoding at all.
> 
>> — It’s not just a string, it has to be a string in a known encoding. 
>> Otherwise, how could you ever mount an external drive on a different 
>> computer? The encoding has to be pre-specified for APFS, or it has to be 
>> stored in metadata on each volume.
> 
> Agreed, that’s where the “bunch of bytes” approach falls down.
> 
>> — It’s not just going to be a string of known encoding, it’s going to be 
>> Unicode. That’s going to be true even if the fact is specified in volume 
>> metadata and it’s theoretically possible to create APFS volumes with 
>> non-Unicode file names. Anything other than Unicode would, at this point, be 
>> a crime against humanity.
> 
> If I’d designed APFS, it probably would use Unicode names (and it’d store the 
> version of Unicode it used in the filesystem header, to avoid having to 
> hard-code it).
> 
> But I didn’t design it - Dominic Giampaolo and his team did - and we still 
> don’t have that much information about how APFS works.  I’m sure they had 
> their reasons for whatever decision they’ve made here.
> 
>> Is *that* the bottom line? I doubt it. I don’t believe the above quoted 
>> statement can be correct. I could believe that normalization is being moved 
>> out of the file system code, but it would have to be moved to (e.g.) the 
>> Cocoa frameworks, still “downstream” of the file-handling APIs. It can’t go 
>> upstream of the public APIs without breaking an API contract that has 
>> existed for the 16+ years since OS X 10.0.
> 
> This is a tricky area.  The problem with what we have at the moment 
> (-fileSystemRepresentation) is that it *assumes* HFS+ semantics.  That isn’t 
> always going to be correct for existing non-HFS+ filesystems, let alone in 
> the future.  Of course, if you’re using the NSURL or NSString methods, rather 
> than calling the BSD or C library APIs yourself, this is all hidden from you 
> anyway (you certainly shouldn’t, IMO, be required to do anything unusual at 
> Cocoa level - the Foundation framework should just make this all work, rather 
> in the same way it presently does for numerous other things).
> 
> It’s also complicated by the fact that, unlike on DOS or Windows, UNIX-like 
> systems use a unified filesystem - that is, other filesystems are joined on 
> at mount points.  Thus you could have a name like
> 
>  /Volumes/Foo/Bar/Baz/Blam
> 
> where (say) both Foo and Baz are mount points, and the rules about filenames 
> could differ markedly, at least in principle; that is, /Volumes/Foo would 
> have to conform to HFS+ (or APFS) rules, Bar/Baz to whatever rules govern the 
> filesystem mounted at Foo, and Blam to whatever rules govern the filesystem 
> mounted at Baz.  And remember, not every filesystem will be using a well 
> known encoding - macOS already has code to add and remove percent escapes (I 
> kid you not) for this very reason.
> 
> I’d like to hear what Dominic has to say (at least what he *can* say) about 
> this, since he’s likely in a position to shed some light on it - or at least 
> to take on board that we’re worrying about it.  At the very least it’d be 
> nice to see some more detail about APFS published somewhere *soon*...
> 
> Kind regards,
> 
> Alastair.
> 
> --
> http://alastairs-place.net


I think it should be taken care of by NSURL so developers don’t need to worry 
about it but that doesn’t appear to be the case, but, at this point I 

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-22 Thread Alastair Houghton
On 21 Mar 2017, at 20:49, Quincey Morris  
wrote:
> 
> On Mar 20, 2017, at 14:23 , davel...@mac.com wrote:
>> 
>> "iOS HFS Normalized UNICODE names , APFS now treats all file[ name]s as a 
>> bag of bytes on iOS . We are requesting that Applications developers call 
>> the correct Normalization routines to make sure the file name contains the 
>> correct representation."
> 
> I’ve been letting this simmer for a couple of days now, and I’ve come to the 
> conclusion that it’s — sincere apologies to the unnamed Apple engineer who 
> wrote it — as dumb as dirt.
> 
> — It’s not a "bag of bytes”, because bags of stuff are generally understood 
> as unordered sets, and I doubt that’s what’s intended. It has to be a 
> sequence of bytes.

In the context of filesystems (and specifically filenames), the phrases “bag of 
bytes” and “bunch of bytes” have a fairly specific meaning.  The point is that 
the filesystem doesn’t inspect the bytes it’s given, and doesn’t care what they 
represent (about the only exception is that it probably doesn’t support 
embedded NULs).  It isn’t suggesting that the names are treated as an unordered 
set of bytes (that’d just be silly).  It’s just expressing the fact that the 
filesystem doesn’t care what they are - it may compare them, and if it does so, 
it will use binary ordering (not some other collation sequence) and won’t worry 
about things like case or encoding at all.

> — It’s not just a string, it has to be a string in a known encoding. 
> Otherwise, how could you ever mount an external drive on a different 
> computer? The encoding has to be pre-specified for APFS, or it has to be 
> stored in metadata on each volume.

Agreed, that’s where the “bunch of bytes” approach falls down.

> — It’s not just going to be a string of known encoding, it’s going to be 
> Unicode. That’s going to be true even if the fact is specified in volume 
> metadata and it’s theoretically possible to create APFS volumes with 
> non-Unicode file names. Anything other than Unicode would, at this point, be 
> a crime against humanity.

If I’d designed APFS, it probably would use Unicode names (and it’d store the 
version of Unicode it used in the filesystem header, to avoid having to 
hard-code it).

But I didn’t design it - Dominic Giampaolo and his team did - and we still 
don’t have that much information about how APFS works.  I’m sure they had their 
reasons for whatever decision they’ve made here.

> Is *that* the bottom line? I doubt it. I don’t believe the above quoted 
> statement can be correct. I could believe that normalization is being moved 
> out of the file system code, but it would have to be moved to (e.g.) the 
> Cocoa frameworks, still “downstream” of the file-handling APIs. It can’t go 
> upstream of the public APIs without breaking an API contract that has existed 
> for the 16+ years since OS X 10.0.

This is a tricky area.  The problem with what we have at the moment 
(-fileSystemRepresentation) is that it *assumes* HFS+ semantics.  That isn’t 
always going to be correct for existing non-HFS+ filesystems, let alone in the 
future.  Of course, if you’re using the NSURL or NSString methods, rather than 
calling the BSD or C library APIs yourself, this is all hidden from you anyway 
(you certainly shouldn’t, IMO, be required to do anything unusual at Cocoa 
level - the Foundation framework should just make this all work, rather in the 
same way it presently does for numerous other things).

It’s also complicated by the fact that, unlike on DOS or Windows, UNIX-like 
systems use a unified filesystem - that is, other filesystems are joined on at 
mount points.  Thus you could have a name like

  /Volumes/Foo/Bar/Baz/Blam

where (say) both Foo and Baz are mount points, and the rules about filenames 
could differ markedly, at least in principle; that is, /Volumes/Foo would have 
to conform to HFS+ (or APFS) rules, Bar/Baz to whatever rules govern the 
filesystem mounted at Foo, and Blam to whatever rules govern the filesystem 
mounted at Baz.  And remember, not every filesystem will be using a well known 
encoding - macOS already has code to add and remove percent escapes (I kid you 
not) for this very reason.

I’d like to hear what Dominic has to say (at least what he *can* say) about 
this, since he’s likely in a position to shed some light on it - or at least to 
take on board that we’re worrying about it.  At the very least it’d be nice to 
see some more detail about APFS published somewhere *soon*...

Kind regards,

Alastair.

--
http://alastairs-place.net


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to 

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-21 Thread Quincey Morris
On Mar 20, 2017, at 14:23 , davel...@mac.com wrote:
> 
> "iOS HFS Normalized UNICODE names , APFS now treats all file[ name]s as a bag 
> of bytes on iOS . We are requesting that Applications developers call the 
> correct Normalization routines to make sure the file name contains the 
> correct representation."

I’ve been letting this simmer for a couple of days now, and I’ve come to the 
conclusion that it’s — sincere apologies to the unnamed Apple engineer who 
wrote it — as dumb as dirt.

— It’s not a "bag of bytes”, because bags of stuff are generally understood as 
unordered sets, and I doubt that’s what’s intended. It has to be a sequence of 
bytes.

— It’s not a sequence of bytes, because *everything* is a sequence of bytes, 
except perhaps things that are just a sequence of bits. It’s a sequence of 
bytes that represents a human-readable name. We have a word for that already: 
string.

— It’s not just a string, it has to be a string in a known encoding. Otherwise, 
how could you ever mount an external drive on a different computer? The 
encoding has to be pre-specified for APFS, or it has to be stored in metadata 
on each volume.

— It’s not just going to be a string of known encoding, it’s going to be 
Unicode. That’s going to be true even if the fact is specified in volume 
metadata and it’s theoretically possible to create APFS volumes with 
non-Unicode file names. Anything other than Unicode would, at this point, be a 
crime against humanity.

— It’s not just going to be Unicode, it’s going to be UTF-8 or UTF-16 or 
UTF-32. Again, it might be one of these code-point sizes by definition, or any 
of them according to volume or file metadata. If the code point size isn’t 
determinable in one of these ways, the names cannot be interpreted.

— Ditto endianness (for UTF-16 or UTF-32).

— What we’re left with is that, apparently, APFS is a normalization-sensitive 
file system (by analogy with the case-sensitive file system that iOS already 
has): it’s capable of giving different files the same names, that differ only 
in capitalization and normalization. Except, no. There’s no “correct” or 
“incorrect” for iOS file name capitalization — you can create and use files 
following your own private rules for capitalization — but according to the 
above quote it is “correct” to normalize APFS file names, and presumably 
incorrect to leave them unnormalized.

— So, if unnormalized names are not “correct”, then it’s not a 
normalization-sensitive file system either.

— What are we left with? Well, the same file naming system as iOS HFS+, with 
the actual normalization left out. Great. No developer is ever going to forget 
to handle that, right?

Is that the bottom line? Actually, no. Here’s the bottom line:

— If Apple did this, *every* existing app would instantly break. All of them. 
(The only exceptions perhaps being apps that don’t ever construct a path string 
or URL.) That’s apparently what happened to Dave. His perfectly correct app 
*broke*.

Is *that* the bottom line? I doubt it. I don’t believe the above quoted 
statement can be correct. I could believe that normalization is being moved out 
of the file system code, but it would have to be moved to (e.g.) the Cocoa 
frameworks, still “downstream” of the file-handling APIs. It can’t go upstream 
of the public APIs without breaking an API contract that has existed for the 
16+ years since OS X 10.0.

Is there anything wrong with my reasoning here?

___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-21 Thread Alastair Houghton
On 21 Mar 2017, at 12:33, Jean-Daniel  wrote:
> 
> This is what the reply suggest but that make no sens for me. If you are 
> accessing the file using URL, that the work of the framework to convert the 
> URL into the right file system representation.

Agreed, IMO the framework should be responsible for Unicode normalisation.  
It’s only at the BSD layer where you *might* be responsible for it yourself, 
but this isn’t about making BSD API calls, right?  It’s the Cocoa API we’re 
dealing with here.

This *should* be abstracted.  If it isn’t, it’s a bug.

(Having vaguely followed this thread for a while, if I had to guess what 
happened here, I’d guess there’s some kind of bug in the HFS+ to APFS upgrade 
routine that’s causing the Unicode filenames from the HFS+ system to get messed 
up somehow.)

Kind regards,

Alastair.

--
http://alastairs-place.net


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-21 Thread Charles Srstka
> On Mar 21, 2017, at 9:13 AM, davel...@mac.com wrote:
> 
>> 
>> On Mar 21, 2017, at 8:33 AM, Jean-Daniel > > wrote:
>> 
>> 
>>> Le 21 mars 2017 à 12:03, davel...@mac.com a écrit :
>>> 
 
 On Mar 21, 2017, at 1:06 AM, Jens Alfke  wrote:
 
 
> On Mar 20, 2017, at 2:23 PM, davel...@mac.com wrote:
> 
> NSURL *url = [[self courseDirectory] URLByAppendingPathComponent:name];
 
 There’s nothing wrong with that call; it’s the canonical way to add a path 
 component to a URL, filesystem or not.
 
> NSURL *url = [NSURL fileURLWithFileSystemRepresentation:[name 
> fileSystemRepresentation] isDirectory:YES relativeToURL:[self 
> courseDirectory]];
 
 This call doesn’t make sense. You’re converting to filesystem 
 representation and then back again, for no reason.
 
 What Apple suggested is to Unicode-normalize the filename before adding it 
 to the URL. Did you try doing that?
 
 —Jens
>>> 
>>> Jens,
>>> 
>>> I’m trying to find out what that means. Someone suggested off-list to me 
>>> that I should be calling this:
>>> 
>>> https://developer.apple.com/reference/foundation/nsstring/1409474-decomposedstringwithcanonicalmap?language=objc
>>> 
>>> Is that correct?
>>> 
>>> So based on that, I think it means I should do:
>>> 
>>> NSURL *url = [[self courseDirectory] 
>>> URLByAppendingPathComponent:name.decomposedStringWithCanonicalMapping];
>>> 
>>> Thanks,
>>> Dave Reed
>>> 
>> 
>> This is what the reply suggest but that make no sens for me. If you are 
>> accessing the file using URL, that the work of the framework to convert the 
>> URL into the right file system representation.
>> 
>> That said, if using name.decomposedStringWithCanonicalMapping fix your 
>> problem, so go with it.
> 
> Unfortunately, I can't tell. One user told me he couldn't open his files 
> (with Arabic names) after upgrading to iOS 10.3 public beta. He then changed 
> the names to English names and was able to open the files. He sent me some 
> sample files that I can open (but they were zipped by my app and then 
> unzipped by me) so I honestly can't tell if any of these changes have 
> different effects. With all the options I've tried, I can open the files on 
> my 10.2 and my 10.3 developer device.
> 
> If he creates files with Arabic names on 10.3 he can open them so something 
> happened in the 10.2 to 10.3 (which is why I suspect APFS) upgrade that 
> prevents files created with 10.2 from being opened with 10.3.
> 
> My plan to test it is to take a file created with 10.2 using an Arabic name 
> and see what happens when I upgrade my regular phone to 10.3 but obviously 
> that's time consuming and I only plan to do that upgrade once since it's a 
> pain to downgrade (and not certain it would be possible with the APFS change).
> 
> That's why I keep emailing about this - I can't test it and can't find 
> specific documentation from Apple that says exactly how to create a NSURL 
> from a NSString that may contain Unicode characters.
> 
> Thanks,
> Dave Reed

Maybe you could make a debug build and send it to him (can you do that on iOS?) 
with some code that logs the raw Unicode characters of the filenames in the 
directory, without the name getting changed/decomposed/recomposed/etc by the 
zip/unzip process, and have him send you the output. Something like:

void LogFilenames(NSURL *url) {
DIR *dir = opendir(url.fileSystemRepresentation);

struct dirent *file;
while ((file = readdir(dir))) {
NSData *nameData = [[NSData alloc] initWithBytes:file->d_name 
length:file->d_namlen];

NSLog(@"%@", nameData);
}

closedir(dir);
}

This’ll be in UTF-8 instead of the underlying UTF-16, of course, but it should 
still be useful in figuring out what’s going on.

Charles

___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-21 Thread Giacomo Tufano
If Apple Support says (as It said) "iOS HFS Normalized UNICODE names , APFS now 
treats all files as a bag of bytes on iOS . We are requesting that Applications 
developers call the correct Normalization routines to make sure the file name 
contains the correct representation.” then I think that the solution will be to 
decompose the bytes so that the “bag of bytes” or the “normalized names” are 
the same. From the name of the API I *suppose* that [NSString 
decomposedStringWithCanonicalMapping] will do, but it needs to be tested, 
because the point is that on APFS you need to apply the same (de)composition 
that iOS HFS does (so to have the same bytes when asking for the file name).
Btw: I think this is a significant difference in filename management that I 
think will bite many developers, it should be fixed at file system level, IMHO… 
but who knows what the other implication are… 

My 2 €cents,
Gt

> Il giorno 21 mar 2017, alle ore 15:18, Aandi Inston  ha 
> scritto:
> 
> Is the question, what is canonical mapping? I'm going to assume it is, so I
> can share what I found when I hit much the same issue. This is mostly from
> memory so let's hope it's right.
> 
> Take the word Café. How many Unicode characters is this and what are they?
> Turns out there are two answers. The last character as seen on screen is a
> lower case e with an acute accent.
> Let's ignore C,a,f as they are the same in all answers.  First answer: é is
> 'LATIN SMALL LETTER E WITH ACUTE' (U+00E9). We'll call this "composed". In
> UTF-8 that's two bytes, 0xC3 0xA9. (This is the answer you'd often get, but
> it's not the only answer, and not the one Apple filesystems like.)
> 
> Second answer uses an accent character. These are designed to appear in the
> same space as another character. So combine "e" and an acute accent (like a
> floating, slanted apostrophe) and we have "é". This means you could get the
> same result from the two Unicode characters LATIN SMALL LETTER E (U+0065)
> COMBINING ACUTE ACCENT (U+0301). We'll call this "decomposed". In UTF-8
> that would be 0x65 0xCC 0x81: three bytes, two characters, combine to a
> single character. (This is the one Apple filesystems like).
> 
> When you're typing in a word processor, or showing an alert, it hardly
> matters how you create the e acute. Both look the same. But searching may
> be a problem (not discussed) as may showing items in alphabetical order
> (also not discussed).
> 
> Let's imagine now we have a filename Café. This could be represented in
> UTF-8 bytes as 0x41 0x61 0x66 0xC3 0xA9 (composed), or as 0x41 0x61 0x66
> 0x65 0xCC 0x81 (decomposed). But ultimately there needs to be a set of bits
> on disk, in a directory, saying the name of the file. When searching for a
> file we could have three choices (a) these two composed/decomposed are
> separate file names for two distinct files - whose name will look the same
> (b) these are the same file, which means all file access by name, and
> searching has to compose or decompose for comparison purposes (c) only one
> is allowed and the other is rejected or invalid.
> 
> Where are we? A bit of (b) and a bit of (c). Finder and file dialogs always
> decompose what is typed, and this is stored as the string of bits giving
> the file name. It seems that some APIs will automatically decompose their
> input, and others won't, and we may be in transition [to judge from the bug
> response]. So for safety, use a method that decomposes. (Unicode define at
> least two other types of de/composition, not discussed).
> 
> Apple calls decomposed "canonical". This is fine, except that Unicode
> refers to both "canonical decomposition" (what Apple filenames need) and
> "canonical composition" (the opposite). So if handling names via an Apple
> API made for filenames we are fine to talk of canonical file names. But if
> handling names with a general Unicode API, we need to understand that this
> means "canonical decomposition" rather than "canonical composition".
> 
> On 21 March 2017 at 11:03,  wrote:
> 
>> 
>>> What Apple suggested is to Unicode-normalize the filename before adding
>> it to the URL. Did you try doing that?
>> 
>> I’m trying to find out what that means.
> ___
> 
> Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)
> 
> Please do not post admin requests or moderator comments to the list.
> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
> 
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/cocoa-dev/gt%40iltofa.com
> 
> This email sent to g...@iltofa.com


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-21 Thread davelist

> On Mar 21, 2017, at 8:33 AM, Jean-Daniel  wrote:
> 
> 
>> Le 21 mars 2017 à 12:03, davel...@mac.com a écrit :
>> 
>>> 
>>> On Mar 21, 2017, at 1:06 AM, Jens Alfke  wrote:
>>> 
>>> 
 On Mar 20, 2017, at 2:23 PM, davel...@mac.com wrote:
 
 NSURL *url = [[self courseDirectory] URLByAppendingPathComponent:name];
>>> 
>>> There’s nothing wrong with that call; it’s the canonical way to add a path 
>>> component to a URL, filesystem or not.
>>> 
 NSURL *url = [NSURL fileURLWithFileSystemRepresentation:[name 
 fileSystemRepresentation] isDirectory:YES relativeToURL:[self 
 courseDirectory]];
>>> 
>>> This call doesn’t make sense. You’re converting to filesystem 
>>> representation and then back again, for no reason.
>>> 
>>> What Apple suggested is to Unicode-normalize the filename before adding it 
>>> to the URL. Did you try doing that?
>>> 
>>> —Jens
>> 
>> Jens,
>> 
>> I’m trying to find out what that means. Someone suggested off-list to me 
>> that I should be calling this:
>> 
>> https://developer.apple.com/reference/foundation/nsstring/1409474-decomposedstringwithcanonicalmap?language=objc
>> 
>> Is that correct?
>> 
>> So based on that, I think it means I should do:
>> 
>> NSURL *url = [[self courseDirectory] 
>> URLByAppendingPathComponent:name.decomposedStringWithCanonicalMapping];
>> 
>> Thanks,
>> Dave Reed
>> 
> 
> This is what the reply suggest but that make no sens for me. If you are 
> accessing the file using URL, that the work of the framework to convert the 
> URL into the right file system representation.
> 
> That said, if using name.decomposedStringWithCanonicalMapping fix your 
> problem, so go with it.

Unfortunately, I can't tell. One user told me he couldn't open his files (with 
Arabic names) after upgrading to iOS 10.3 public beta. He then changed the 
names to English names and was able to open the files. He sent me some sample 
files that I can open (but they were zipped by my app and then unzipped by me) 
so I honestly can't tell if any of these changes have different effects. With 
all the options I've tried, I can open the files on my 10.2 and my 10.3 
developer device.

If he creates files with Arabic names on 10.3 he can open them so something 
happened in the 10.2 to 10.3 (which is why I suspect APFS) upgrade that 
prevents files created with 10.2 from being opened with 10.3.

My plan to test it is to take a file created with 10.2 using an Arabic name and 
see what happens when I upgrade my regular phone to 10.3 but obviously that's 
time consuming and I only plan to do that upgrade once since it's a pain to 
downgrade (and not certain it would be possible with the APFS change).

That's why I keep emailing about this - I can't test it and can't find specific 
documentation from Apple that says exactly how to create a NSURL from a 
NSString that may contain Unicode characters.

Thanks,
Dave Reed







___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-21 Thread Aandi Inston
Is the question, what is canonical mapping? I'm going to assume it is, so I
can share what I found when I hit much the same issue. This is mostly from
memory so let's hope it's right.

Take the word Café. How many Unicode characters is this and what are they?
Turns out there are two answers. The last character as seen on screen is a
lower case e with an acute accent.
Let's ignore C,a,f as they are the same in all answers.  First answer: é is
'LATIN SMALL LETTER E WITH ACUTE' (U+00E9). We'll call this "composed". In
UTF-8 that's two bytes, 0xC3 0xA9. (This is the answer you'd often get, but
it's not the only answer, and not the one Apple filesystems like.)

Second answer uses an accent character. These are designed to appear in the
same space as another character. So combine "e" and an acute accent (like a
floating, slanted apostrophe) and we have "é". This means you could get the
same result from the two Unicode characters LATIN SMALL LETTER E (U+0065)
COMBINING ACUTE ACCENT (U+0301). We'll call this "decomposed". In UTF-8
that would be 0x65 0xCC 0x81: three bytes, two characters, combine to a
single character. (This is the one Apple filesystems like).

When you're typing in a word processor, or showing an alert, it hardly
matters how you create the e acute. Both look the same. But searching may
be a problem (not discussed) as may showing items in alphabetical order
(also not discussed).

Let's imagine now we have a filename Café. This could be represented in
UTF-8 bytes as 0x41 0x61 0x66 0xC3 0xA9 (composed), or as 0x41 0x61 0x66
0x65 0xCC 0x81 (decomposed). But ultimately there needs to be a set of bits
on disk, in a directory, saying the name of the file. When searching for a
file we could have three choices (a) these two composed/decomposed are
separate file names for two distinct files - whose name will look the same
(b) these are the same file, which means all file access by name, and
searching has to compose or decompose for comparison purposes (c) only one
is allowed and the other is rejected or invalid.

Where are we? A bit of (b) and a bit of (c). Finder and file dialogs always
decompose what is typed, and this is stored as the string of bits giving
the file name. It seems that some APIs will automatically decompose their
input, and others won't, and we may be in transition [to judge from the bug
response]. So for safety, use a method that decomposes. (Unicode define at
least two other types of de/composition, not discussed).

Apple calls decomposed "canonical". This is fine, except that Unicode
refers to both "canonical decomposition" (what Apple filenames need) and
 "canonical composition" (the opposite). So if handling names via an Apple
API made for filenames we are fine to talk of canonical file names. But if
handling names with a general Unicode API, we need to understand that this
means "canonical decomposition" rather than "canonical composition".

On 21 March 2017 at 11:03,  wrote:

>
> > What Apple suggested is to Unicode-normalize the filename before adding
> it to the URL. Did you try doing that?
>
> I’m trying to find out what that means.
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-21 Thread Jean-Daniel

> Le 21 mars 2017 à 12:03, davel...@mac.com a écrit :
> 
>> 
>> On Mar 21, 2017, at 1:06 AM, Jens Alfke  wrote:
>> 
>> 
>>> On Mar 20, 2017, at 2:23 PM, davel...@mac.com wrote:
>>> 
>>> NSURL *url = [[self courseDirectory] URLByAppendingPathComponent:name];
>> 
>> There’s nothing wrong with that call; it’s the canonical way to add a path 
>> component to a URL, filesystem or not.
>> 
>>> NSURL *url = [NSURL fileURLWithFileSystemRepresentation:[name 
>>> fileSystemRepresentation] isDirectory:YES relativeToURL:[self 
>>> courseDirectory]];
>> 
>> This call doesn’t make sense. You’re converting to filesystem representation 
>> and then back again, for no reason.
>> 
>> What Apple suggested is to Unicode-normalize the filename before adding it 
>> to the URL. Did you try doing that?
>> 
>> —Jens
> 
> Jens,
> 
> I’m trying to find out what that means. Someone suggested off-list to me that 
> I should be calling this:
> 
> https://developer.apple.com/reference/foundation/nsstring/1409474-decomposedstringwithcanonicalmap?language=objc
>  
> 
> 
> Is that correct?
> 
> So based on that, I think it means I should do:
> 
> NSURL *url = [[self courseDirectory] 
> URLByAppendingPathComponent:name.decomposedStringWithCanonicalMapping];
> 
> Thanks,
> Dave Reed
> 


This is what the reply suggest but that make no sens for me. If you are 
accessing the file using URL, that the work of the framework to convert the URL 
into the right file system representation.

That said, if using name.decomposedStringWithCanonicalMapping fix your problem, 
so go with it.


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-21 Thread davelist

> On Mar 21, 2017, at 1:06 AM, Jens Alfke  wrote:
> 
> 
>> On Mar 20, 2017, at 2:23 PM, davel...@mac.com wrote:
>> 
>> NSURL *url = [[self courseDirectory] URLByAppendingPathComponent:name];
> 
> There’s nothing wrong with that call; it’s the canonical way to add a path 
> component to a URL, filesystem or not.
> 
>> NSURL *url = [NSURL fileURLWithFileSystemRepresentation:[name 
>> fileSystemRepresentation] isDirectory:YES relativeToURL:[self 
>> courseDirectory]];
> 
> This call doesn’t make sense. You’re converting to filesystem representation 
> and then back again, for no reason.
> 
> What Apple suggested is to Unicode-normalize the filename before adding it to 
> the URL. Did you try doing that?
> 
> —Jens

Jens,

I’m trying to find out what that means. Someone suggested off-list to me that I 
should be calling this:

https://developer.apple.com/reference/foundation/nsstring/1409474-decomposedstringwithcanonicalmap?language=objc

Is that correct?

So based on that, I think it means I should do:

NSURL *url = [[self courseDirectory] 
URLByAppendingPathComponent:name.decomposedStringWithCanonicalMapping];

Thanks,
Dave Reed


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-20 Thread Jens Alfke

> On Mar 20, 2017, at 2:23 PM, davel...@mac.com wrote:
> 
> NSURL *url = [[self courseDirectory] URLByAppendingPathComponent:name];

There’s nothing wrong with that call; it’s the canonical way to add a path 
component to a URL, filesystem or not.

> NSURL *url = [NSURL fileURLWithFileSystemRepresentation:[name 
> fileSystemRepresentation] isDirectory:YES relativeToURL:[self 
> courseDirectory]];

This call doesn’t make sense. You’re converting to filesystem representation 
and then back again, for no reason.

What Apple suggested is to Unicode-normalize the filename before adding it to 
the URL. Did you try doing that?

—Jens
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-20 Thread Jean-Daniel

> Le 7 mars 2017 à 21:03, davel...@mac.com a écrit :
> 
>> 
>> On Mar 7, 2017, at 1:19 PM, Alastair Houghton  
>> wrote:
>> 
>> On 7 Mar 2017, at 12:47, Jean-Daniel  wrote:
>>> 
>>> Did you try to use NSString -fileSystemRepresentation instead of UTF-8, or 
>>> even better, use URL. While using UTF-8 for path worked well on HFS+, It 
>>> was never guaranteed to work on all FS.
>> 
>> FWIW, the macOS kernel does use UTF-8 at the VFS interface (and therefore 
>> the BSD syscalls that take path arguments expect UTF-8).  This is different 
>> to most other UNIXen (which tend to treat paths as a bunch of bytes, at 
>> least at syscall level and often at filesystem level too).  It’s definitely 
>> the case that for the built-in FAT, NTFS and HFS+ implementations, UTF-8 
>> will work.  Other filesystem implementations really *should* be treating 
>> what they get as UTF-8 too, but obviously that’s not guaranteed.
>> 
>> AFAIK all -fileSystemRepresentation does is it processes the Unicode string 
>> according to the rules in TN1150 and then convert to UTF-8; but you don’t 
>> actually *need* to do the HFS+ mapping (TN1150) before calling the BSD API 
>> (and it doesn’t even make any sense to do so unless the filesystem is HFS+, 
>> which -fileSystemRepresentation has no way of knowing).  The main benefit is 
>> that the result will compare bytewise equal with a filename read from the 
>> filesystem (assuming HFS+).  On other filesystems, well, things are 
>> different.  VFAT and later variants store UTF-16, as does NTFS, but the 
>> rules in both cases differ.  ExtFS, UFS et al. tend to regard filenames as a 
>> bunch of bytes and don’t even try to record what encoding was used.  I don’t 
>> know what ZFS, XFS or JFS do; using Unicode at filesystem level on a 
>> UNIX-like system is not unproblematic (because it may very well *not* be the 
>> same encoding being used at the user’s terminal), but equally the bunch of 
>> bytes approach creates all kinds of fun (you may *see* a file with a 
>> particular name, but you can’t necessarily name it yourself from the 
>> keyboard...)
>> 
>> Not that I’d recommend *not* using -fileSystemRepresentation; Apple says we 
>> should, so we should.  I’m just observing that it isn’t a particularly good 
>> API and in future it’ll either be deprecated or do the exact same thing as 
>> -UTF8String because there’s really no other good option I can see.
>> 
>> Kind regards,
>> 
>> Alastair.
> 
> I saw the other posts about fileSystemRepresentation and tried the code I 
> posted earlier in the thread didn’t have any effect.
> 
> My app has the option to zip up the directories UIManagedDocument creates and 
> email it (so users can back up their data or share it with others). The 
> person sent it to me. Below is what I did in the Terminal so you can see what 
> happens when I try to unzip it. If this doesn’t come through on the email 
> list with the characters looking correct, I can screenshot it.
> 
> This is one of the data files that was created on iOS 10.2 and then won’t 
> open now on an iOS 10.3 device. It appears the directory name and zip file 
> name do not match and it won’t unzip correctly. It does create a directory 
> but the directory is empty instead of containing the StoreContent and 
> persistentStore files. The zip file is 34KB so it may or may not actually 
> have the data in it.
> 
> $ ls
> إعلام.zip
> 
> $ unzip *.zip
> Archive:  إعلام.zip
> checkdir error:  cannot create Ϻ+?Ϧ+?Ϻ+?/StoreContent
> No such file or directory
> unable to process Ϻ+?Ϧ+?Ϻ+?/StoreContent/persistentStore.
> checkdir error:  cannot create Ϻ+?Ϧ+?Ϻ+?/StoreContent
> No such file or directory
> unable to process Ϻ+?Ϧ+?Ϻ+?/StoreContent/persistentStore-shm.
> checkdir error:  cannot create Ϻ+?Ϧ+?Ϻ+?/StoreContent
> No such file or directory
> unable to process Ϻ+?Ϧ+?Ϻ+?/StoreContent/persistentStore-wal.
> 
> $ ls
> ?+%F2Ϧ+%E4?+%E0/  إعلام.zip
> 
> When you unzip it, it should create a directory with the exact same name as 
> the .zip file (just without the .zip extension). 
> 
> This may be enough information that it’s worth filing a bug report now. Does 
> anyone have any other suggestions? Again, creating an Arabic file on iOS 10.3 
> works fine, but these ones that were created on 10.2 do not open on 10.3.

If new files work but old ones are broken, so it is probably a FS migration 
issue. Look like something went wrong while converting HFS to APFS, and I’m not 
sure you can do something about it (but fill a bug report to Apple).


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-20 Thread davelist

I received a polite reply from my bug stating:

"iOS HFS Normalized UNICODE names , APFS now treats all files as a bag of bytes 
on iOS . We are requesting that Applications developers call the correct 
Normalization routines to make sure the file name contains the correct  
representation."

Having trouble finding the canonical documentation from Apple stating what to 
do but by looking through NSURL documentation, I think the correct replacement 
for:

NSURL *url = [[self courseDirectory] URLByAppendingPathComponent:name];

where [self courseDirectory] is a URL of a directory (with an English name 
created by the app) in the Documents folder. The variable "name" is a NSString 
that is from the user (with just basic sanitizing to replace "/" with "-"). 
Note: this is iOS.

is:

NSURL *url = [NSURL fileURLWithFileSystemRepresentation:[name 
fileSystemRepresentation] isDirectory:YES relativeToURL:[self courseDirectory]];

Can anyone confirm this is correct?

Thanks,
Dave


> On Mar 12, 2017, at 5:00 PM, David Reed <dave...@mac.com> wrote:
> 
> 
> Hi Uli,
> 
> The code to create the URL was using:
> 
>NSURL *url = [[self courseDirectory] URLByAppendingPathComponent:name];
> 
> where [self courseDirectory] is a URL of a directory (with an English name 
> created by the app) in the Documents folder. The variable "name" is a 
> NSString that is from the user (with just basic sanitizing to replace "/" 
> with "-"). Note: this is iOS.
> 
> So I wasn't using UTF8String or fileSystemRepresentation.
> 
> Someone claimed I should be using fileSystemRepresentation and someone else 
> said it shouldn't matter. If anyone has the definitive answer as to what I 
> should change that to, I'm happy to use it (although it may be too late now).
> 
> Thanks,
> Dave Reed
> 
> 
>> On Mar 12, 2017, at 8:25 AM, Uli Kusterer <witness.of.teacht...@gmx.net> 
>> wrote:
>> 
>> I can't find the start of this thread, but this sounds a lot like you were 
>> using -UTF8String instead of -fileSystemRepresentation to save out your file 
>> names. That's the main difference between those two calls: 
>> -fileSystemRepresentation decomposes UTF8 the way HFS+ does, so should never 
>> adopt newer decompositions, and will instead guarantee the same string will 
>> decompose the same way — as long as you don't forget to use it somewhere.
>> 
>> Of course, if you are using command line tools, they might not be properly 
>> normalizing the file names.
>> 
>> Apologies if this was already covered in the lost beginning of this thread.
>> 
>> Cheers,
>> -- Uli Kusterer
>> "The Witnesses of TeachText are everywhere..."
>> http://www.zathras.de
>> 
>>> On 8 Mar 2017, at 22:35, Peter Edberg <pedb...@apple.com> wrote:
>>> 
>>> 
>>>> On Mar 8, 2017, at 12:00 PM, cocoa-dev-requ...@lists.apple.com wrote:
>>>> 
>>>> Message: 1
>>>> Date: Tue, 07 Mar 2017 15:03:41 -0500
>>>> From: davel...@mac.com
>>>> To: Alastair Houghton <alast...@alastairs-place.net>,  David Duncan
>>>><david.dun...@apple.com>
>>>> Cc: cocoa-dev list <cocoa-dev@lists.apple.com>
>>>> Subject: Re: Unicode filenames with Apple File System and
>>>>UIManagedDocument
>>>> 
>>>> 
>>>> 
>>>> My app has the option to zip up the directories UIManagedDocument creates 
>>>> and email it (so users can back up their data or share it with others). 
>>>> The person sent it to me. Below is what I did in the Terminal so you can 
>>>> see what happens when I try to unzip it. If this doesn’t come through on 
>>>> the email list with the characters looking correct, I can screenshot it.
>>>> 
>>>> This is one of the data files that was created on iOS 10.2 and then won’t 
>>>> open now on an iOS 10.3 device. It appears the directory name and zip file 
>>>> name do not match and it won’t unzip correctly. It does create a directory 
>>>> but the directory is empty instead of containing the StoreContent and 
>>>> persistentStore files. The zip file is 34KB so it may or may not actually 
>>>> have the data in it.
>>>> 
>>>> $ ls
>>>> إعلام.zip
>>> 
>>> 
>>> It is probably worth noting that the first Arabic character in the above 
>>> filename (i.e. the one that appears on the right, adjacent to the period) 
>>> has a canonical decomposition, as per this line from UnicodeData.txt 
>

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-12 Thread davelist
(resent from address that is subscribed to the list)

Hi Uli, 

The code to create the URL was using:

   NSURL *url = [[self courseDirectory] URLByAppendingPathComponent:name];

where [self courseDirectory] is a URL of a directory (with an English name 
created by the app) in the Documents folder. The variable "name" is a NSString 
that is from the user (with just basic sanitizing to replace "/" with "-"). 
Note: this is iOS.

So I wasn't using UTF8String or fileSystemRepresentation.

Someone claimed I should be using fileSystemRepresentation and someone else 
said it shouldn't matter. If anyone has the definitive answer as to what I 
should change that to, I'm happy to use it (although it may be too late now).

Thanks,
Dave Reed

> On Mar 12, 2017, at 8:25 AM, Uli Kusterer <witness.of.teacht...@gmx.net> 
> wrote:
> 
> I can't find the start of this thread, but this sounds a lot like you were 
> using -UTF8String instead of -fileSystemRepresentation to save out your file 
> names. That's the main difference between those two calls: 
> -fileSystemRepresentation decomposes UTF8 the way HFS+ does, so should never 
> adopt newer decompositions, and will instead guarantee the same string will 
> decompose the same way — as long as you don't forget to use it somewhere.
> 
> Of course, if you are using command line tools, they might not be properly 
> normalizing the file names.
> 
> Apologies if this was already covered in the lost beginning of this thread.
> 
> Cheers,
> -- Uli Kusterer
> "The Witnesses of TeachText are everywhere..."
> http://www.zathras.de
> 
>> On 8 Mar 2017, at 22:35, Peter Edberg <pedb...@apple.com> wrote:
>> 
>> 
>>> On Mar 8, 2017, at 12:00 PM, cocoa-dev-requ...@lists.apple.com wrote:
>>> 
>>> Message: 1
>>> Date: Tue, 07 Mar 2017 15:03:41 -0500
>>> From: davel...@mac.com
>>> To: Alastair Houghton <alast...@alastairs-place.net>,   David Duncan
>>> <david.dun...@apple.com>
>>> Cc: cocoa-dev list <cocoa-dev@lists.apple.com>
>>> Subject: Re: Unicode filenames with Apple File System and
>>> UIManagedDocument
>>> 
>>> 
>>> 
>>> My app has the option to zip up the directories UIManagedDocument creates 
>>> and email it (so users can back up their data or share it with others). The 
>>> person sent it to me. Below is what I did in the Terminal so you can see 
>>> what happens when I try to unzip it. If this doesn’t come through on the 
>>> email list with the characters looking correct, I can screenshot it.
>>> 
>>> This is one of the data files that was created on iOS 10.2 and then won’t 
>>> open now on an iOS 10.3 device. It appears the directory name and zip file 
>>> name do not match and it won’t unzip correctly. It does create a directory 
>>> but the directory is empty instead of containing the StoreContent and 
>>> persistentStore files. The zip file is 34KB so it may or may not actually 
>>> have the data in it.
>>> 
>>> $ ls
>>> إعلام.zip
>> 
>> 
>> It is probably worth noting that the first Arabic character in the above 
>> filename (i.e. the one that appears on the right, adjacent to the period) 
>> has a canonical decomposition, as per this line from UnicodeData.txt 
>> (http://www.unicode.org/Public/9.0.0/ucd/UnicodeData.txt 
>> <http://www.unicode.org/Public/9.0.0/ucd/UnicodeData.txt>):
>> 0625;ARABIC LETTER ALEF WITH HAMZA BELOW;Lo;0;AL;0627 0655;...
>> 
>> That is, in some cases this character 0625 (UTF8: D8 A5)  will be converted 
>> to the sequence 0627 0655 (UTF8: D8 A7 D9 95).
>> 
>> This decomposition was introduced in Unicode 3.0. If there are processes 
>> that use decomposition according to Unicode 9 versus Unicode 2.x, or 
>> processes that don't decompose versus ones that do, then the filename bytes 
>> will be different.
>> 
>> - Peter E
>> 
>> 
>> 
>> ___
>> 
>> Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)
>> 
>> Please do not post admin requests or moderator comments to the list.
>> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
>> 
>> Help/Unsubscribe/Update your Subscription:
>> https://lists.apple.com/mailman/options/cocoa-dev/witness.of.teachtext%40gmx.net
>> 
>> This email sent to witness.of.teacht...@gmx.net
> 
> 
> ___
> 
> Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)
> 
> Please do not post admin requests or moderato

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-12 Thread Uli Kusterer
I can't find the start of this thread, but this sounds a lot like you were 
using -UTF8String instead of -fileSystemRepresentation to save out your file 
names. That's the main difference between those two calls: 
-fileSystemRepresentation decomposes UTF8 the way HFS+ does, so should never 
adopt newer decompositions, and will instead guarantee the same string will 
decompose the same way — as long as you don't forget to use it somewhere.

Of course, if you are using command line tools, they might not be properly 
normalizing the file names.

Apologies if this was already covered in the lost beginning of this thread.

Cheers,
-- Uli Kusterer
"The Witnesses of TeachText are everywhere..."
http://www.zathras.de

> On 8 Mar 2017, at 22:35, Peter Edberg <pedb...@apple.com> wrote:
> 
> 
>> On Mar 8, 2017, at 12:00 PM, cocoa-dev-requ...@lists.apple.com wrote:
>> 
>> Message: 1
>> Date: Tue, 07 Mar 2017 15:03:41 -0500
>> From: davel...@mac.com
>> To: Alastair Houghton <alast...@alastairs-place.net>,David Duncan
>>  <david.dun...@apple.com>
>> Cc: cocoa-dev list <cocoa-dev@lists.apple.com>
>> Subject: Re: Unicode filenames with Apple File System and
>>  UIManagedDocument
>> 
>> 
>> 
>> My app has the option to zip up the directories UIManagedDocument creates 
>> and email it (so users can back up their data or share it with others). The 
>> person sent it to me. Below is what I did in the Terminal so you can see 
>> what happens when I try to unzip it. If this doesn’t come through on the 
>> email list with the characters looking correct, I can screenshot it.
>> 
>> This is one of the data files that was created on iOS 10.2 and then won’t 
>> open now on an iOS 10.3 device. It appears the directory name and zip file 
>> name do not match and it won’t unzip correctly. It does create a directory 
>> but the directory is empty instead of containing the StoreContent and 
>> persistentStore files. The zip file is 34KB so it may or may not actually 
>> have the data in it.
>> 
>> $ ls
>> إعلام.zip
> 
> 
> It is probably worth noting that the first Arabic character in the above 
> filename (i.e. the one that appears on the right, adjacent to the period) has 
> a canonical decomposition, as per this line from UnicodeData.txt 
> (http://www.unicode.org/Public/9.0.0/ucd/UnicodeData.txt 
> <http://www.unicode.org/Public/9.0.0/ucd/UnicodeData.txt>):
> 0625;ARABIC LETTER ALEF WITH HAMZA BELOW;Lo;0;AL;0627 0655;...
> 
> That is, in some cases this character 0625 (UTF8: D8 A5)  will be converted 
> to the sequence 0627 0655 (UTF8: D8 A7 D9 95).
> 
> This decomposition was introduced in Unicode 3.0. If there are processes that 
> use decomposition according to Unicode 9 versus Unicode 2.x, or processes 
> that don't decompose versus ones that do, then the filename bytes will be 
> different.
> 
> - Peter E
> 
> 
> 
> ___
> 
> Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)
> 
> Please do not post admin requests or moderator comments to the list.
> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
> 
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/cocoa-dev/witness.of.teachtext%40gmx.net
> 
> This email sent to witness.of.teacht...@gmx.net


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-11 Thread davelist

> On Mar 9, 2017, at 7:31 AM, Chris Ridd  wrote:
> 
> 
>> On 8 Mar 2017, at 22:56, Peter Edberg  wrote:
>> 
>>> 
>>> On Mar 8, 2017, at 1:44 PM, David Reed  wrote:
>>> 
>>> Thanks Peter.
>>> 
>>> I am going to try to find time in the next few days to file a bug report. 
>>> I'll obviously include this information. Is there anything else you think I 
>>> should include?
>> 
>> Nothing else leaps out at me other than the usuals (system version, exact 
>> steps to repro, etc.).
>> - Peter E
> 
> The only other thing I’d do (if possible) is to use NSFileManager and 
> enumerate all the filenames in the directory containing the “bad” document. 
> Possibly the process of zipping stuff up will mangle the bytes of the 
> filename, so the more “raw” info you can get from the OS the better.
> 
> Chris

Thanks, I reported the bug: rdar://30993389

I did do a directory enumeration and the "persistentStore" files that 
UIMangagedDocument creates were missing. The directory name was there as was 
the StoreContent directory inside it, but the StoreContent directory was empty.

Thanks,
Dave Reed


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-09 Thread Chris Ridd

> On 8 Mar 2017, at 22:56, Peter Edberg  wrote:
> 
>> 
>> On Mar 8, 2017, at 1:44 PM, David Reed  wrote:
>> 
>> Thanks Peter.
>> 
>> I am going to try to find time in the next few days to file a bug report. 
>> I'll obviously include this information. Is there anything else you think I 
>> should include?
> 
> Nothing else leaps out at me other than the usuals (system version, exact 
> steps to repro, etc.).
> - Peter E

The only other thing I’d do (if possible) is to use NSFileManager and enumerate 
all the filenames in the directory containing the “bad” document. Possibly the 
process of zipping stuff up will mangle the bytes of the filename, so the more 
“raw” info you can get from the OS the better.

Chris


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-08 Thread Peter Edberg

> On Mar 8, 2017, at 1:44 PM, David Reed <dave...@mac.com> wrote:
> 
> 
>> On Mar 8, 2017, at 4:35 PM, Peter Edberg <pedb...@apple.com> wrote:
>> 
>> 
>>> On Mar 8, 2017, at 12:00 PM, cocoa-dev-requ...@lists.apple.com wrote:
>>> 
>>> Message: 1
>>> Date: Tue, 07 Mar 2017 15:03:41 -0500
>>> From: davel...@mac.com
>>> To: Alastair Houghton <alast...@alastairs-place.net>,   David Duncan
>>> <david.dun...@apple.com>
>>> Cc: cocoa-dev list <cocoa-dev@lists.apple.com>
>>> Subject: Re: Unicode filenames with Apple File System and
>>> UIManagedDocument
>>> 
>>> 
>>> 
>>> My app has the option to zip up the directories UIManagedDocument creates 
>>> and email it (so users can back up their data or share it with others). The 
>>> person sent it to me. Below is what I did in the Terminal so you can see 
>>> what happens when I try to unzip it. If this doesn’t come through on the 
>>> email list with the characters looking correct, I can screenshot it.
>>> 
>>> This is one of the data files that was created on iOS 10.2 and then won’t 
>>> open now on an iOS 10.3 device. It appears the directory name and zip file 
>>> name do not match and it won’t unzip correctly. It does create a directory 
>>> but the directory is empty instead of containing the StoreContent and 
>>> persistentStore files. The zip file is 34KB so it may or may not actually 
>>> have the data in it.
>>> 
>>> $ ls
>>> إعلام.zip
>> 
>> 
>> It is probably worth noting that the first Arabic character in the above 
>> filename (i.e. the one that appears on the right, adjacent to the period) 
>> has a canonical decomposition, as per this line from UnicodeData.txt 
>> (http://www.unicode.org/Public/9.0.0/ucd/UnicodeData.txt 
>> <http://www.unicode.org/Public/9.0.0/ucd/UnicodeData.txt>):
>> 0625;ARABIC LETTER ALEF WITH HAMZA BELOW;Lo;0;AL;0627 0655;...
>> 
>> That is, in some cases this character 0625 (UTF8: D8 A5)  will be converted 
>> to the sequence 0627 0655 (UTF8: D8 A7 D9 95).
>> 
>> This decomposition was introduced in Unicode 3.0. If there are processes 
>> that use decomposition according to Unicode 9 versus Unicode 2.x, or 
>> processes that don't decompose versus ones that do, then the filename bytes 
>> will be different.
>> 
>> - Peter E
> 
> Thanks Peter.
> 
> I am going to try to find time in the next few days to file a bug report. 
> I'll obviously include this information. Is there anything else you think I 
> should include?

Nothing else leaps out at me other than the usuals (system version, exact steps 
to repro, etc.).
- Peter E
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-08 Thread davelist

> On Mar 8, 2017, at 4:35 PM, Peter Edberg  wrote:
> 
> 
>> On Mar 8, 2017, at 12:00 PM, cocoa-dev-requ...@lists.apple.com wrote:
>> 
>> 
>> My app has the option to zip up the directories UIManagedDocument creates 
>> and email it (so users can back up their data or share it with others). The 
>> person sent it to me. Below is what I did in the Terminal so you can see 
>> what happens when I try to unzip it. If this doesn’t come through on the 
>> email list with the characters looking correct, I can screenshot it.
>> 
>> This is one of the data files that was created on iOS 10.2 and then won’t 
>> open now on an iOS 10.3 device. It appears the directory name and zip file 
>> name do not match and it won’t unzip correctly. It does create a directory 
>> but the directory is empty instead of containing the StoreContent and 
>> persistentStore files. The zip file is 34KB so it may or may not actually 
>> have the data in it.
>> 
>> $ ls
>> إعلام.zip
> 
> 
> It is probably worth noting that the first Arabic character in the above 
> filename (i.e. the one that appears on the right, adjacent to the period) has 
> a canonical decomposition, as per this line from UnicodeData.txt 
> (http://www.unicode.org/Public/9.0.0/ucd/UnicodeData.txt 
> ):
> 0625;ARABIC LETTER ALEF WITH HAMZA BELOW;Lo;0;AL;0627 0655;...
> 
> That is, in some cases this character 0625 (UTF8: D8 A5)  will be converted 
> to the sequence 0627 0655 (UTF8: D8 A7 D9 95).
> 
> This decomposition was introduced in Unicode 3.0. If there are processes that 
> use decomposition according to Unicode 9 versus Unicode 2.x, or processes 
> that don't decompose versus ones that do, then the filename bytes will be 
> different.
> 
> - Peter E

Thanks Peter.

I am going to try to find time in the next few days to file a bug report. I'll 
obviously include this information. Is there anything else you think I should 
include?

Thanks,
Dave Reed
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-08 Thread Peter Edberg

> On Mar 8, 2017, at 12:00 PM, cocoa-dev-requ...@lists.apple.com wrote:
> 
> Message: 1
> Date: Tue, 07 Mar 2017 15:03:41 -0500
> From: davel...@mac.com
> To: Alastair Houghton <alast...@alastairs-place.net>, David Duncan
>   <david.dun...@apple.com>
> Cc: cocoa-dev list <cocoa-dev@lists.apple.com>
> Subject: Re: Unicode filenames with Apple File System and
>   UIManagedDocument
> 
> 
> 
> My app has the option to zip up the directories UIManagedDocument creates and 
> email it (so users can back up their data or share it with others). The 
> person sent it to me. Below is what I did in the Terminal so you can see what 
> happens when I try to unzip it. If this doesn’t come through on the email 
> list with the characters looking correct, I can screenshot it.
> 
> This is one of the data files that was created on iOS 10.2 and then won’t 
> open now on an iOS 10.3 device. It appears the directory name and zip file 
> name do not match and it won’t unzip correctly. It does create a directory 
> but the directory is empty instead of containing the StoreContent and 
> persistentStore files. The zip file is 34KB so it may or may not actually 
> have the data in it.
> 
> $ ls
> إعلام.zip


It is probably worth noting that the first Arabic character in the above 
filename (i.e. the one that appears on the right, adjacent to the period) has a 
canonical decomposition, as per this line from UnicodeData.txt 
(http://www.unicode.org/Public/9.0.0/ucd/UnicodeData.txt 
<http://www.unicode.org/Public/9.0.0/ucd/UnicodeData.txt>):
0625;ARABIC LETTER ALEF WITH HAMZA BELOW;Lo;0;AL;0627 0655;...

That is, in some cases this character 0625 (UTF8: D8 A5)  will be converted to 
the sequence 0627 0655 (UTF8: D8 A7 D9 95).

This decomposition was introduced in Unicode 3.0. If there are processes that 
use decomposition according to Unicode 9 versus Unicode 2.x, or processes that 
don't decompose versus ones that do, then the filename bytes will be different.

- Peter E



___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-07 Thread davelist

> On Mar 7, 2017, at 1:19 PM, Alastair Houghton  
> wrote:
> 
> On 7 Mar 2017, at 12:47, Jean-Daniel  wrote:
>> 
>> Did you try to use NSString -fileSystemRepresentation instead of UTF-8, or 
>> even better, use URL. While using UTF-8 for path worked well on HFS+, It was 
>> never guaranteed to work on all FS.
> 
> FWIW, the macOS kernel does use UTF-8 at the VFS interface (and therefore the 
> BSD syscalls that take path arguments expect UTF-8).  This is different to 
> most other UNIXen (which tend to treat paths as a bunch of bytes, at least at 
> syscall level and often at filesystem level too).  It’s definitely the case 
> that for the built-in FAT, NTFS and HFS+ implementations, UTF-8 will work.  
> Other filesystem implementations really *should* be treating what they get as 
> UTF-8 too, but obviously that’s not guaranteed.
> 
> AFAIK all -fileSystemRepresentation does is it processes the Unicode string 
> according to the rules in TN1150 and then convert to UTF-8; but you don’t 
> actually *need* to do the HFS+ mapping (TN1150) before calling the BSD API 
> (and it doesn’t even make any sense to do so unless the filesystem is HFS+, 
> which -fileSystemRepresentation has no way of knowing).  The main benefit is 
> that the result will compare bytewise equal with a filename read from the 
> filesystem (assuming HFS+).  On other filesystems, well, things are 
> different.  VFAT and later variants store UTF-16, as does NTFS, but the rules 
> in both cases differ.  ExtFS, UFS et al. tend to regard filenames as a bunch 
> of bytes and don’t even try to record what encoding was used.  I don’t know 
> what ZFS, XFS or JFS do; using Unicode at filesystem level on a UNIX-like 
> system is not unproblematic (because it may very well *not* be the same 
> encoding being used at the user’s terminal), but equally the bunch of bytes 
> approach creates all kinds of fun (you may *see* a file with a particular 
> name, but you can’t necessarily name it yourself from the keyboard...)
> 
> Not that I’d recommend *not* using -fileSystemRepresentation; Apple says we 
> should, so we should.  I’m just observing that it isn’t a particularly good 
> API and in future it’ll either be deprecated or do the exact same thing as 
> -UTF8String because there’s really no other good option I can see.
> 
> Kind regards,
> 
> Alastair.

I saw the other posts about fileSystemRepresentation and tried the code I 
posted earlier in the thread didn’t have any effect.

My app has the option to zip up the directories UIManagedDocument creates and 
email it (so users can back up their data or share it with others). The person 
sent it to me. Below is what I did in the Terminal so you can see what happens 
when I try to unzip it. If this doesn’t come through on the email list with the 
characters looking correct, I can screenshot it.

This is one of the data files that was created on iOS 10.2 and then won’t open 
now on an iOS 10.3 device. It appears the directory name and zip file name do 
not match and it won’t unzip correctly. It does create a directory but the 
directory is empty instead of containing the StoreContent and persistentStore 
files. The zip file is 34KB so it may or may not actually have the data in it.

$ ls
إعلام.zip

$ unzip *.zip
Archive:  إعلام.zip
checkdir error:  cannot create Ϻ+?Ϧ+?Ϻ+?/StoreContent
 No such file or directory
 unable to process Ϻ+?Ϧ+?Ϻ+?/StoreContent/persistentStore.
checkdir error:  cannot create Ϻ+?Ϧ+?Ϻ+?/StoreContent
 No such file or directory
 unable to process Ϻ+?Ϧ+?Ϻ+?/StoreContent/persistentStore-shm.
checkdir error:  cannot create Ϻ+?Ϧ+?Ϻ+?/StoreContent
 No such file or directory
 unable to process Ϻ+?Ϧ+?Ϻ+?/StoreContent/persistentStore-wal.

 $ ls
?+%F2Ϧ+%E4?+%E0/إعلام.zip

When you unzip it, it should create a directory with the exact same name as the 
.zip file (just without the .zip extension). 

This may be enough information that it’s worth filing a bug report now. Does 
anyone have any other suggestions? Again, creating an Arabic file on iOS 10.3 
works fine, but these ones that were created on 10.2 do not open on 10.3.

Thanks,
Dave Reed





___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-07 Thread Alastair Houghton
On 7 Mar 2017, at 12:47, Jean-Daniel  wrote:
> 
> Did you try to use NSString -fileSystemRepresentation instead of UTF-8, or 
> even better, use URL. While using UTF-8 for path worked well on HFS+, It was 
> never guaranteed to work on all FS.

FWIW, the macOS kernel does use UTF-8 at the VFS interface (and therefore the 
BSD syscalls that take path arguments expect UTF-8).  This is different to most 
other UNIXen (which tend to treat paths as a bunch of bytes, at least at 
syscall level and often at filesystem level too).  It’s definitely the case 
that for the built-in FAT, NTFS and HFS+ implementations, UTF-8 will work.  
Other filesystem implementations really *should* be treating what they get as 
UTF-8 too, but obviously that’s not guaranteed.

AFAIK all -fileSystemRepresentation does is it processes the Unicode string 
according to the rules in TN1150 and then convert to UTF-8; but you don’t 
actually *need* to do the HFS+ mapping (TN1150) before calling the BSD API (and 
it doesn’t even make any sense to do so unless the filesystem is HFS+, which 
-fileSystemRepresentation has no way of knowing).  The main benefit is that the 
result will compare bytewise equal with a filename read from the filesystem 
(assuming HFS+).  On other filesystems, well, things are different.  VFAT and 
later variants store UTF-16, as does NTFS, but the rules in both cases differ.  
ExtFS, UFS et al. tend to regard filenames as a bunch of bytes and don’t even 
try to record what encoding was used.  I don’t know what ZFS, XFS or JFS do; 
using Unicode at filesystem level on a UNIX-like system is not unproblematic 
(because it may very well *not* be the same encoding being used at the user’s 
terminal), but equally the bunch of bytes approach creates all kinds of fun 
(you may *see* a file with a particular name, but you can’t necessarily name it 
yourself from the keyboard...)

Not that I’d recommend *not* using -fileSystemRepresentation; Apple says we 
should, so we should.  I’m just observing that it isn’t a particularly good API 
and in future it’ll either be deprecated or do the exact same thing as 
-UTF8String because there’s really no other good option I can see.

Kind regards,

Alastair.

--
http://alastairs-place.net


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-07 Thread Alastair Houghton
On 7 Mar 2017, at 18:00, Jens Alfke  wrote:
> 
> On Mar 7, 2017, at 8:13 AM, davel...@mac.com wrote:
>> 
>>   NSFileManager *fm = [[NSFileManager alloc] init];
>>   const char *data = [name fileSystemRepresentation];
>>   NSString *filename = [fm stringWithFileSystemRepresentation:data 
>> length:strlen(data)];
> 
> This is a no-op, since you’re calling two methods that perform inverse 
> operations — `filename` will end up being identical to `name`.

AFAIK it isn’t a no-op (at least at present).  Right now, I rather suspect 
it’ll lead to decomposition and the replacement of Hangul characters in the 
range U+AC00 through U+D7A3, according to the rules in TN1150.

Unfortunately, because -fileSystemRepresentation doesn’t really know the 
underlying filesystem, this may or may not be appropriate, and I’d expect *in 
future* at some point the above really will be a no-op as a result.

Kind regards,

Alastair.

--
http://alastairs-place.net


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-07 Thread Jens Alfke

> On Mar 7, 2017, at 8:13 AM, davel...@mac.com wrote:
> 
>NSFileManager *fm = [[NSFileManager alloc] init];
>const char *data = [name fileSystemRepresentation];
>NSString *filename = [fm stringWithFileSystemRepresentation:data 
> length:strlen(data)];

This is a no-op, since you’re calling two methods that perform inverse 
operations — `filename` will end up being identical to `name`.

The only reason to use the fileSystemRepresentation methods is if you need to 
convert a filename/path to or from a C string, for example if you’re calling a 
lower-level API like fopen, or you received a filename from a C string (like a 
command line argument.)

> Before what I was doing is:
>NSURL *url = [[self courseDirectory] URLByAppendingPathComponent:name];

That should work fine. If it doesn’t, it’s Apple’s bug not yours.

—Jens
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-07 Thread davelist

> On Mar 7, 2017, at 9:55 AM, davel...@mac.com wrote:
> 
> 
>> On Mar 7, 2017, at 7:47 AM, Jean-Daniel  wrote:
>> 
>> 
>>> Le 6 mars 2017 à 14:28, davel...@mac.com a écrit :
>>> 
>>> I have an iOS app (Attendance2) written in Objective-C. One of my users 
>>> upgraded to the public 10.3 iOS beta and reported he could no longer open 
>>> his documents (I have a subclass of UIManagedDocument so they are Core Data 
>>> files stored in the package/directory format that UIManagedDocument uses). 
>>> I didn’t notice any issues with my test device using the developer beta of 
>>> 10.3. He changed the file names from Arabic to Roman and then he said he 
>>> could open them.
>>> 
>>> Everything I do with NSString is via UTF8 (and it worked fine with Arabic 
>>> letters for this person before updating to the 10.3 beta) so I don’t think 
>>> I’m doing anything wrong.
>>> 
>> 
>> Did you try to use NSString -fileSystemRepresentation instead of UTF-8, or 
>> even better, use URL. While using UTF-8 for path worked well on HFS+, It was 
>> never guaranteed to work on all FS.
>> 
> 
> I’ll take a look at this today. I suspect I’m not using 
> fileSystemRepresentation. I’ll have to see when that is appropriate to use as 
> I believe I’m creating a URL from the string the user types in and then using 
> that as part of the URL for the UIManagedDocument.
> 
> The person did create a new Arabic file under 10.3 and it opens fine, but the 
> ones that we created under 10.2 won’t open under 10.3 unless the person 
> changes the name so it uses Roman/English characters.
> 
> If using fileSystemRepresentation doesn’t fix it, I’ll file a bug.
> 
> Thanks,
> Dave Reed

Is this the correct way to do it (assuming the variable name is the NSString 
with the name of the file (and [self courseDirectory] is the directory the file 
is in?

NSFileManager *fm = [[NSFileManager alloc] init];
const char *data = [name fileSystemRepresentation];
NSString *filename = [fm stringWithFileSystemRepresentation:data 
length:strlen(data)];
NSURL *url = [[self courseDirectory] URLByAppendingPathComponent:filename];

Before what I was doing is:
NSURL *url = [[self courseDirectory] URLByAppendingPathComponent:name];

With the above changes, I can still open my files so it doesn't appear to break 
anything. I'll have to wait until I get home to see if that now lets me open 
the pre-created Arabic files (the person sent me a sample file that won't open 
in 10.3 and I can confirm it opens in 10.2 but not with my 10.3 test device) as 
I don't have Xcode 8.3 beta on my laptop and that seems to be required to load 
the data onto the 10.3 beta device.

Thanks,
Dave Reed



___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-07 Thread davelist

> On Mar 7, 2017, at 7:47 AM, Jean-Daniel  wrote:
> 
> 
>> Le 6 mars 2017 à 14:28, davel...@mac.com a écrit :
>> 
>> I have an iOS app (Attendance2) written in Objective-C. One of my users 
>> upgraded to the public 10.3 iOS beta and reported he could no longer open 
>> his documents (I have a subclass of UIManagedDocument so they are Core Data 
>> files stored in the package/directory format that UIManagedDocument uses). I 
>> didn’t notice any issues with my test device using the developer beta of 
>> 10.3. He changed the file names from Arabic to Roman and then he said he 
>> could open them.
>> 
>> Everything I do with NSString is via UTF8 (and it worked fine with Arabic 
>> letters for this person before updating to the 10.3 beta) so I don’t think 
>> I’m doing anything wrong.
>> 
> 
> Did you try to use NSString -fileSystemRepresentation instead of UTF-8, or 
> even better, use URL. While using UTF-8 for path worked well on HFS+, It was 
> never guaranteed to work on all FS.
> 

I’ll take a look at this today. I suspect I’m not using 
fileSystemRepresentation. I’ll have to see when that is appropriate to use as I 
believe I’m creating a URL from the string the user types in and then using 
that as part of the URL for the UIManagedDocument.

The person did create a new Arabic file under 10.3 and it opens fine, but the 
ones that we created under 10.2 won’t open under 10.3 unless the person changes 
the name so it uses Roman/English characters.

If using fileSystemRepresentation doesn’t fix it, I’ll file a bug.

Thanks,
Dave Reed




___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-07 Thread Jean-Daniel

> Le 6 mars 2017 à 14:28, davel...@mac.com a écrit :
> 
> I have an iOS app (Attendance2) written in Objective-C. One of my users 
> upgraded to the public 10.3 iOS beta and reported he could no longer open his 
> documents (I have a subclass of UIManagedDocument so they are Core Data files 
> stored in the package/directory format that UIManagedDocument uses). I didn’t 
> notice any issues with my test device using the developer beta of 10.3. He 
> changed the file names from Arabic to Roman and then he said he could open 
> them.
> 
> Everything I do with NSString is via UTF8 (and it worked fine with Arabic 
> letters for this person before updating to the 10.3 beta) so I don’t think 
> I’m doing anything wrong.
> 

Did you try to use NSString -fileSystemRepresentation instead of UTF-8, or even 
better, use URL. While using UTF-8 for path worked well on HFS+, It was never 
guaranteed to work on all FS.



___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-06 Thread Jens Alfke

> On Mar 6, 2017, at 3:45 PM, davel...@mac.com wrote:
> 
> If I had the time and could easily do that, I would but pairing my app down 
> to just the minimal parts would be time consuming.

Please at least file the bug report, even if you can’t include a reproducible 
case. It’s very important that Apple know about any potential problems with 
APFS since it’s such a major change.

—Jens


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-06 Thread davelist

> On Mar 6, 2017, at 5:10 PM, David Duncan  wrote:
> 
>> 
>> On Mar 6, 2017, at 2:05 PM, davel...@mac.com wrote:
>> 
>>> 
>>> On Mar 6, 2017, at 12:37 PM, Chris Ridd  wrote:
>>> 
>>> 
 On 6 Mar 2017, at 13:28, davel...@mac.com wrote:
 
 I have an iOS app (Attendance2) written in Objective-C. One of my users 
 upgraded to the public 10.3 iOS beta and reported he could no longer open 
 his documents (I have a subclass of UIManagedDocument so they are Core 
 Data files stored in the package/directory format that UIManagedDocument 
 uses). I didn’t notice any issues with my test device using the developer 
 beta of 10.3. He changed the file names from Arabic to Roman and then he 
 said he could open them.
 
 Everything I do with NSString is via UTF8 (and it worked fine with Arabic 
 letters for this person before updating to the 10.3 beta) so I don’t think 
 I’m doing anything wrong.
 
 Any suggestions?
>>> 
>>> If that iOS beta has upgraded the user’s filesystem to APFS, then it may be 
>>> an iOS bug that you need to report.
>>> 
>>> Chris
>> 
>> I'm assuming the public beta upgraded to APFS (as I believe I read the 
>> developer betas upgraded to APFS). I'm trying to figure out if this an Apple 
>> bug (i.e., either APFS isn't handling his Arabic filenames correctly or 
>> perhaps something went wrong in the upgrade from HFS+ to APFS) or if perhaps 
>> it is a bug in my app (I doubt since all I'm doing is taking the NSString 
>> they enter and using it as the filename).
>> 
>> Is there anything else we could try to see which one of those it likely is? 
>> I'm going to ask him to create a new file and use an Arabic name and see if 
>> that works (i.e., was it just an issue with existing files in Arabic).
> 
> I would highly recommend you file a bug that includes enough of your code to 
> reproduce the issue now. You can update it later if you determine it is your 
> issue.
> 
> --
> David Duncan

If I had the time and could easily do that, I would but pairing my app down to 
just the minimal parts would be time consuming. It would probably be quicker to 
create a brand new app that creates a subclass of UIDocument and see if it 
happens with that. And then I need to figure out how to get my phone in to an 
Arabic locale (which should be easy), but more challenging is to get back out 
since I can't read Arabic and navigating the UI settings might be challenging.

In the meantime, I've asked the person who ran into the problem to create a new 
document that uses Arabic as the filename and see if it happens with the new 
file. 

Thanks,
Dave Reed
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-06 Thread David Duncan

> On Mar 6, 2017, at 2:05 PM, davel...@mac.com wrote:
> 
>> 
>> On Mar 6, 2017, at 12:37 PM, Chris Ridd  wrote:
>> 
>> 
>>> On 6 Mar 2017, at 13:28, davel...@mac.com wrote:
>>> 
>>> I have an iOS app (Attendance2) written in Objective-C. One of my users 
>>> upgraded to the public 10.3 iOS beta and reported he could no longer open 
>>> his documents (I have a subclass of UIManagedDocument so they are Core Data 
>>> files stored in the package/directory format that UIManagedDocument uses). 
>>> I didn’t notice any issues with my test device using the developer beta of 
>>> 10.3. He changed the file names from Arabic to Roman and then he said he 
>>> could open them.
>>> 
>>> Everything I do with NSString is via UTF8 (and it worked fine with Arabic 
>>> letters for this person before updating to the 10.3 beta) so I don’t think 
>>> I’m doing anything wrong.
>>> 
>>> Any suggestions?
>> 
>> If that iOS beta has upgraded the user’s filesystem to APFS, then it may be 
>> an iOS bug that you need to report.
>> 
>> Chris
> 
> I'm assuming the public beta upgraded to APFS (as I believe I read the 
> developer betas upgraded to APFS). I'm trying to figure out if this an Apple 
> bug (i.e., either APFS isn't handling his Arabic filenames correctly or 
> perhaps something went wrong in the upgrade from HFS+ to APFS) or if perhaps 
> it is a bug in my app (I doubt since all I'm doing is taking the NSString 
> they enter and using it as the filename).
> 
> Is there anything else we could try to see which one of those it likely is? 
> I'm going to ask him to create a new file and use an Arabic name and see if 
> that works (i.e., was it just an issue with existing files in Arabic).

I would highly recommend you file a bug that includes enough of your code to 
reproduce the issue now. You can update it later if you determine it is your 
issue.

> 
> Thanks,
> Dave Reed
> 
> 
> ___
> 
> Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)
> 
> Please do not post admin requests or moderator comments to the list.
> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
> 
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/cocoa-dev/david.duncan%40apple.com
> 
> This email sent to david.dun...@apple.com

--
David Duncan

___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-06 Thread davelist

> On Mar 6, 2017, at 12:37 PM, Chris Ridd  wrote:
> 
> 
>> On 6 Mar 2017, at 13:28, davel...@mac.com wrote:
>> 
>> I have an iOS app (Attendance2) written in Objective-C. One of my users 
>> upgraded to the public 10.3 iOS beta and reported he could no longer open 
>> his documents (I have a subclass of UIManagedDocument so they are Core Data 
>> files stored in the package/directory format that UIManagedDocument uses). I 
>> didn’t notice any issues with my test device using the developer beta of 
>> 10.3. He changed the file names from Arabic to Roman and then he said he 
>> could open them.
>> 
>> Everything I do with NSString is via UTF8 (and it worked fine with Arabic 
>> letters for this person before updating to the 10.3 beta) so I don’t think 
>> I’m doing anything wrong.
>> 
>> Any suggestions?
> 
> If that iOS beta has upgraded the user’s filesystem to APFS, then it may be 
> an iOS bug that you need to report.
> 
> Chris

I'm assuming the public beta upgraded to APFS (as I believe I read the 
developer betas upgraded to APFS). I'm trying to figure out if this an Apple 
bug (i.e., either APFS isn't handling his Arabic filenames correctly or perhaps 
something went wrong in the upgrade from HFS+ to APFS) or if perhaps it is a 
bug in my app (I doubt since all I'm doing is taking the NSString they enter 
and using it as the filename).

Is there anything else we could try to see which one of those it likely is? I'm 
going to ask him to create a new file and use an Arabic name and see if that 
works (i.e., was it just an issue with existing files in Arabic).

Thanks,
Dave Reed


___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Unicode filenames with Apple File System and UIManagedDocument

2017-03-06 Thread Chris Ridd

> On 6 Mar 2017, at 13:28, davel...@mac.com wrote:
> 
> I have an iOS app (Attendance2) written in Objective-C. One of my users 
> upgraded to the public 10.3 iOS beta and reported he could no longer open his 
> documents (I have a subclass of UIManagedDocument so they are Core Data files 
> stored in the package/directory format that UIManagedDocument uses). I didn’t 
> notice any issues with my test device using the developer beta of 10.3. He 
> changed the file names from Arabic to Roman and then he said he could open 
> them.
> 
> Everything I do with NSString is via UTF8 (and it worked fine with Arabic 
> letters for this person before updating to the 10.3 beta) so I don’t think 
> I’m doing anything wrong.
> 
> Any suggestions?

If that iOS beta has upgraded the user’s filesystem to APFS, then it may be an 
iOS bug that you need to report.

Chris
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com