Martynas Brijunas <mbri...@gmail.com> added the comment:

Hi Steve, Eryk,

thank you very much for looking into this. I was looking into "st_ino"
as a potential substitute of a full path of a file when it comes to
uniquely identifying that file in a database.

> ReFS uses a 128-bit file ID, which I gather consists of a 64-bit directory ID 
> and a 64-bit relative ID. (Take this with a grain of salt. AFAIK, Microsoft 
> hasn't published a spec for ReFS.) The latter is 0 for the directory itself 
> and increments by 1 for each file created in the directory, with no reuse of 
> previous values if a file is deleted or moved. If that's correct, and if 
> "test.jpg" was created in "\test", then the directory ID of "\test" is 
> 0x29d5, and the relative file ID is 0x4ae.

This assumption seems to be correct. All files within the same
directory have identical first half of their ID, as reported by
"fsutil".

U:\test>fsutil file queryfileid test.jpg
File ID is 0x00000000000029d500000000000004ae

U:\test>fsutil file queryfileid test.nef
File ID is 0x00000000000029d50000000000000483

U:\test>fsutil file queryfileid test.ARW
File ID is 0x00000000000029d50000000000000484

U:\test>fsutil file queryfileid test.db
File ID is 0x00000000000029d50000000000000495

>
> > >>> from pathlib import Path
> > >>> hex(Path('U:/test/test.jpg').stat().st_ino)
> > '0x4000000004ae29d5'
>
> os.stat calls WINAPI GetFileInformationByHandle, which returns a 64-bit file 
> ID. It appears that ReFS generates this ID by concatenating the relative ID 
> and directory ID in a way that is "not guaranteed to be unique" according to 
> the BY_HANDLE_FILE_INFORMATION [1] docs.

The feedack from "st_ino" appears to be in total sync with "fsutil".
The only real difference (apart for the for the missing leading zeros
in each half) is the inclusion of a hex "4" at the very beginning of
the hex sequence. But even that is consistent as the "4" is present in
all cases.

>>> hex(Path('U:/test/test.jpg').stat().st_ino)
'0x4000000004ae29d5'
>>> hex(Path('U:/test/test.nef').stat().st_ino)
'0x40000000048329d5'
>>> hex(Path('U:/test/test.arw').stat().st_ino)
'0x40000000048429d5'
>>> hex(Path('U:/test/test.db').stat().st_ino)
'0x40000000049529d5'

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue40095>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to