Re: [Dorset] Paths relative to a current working directory whose path contains symbolic links

2020-11-09 Thread Hamish McIntyre-Bhatty
On 09/11/2020 17:45, Patrick Wigmore wrote:
> Thank you so much for that informative response, Ralph.
>
> So much interesting history!
>
> On Mon, 09 Nov 2020 15:00:55 +, Ralph Corderoy wrote:
>> The program can't know which.  It shouldn't try and
>> guess but instead just pass the argument to open(2), etc.
> That was pretty much my conclusion. It was interesting to look for a 
> way to do it, but I didn't really like the idea of writing a long-
> winded workaround that bypassed well-trodden standard functions.
>
> If programs can't resolve relative paths in the same way that the 
> shell does, then I might have expected the shell to turn the relative 
> paths into absolute ones before passing them as arguments to programs.
>
> But that opens up a can of worms too. The shell (probably) can't know 
> whether I intend a string that looks like a relative path to be 
> treated as a reference to a file, or as a literal string that just 
> happens to look like a path, or as a path relative to something else 
> that the program is going to know about!
>
>> If that bites the user, e.g. by trampling the wrong file, then the 
>> user will have learnt symlinks are a bad idea and alternatives
>> should be sought where possible.  :-)
> I have symbolic links to give the effect of selectively placing some 
> of the subdirectories of my home directory on an HDD. The rest of the 
> home directory is on an SSD. It has worked "well enough" for quite a 
> while now, but I suppose bind mounts might be a better idea.
>
>
> Patrick

Interesting to know about all this.

I do a similar thing to you Patrick, but for file syncing with Nextcloud
over my home network. Unfortunately I'm not sure bind mounting would
work for me in this case, but if you have a go and it works, please let
me know :)

Hamish



signature.asc
Description: OpenPGP digital signature
-- 
  Next meeting: Online, Jitsi, Tuesday, 2020-12-01 20:00
  Check to whom you are replying
  Meetings, mailing list, IRC, ...  http://dorset.lug.org.uk
  New thread, don't hijack:  mailto:dorset@mailman.lug.org.uk


Re: [Dorset] Paths relative to a current working directory whose path contains symbolic links

2020-11-09 Thread Patrick Wigmore
Thank you so much for that informative response, Ralph.

So much interesting history!

On Mon, 09 Nov 2020 15:00:55 +, Ralph Corderoy wrote:
> The program can't know which.  It shouldn't try and
> guess but instead just pass the argument to open(2), etc.

That was pretty much my conclusion. It was interesting to look for a 
way to do it, but I didn't really like the idea of writing a long-
winded workaround that bypassed well-trodden standard functions.

If programs can't resolve relative paths in the same way that the 
shell does, then I might have expected the shell to turn the relative 
paths into absolute ones before passing them as arguments to programs.

But that opens up a can of worms too. The shell (probably) can't know 
whether I intend a string that looks like a relative path to be 
treated as a reference to a file, or as a literal string that just 
happens to look like a path, or as a path relative to something else 
that the program is going to know about!

> If that bites the user, e.g. by trampling the wrong file, then the 
> user will have learnt symlinks are a bad idea and alternatives
> should be sought where possible.  :-)

I have symbolic links to give the effect of selectively placing some 
of the subdirectories of my home directory on an HDD. The rest of the 
home directory is on an SSD. It has worked "well enough" for quite a 
while now, but I suppose bind mounts might be a better idea.


Patrick

-- 
  Next meeting: Online, Jitsi, Tuesday, 2020-12-01 20:00
  Check to whom you are replying
  Meetings, mailing list, IRC, ...  http://dorset.lug.org.uk
  New thread, don't hijack:  mailto:dorset@mailman.lug.org.uk


Re: [Dorset] Paths relative to a current working directory whose path contains symbolic links

2020-11-09 Thread Ralph Corderoy
Hi Patrick,

Keith's already answered, but here's some background.

> Further suppose that, in that path, 'a' is a symbolic link:

Symbolic links were an easy hack by Berkeley but they aren't orthogonal
to the system they modified and messed up many existing things.  They're
still messed up decades later,
e.g. https://en.wikipedia.org/wiki/Symlink_race.

One thing to consider is how /bin/pwd traditionally worked.

- stat(2) the current directory, ‘.’, to learn its inode number.
- chdir(2) to the parent directory.
- readdir(2) the new current directory and look for an entry with the
  inode number of the directory we've just ascended from.

Keep repeating those stages as you work up the directory tree.  Stop
when you reach the root directory which is where ‘.’ and ‘..’ are the
same inode number.

$ ls -1di / /. /..
2 /
2 /.
2 /..
$

If in the directory /usr/share/doc/bash then this would give

bash doc share usr

There's only one possible answer this way as it ignores how you arrived
at the starting directory and instead learns of its sole canonical place
in the filesystem.  /bin/pwd still exists today and although it now
works more efficiently thanks to an extra kernel interface, it still
gives this filesystem answer, as does a POSIX ‘pwd -P’, whether built
into the shell for efficiency or not.

/bin/pwd was quite expensive on old systems which used the above
traditional method, especially if all the directories from here to ‘/’
weren't in RAM thus causing disk-head seeks.  A shell's built-in pwd
could keep track of the CWD's path in the filesystem and just print the
string when asked.  That shell has to decide how to handle the ambiguity
created by symlinks: does changing into a subdirectory which is a
symbolic link tack the symlink's name onto the CWD string or is
readlink(2) used, possibly repeatedly, and the true filesystem location
maintained?

> In Python, it seems as though the way to eliminate the discrepancy 
> while maintaining cross-platform compatibility is to add a check to 
> see whether the program is running under a shell that sets $PWD (see 
> bash(1)), or on a system that has pwd(1), and, if it is, then resolve 
> any relative paths relative to $PWD or the output of pwd, rather than 
> relative to os.getcwd().

If I cd into /a/symlink/b and pass ../c to a program then I may mean the
textual /a/symlink/c, or the filesystem /x/y/c because I know where
symlink has led me.  The program can't know which.  It shouldn't try and
guess but instead just pass the argument to open(2), etc.  If that bites
the user, e.g. by trampling the wrong file, then the user will have
learnt symlinks are a bad idea and alternatives should be sought where
possible.  :-)

There's a good paper by Rob Pike presented at Usenix in 2000 which
accepts symlinks are here to say and suggests a solution based on the
work in Plan 9.

A deeper question is whether the shell should even be trying to make
pwd and cd do a better job.  If it does, then the getwd library call
and every program that uses it will behave differently from the
shell, a situation that is sure to confuse.

― https://9p.io/sys/doc/lexnames.html

-- 
Cheers, Ralph.

-- 
  Next meeting: Online, Jitsi, Tuesday, 2020-12-01 20:00
  Check to whom you are replying
  Meetings, mailing list, IRC, ...  http://dorset.lug.org.uk
  New thread, don't hijack:  mailto:dorset@mailman.lug.org.uk


Re: [Dorset] Paths relative to a current working directory whose path contains symbolic links

2020-11-09 Thread Patrick Wigmore
On Mon, 09 Nov 2020 13:46:25 +, Keith Edmunds wrote:
> One way (which may or may not be convenient) is to run this from the
> shell:
> 
> $ cd $(realpath .)

Thanks Keith. A useful tip.

In my case, it was necessary to add some quotes, since the path 
returned by realpath contained some spaces:

$ cd "$(realpath .)"

-- 
  Next meeting: Online, Jitsi, Tuesday, 2020-12-01 20:00
  Check to whom you are replying
  Meetings, mailing list, IRC, ...  http://dorset.lug.org.uk
  New thread, don't hijack:  mailto:dorset@mailman.lug.org.uk


Re: [Dorset] Paths relative to a current working directory whose path contains symbolic links

2020-11-09 Thread Keith Edmunds
One way (which may or may not be convenient) is to run this from the shell:

$ cd $(realpath .)

That will set your current directory path to the path with all symlinks
resolved.

More info here:
https://www.tiger-computing.co.uk/linux-tips-finding-real-path/
-- 
Linux Tips: https://www.tiger-computing.co.uk/category/techtips/

-- 
  Next meeting: Online, Jitsi, Tuesday, 2020-12-01 20:00
  Check to whom you are replying
  Meetings, mailing list, IRC, ...  http://dorset.lug.org.uk
  New thread, don't hijack:  mailto:dorset@mailman.lug.org.uk


[Dorset] Paths relative to a current working directory whose path contains symbolic links

2020-11-09 Thread Patrick Wigmore
Hi all,

I was writing a script that took file paths as arguments, and stumbled 
upon a phenomenon that I wasn't previously aware of, but which I'm 
guessing is probably quite well known.

There is a difference between the way the current working directory is 
understood by the bash shell and the way it is understood by Python.

Suppose, when I run the script, my shell says I am in:

/some/path/to/a/directory

Further suppose that, in that path, 'a' is a symbolic link:

/some/path/to/a -> /some/other/path/to/a

If I give my Python script the following relative path as an argument 
on the command line:

../../../that/points/to/a/file

I would expect it to point to

/some/path/that/points/to/a/file

And, indeed, my shell would make the same assumption and I could use 
tab-completion to fill in the path based on that assumption.


But if my Python script runs that relative path through 
os.path.abspath(), the result is actually:

/some/other/path/that/points/to/a/file

Which may or may not exist, but is certainly not the path I intended!

This is because Python's understanding of the current working 
directory has all symbolic links followed, so as far as Python is 
concerned, I am not, and can never be, in:

/some/path/to/a/directory

Because it contains a symbolic link. Instead, I must be in:

/some/other/path/to/a/directory

As far as I can tell, this is by design and not entirely unique to 
Python. (e.g. the program 'optipng' seems to have the same issue).

In Python, it seems as though the way to eliminate the discrepancy 
while maintaining cross-platform compatibility is to add a check to 
see whether the program is running under a shell that sets $PWD (see 
bash(1)), or on a system that has pwd(1), and, if it is, then resolve 
any relative paths relative to $PWD or the output of pwd, rather than 
relative to os.getcwd().

Or, if adding all that platform-specific complexity would double the 
length of your simple script, document it as a limitation and tell 
your users they're holding the script wrong when they run into it.

Patrick

-- 
  Next meeting: Online, Jitsi, Tuesday, 2020-12-01 20:00
  Check to whom you are replying
  Meetings, mailing list, IRC, ...  http://dorset.lug.org.uk
  New thread, don't hijack:  mailto:dorset@mailman.lug.org.uk