Re: path normalization

2005-01-19 Thread Gary V. Vaughan
Hallo Ralf!

Ralf Wildenhues wrote:
 One step toward integrating Linux multilib support, but a Libtool
 requirement independent of that goal, is comparison of normalized
 paths.  In a nutshell, I'd like to be able to decide that
   ../foo/../lib
   ../lib
 are equal.
 
 [[snip]]
 
 This is my first try at a shell function that implements this with
 sed (and little overhead in most trivial cases).  I'm posting it
 because it's not trivial, and I'd like to know about bugs in it or 
 general comments on this problem (before integrating into Libtool)
 or the choices I had to make about normalization, or possible
 simplification.  The size of the script is partly due to the fact
 that we cannot use alternation `\|'.

The implementation looks fine to me, and works on  Good work!

Tested with: ash-0.3.8, bash-2.05b, pdksh-5.2.14 and zsh-4.1.1,
all using GNU sed 4.0.9.

Here are some more ideas:

1. We could avoid the normalization step on .la files that are new
   enough by checking the version (or for speed adding a new
   'pathsnormalized=yes' declaration) to decide whether it was already
   done at creation time.
2. Shipping a script to optionally trawl the filesystem and normalize
   installed .la files (and add a pathsnormalized decl) at libtool 'make
   install' time would save time for subsequent libtool calls.
3. I wonder how much of the normalization we could do with M4 as the
   libtool script is generated?

Cheers,
Gary.
-- 
Gary V. Vaughan  ())_.  [EMAIL PROTECTED],gnu.org}
Research Scientist   ( '/   http://tkd.kicks-ass.net
GNU Hacker   / )=   http://www.gnu.org/software/libtool
Technical Author   `(_~)_   http://sources.redhat.com/autobook


signature.asc
Description: OpenPGP digital signature
___
http://lists.gnu.org/mailman/listinfo/libtool


path normalization

2005-01-18 Thread Ralf Wildenhues
One step toward integrating Linux multilib support, but a Libtool
requirement independent of that goal, is comparison of normalized
paths.  In a nutshell, I'd like to be able to decide that
  ../foo/../lib
  ../lib
are equal.

Unfortunately, libtool so far has neither required its input to be
normalized nor implemented that normalization itself.  Thus, we have
to deal with installed .la files which have unnormalized paths for
ever, thus adding such a requirement as an afterthought is hopeless.

Now I figured we may need to compare all sorts of paths,
- relative or absolute (but maybe not one group against the other)
- existing or not existing.
Thus `pwd -L' or portable approximations thereof won't work in all
cases.  If we ever have to compare relative to absolute paths, I
think we can rely on them being present.

This is my first try at a shell function that implements this with
sed (and little overhead in most trivial cases).  I'm posting it
because it's not trivial, and I'd like to know about bugs in it or 
general comments on this problem (before integrating into Libtool)
or the choices I had to make about normalization, or possible
simplification.  The size of the script is partly due to the fact
that we cannot use alternation `\|'.

You'd also do me a favor if you tried this on your system and reported
back any output (it does not output anything if all goes ok) or non-
zero exit status.  Your sed may require removing all comments from
the sed script -- filtering the whole thingy through
  sed '/^[  ]\{1,\}#/d'
should do the trick (space and tab within [ ]).  Please try different
shells as well.

This function keeps trailing slashes on purpose -- an IMHO independent
task.  I added necessary shell-sanitize blob for ease of testing.

Thanks for reading this far,
Ralf

--- cut here ---
#! /bin/sh

if test -n ${ZSH_VERSION+set}  (emulate sh) /dev/null 21; then
  emulate sh
  NULLCMD=:
  # Zsh 3.x and 4.x performs word splitting on ${1+$@}, which
  # is contrary to our usage.  Disable this feature.
  alias -g '${1+$@}'='$@'
  setopt NO_GLOB_SUBST
elif test -n ${BASH_VERSION+set}${KSH_VERSION+set}  (set -o posix) 
/dev/null 21; then
  set -o posix
fi
BIN_SH=xpg4; export BIN_SH # for Tru64
DUALCASE=1; export DUALCASE # for MKS sh

set -e

for ECHO in ${ECHO-echo} 'print -r' 'printf %s\n' false
do
  if test `($ECHO '\t') 2/dev/null || :` = '\t'; then break; fi
done

: ${SED=sed}
: ${Xsed=$SED -e s,^X,,}
: ${VERBOSE=false}

# func_path_normalize pathname
# Remove /./ and /../ parts from PATHNAME.
# Do not fork in most of the trivial cases,
# respect the number of consecutive slashes at the beginning,
# DOS drive letters,
# trailing slash,
# absolute and relative paths.
# We do not honor newlines in PATHNAME,
# backslashes are always treated as separators,
# DOS paths may not contain a colon (except for the drive letter).
func_path_normalize ()
{
case $1 in
  *[\\/]..* | *[\\/].[\\/]* | *[\\/]. )
  func_path_normalize_result=`$ECHO X$1 | $Xsed -e '
# remove multiple slashes except at beginning
s#\([^\\/]\)\([\\/]\)\{1,\}#\1\2#g

 /./

:one
# common case
s#\([^\\/][\\/]\)\.[\\/]#\1#
# at beginning
s#^\([\\/]\{1,\}\)\.[\\/]#\1#
# /. at end
s#\([\\/]\)\.$#\1#
t one

 /../

# three cases for path elements:
#   foo
#   .foo
#   ..foo
# where foo is nonempty and may not start with a dot.

# DOS drive letters
/^[A-Za-z]:/ {
  :dos
  s#^\(..[\\/]\)[^\\/.][^\\/]*[\\/]\.\.#\1#
  s#^\(..[\\/]\)\.[^\\/.][^\\/]*[\\/]\.\.#\1#
  s#^\(..[\\/]\)\.\.[^\\/]\{1,\}[\\/]\.\.#\1#
  t dos

  # common case
  s#\([^\\/:]\)[\\/][^\\/.][^\\/]*[\\/]\.\.#\1#
  s#\([^\\/:]\)[\\/]\.[^\\/.][^\\/]*[\\/]\.\.#\1#
  s#\([^\\/:]\)[\\/]\.\.[^\\/]\{1,\}[\\/]\.\.#\1#
  t dos

  s#^\(..\)[^\\/]\{1,\}[\\/]\.\.[\\/]*$#\1#

  # we may have picked up multiple slashes again
  s#\([^\\/]\)\([\\/]\)\{1,\}#\1\2#g

  b end
}

# common case
:common
s#\([^\\/]\)[\\/][^\\/.][^\\/]*[\\/]\.\.#\1#
s#\([^\\/]\)[\\/]\.[^\\/.][^\\/]*[\\/]\.\.#\1#
s#\([^\\/]\)[\\/]\.\.[^\\/]\{1,\}[\\/]\.\.#\1#
t common

# do not add slashes to the root
:root
s#^\([\\/]\{1,\}\)[^\\/.][^\\/]*[\\/]\.\.[\\/]*$#\1#
s#^\([\\/]\{1,\}\)\.[^\\/.][^\\/]*[\\/]\.\.[\\/]*$#\1#
s#^\([\\/]\{1,\}\)\.\.[^\\/]\{1,\}[\\/]\.\.[\\/]*$#\1#
t root

# root special cases
s#^[^\\/]\{1,\}[\\/]\.\.$#.#
: end
s#^\([\\/]\{1,\}\)\.\{1,2\}$#\1#
'`;;
  *) func_path_normalize_result=$1;;
esac
}

#  input  output (desired)
tests='
/ /
/./
/..   /
..