Re: Localization based problem with sort

2006-01-13 Thread Dirk Stoecker
Hello,

 Dirk Stoecker [EMAIL PROTECTED] writes:
 
  when using the sort utility in German language the two options
 
-d, --dictionary-order
-f, --ignore-case
 
  are activated by default. It is impossible to have other sorting methods 
  then.
 
  Would you please add negative forms of these options, so they can be 
  deactivated in localized environments?
 
 Sorry, I don't see how that could be implemented portably, in the
 ordinary POSIX locale environment anyway.  Perhaps you can work around
 the problem by setting LC_COLLATE=C in your environment, before
 invoking 'sort'.

What would be the problem when there is an option --no-dictionary-order 
and --no-ignore-case? The GNU tools also have other options, which do 
not exist in other tools.

The problem with the workaround is, that it does not work in all places, 
where sort can (and is) be used.

Ciao
-- 
http://www.dstoecker.de/ (PGP key available)


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: Localization based problem with sort

2006-01-13 Thread Paul Eggert
Dirk Stoecker [EMAIL PROTECTED] writes:

 What would be the problem when there is an option --no-dictionary-order 
 and --no-ignore-case?

The problem is implementing those options, not specifying them.  I
don't know how to implement them.  If you could supply a patch to
implement them, that would help.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


RFC: How du counts size of hardlinked files

2006-01-13 Thread Johannes Niess
Hi list,

du (with default options) seems to count files with multiple hard links in the 
first directory it traverses.

The -l option changes that.

But there are other valid viewpoints.

Somehow the byte count of multiple hardlinks partially belongs to all of them, 
even when not part of traversed directories. In this mode a file with 10 
bytes and 3 hardlinks would be counted as 3 files with 3 bytes (an only one 
hardlink) each. The rounding error of integers is acceptable in this 
'approximate' mode. Programmatically this is should be very similar to the -l 
mode. Use case: Different physical owners of the hardlinks and doing fair 
accounting for them. (Of course the inode has only one common logical owner 
for all directory entries).

Not counting multiple AND out-of-tree hardlinks is also usefull. It tells us 
how much space we really gain when deleting that tree. 'rm-size' could be a 
name for this mode. Programmatically this is similar to default mode: In Perl 
I'd use hash keys for the test in default mode. In 'rm-size' mode I'd 
increase the hash values of visited inodes.  Finally compare # of visited 
directory entries to the # of links.

du seems to be the natural home for this functionality. Or is it feature 
bloat?

Background: Backups via 'cp -l' need (almost) no space for files unchanged in 
several cycles. But these shadow forests of hardlinks are difficult to 
account for. Especially when combined with finding and linking identical  
files across several physical owners.

Johannes Niess

P.S: I'm not volunteering to implement this. I did not even feel enough need 
to do the perl scripts.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: Join separator field

2006-01-13 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

The fact that you mailed the obsolete bug-textutils instead of the current
bug-coreutils makes me think that your installation is out of date.  The
current stable version of coreutils is 5.93.  You should consider
upgrading your installation (since you mentioned cygwin, you may also want
to consider asking the cygwin AT cygwin DOT com mailing list for help in
your upgrade process).

According to Samuel GRANJEAUD on 1/11/2006 9:09 AM:
 I am pround to use join for doing bioinformatics up to now. But when I
 wanted to separate field by tab character, I failed. May be it's a bug...
 
 bash-3.00$ join -j 1 -t \011 aa.txt bb.txt   cc.txt

You didn't quote the backslash, so join was behaving as if it were called
with join -j 1 -t 011 aa.text bb.txt.  Had you really wanted joint to
see \011, you should have used -t '\011'.  But even then, older versions
of join just silently used the first character (giving the effect of -t
0), and 5.93 now complains when multiple characters are present:
join: multi-character tab `011'.

What you really wanted to do to use a literal tab on the command line
(since join 5.93 only understands \0; it does not understand \t or \011),
so that join sees only a single character.  This can be done using the key
combination [ctrl-v][tab] in an interactive bash shell, or by using a
literal tab in a script file, or by using another program to generate the
tab for you, like so:

$ join -j 1 -t `printf '\t'` aa.txt bb.txt  cc.txt

Meanwhile, a patch that allows join to parse the same escape sequences as
printf would probably be welcomed.

- --
Life is short - so eat dessert first!

Eric Blake [EMAIL PROTECTED]
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDx7Is84KuGfSFAYARAmQ2AJ9SbHFBu280vz8SN6N9U+FMe9LFLwCfQlGt
RX7sWYXw1iMRYUAFr7cyf/g=
=+FuH
-END PGP SIGNATURE-


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: Join separator field

2006-01-13 Thread Samuel GRANJEAUD

Hello !

Many thanks for your answer.

Eric Blake wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

The fact that you mailed the obsolete bug-textutils instead of the current
bug-coreutils makes me think that your installation is out of date.  The
current stable version of coreutils is 5.93.  You should consider
upgrading your installation (since you mentioned cygwin, you may also want
to consider asking the cygwin AT cygwin DOT com mailing list for help in
your upgrade process).
 


I used in fact the core-utils 5.93.0-9 of cygwin, ans it is up to date.
bash-3.00$ join --version
join (GNU coreutils) 5.3.0
Written by Mike Haertel.

Copyright (C) 2005 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


You didn't quote the backslash, so join was behaving as if it were called
with join -j 1 -t 011 aa.text bb.txt.  Had you really wanted joint to
see \011, you should have used -t '\011'.  But even then, older versions
of join just silently used the first character (giving the effect of -t
0), and 5.93 now complains when multiple characters are present:
join: multi-character tab `011'.
 

I wanted to use the tab character and thought that the octal code would 
be the right solution beacuse I didn't think of putting it in a file.


I tried to quote the octal code on the command line, but join gives me 
no answer (as you explained it), even not an error code of multiple 
characters. I didn't even get this error with 011.


Nevertheless, I successfully join my files with [ctrl-v][tab] in  or ''.
Thank for that solution. I should have think of it and not bore you.

Cheers,
--Samuel


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


groups additional flag

2006-01-13 Thread Reinhold Bader
 Dear coreutils maintainers,

   I am attaching a modified version of /usr/bin/groups which allows to suppress
   errors resulting from the occurrence of artificial GIDs which are used
   for authentification purposes. These errors cause difficulties e. g. in Tcl
   scripts using groups via exec.

   May I ask for inclusion in the standard distribution tree?
   The basis used was the 5.2.1 coreutils release as used in Novell's SLES9
   distribution.

 Best regards

-- 
 Dr. Reinhold Bader

 Leibniz-Rechenzentrum, Abt. Hochleistungssysteme | Tel. +49 89 289 28825
 Barerstr. 21, 80333 Muenchen | email [EMAIL PROTECTED]

#!/bin/sh
# groups -- print the groups a user is in
# Copyright (C) 1991, 1997, 2000, 2002 Free Software Foundation, Inc.

# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2, or (at your option)
# any later version.

# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.

# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.  */

# Written by David MacKenzie [EMAIL PROTECTED].

# Make sure we get GNU id, if possible; also allow
# it to be somewhere else in PATH if not installed yet.
#
# LRZ fix: add switch to ignore errors induced by artificial GIDs without
# /etc/group entry.
PATH=/usr/bin:$PATH

usage=Usage: $0 [OPTION]... [USERNAME]...

  --helpdisplay this help and exit
  --version output version information and exit
  --noinvgidignore invalid GIDs and suppress error

Same as id -Gn.  If no USERNAME, use current process.

Report bugs to bug-coreutils@gnu.org.

fail=0
ignore=0
case $# in
  1 )
case z${1} in
  z--help )
 echo $usage || fail=1; exit $fail;;
  z--version )
 echo groups (GNU coreutils) 5.2.1 || fail=1; exit $fail;;
  z--noinvgid )
 ignore=1
 shift;;
  * ) ;;
esac
;;
  * ) ;;
esac

if [ $# -eq 0 ]; then
  if [ $ignore -eq 0 ] ; then
id -Gn
fail=$?
  else
groups=$(id -Gn -- $(whoami))
status=$?
if test $status = 0; then
  echo $groups
else
  fail=$status
fi
  fi
else
  for name in $@; do
groups=`id -Gn -- $name`
status=$?
if test $status = 0; then
  echo $name : $groups
else
  fail=$status
fi
  done
fi
exit $fail
___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: RFC: How du counts size of hardlinked files

2006-01-13 Thread Phillip Susi
Maybe I misunderstood you but you seem to think that each hard link to 
the same file can have different ownerships.  This is not the case. 
Hard links are just additional names for the same inode, and permissions 
and ownership is associated with the inode, not the name(s).


Also I just tested it and du doesn't report the size used by duplicate 
hard links in the tree twice.  I did a cp -al foo bar, then a du -sh, du 
-sh foo, and they were both the same size.




Johannes Niess wrote:

Hi list,

du (with default options) seems to count files with multiple hard links in the 
first directory it traverses.


The -l option changes that.

But there are other valid viewpoints.

Somehow the byte count of multiple hardlinks partially belongs to all of them, 
even when not part of traversed directories. In this mode a file with 10 
bytes and 3 hardlinks would be counted as 3 files with 3 bytes (an only one 
hardlink) each. The rounding error of integers is acceptable in this 
'approximate' mode. Programmatically this is should be very similar to the -l 
mode. Use case: Different physical owners of the hardlinks and doing fair 
accounting for them. (Of course the inode has only one common logical owner 
for all directory entries).


Not counting multiple AND out-of-tree hardlinks is also usefull. It tells us 
how much space we really gain when deleting that tree. 'rm-size' could be a 
name for this mode. Programmatically this is similar to default mode: In Perl 
I'd use hash keys for the test in default mode. In 'rm-size' mode I'd 
increase the hash values of visited inodes.  Finally compare # of visited 
directory entries to the # of links.


du seems to be the natural home for this functionality. Or is it feature 
bloat?


Background: Backups via 'cp -l' need (almost) no space for files unchanged in 
several cycles. But these shadow forests of hardlinks are difficult to 
account for. Especially when combined with finding and linking identical  
files across several physical owners.


Johannes Niess

P.S: I'm not volunteering to implement this. I did not even feel enough need 
to do the perl scripts.






___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: RFC: How du counts size of hardlinked files

2006-01-13 Thread Johannes Niess
Hi Phillip,

Hard links and file sizes are concepts that don't fit each other well. The 
best fit depends on what you are asking for.

bash-2.05b$ cp -al a/ b
bash-2.05b$ du -s a b .
34540   a
34540   b
34556   .
bash-2.05b$ du -sc a b .
34540   a
12  b
4   .
34556   total
bash-2.05b$ du -scl a b .
34540   a
34540   b
69084   .
138164  total
bash-2.05b$ 

Am Freitag, 13. Januar 2006 19:56 schrieb Phillip Susi:
 Maybe I misunderstood you but you seem to think that each hard link to
 the same file can have different ownerships.  This is not the case.
 Hard links are just additional names for the same inode, and permissions
 and ownership is associated with the inode, not the name(s).

I know that. So I made the distinction between physical (customer) and logical 
(file system) owner. A file hardlinked between 2 customers belongs to both of 
them. It is quite unpredictable which directory entry (i.e one of the links 
to the inode) du finds first. This directory has the inode size added to its 
sum.


 Also I just tested it and du doesn't report the size used by duplicate
 hard links in the tree twice.  I did a cp -al foo bar, then a du -sh, du
 -sh foo, and they were both the same size.

That's correct without -l. The sizes do not add up: 'du ./foo' + 
'du ./bar'  (my two customers point of view) != 'du .' (disk space I need in 
the server).

'du -l' counts the links multiple times. 'du ./foo' = 'du ./bar' = 0.5 'du .';  
The overall size is from a customer perspective.

My approximate mode would count two halves. 0.5 'du ./foo' + 0.5 'du ./bar' = 
'du .'; That's the admins size perspective. In reality there is no fixed 
factor to du -l.

Johannes


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: groups additional flag

2006-01-13 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to Reinhold Bader on 1/13/2006 8:31 AM:
  Dear coreutils maintainers,
 
I am attaching a modified version of /usr/bin/groups which allows to 
 suppress
errors resulting from the occurrence of artificial GIDs which are used
for authentification purposes. These errors cause difficulties e. g. in Tcl
scripts using groups via exec.
 
May I ask for inclusion in the standard distribution tree?
The basis used was the 5.2.1 coreutils release as used in Novell's SLES9
distribution.


Thanks for the ideas.  However, could you please regenerate this as a
unified diff against a more current distribution (the current stable
release is 5.93, and the CVS development is progressing towards an
eventual 6.0 release), rather than as a straight file?  Also, your changes
may be large enough that it would require copyright assignment, if it is
indeed worth applying the patch.  Without anything to compare against, it
is not obvious what you are trying to add.

- --
Life is short - so eat dessert first!

Eric Blake [EMAIL PROTECTED]
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDyCWL84KuGfSFAYARAoN5AJ9jGHh26m6rADpT8RXGieaOnEXSMwCfaeuR
alZqkhvS2ofT2SaazmusxfM=
=eot8
-END PGP SIGNATURE-


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils