bug#6330: Feature request: mktemp creates named pipes

2010-06-08 Thread Sebastien Andre
On Sat, Jun 5, 2010 at 9:25 AM, Jim Meyering j...@meyering.net wrote:

 Sebastien Andre wrote:
 ...
  I see two reasons why the addition of a --fifo option is better than
 using
  existing tools:
 
  * Creating a temporary directory to finally create a pipe just
 because
  it is safe this way is kind of a trick. For the clarity of scripts, it
 would
  be better having mktemp to ensure the uniqueness of a fifo, even if it's
  created in /

 because it is safe is often a good reason to learn and use a new idiom.
 When you first encounter such an idiom, it does indeed look like a trick.


Well, after all an idiom might just be a popular trick



  * mktemp is not only a tool to create unique files, it's also a name
  generator. The example given in the manual works for one or two fifos,
 but
  if the number of fifos is unknown there is no choice but implementing
  something to generate names, which is another potential source of bugs.

 Any code addition is an opportunity to introduce bugs ;-)

 What if someone wants a uniquely-named symlink?
 New option?  ... or a shared memory object?  Add another?
 Wouldn't it be better to use the tried and true (safe) idiom
 that will work with all versions of mktemp?  Then
 your script will work also on systems that use the
 version of mktemp that predated the one in coreutils.


Thank you for broadening the question.

You made me realize that what I really want is to use the tempnam(3)
function
in shell scripts. But having a tempnam command in a shell script would be
unsafe unless
something is created atomically when invoked. If mktemp(1) is not the right
place for this, how about adding a -t/--template option to the mkfifo(1) ?

Wanting to get a unique name in a given directory according to a template
doesn't
sound too specific to me, maybe some other people need it?


bug#6330: Feature request: mktemp creates named pipes

2010-06-08 Thread Jim Meyering
Sebastien Andre wrote:

 On Sat, Jun 5, 2010 at 9:25 AM, Jim Meyering j...@meyering.net wrote:
...
 because it is safe is often a good reason to learn and use a new idiom.
 When you first encounter such an idiom, it does indeed look like a 
 trick.

 Well, after all an idiom might just be a popular trick

Effective technique is more accurate.
This is the unix way.

      * mktemp is not only a tool to create unique files, it's also a name
  generator. The example given in the manual works for one or two fifos,
 but
  if the number of fifos is unknown there is no choice but implementing
  something to generate names, which is another potential source of bugs.

 Any code addition is an opportunity to introduce bugs ;-)

 What if someone wants a uniquely-named symlink?
 New option?  ... or a shared memory object?  Add another?
 Wouldn't it be better to use the tried and true (safe) idiom
 that will work with all versions of mktemp?  Then
 your script will work also on systems that use the
 version of mktemp that predated the one in coreutils.

 Thank you for broadening the question.

 You made me realize that what I really want is to use the tempnam(3) function
 in shell scripts. But having a tempnam command in a shell script would be
 unsafe unless
 something is created atomically when invoked. If mktemp(1) is not the right
 place for this, how about adding a -t/--template option to the mkfifo(1) ?

And to mknod
and to ln?

 Wanting to get a unique name in a given directory according to a template
 doesn't sound too specific to me, maybe some other people need it?

If that given directory has only one writer, then using mktemp -u is fine.

Other people use mktemp -d, and then can create names however
they like (and safely) in the just-created directory.

The point that someone might want to create an object of a type other
than fifo (e.g., symlink, hard link, block dev., etc.) was to make
you think about adding a more generic option, but then you'd have to
consider how to specify the target of a symlink and major,minor numbers
for a block device.

I was hoping you would then conclude that this would lead to mktemp
subsuming most of the functionality of tools like ln, mkfifo and mknod,
and thus rethink your premise that it would be better to change mktemp
than to use it the way everyone else does.





bug#6131: [PATCH]: fiemap support for efficient sparse file copy

2010-06-08 Thread Jim Meyering
Jim Meyering wrote:
 Jim Meyering wrote:
 ...
 Do you know of a tool other than filefrag that I could use?
 nope.

 It looks like a small script could filter filefrag -v output, detect
 split extents and rewrite to make the output match what's expected.
 Probably not worth it, though, since this is already a very fragile test.

 I went ahead and did it, after all.
 Here's the script, filefrag-extent-compare.
 With it, this test should pass when run on any of those four
 file system types.

 Not quite.
 Parsing filefrag -v output is part of what is fragile.

 Here are two examples:

 == ff1 ==
 Filesystem type is: ef53
 File size of j1 is 49152 (12 blocks, blocksize 4096)
  ext logical physical expected length flags
0   3 10256380   3
1   9 10740733 10256382  3 eof

 == ff2 ==
 Filesystem type is: ef53
 File size of j2 is 49152 (12 blocks, blocksize 4096)
  ext logical physical expected length flags
0   3 10283520   3
1   9 10283523   3 eof
^^
 Note the missing expected number.
 That doesn't happen often, but it's enough to cause occasional
 false-positive failures, since the awk filter counts fields.
 I don't dare try to count columns, because those physical block
 numbers are likely to have width greater than 8 some of the time.

 Instead, I adjusted the filter to remove the eof
 and let the existing awk code handle the rest:

 # Extract logical block number and length pairs from filefrag -v output.
 # The initial sed is to remove the eof from the normally-empty flags 
 field.
 # That is required when that final extent has no number in the expected 
 field.
 f()
 {
   sed 's/ eof$//' $@ \
 | awk '/^ *[0-9]/ {printf %d %d , $2 ,NF  5 ? $NF : $5 } END {print 
 }'
 }

 I'll post a new patch soon.

Slowly but surely...

I'll squash the two most recent changes (at the end, below)
into yours, Jeff.

From 9f7a9882944455bfc2b6aec9c9f5431ac429b88b Mon Sep 17 00:00:00 2001
From: Jie Liu jeff@oracle.com
Date: Thu, 13 May 2010 22:09:30 +0800
Subject: [PATCH 01/10] cp: Add FIEMAP support for efficient sparse file copy

* src/fiemap.h: Add fiemap.h for fiemap ioctl(2) support.
Copied from linux's include/linux/fiemap.h, with minor formatting changes.
* src/copy.c (copy_reg): Now, when `cp' invoked with --sparse=[WHEN] option, we
will try to do FIEMAP-copy if the underlaying file system support it, fall back
to a normal copy if it fails.
---
 src/copy.c   |  159 ++
 src/fiemap.h |  102 +
 2 files changed, 261 insertions(+), 0 deletions(-)
 create mode 100644 src/fiemap.h

diff --git a/src/copy.c b/src/copy.c
index 171499c..2c15ca0 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -63,6 +63,10 @@

 #include sys/ioctl.h

+#ifndef HAVE_FIEMAP
+# include fiemap.h
+#endif
+
 #ifndef HAVE_FCHOWN
 # define HAVE_FCHOWN false
 # define fchown(fd, uid, gid) (-1)
@@ -149,6 +153,141 @@ clone_file (int dest_fd, int src_fd)
 #endif
 }

+#ifdef __linux__
+# ifndef FS_IOC_FIEMAP
+#  define FS_IOC_FIEMAP _IOWR ('f', 11, struct fiemap)
+# endif
+/* Perform FIEMAP(available in mainline 2.6.27) copy if possible.
+   Call ioctl(2) with FS_IOC_FIEMAP to efficiently map file allocation
+   excepts holes.  So the overhead to deal with holes with lseek(2) in
+   normal copy could be saved.  This would result in much faster backups
+   for any kind of sparse file.  */
+static bool
+fiemap_copy_ok (int src_fd, int dest_fd, size_t buf_size,
+off_t src_total_size, char const *src_name,
+char const *dst_name, bool *normal_copy_required)
+{
+  bool fail = false;
+  bool last = false;
+  char fiemap_buf[4096];
+  struct fiemap *fiemap = (struct fiemap *)fiemap_buf;
+  struct fiemap_extent *fm_ext = fiemap-fm_extents[0];
+  uint32_t count = (sizeof (fiemap_buf) - sizeof (*fiemap)) /
+sizeof (struct fiemap_extent);
+  off_t last_ext_logical = 0;
+  uint64_t last_ext_len = 0;
+  uint64_t last_read_size = 0;
+  unsigned int i = 0;
+
+  /* This is required at least to initialize fiemap-fm_start,
+ but also serves (in May 2010) to appease valgrind, which
+ appears not to know the semantics of the FIEMAP ioctl. */
+  memset (fiemap_buf, 0, sizeof fiemap_buf);
+
+  do
+{
+  fiemap-fm_length = FIEMAP_MAX_OFFSET;
+  fiemap-fm_extent_count = count;
+
+  /* When ioctl(2) fails, fall back to the normal copy only if it
+ is the first time we met.  */
+  if (ioctl (src_fd, FS_IOC_FIEMAP, fiemap)  0)
+{
+  /* If `i  0', then at least one ioctl(2) has been performed before. 
 */
+  if (i == 0)
+*normal_copy_required = true;
+  return false;
+}
+
+  /* If 0 extents are returned, then more ioctls are not needed.  */
+  if (fiemap-fm_mapped_extents == 0)
+break;
+
+  for (i = 0; i  fiemap-fm_mapped_extents; i++)
+{

bug#6131: [PATCH]: fiemap support for efficient sparse file copy

2010-06-08 Thread Eric Blake
On 06/08/2010 05:33 AM, Jim Meyering wrote:
 new file mode 100644
 index 000..d33293b
 --- /dev/null
 +++ b/src/fiemap.h
 @@ -0,0 +1,102 @@
 +/* FS_IOC_FIEMAP ioctl infrastructure.
 +   Some portions copyright (C) 2007 Cluster File Systems, Inc
 +   Authors: Mark Fasheh mfas...@suse.com
 +Kalpak Shah kalpak.s...@sun.com
 +Andreas Dilger adil...@sun.com.  */

Shouldn't we also add an FSF Copyright 2010 to this new file, to cover
our changes?

 +++ b/tests/cp/sparse-fiemap
 @@ -0,0 +1,56 @@
 +#!/bin/sh
 +# Test cp --sparse=always through fiemap copy
 +
 +# Copyright (C) 2006-2010 Free Software Foundation, Inc.

How much of this content comes from other files from 2006, vs. new
content needing only 2010?

-- 
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


bug#6377: Subject: inaccurate character class processing

2010-06-08 Thread Iosif Fettich

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-unknown-linux-gnu' 
-DCONF_VENDOR='unknown' -DLOCALEDIR='/usr/local/share/locale' 
-DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H   -I.  -I. -I./include -I./lib 
-g -O2
uname output: Linux pony.netsoft.ro 2.6.32.12-115.fc12.x86_64 #1 SMP Fri 
Apr 30 19:46:25 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux

Machine Type: x86_64-unknown-linux-gnu

Bash Version: 4.1
Patch Level: 0
Release Status: release

Description:

(I'm not sure if this a bash or a coreutils issue).

ls [A-Z]*

doesn't work as expected/documented.
I'd want/expect it to list the filenames starting with an uppercase 
letter.

Thank you for looking at it!


Repeat-By:
In an empty directory, create files like

touch a A b B z Z

Now,

ls [A-Z]*

outputs

A  b  B  z  Z

(why 'b' and 'z' - and/or where's 'a'...?!!)

and

ls [a-z]*

outputs

a  A  b  B  z

(why 'A' and 'B' - and/or where's 'Z'...?!!)








bug#6377: Subject: inaccurate character class processing

2010-06-08 Thread Pádraig Brady
tags 6377 + notabug

On 08/06/10 14:48, Iosif Fettich wrote:
 (I'm not sure if this a bash or a coreutils issue).
 
 ls [A-Z]*
 
 doesn't work as expected/documented.

The logic is in bash but it's not an issue.
It's using the collating sequence of your locale

$ touch a A b B z Z
$ echo [A-Z]*
A b B z Z
$ export LANG=C
$ echo [A-Z]*
A B Z






bug#6377: Subject: inaccurate character class processing

2010-06-08 Thread Pierre Gaston
On Tue, Jun 8, 2010 at 4:48 PM, Iosif Fettich ifett...@netsoft.ro wrote:
...

        ls [a-z]*

        outputs

        a  A  b  B  z

        (why 'A' and 'B' - and/or where's 'Z'...?!!)


it's a classic problem with the locale, the range [a-z] contains the
capital letters
for some  locale definitions ie  a-z is aAbB z (Z is after the z)
As a workaround  you can export LC_COLLATE=C, or maybe use [[:lower:]]
instead of [a-z]





bug#6377: Subject: inaccurate character class processing

2010-06-08 Thread Greg Wooledge
On Tue, Jun 08, 2010 at 04:48:08PM +0300, Iosif Fettich wrote:
 ls [A-Z]*
 
 doesn't work as expected/documented.
 I'd want/expect it to list the filenames starting with an uppercase 
 letter.

The results of this are dependent upon your locale.  If your locale is
set to C or POSIX, you will get what you expect.  If your locale is set
to something else (such as en_US.utf8) then you will get something
completely different.

I explain why this happens, on http://mywiki.wooledge.org/locale.

The glob in your command is expanded by bash (not ls), so in order to
get the results you want, your locale variables would have to be set to
C/POSIX *before* expanding the glob.  In other words, LANG=C ls [A-Z]*
will not work, since that sets the variable after expanding the glob.

This would work, although it's extremely awkward (IMHO):

  LANG=C bash -c 'ls [A-Z]*'

Another approach would be to permanently (or semi-permanently, e.g. just
for one shell session) set the LC_COLLATE variable.  Thus,

  export LC_COLLATE=C
  ls [A-Z]*

This will cause the ordering of glob results (and also of results generated
by ls itself, for example ls with no arguments, or ls dirname) to be
in ASCII order, without throwing away the other locale features.





bug#6131: [PATCH]: fiemap support for efficient sparse file copy

2010-06-08 Thread Sunil Mushran

On 06/08/2010 05:10 AM, Eric Blake wrote:

new file mode 100644
index 000..d33293b
--- /dev/null
+++ b/src/fiemap.h
@@ -0,0 +1,102 @@
+/* FS_IOC_FIEMAP ioctl infrastructure.
+   Some portions copyright (C) 2007 Cluster File Systems, Inc
+   Authors: Mark Fashehmfas...@suse.com
+Kalpak Shahkalpak.s...@sun.com
+Andreas Dilgeradil...@sun.com.  */
 

Shouldn't we also add an FSF Copyright 2010 to this new file, to cover
our changes?
   


+/* Copy from kernel, modified to respect GNU code style by Jie Liu.  */


The difference is only in style.





bug#6053: cp, ls, and mv bug: unknown error (252)

2010-06-08 Thread Callahan, Patrick M.
Bob,

Thanks for responding.  One thing I probably did not make clear is that I've 
tried 8.5 both depots and sources from the HP Porting and Archive Centre 
(http://hpux.connect.org.uk/). I realize that changes they make are out of your 
control however for a sysadmin like me they make my job easier in our 
heterogeneous environment.

I started to trace through the code and it seems that the problem is that 
getacl fails on our StorNext File System cvfs (aka snfs) does not support the 
getacl call.  I could easily accept better warnings when using cp or mv across 
file systems where acls are not supported on both but the problem is that we 
see the problems with other commands such as ls -l. Tusc output for an ls is 
attached and you can see that it is the getacl call that fails.

Patrick M. Callahan Senior Engineer
pat.calla...@gd-ais.com
General Dynamics Advanced Information Systems
10467 White Granite Drive Ste 304
Oakton VA 22124
work: 703.277.1471


-Original Message-
From: Bob Proulx [mailto:b...@proulx.com] 
Sent: Monday, June 07, 2010 8:10 PM
To: Callahan, Patrick M.; 6...@debbugs.gnu.org
Subject: Re: bug#6053: cp, ls, and mv bug: unknown error (252)

Pádraig Brady wrote:
 Callahan, Patrick M. wrote:
  When using coreutils binaries either built from sources or installed
  from the Porting And Archive Centre for HP-UX we see errors of the type
  below when copying (cp), listing (ls), or moving (mv) files or
  directories on Quantum's StorNext file system (cvfs).  When we build
  --without-acl we do not see these errors.
  
  mv SEG_5_1* ~/release-input/dev-to-integration
  mv: preserving permissions for
  `/usr/people/archive/release-input/dev-to-integation/SEG_5_1.txt':
  Unknown error (252)
 
 It seems the file system is returning that which we're dutifully reporting.
 An strace or equivalent would be informative.
 Also the version of coreutils you're using would be good to note
 (8.5 was just released).

On HP-UX the tool 'tusc' (trace unix system call) is the equivalent
tool to 'strace'.  Using it you should be able to see the underlying
system calls and their status returns.  That would provide a valuable
clue as to where the problem lies.

You are using a combination of operating system and filesystem that
isn't very commonly seen.  Nor is it available to the developers
here.  You may very well have found a bug.  But is the bug in
coreutils mv or in the filesystem or in the kernel?  Any of those are
almost equally likely possibilites.  And not having such a system to
try it means that we will need to rely upon you to help.

Bob
# /usr/local/bin/tusc ./ls -l wc
execve(./ls, 0x96c8, 0x96e8) 

 = 0 [32-bit]
mmap(NULL, 8192, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) 

 = 0x777fc000
open(/usr/lib/hpux32/dld.so, O_RDONLY, 0) 
...
 = 3
read(3, 7fE L F 0102010101\0\0\0\0\0\0\0.., 1024) 
...
 = 1024
mmap(NULL, 735056, PROT_READ|PROT_EXEC, MAP_SHARED|MAP_FILE|MAP_SHLIB, 3, 0) 
..
 = 0xc0016000
sysconf(_SC_PAGE_SIZE) 

 = 4096
mmap(NULL, 6760, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_SHLIB, -1, 
0) 

 = 0x777fa000
mmap(0x777f7000, 11480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FILE|MAP_SHLIB, 
3, 786432) 
..
 = 0x777f7000
close(3) 
..
 = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) 

 = 0x777f5000
sysconf(_SC_PAGE_SIZE) 

 = 4096
stat(/usr/lib/hpux32/dpd, 0x72d0) 
...
 = 0
open(/usr/lib/hpux32/dpd, O_RDONLY, 0) 

bug#6379: process substitution with a give suffix

2010-06-08 Thread Peng Yu
On Tue, Jun 8, 2010 at 1:02 PM, Greg Wooledge wool...@eeg.ccf.org wrote:
 On Tue, Jun 08, 2010 at 12:53:47PM -0500, Peng Yu wrote:
 I have a program that only accept argument with a give suffix

 ./program xxx.suffix

 If I use process substitution, which gives me /dev/fd/xx, it will not
 work with the program. Is there a way to make sure a suffix is added
 to the substitute process file handle in /def/fd/, so that the program
 can work with process substitution?

  mkfifo myfifo.suffix
  something myfifo.suffix 
  ./program myfifo.suffix
  wait
  rm myfifo.suffix

The above question was sent to bug-bash. But since it is related to
mkfilo. I redirect it to bug-coreutils.

I have more than one arguments. I tried the following code. It doesn't
seem to work for more than one arguments. Would you please let me know
what is wrong?

BTW, using fifo is going to be much faster than using a temp file as
it avoid the disk usage, right?

$ cat a.txt
In a.txt
$ cat b.txt
In b.txt
$ cat main.sh
#!/usr/bin/env bash

mkfifo a.suffix
cat a.txt a.suffix 
mkfifo b.suffix
cat b.txt b.suffix 
cat a.suffix b.suffix
wait
rm a.suffix b.suffix
$ ./main.sh
In b.txt

-- 
Regards,
Peng





bug#6379: process substitution with a give suffix

2010-06-08 Thread Eric Blake
On 06/08/2010 12:33 PM, Peng Yu wrote:
  mkfifo myfifo.suffix
  something myfifo.suffix 
  ./program myfifo.suffix
  wait
  rm myfifo.suffix
 
 The above question was sent to bug-bash. But since it is related to
 mkfilo. I redirect it to bug-coreutils.

Your question is about how to use various Unix tools together; rather
than directing to bug-bash or bug-coreutils, you may be better off
directing to a generic shell-programming forum.

As it is, you didn't raise any bug report about the mkfifo program, so
redirecting to coreutils didn't really buy you anything.

 
 I have more than one arguments. I tried the following code. It doesn't
 seem to work for more than one arguments. Would you please let me know
 what is wrong?
 
 BTW, using fifo is going to be much faster than using a temp file as
 it avoid the disk usage, right?

A fifo will have the same speed as a pipe (another name for a fifo is
named pipe; unlike 'foo | bar', where the pipe is anonymous and exists
between exactly two processes, a fifo can be accessed from the file
system by multiple processes, but but under the hood, it uses the same
kernel pipe handling code).  It has the drawback of being non-seekable
in comparison to temporary regular files.  It has the advantage of
atomic operations not guaranteed by disk files, provided you stick to
transactions below the size guaranteed by your kernel.  And if you use a
ramdisk backing store for /tmp, there is little difference in speed
(either way, a ramdisk or a fifo does not have to do disk I/O); but
since you can't guarantee that /tmp is a ramdisk, yes, a fifo can be
faster for interprocess communication.  And if used incorrectly, it has
the potential to deadlock your shell script (something that won't happen
with regular files).

 
 $ cat a.txt
 In a.txt
 $ cat b.txt
 In b.txt
 $ cat main.sh
 #!/usr/bin/env bash
 
 mkfifo a.suffix
 cat a.txt a.suffix 
 mkfifo b.suffix
 cat b.txt b.suffix 
 cat a.suffix b.suffix

This is redirecting stdin twice, with the net effect that cat only sees
the contents of the fifo b.suffix.  You probably meant to do:

cat a.suffix b.suffix

 wait
 rm a.suffix b.suffix
 $ ./main.sh
 In b.txt
 

-- 
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


bug#6053: cp, ls, and mv bug: unknown error (252)

2010-06-08 Thread Jim Meyering
Callahan, Patrick M. wrote:
 Pádraig Brady wrote:
 Callahan, Patrick M. wrote:
  When using coreutils binaries either built from sources or installed
  from the Porting And Archive Centre for HP-UX we see errors of the type
  below when copying (cp), listing (ls), or moving (mv) files or
  directories on Quantum's StorNext file system (cvfs).  When we build
  --without-acl we do not see these errors.
 
  mv SEG_5_1* ~/release-input/dev-to-integration
  mv: preserving permissions for
  `/usr/people/archive/release-input/dev-to-integation/SEG_5_1.txt':
  Unknown error (252)

Thanks for the details.
Do you care about ACLs?
If not, then building --without-acl should be fine.

If you do require that ACLs be preserved in general,
then more work will be required.  The code in question is
probably part of gnulib, in copy-acl.c.  There, we already
ignore failure due to conditions that imply lack of support:

if ((errno == ENOSYS || errno == EOPNOTSUPP)
...
if ((errno == ENOSYS || errno == EINVAL || errno == ENOTSUP)

But that the errno value in question, 252, does not map
to a strerror string is ominous.  Could it be that your version
of HP-UX's C library is lacking patches that might provide that?

You could get in a debugger and determine where
to add || errno == 252 to solve what appears to be
an HP-UX-and/or-cvfs-specific problem.

However, such a change is not appropriate for others,
and I doubt it will be worthwhile to attempt an
upstream workaround.





bug#6131: [PATCH]: fiemap support for efficient sparse file copy

2010-06-08 Thread Jim Meyering
Jim Meyering wrote:
 Subject: [PATCH 01/10] cp: Add FIEMAP support for efficient sparse file copy

FYI, using those patches, I ran a test for the first time in a few days:

check -C tests TESTS=cp/sparse-fiemap VERBOSE=yes

It failed like this on an ext4 partition using F13:

+ timeout 10 cp --sparse=always sparse fiemap
+ fail=1
++ stat --printf %s sparse
++ stat --printf %s fiemap
+ test 1099511628800 = 0
+ fail=1

That is very odd.  No diagnostic from cp, yet it failed
after creating a zero-length file.

Here's the corresponding piece of the script:

# It takes many minutes to copy this sparse file using the old method.
# By contrast, it takes far less than 1 second using FIEMAP-copy.
timeout 10 cp --sparse=always sparse fiemap || fail=1

# Ensure that the sparse file copied through fiemap has the same size
# in bytes as the original.
test $(stat --printf %s sparse) = $(stat --printf %s fiemap) || fail=1

However, so far I've been unable to reproduce the failure,
running hundreds of iterations:

for i in $(seq 300); do printf .; make check -C tests \
  TESTS=cp/sparse-fiemap VERBOSE=yes  makerr-$i || break; done

Have any of you heard of a problem whereby a cold cache can cause
such a thing?  echo 3  /proc/sys/vm/drop_caches didn't help.
I suspect that having so many extents is unusual, so maybe
this is a rarely exercised corner case.

===
As I wrote the above, I realized I probably had enough
information to deduce where things were going wrong, even
if so far I've been unable to reproduce it.

And sure enough.  There is a way to provoke exactly
that failure.  If the *second* (or later) FIEMAP ioctl fails:

  do
{
  fiemap-fm_length = FIEMAP_MAX_OFFSET;
  fiemap-fm_extent_count = count;

  /* When ioctl(2) fails, fall back to the normal copy only if it
 is the first time we met.  */
  if (ioctl (src_fd, FS_IOC_FIEMAP, fiemap)  0)
{
  /* If the first ioctl fails, tell the caller that it is
 ok to proceed with a normal copy.  */
  if (i == 0)
*normal_copy_required = true;
  return false;
}

In that case, fiemap_copy returns false (with no diagnostic)
and cp fails silently.

Obviously I will now add code to diagnose the failure,
but do any of you know off hand how to reproduce this
or what the failure might have been?

Here's the patch I plan to merge:

diff --git a/src/copy.c b/src/copy.c
index eb67700..07d605e 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -200,6 +200,12 @@ fiemap_copy (int src_fd, int dest_fd, size_t buf_size,
  ok to proceed with a normal copy.  */
   if (i == 0)
 *normal_copy_required = true;
+  else
+{
+  /* If the second or subsequent ioctl fails, diagnose it,
+ since it ends up causing the entire copy/cp to fail.  */
+  error (0, errno, _(%s: FIEMAP ioctl failed), quote (src_name));
+}
   return false;
 }





bug#6131: [PATCH]: fiemap support for efficient sparse file copy

2010-06-08 Thread Jim Meyering
Eric Blake wrote:
 +++ b/tests/cp/sparse-fiemap
 @@ -0,0 +1,56 @@
 +#!/bin/sh
 +# Test cp --sparse=always through fiemap copy
 +
 +# Copyright (C) 2006-2010 Free Software Foundation, Inc.

 How much of this content comes from other files from 2006, vs. new
 content needing only 2010?

Good point.
This started as a largely copied test, but now it's mostly new.
I'm taking this opportunity to make it use init.sh, too.
I'll squash this.  Think of init.sh as the next-generation test-lib.sh.

Thanks.

From c7f9d3d0ff23d72cadd435ceef8d44b7eab7f072 Mon Sep 17 00:00:00 2001
From: Jim Meyering meyer...@redhat.com
Date: Tue, 8 Jun 2010 22:53:51 +0200
Subject: [PATCH] test-use init.sh

---
 tests/cp/sparse-fiemap |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/cp/sparse-fiemap b/tests/cp/sparse-fiemap
index 3e7c11f..dc0cf60 100755
--- a/tests/cp/sparse-fiemap
+++ b/tests/cp/sparse-fiemap
@@ -1,7 +1,7 @@
 #!/bin/sh
 # Test cp --sparse=always through fiemap copy

-# Copyright (C) 2006-2010 Free Software Foundation, Inc.
+# Copyright (C) 2010 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@@ -21,7 +21,7 @@ if test $VERBOSE = yes; then
   cp --version
 fi

-. $srcdir/test-lib.sh
+. ${srcdir=.}/init.sh; path_prepend_ ../src

 if df -T -t btrfs -t xfs -t ext4 -t ocfs2 . ; then
   : # Current dir is on a partition with working extents.  Good!
--
1.7.1.501.g23b46





bug#6053: cp, ls, and mv bug: unknown error (252)

2010-06-08 Thread Pádraig Brady
On 08/06/10 21:51, Jim Meyering wrote:
 Callahan, Patrick M. wrote:
 Pádraig Brady wrote:
 Callahan, Patrick M. wrote:
 When using coreutils binaries either built from sources or installed
 from the Porting And Archive Centre for HP-UX we see errors of the type
 below when copying (cp), listing (ls), or moving (mv) files or
 directories on Quantum's StorNext file system (cvfs).  When we build
 --without-acl we do not see these errors.

 mv SEG_5_1* ~/release-input/dev-to-integration
 mv: preserving permissions for
 `/usr/people/archive/release-input/dev-to-integation/SEG_5_1.txt':
 Unknown error (252)
 
 Thanks for the details.
 Do you care about ACLs?
 If not, then building --without-acl should be fine.
 
 If you do require that ACLs be preserved in general,
 then more work will be required.  The code in question is
 probably part of gnulib, in copy-acl.c.  There, we already
 ignore failure due to conditions that imply lack of support:
 
 if ((errno == ENOSYS || errno == EOPNOTSUPP)
 ...
 if ((errno == ENOSYS || errno == EINVAL || errno == ENOTSUP)
 
 But that the errno value in question, 252, does not map
 to a strerror string is ominous.  Could it be that your version
 of HP-UX's C library is lacking patches that might provide that?
 
 You could get in a debugger and determine where
 to add || errno == 252 to solve what appears to be
 an HP-UX-and/or-cvfs-specific problem.
 
 However, such a change is not appropriate for others,
 and I doubt it will be worthwhile to attempt an
 upstream workaround.

Well a quick search says ENOTSUP is 252 on HPUX which is
different to EOPNOTSUPP. So perhaps we need to check
for both these errors in the if statements above.
I've mentioned before about amalgamating various errnos
when testing for not supported, but it seems dubious
to not treat ENOTSUP and EOPNOTSUPP as synonymous at least.

strerror(252) = unknown on HPUX is strange

cheers,
Pádraig.





bug#6131: [PATCH]: fiemap support for efficient sparse file copy

2010-06-08 Thread Paul Eggert
Jeff Liu and Jim Meyering wrote:

 diff --git a/src/fiemap.h b/src/fiemap.h
 new file mode 100644
 index 000..d33293b
 --- /dev/null
 +++ b/src/fiemap.h

Why is this file necessary?  fiemap.h is included only if it exists,
right?  Shouldn't we just use the kernel's fiemap.h rather than
copying it here and assuming kernel internals?

  if (lseek (src_fd, ext_logical, SEEK_SET)  0LL)

For this sort of thing, please just use 0 rather than 0LL.
0 is easier to read and it has the same effect here.

  char buf[buf_size];

This assumes C99, since buf_size is not known at compile time.
Also, isn't there a potential segmentation-violation problem
if buf_size is sufficiently large?

More generally, since the caller is already allocating a buffer of the
appropiate size, shouldn't we just reuse that buffer, rather than
allocating a new one?  That would avoid the problems of assuming
C99 and possible segmentation violations.


   char fiemap_buf[4096];
   struct fiemap *fiemap = (struct fiemap *) fiemap_buf;
   struct fiemap_extent *fm_ext = fiemap-fm_extents[0];
   uint32_t count = ((sizeof fiemap_buf - sizeof (*fiemap))
 / sizeof (struct fiemap_extent));

This code isn't portable, since fiemap_buf is only char-aligned, and
struct fiemap may well require stricter alignment.  The code will work
on the x86 despite the alignment problem, but that's just a happy
coincidence.

A lesser point: the code assumes that 'struct fiemap' is sufficiently
small (considerably less than 4096 bytes in size); I expect that this
is universally true but we might as well check this assumption, since
it's easy to do so without any runtime overhead.

So I propose something like this instead:

   union { struct fiemap f; char c[4096]; } fiemap_buf;
   struct fiemap *fiemap = fiemap_buf.f;
   struct fiemap_extent *fm_ext = fiemap-fm_extents[0];
   enum { count = (sizeof fiemap_buf - sizeof *fiemap) / sizeof *fm_ext };
   verify (count != 0);






bug#6366: join can't join on numeric fields

2010-06-08 Thread Alex Shinn
2010/6/8 Pádraig Brady p...@draigbrady.com:
 On 07/06/10 06:19, Alex Shinn wrote:

 Ideally join should be able to handle files sorted in any order
 that sort provides, but as a bare minimum it should at least
 be able to join files sorted on numeric fields.

 Well if there were no aliases in the numbers, you could always
 sort the output numerically after the join if it was important.

By first sorting lexicographically, you mean?
In the use case I had, the data was already sorted
numerically.  So whenever I want to join two files,
currently I have to do:

  sort file1  file1.tmp
  sort file2  file2.tmp
  join file1.tmp file2.tmp | sort -n  out
  rm -f file1.tmp file2.tmp

instead of just

  join -n file1 file2  out

In the small tools philosophy you want to avoid adding
redundancy, but in this case join isn't doing the same
thing as sort, it's just working with it better.  Not to mention
the fact that sort is an expensive operation to have to
perform multiple times, not just an extra O(n) filter
to throw in the middle of a pipeline.

 However if you wanted to join 01 and 1 then your patch is required.
 Are numeric aliases common enough to warrant this? I think so.

Leading zeros may not be so common, but don't forget
1.0 and 1 or 1e2 and 100 and 100.0, etc.

 I'd use -g, --general-numeric to correspond with `sort`.

Yes, that's probably better.

-- 
Alex