Re: you are not going to be able to sort this by the fifth field.

2010-03-04 Thread Eric Blake
According to jida...@jidanni.org on 3/4/2010 7:11 PM:
> Thanks. I see I neglected the -b.
> On the info page in the `--field-separator=SEPARATOR' discussion, do
> mention the effects of -b on ' foo' etc.
> PB> $ LC_CTYPE=C sort --debug -sb -k5,5 < taichung_county_atm.htm
> (Use .txt, not .htms in examples.)
> Anyway, your --debug stuff would be clearer with just pipes added
> inline:
> $ echo 'a   b c'|sort --debug=show_fields
> a|   b| c
> or something like that.

Except that you can specify overlapping keys.  I find the idea of multiple
separate lines of underscores, one per key, much easier to follow in
understanding how each line is broken down into fields and keys, than I
would in trying to parse inline notations that change the line's contents.

-- 
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


test failures on solaris

2010-03-04 Thread Ben Walton

Hi All,

I'm packaging coreutils for OpenCSW[1] to replace our (now ancient)
versions of {sh,file,text}utils and have hit a few test failures I'm
hoping someone can help me solve.

We're still targeting solaris 8[2] for one more release, and this is
where the results were gathered (sparc in this case although i386 is
affected too).

The failing tests are:
cp/into-self
du/long-from-unreadable

I've traced the first failure to a bug dependent on the ordering of
results from readdir() but I don't see the bug itself in the code.  In
many runs of the suite, the test passes.  I can exercise the failure
in a more regular fashion by making the following change to the
cp/into-self test:

$ diff cp/into-self{,-orig}
42c42
< cp -rl dir a dir 2>> out && fail=1
---
> cp -rl a dir dir 2>> out && fail=1

I haven't had much time to look into the du failure, but will be
investigating more in the next few days.  I haven't seen it occur on
i386 either, if that helps.

Thanks for any help you can provide.
-Ben

[1] http://opencsw.org/
[2] I'm groaning too.


coreutils-sol8sparc-fail.log
Description: Binary data


signature.asc
Description: PGP signature


Re: you are not going to be able to sort this by the fifth field.

2010-03-04 Thread jidanni
Thanks. I see I neglected the -b.
On the info page in the `--field-separator=SEPARATOR' discussion, do
mention the effects of -b on ' foo' etc.
PB> $ LC_CTYPE=C sort --debug -sb -k5,5 < taichung_county_atm.htm
(Use .txt, not .htms in examples.)
Anyway, your --debug stuff would be clearer with just pipes added
inline:
$ echo 'a   b c'|sort --debug=show_fields
a|   b| c
or something like that.




Re: say that cut can't handle more than one field demarcator

2010-03-04 Thread Pádraig Brady

On 04/03/10 20:05, jida...@jidanni.org wrote:

On (info "(coreutils) cut invocation") please add:
cut has no way to specify a group of blanks as a field demarcator.
If you want that, use perl -a. Also use perl's split if you want
regexp demarcators, etc.
If that is indeed the case.


Actually using tr from coreutils you can do:

tr -s '[:blank:]' '\t' | cut -f5

In fact I thought I had already added that to
the info docs. I had intended adding it at least,
but obviously I misremembered checking it in.
I'll do that very soon.

thanks,
Pádraig.




Re: [PATCH] sort: parallel external sort implementation

2010-03-04 Thread Pádraig Brady

On 05/03/10 00:39, Joey Degges wrote:



2010/3/4 Pádraig Brady mailto:p...@draigbrady.com>>
Have you considered the seek characteristics of SSDs
and how they might affect things (with consideration
that mechanical disks will be replaced by SSDs quite quickly).
There still would be some benefit splitting per SSD,
but it would be worth measuring to see.


I will post some results testing with various flash keys but I do not
have any proper SSD drives to play with. The extreme case here would be
sorting from multiple ramdisks in which case there is likely to be no
improvements whatsoever --


Right. In general it's worth posting the results for
counter cases like this to help with decisions.


supposing the underlying "do_sort" function
can process a single file in parallel. In this worst case it might be
useful to expose a "--no-multidisk" flag allowing the user to disable
this feature (or a "--multidisk" flag to enable it).


I'm not fond of options for this because if the user
needed to make that decision, then they could nearly
as easily and more generally do:

sort -m <(find /flash/ | xargs -P2 -n1 sort) \
<(find /mech/  | xargs -n1 sort)


+  unsigned long int np2 = num_processors (NPROC_CURRENT) / 2;

You probably want NPROC_CURRENT_OVERRIDABLE ?

Would we want to use the OpenMP environmental variable to affect the
number of pthreads that are used? A more generic PARALLEL variable might
be better suited.


That would be a better name but non standard.
OMP_NUM_THREADS can be used to config all OpenMP programs,
and also the coreutils nproc command honors it.
So for consistency at least _OVERRIDABLE might be best.

cheers,
Pádraig.




Re: you are not going to be able to sort this by the fifth field.

2010-03-04 Thread Pádraig Brady

On 04/03/10 19:59, jida...@jidanni.org wrote:

Try as you might, there is no way you are going to sort by this field,
$ LC_CTYPE=zh_TW.UTF-8 w3m -dump \
   
http://www.tcb-bank.com.tw/tcb/servicesloc/atm_location/taichung_county_atm.htm 
|
   perl -anlwe 'print $F[4] if exists $F[4]'|LC_CTYPE=C sort
without ripping it out of the table first using perl. Go ahead, try -t ... -k 
...,...
You won't be able to order that field in the same way one can after
ripping it out of the table.


This seems to work for me: LC_CTYPE=C sort -sb -k5,5
I confirmed by extracting the field as perl above _after_ the sort using:
sed 's/^ *//; s/ +/ /g; s/\r$//' | cut -d ' ' -s -f5 | sed '/^$/d'


sort (GNU coreutils) 8.4
P.S., perhaps add a --debug-fields mode which adds field boundary | pipe
symbols into the output.


Yes I agree it's very difficult to know exactly what's going on
with the field processing in sort. I actually proposed and
mostly implemented a --debug option. Here are some examples:

$ LC_CTYPE=C sort --debug -sb -k5,5 < taichung_county_atm.htm

** no match for key **

  


Re: [PATCH] sort: parallel external sort implementation

2010-03-04 Thread Joey Degges
2010/3/4 Pádraig Brady 

> On 04/03/10 06:01, Joey Degges wrote:
>
>> Hello,
>>
>> Matt (cc'd, not on list) and I have developed a patch that parallelizes
>> sort
>> using pthreads. The threading approach taken here is to break up input
>> files
>> into groups based on physical device and sort each of those groups in
>> parallel.
>> This allows for storage and processing resources to be more fully utilized
>> during the sort process.
>>
>>
>  multidisk sort with 4x 50M inputs
>> $ time -v sort_multidisk /q/0/50M /q/1/50M /q/2/50M /q/3/50M>  /dev/null
>> User time (seconds): 10.89
>> System time (seconds): 0.83
>> Percent of CPU this job got: 180%
>> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.50
>> Major (requiring I/O) page faults: 20
>> Minor (reclaiming a frame) page faults: 72843
>> Voluntary context switches: 1936
>> Involuntary context switches: 149
>> File system inputs: 393080
>>
>> current sort with 4x 50M inputs
>> $ time -v sort_current /q/0/50M /q/1/50M /q/2/50M /q/3/50M>  /dev/null
>> User time (seconds): 11.05
>> System time (seconds): 0.32
>> Percent of CPU this job got: 86%
>> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:13.19
>> Major (requiring I/O) page faults: 18
>> Minor (reclaiming a frame) page faults: 72532
>> Voluntary context switches: 915
>> Involuntary context switches: 48
>> File system inputs: 198936
>>
>
>  diff --git a/src/sort.c b/src/sort.c
>> +  --threads=N   use no more than N threads to improve
>> parallelism\n\
>>
>
> Threads is a bit of an implementation detail?
> Perhaps it would be better to use --parallel so one
> could move to multiprocess in future if desired?


> Since this is a new option, it requires documentation
> in docs/coreutils.texi which would include examples
> of how and why you would use this option.
>
>  +
>> +/* Sort NFILES FILES onto OUTPUT_FILE.
>> +
>> +   Threading approach: Break FILES up into several groups where each
>> contains
>> +   only files that can be found on the same physical device (according to
>> +   device_cmp()). Spawn threads to execute do_sort() on each group of
>> files in
>> +   parallel.
>> +
>> +   This allows for all concerned resources (storage devices and
>> processors) to
>> +   be more fully utilized.
>> +*/
>> +
>> +static void
>> +sort_multidisk (char * const *files, size_t nfiles, char const
>> *output_file,
>> +unsigned long int nthreads)
>>
>
> Have you considered the seek characteristics of SSDs
> and how they might affect things (with consideration
> that mechanical disks will be replaced by SSDs quite quickly).
> There still would be some benefit splitting per SSD,
> but it would be worth measuring to see.
>

I will post some results testing with various flash keys but I do not have
any proper SSD drives to play with. The extreme case here would be sorting
from multiple ramdisks in which case there is likely to be no improvements
whatsoever -- supposing the underlying "do_sort" function can process a
single file in parallel. In this worst case it might be useful to expose a
"--no-multidisk" flag allowing the user to disable this feature (or a
"--multidisk" flag to enable it).



>  @@ -3691,7 +3996,21 @@ main (int argc, char **argv)
>>   IF_LINT (free (sortfiles));
>> }
>>   else
>> -sort (files, nfiles, outfile);
>> +{
>> +  if (!nthreads)
>> +{
>> +  /* The default NTHREADS is 2 ** floor (log2 (number of
>> processors)).
>> + If comparisons do not vary in cost and threads are
>> + scheduled evenly, with the current algorithm there is no
>> + performance advantage to using a number of threads that
>> + is not a power of 2.  */
>> +  unsigned long int np2 = num_processors (NPROC_CURRENT) / 2;
>> +  for (nthreads = 1; nthreads <= np2; nthreads *= 2)
>> +continue;
>> +}
>>
>
> You probably want NPROC_CURRENT_OVERRIDABLE ?
>

Would we want to use the OpenMP environmental variable to affect the number
of pthreads that are used? A more generic PARALLEL variable might be better
suited.


Thanks,
Joey


say that cut can't handle more than one field demarcator

2010-03-04 Thread jidanni
On (info "(coreutils) cut invocation") please add:
   cut has no way to specify a group of blanks as a field demarcator.
   If you want that, use perl -a. Also use perl's split if you want
   regexp demarcators, etc.
If that is indeed the case.




you are not going to be able to sort this by the fifth field.

2010-03-04 Thread jidanni
Try as you might, there is no way you are going to sort by this field,
$ LC_CTYPE=zh_TW.UTF-8 w3m -dump \
  
http://www.tcb-bank.com.tw/tcb/servicesloc/atm_location/taichung_county_atm.htm 
|
  perl -anlwe 'print $F[4] if exists $F[4]'|LC_CTYPE=C sort
without ripping it out of the table first using perl. Go ahead, try -t ... -k 
...,...
You won't be able to order that field in the same way one can after
ripping it out of the table.
sort (GNU coreutils) 8.4
P.S., perhaps add a --debug-fields mode which adds field boundary | pipe
symbols into the output.




Re: [Patch In Progress] rm -d, --directory

2010-03-04 Thread Eric Blake
According to William Plusnick on 3/4/2010 9:02 AM:
> Hello,
> I am working on a patch that adds a FreeBSD style directory option, as
> suggested in the rm.c source code. The --directory (-d) option will
> delete empty directories, as well as the usual files, passed via
> command line.

Thanks for doing this.  Normally, adding new options (especially short
options, like -d) is difficult, but by pointing to prior practice in the
BSD implementation, you've already got the justification for this.

> I am now in the documentation stages and nearing
> completion (all the functionality is working). I have a few questions
> mostly regarding legal issues.
> 
> I know that I will have to sign some papers, after an email exchange
> with the Free Software Foundation. Who do I email to start this
> transaction? Also, I am only 15 years old, will this complicate
> anything, legal wise, anymore than it would be normally?

I'm sending you the contact information off-list.  As to whether your age
has an impact, that will be a question for the FSF contact to answer.

> 
> Where do I post my patch, when completed, for approval? Do you want
> the modified files each to have their own separate patches or all of the
> files
> to be in one large patch? Do you want the documentation patch separate
> from the source code patch?

Definitely an atomic commit per conceptual change, where a commit consists
of changes to multiple files.  Mail the proposed patches to this list.
The best thing to do would be reading up on our HACKING guidelines:
http://git.sv.gnu.org/cgit/coreutils.git/tree/HACKING

You are basing your patches on the latest git, right?

> 
> If the check to see if the directory is empty fails, I print an error
> message. Currently I am using this form:
> error (0, 0, "couldn't remove %s : Directory not empty");

error (ENOTEMPTY, 0, _("couldn't remove %s"));

Using the correct errno, error() will automatically append the ":
Directory not empty".  Also, note that I marked the string for
translation.  Actually, depending on the syscall you used to determine
whether the directory is non-empty, you may want to use errno directly
rather than hardcoding ENOTEMPTY, in case there are other failures also
worth reporting.

-- 
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Patch In Progress] rm -d, --directory

2010-03-04 Thread William Plusnick
Hello,
I am working on a patch that adds a FreeBSD style directory option, as
suggested in the rm.c source code. The --directory (-d) option will
delete empty directories, as well as the usual files, passed via
command line. I am now in the documentation stages and nearing
completion (all the functionality is working). I have a few questions
mostly regarding legal issues.

I know that I will have to sign some papers, after an email exchange
with the Free Software Foundation. Who do I email to start this
transaction? Also, I am only 15 years old, will this complicate
anything, legal wise, anymore than it would be normally?

Where do I post my patch, when completed, for approval? Do you want
the modified files each to have their own separate patches or all of the
files
to be in one large patch? Do you want the documentation patch separate
from the source code patch?

If the check to see if the directory is empty fails, I print an error
message. Currently I am using this form:
error (0, 0, "couldn't remove %s : Directory not empty");
(Paraphrased as I don't have the source code in front of me.) I went
ahead and manually put the ':' and why it wasn't removed, because we
know why it can't be removed but there isn't an errno that would print
the appropriate message. Is this acceptable or is there a better way?

Thanks,
William

-- 
"Once GNU is written, everyone will be able to obtain good system software
free, just like air."-- RMS in the GNU Manifesto


Re: Fw: side-effect implementing the mv command

2010-03-04 Thread Eric Blake
Hi Derick,

According to Derick Centeno on 3/4/2010 3:38 AM:
> 
>> [please keep replies on the list, so that others may chime in]

Thanks for adding the list back in.

>> I asked for 'ls -l' so we could see full details, such as which files are
>> symlinks.  According to your screenshot, you happen to have plain 'ls'
>> aliased to a coloring version; with the four files on that line colored
>> red, black, green, and blue (I'm guessing dangling symlink, regular file,
>> symlink, executable); but rather than having to decode your colors, an 'ls
>> -l' would have shown that unambiguously in black and white.
>>
>>>
> 
> Thanks for the explanation regarding the function of ls -l, quite instructive.
> Note:  All directories are on the same disk.
> Here is your request:
> 
> [agu...@arakus ~]$ cd /opt/ibm/java*/jre/plugin/ppc/ns7
> [agu...@arakus ns7]$ ls -l
> total 164
> -rwxr-xr-x 1 root root 163244 Dec 14 23:39 libjavaplugin_oji.so
> [agu...@arakus ns7]$ cd /usr/lib/mozilla/plugins
> [agu...@arakus plugins]$ ls -l
> total 52
> lrwxrwxrwx 1 root   root  20 Dec 15 16:59 libgnashplugin.la
> -> ../libgnashplugin.la -rw-r--r-- 1 root   root 869 Jan 24  2008
> libgnashplugin.lai -rwxr-xr-x 1 root   root   45000 Jan 24  2008
> libgnashplugin.so lrwxrwxrwx 1 aguila aguila60 Jan 13 22:39
> libjavaplugin_oji.so
> -> /opt/ibm/java-ppc-60/jre/plugin/ppc/ns7/libjavaplugin_oji.so 
> [agu...@arakus plugins]$ 
> 
> As you can see libjavaplugin_oji.so in the ns7 directory is a shared object 
> not
> a symlink. What I reported earlier is accurate.  I originally executed mv in a
> non-standard manner on this shared object file itself.
> For me, developing an understanding mv's program logic as it executed a 
> symbolic
> link back to the shared object file would be worth pursuing.  

I'm still not sure I understand the exact settings you had before you ran
your mv command.  A transcript of the session that has the events that you
are questioning would go a long way to understand what you are even asking
about.

> 
> By the term, side effect, I mean unexpected/undocumented/ambiguous results
> generated by malformed or unusual commands.  The command I executed to mv
> was non-standard and produced the creation of a symbolic link which I've not
> found referrenced as a standard function of mv.  I believe deducing the 
> program
> logic executed by mv as it followed the directive as I structured it, is worth
> investigating.

mv creates a symlink in the destination directory if the source was a
symlink.  That's standard behavior.  I'm still not sure why you think
there is a side effect; mv does not create any symlinks except when moving
a symlink across devices.  Without something that we can do to reproduce
your actions, it's hard to tell you reasons for what you are seeing.  And
without a more detailed explanation of what you typed, what you expected,
and why what happened is different from what you expected, we can't tell
whether it was your expectations or the tool that had the problem.

-- 
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Fw: side-effect implementing the mv command

2010-03-04 Thread Derick Centeno


Begin forwarded message:

Date: Wed, 3 Mar 2010 13:58:59 -0500
From: Derick Centeno 
To: Eric Blake 
Subject: Re: side-effect implementing the mv command


On Wed, 03 Mar 2010 08:08:55 -0700
Eric Blake  wrote:

> [please keep replies on the list, so that others may chime in]
> 
> According to Derick Centeno on 3/2/2010 5:50 PM:
> >> I'm not quite sure what you saw or what you are asking.  If this is still
> >> an issue for you, could you post more context, such as some 'ls -l *.so'
> >> listings in both the source and destination directories?  Also, are the
> >> two directories on the same disk, where rename(2) would work, or was it a
> >> cross-device move, where mv(1) has to create the copy before deleting the
> >> original?
> >>
> > 
> > Thanks for your response on this query, Eric.
> > 
> > Background:
> > I had downloaded a java plugin designed for PowerPC systems, as I'm running
> > Linux on a PowerPC.  That plugin is known as:  libjavaplugin_oji.so
> > Originally it was installed into: /opt/ibm/java-ppc-60/jre/plugin/ppc/ns7
> > 
> > It is supposed to be moved into: /usr/lib/mozilla/plugins
> > 
> > When I executed the mv command from within the ns7 directory by doing:
> > $ sudo mv ./libjavaplugin_oji.so /usr/lib/mozilla/plugins
> 
> Maybe the issue is related to the fact that you were trying to move a
> symlink, rather than an actual file?  If the symlink contains a relative
> filename, the moved symlink will contain the same relative name, but by
> virtue of the fact that it lives in a different directory, it will (most
> likely) resolve to something different than what it referred to in the
> original location.
> 
> > 
> > Eric, it was only after I studied the man/info pages more carefully
> > regarding how mv should be executed (and other references) and how it
> > should behave that I checked the directory where it was and where I
> > expected it to go.
> > 
> > Here it is in the ns7 directory:
> > 
> > [agu...@arakus ns7]$ ls
> > libjavaplugin_oji.so
> > 
> > Here is what the plugins directory looks like:
> > 
> > [agu...@arakus plugins]$ ls
> > libgnashplugin.la  libgnashplugin.lai  libgnashplugin.so
> > libjavaplugin_oji.so
> 
> I asked for 'ls -l' so we could see full details, such as which files are
> symlinks.  According to your screenshot, you happen to have plain 'ls'
> aliased to a coloring version; with the four files on that line colored
> red, black, green, and blue (I'm guessing dangling symlink, regular file,
> symlink, executable); but rather than having to decode your colors, an 'ls
> -l' would have shown that unambiguously in black and white.
> 
> > 

Thanks for the explanation regarding the function of ls -l, quite instructive.
Note:  All directories are on the same disk.
Here is your request:

[agu...@arakus ~]$ cd /opt/ibm/java*/jre/plugin/ppc/ns7
[agu...@arakus ns7]$ ls -l
total 164
-rwxr-xr-x 1 root root 163244 Dec 14 23:39 libjavaplugin_oji.so
[agu...@arakus ns7]$ cd /usr/lib/mozilla/plugins
[agu...@arakus plugins]$ ls -l
total 52
lrwxrwxrwx 1 root   root  20 Dec 15 16:59 libgnashplugin.la
-> ../libgnashplugin.la -rw-r--r-- 1 root   root 869 Jan 24  2008
libgnashplugin.lai -rwxr-xr-x 1 root   root   45000 Jan 24  2008
libgnashplugin.so lrwxrwxrwx 1 aguila aguila60 Jan 13 22:39
libjavaplugin_oji.so
-> /opt/ibm/java-ppc-60/jre/plugin/ppc/ns7/libjavaplugin_oji.so 
[agu...@arakus plugins]$ 

As you can see libjavaplugin_oji.so in the ns7 directory is a shared object not
a symlink. What I reported earlier is accurate.  I originally executed mv in a
non-standard manner on this shared object file itself.
For me, developing an understanding mv's program logic as it executed a symbolic
link back to the shared object file would be worth pursuing.  

By the term, side effect, I mean unexpected/undocumented/ambiguous results
generated by malformed or unusual commands.  The command I executed to mv
was non-standard and produced the creation of a symbolic link which I've not
found referrenced as a standard function of mv.  I believe deducing the program
logic executed by mv as it followed the directive as I structured it, is worth
investigating.

Thanks to Eric, and all others on this list in helping me work through this
conundrum. 

Derick.


=
Refranes/Popular sayings:
The Taino say:No hay mal que por bien no venga.
There is no evil out of which good cannot blossom.


=
Refranes/Popular sayings:
The Taino say:No hay mal que por bien no venga.
There is no evil out of which good cannot blossom.


signature.asc
Description: PGP signature


Re: a different approach to autocompletion

2010-03-04 Thread Eric Blake
According to Martin Walch on 3/3/2010 3:03 PM:
> Hello list,
> 
> sorry if this is not the appropriate place for the following suggestion, but 
> I 
> do not which place would be bette.
> 
> I think autocompletion in shells (that is usually triggered with tab) is a 
> great feature which I do not want to miss. However, over the years, I have 
> seen several autocomplete implementations from a user's point of view and I 
> think the whole mechanism is a mess. Many shells are trying to guess proper 
> program names and command line arguments for every single program.

No, most shells do pure filename completion only as a default fallback if
you have not defined a more powerful programmable completion.

> The
> autocompletion for program names is working well in most cases. But the 
> completion of command line arguments is broken by design. Why are shells even 
> trying to guess command line arguments? They can not know for every program 
> in 
> the world what arguments it will take.

Unless you use a library of shell completions designed for each particular
program.  For bash, see
 http://bash-completion.alioth.debian.org/
I know that zsh also has decent libraries built, but as I don't use zsh, I
can't easily point you to a link.

> Therefore I have an idea: let the program do the completion by itself. Every 
> program could implement an interface that takes the already typed command 
> line 
> (and maybe pwd) and returns one or more suggestions for completion, or no 
> suggestion or an error if there is no way to complete the argument.

Thanks for the suggestion.  However, I'm personally opposed to doing this
in-program.  It will make EVERY program larger, in order to add the
completion bits.  The UNIX philosophy is to do particular tasks well, and
the shell already does _programmable_ completion well.  I think the time
is better spent enhancing the programmable completion libraries associated
with each shell.

That said, it _does_ make sense for individual packages to contribute to
the programmable completion libraries.  For example, the bash-completion
project does not supply any completions for git.  Why?  Because git
supplies its own library of completions which can be hooked into the
bash-completion framework.  The question then becomes which packages are
so complex and frequently changing that the work of programmable
completion is better delegated to the package rather than to the
programmable completion library.  Coreutils probably falls in the latter
(it is stable and the interfaces change slowly enough that the
bash-completion project has no problem keeping up with changes in
coreutils' interfaces).

-- 
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] sort: parallel external sort implementation

2010-03-04 Thread Pádraig Brady

On 04/03/10 12:12, Chen Guo wrote:

Hi Padraig,
 The documentations coming in my patch, since we're both using the same
--threads option. I haven't put any examples in yet, but I suppose I should.
Thanks for the heads up.

 Why would there be a need to switch to processes? There's more memory
and communication overhead. The only advantage I'd see is address space,
which we'd likely never need.


No particular reason at present but possible future reasons are,
buggy pthreads on some platforms, simpler code, possibility
of processes running on separate systems (memory), ...

cheers,
Pádraig.




Re: [PATCH] sort: parallel external sort implementation

2010-03-04 Thread Chen Guo
Hi Padraig,
The documentations coming in my patch, since we're both using the same
--threads option. I haven't put any examples in yet, but I suppose I should.
Thanks for the heads up.

Why would there be a need to switch to processes? There's more memory
and communication overhead. The only advantage I'd see is address space,
which we'd likely never need.



- Original Message 
> From: Pádraig Brady 
> To: Joey Degges 
> Cc: Report bugs to ; Matt Ball 
> Sent: Thu, March 4, 2010 4:06:19 AM
> Subject: Re: [PATCH] sort: parallel external sort implementation
> 
> On 04/03/10 06:01, Joey Degges wrote:
> > Hello,
> >
> > Matt (cc'd, not on list) and I have developed a patch that parallelizes sort
> > using pthreads. The threading approach taken here is to break up input files
> > into groups based on physical device and sort each of those groups in
> > parallel.
> > This allows for storage and processing resources to be more fully utilized
> > during the sort process.
> >
> 
> > multidisk sort with 4x 50M inputs
> > $ time -v sort_multidisk /q/0/50M /q/1/50M /q/2/50M /q/3/50M>  /dev/null
> >  User time (seconds): 10.89
> >  System time (seconds): 0.83
> >  Percent of CPU this job got: 180%
> >  Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.50
> >  Major (requiring I/O) page faults: 20
> >  Minor (reclaiming a frame) page faults: 72843
> >  Voluntary context switches: 1936
> >  Involuntary context switches: 149
> >  File system inputs: 393080
> >
> > current sort with 4x 50M inputs
> > $ time -v sort_current /q/0/50M /q/1/50M /q/2/50M /q/3/50M>  /dev/null
> >  User time (seconds): 11.05
> >  System time (seconds): 0.32
> >  Percent of CPU this job got: 86%
> >  Elapsed (wall clock) time (h:mm:ss or m:ss): 0:13.19
> >  Major (requiring I/O) page faults: 18
> >  Minor (reclaiming a frame) page faults: 72532
> >  Voluntary context switches: 915
> >  Involuntary context switches: 48
> >  File system inputs: 198936
> 
> > diff --git a/src/sort.c b/src/sort.c
> > +  --threads=N   use no more than N threads to improve 
> parallelism\n\
> 
> Threads is a bit of an implementation detail?
> Perhaps it would be better to use --parallel so one
> could move to multiprocess in future if desired?
> 
> Since this is a new option, it requires documentation
> in docs/coreutils.texi which would include examples
> of how and why you would use this option.
> 
> > +
> > +/* Sort NFILES FILES onto OUTPUT_FILE.
> > +
> > +   Threading approach: Break FILES up into several groups where each 
> > contains
> > +   only files that can be found on the same physical device (according to
> > +   device_cmp()). Spawn threads to execute do_sort() on each group of 
> > files 
> in
> > +   parallel.
> > +
> > +   This allows for all concerned resources (storage devices and 
> > processors) 
> to
> > +   be more fully utilized.
> > +*/
> > +
> > +static void
> > +sort_multidisk (char * const *files, size_t nfiles, char const 
> > *output_file,
> > +unsigned long int nthreads)
> 
> Have you considered the seek characteristics of SSDs
> and how they might affect things (with consideration
> that mechanical disks will be replaced by SSDs quite quickly).
> There still would be some benefit splitting per SSD,
> but it would be worth measuring to see.
> 
> > @@ -3691,7 +3996,21 @@ main (int argc, char **argv)
> >IF_LINT (free (sortfiles));
> >  }
> >else
> > -sort (files, nfiles, outfile);
> > +{
> > +  if (!nthreads)
> > +{
> > +  /* The default NTHREADS is 2 ** floor (log2 (number of 
> processors)).
> > + If comparisons do not vary in cost and threads are
> > + scheduled evenly, with the current algorithm there is no
> > + performance advantage to using a number of threads that
> > + is not a power of 2.  */
> > +  unsigned long int np2 = num_processors (NPROC_CURRENT) / 2;
> > +  for (nthreads = 1; nthreads <= np2; nthreads *= 2)
> > +continue;
> > +}
> 
> You probably want NPROC_CURRENT_OVERRIDABLE ?
> 
> cheers,
> Pádraig.





Re: [PATCH] sort: parallel external sort implementation

2010-03-04 Thread Pádraig Brady

On 04/03/10 06:01, Joey Degges wrote:

Hello,

Matt (cc'd, not on list) and I have developed a patch that parallelizes sort
using pthreads. The threading approach taken here is to break up input files
into groups based on physical device and sort each of those groups in
parallel.
This allows for storage and processing resources to be more fully utilized
during the sort process.




multidisk sort with 4x 50M inputs
$ time -v sort_multidisk /q/0/50M /q/1/50M /q/2/50M /q/3/50M>  /dev/null
 User time (seconds): 10.89
 System time (seconds): 0.83
 Percent of CPU this job got: 180%
 Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.50
 Major (requiring I/O) page faults: 20
 Minor (reclaiming a frame) page faults: 72843
 Voluntary context switches: 1936
 Involuntary context switches: 149
 File system inputs: 393080

current sort with 4x 50M inputs
$ time -v sort_current /q/0/50M /q/1/50M /q/2/50M /q/3/50M>  /dev/null
 User time (seconds): 11.05
 System time (seconds): 0.32
 Percent of CPU this job got: 86%
 Elapsed (wall clock) time (h:mm:ss or m:ss): 0:13.19
 Major (requiring I/O) page faults: 18
 Minor (reclaiming a frame) page faults: 72532
 Voluntary context switches: 915
 Involuntary context switches: 48
 File system inputs: 198936



diff --git a/src/sort.c b/src/sort.c
+  --threads=N   use no more than N threads to improve 
parallelism\n\


Threads is a bit of an implementation detail?
Perhaps it would be better to use --parallel so one
could move to multiprocess in future if desired?

Since this is a new option, it requires documentation
in docs/coreutils.texi which would include examples
of how and why you would use this option.


+
+/* Sort NFILES FILES onto OUTPUT_FILE.
+
+   Threading approach: Break FILES up into several groups where each contains
+   only files that can be found on the same physical device (according to
+   device_cmp()). Spawn threads to execute do_sort() on each group of files in
+   parallel.
+
+   This allows for all concerned resources (storage devices and processors) to
+   be more fully utilized.
+*/
+
+static void
+sort_multidisk (char * const *files, size_t nfiles, char const *output_file,
+unsigned long int nthreads)


Have you considered the seek characteristics of SSDs
and how they might affect things (with consideration
that mechanical disks will be replaced by SSDs quite quickly).
There still would be some benefit splitting per SSD,
but it would be worth measuring to see.


@@ -3691,7 +3996,21 @@ main (int argc, char **argv)
   IF_LINT (free (sortfiles));
 }
   else
-sort (files, nfiles, outfile);
+{
+  if (!nthreads)
+{
+  /* The default NTHREADS is 2 ** floor (log2 (number of processors)).
+ If comparisons do not vary in cost and threads are
+ scheduled evenly, with the current algorithm there is no
+ performance advantage to using a number of threads that
+ is not a power of 2.  */
+  unsigned long int np2 = num_processors (NPROC_CURRENT) / 2;
+  for (nthreads = 1; nthreads <= np2; nthreads *= 2)
+continue;
+}


You probably want NPROC_CURRENT_OVERRIDABLE ?

cheers,
Pádraig.




Re: [PATCH] maint: rename the si_present variable in sort to iec_present

2010-03-04 Thread Jim Meyering
Pádraig Brady wrote:
>
> maint: rename the si_present variable in sort to iec_present
>
> * src/sort.c: The units containing 'i' are actually IEC not SI.
>
> diff --git a/src/sort.c b/src/sort.c
> index 02b2351..5a937dd 100644
> --- a/src/sort.c
> +++ b/src/sort.c
> @@ -181,7 +181,7 @@ struct keyfield
> Handle numbers in exponential notation. */
>bool human_numeric;  /* Flag for sorting by human readable
> units with either SI xor IEC prefixes. */
> -  int si_present;  /* Flag for checking for mixed SI and IEC. */
> +  int iec_present; /* Flag for checking for mixed SI and IEC. */

Good point.




Re: [PATCH] sort: parallel external sort implementation

2010-03-04 Thread Joey Degges
2010/3/4 Pádraig Brady 

> On 04/03/10 09:55, Chen Guo wrote:
>
>> Hi Padraig,
>> Actually we're in the same class. They were assigned external sort, my
>> group
>> worked on internal sort. Our patch submission's imminent, looks like you
>> guys
>> are gonna be busy with code reviews :-)
>>
>
> OK thanks :)
>
> If we do see a benefit from using threads in the internal sort,
> then it's probably better for sort to manage the threads
> for external sort also.


Thanks for the clarification Chen! My group is also finishing up another
patch that applies a similar approach to the merge process: instead of an
n-way merge, carry out a tree of binary merges from different disks in
parallel.

Here are the results you asked for using the current sort. I've included the
verbose output but those stats should only apply to the final merge stage.
It appears that the two methods offer similar speedups.

$ time -v sort -m <(sort /q/0/50M) <(sort /q/1/50M) <(sort /q/2/50M) <(sort
/q/3/50M) > /dev/null
User time (seconds): 1.39
System time (seconds): 0.12
Percent of CPU this job got: 22%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.86
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 531
Voluntary context switches: 5117
Involuntary context switches: 1
File system inputs: 552


The patch presented here can be somewhat easily adapted to manage a parallel
internal sort. The general idea is to compute the total size of each device
file group and use the relative difference in total size to assign threads
to each internal sort. This functionality does not exist yet.

Thanks,
Joey


[PATCH] maint: rename the si_present variable in sort to iec_present

2010-03-04 Thread Pádraig Brady

commit a7b4fa01ddde3f09eb642cb6378e16522917f8e0
Author: Pádraig Brady 
Date:   Thu Mar 4 10:54:21 2010 +

maint: rename the si_present variable in sort to iec_present

* src/sort.c: The units containing 'i' are actually IEC not SI.

diff --git a/src/sort.c b/src/sort.c
index 02b2351..5a937dd 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -181,7 +181,7 @@ struct keyfield
Handle numbers in exponential notation. */
   bool human_numeric;  /* Flag for sorting by human readable
units with either SI xor IEC prefixes. */
-  int si_present;  /* Flag for checking for mixed SI and IEC. */
+  int iec_present; /* Flag for checking for mixed SI and IEC. */
   bool month;  /* Flag for comparison by month name. */
   bool reverse;/* Reverse the sense of comparison. */
   bool version;/* sort by version number */
@@ -1754,10 +1754,10 @@ numcompare (const char *a, const char *b)
 static void
 check_mixed_SI_IEC (char prefix, struct keyfield *key)
 {
-  int si_present = prefix == 'i';
-  if (key->si_present != -1 && si_present != key->si_present)
+  int iec_present = prefix == 'i';
+  if (key->iec_present != -1 && iec_present != key->iec_present)
 error (SORT_FAILURE, 0, _("both SI and IEC prefixes present on units"));
-  key->si_present = si_present;
+  key->iec_present = iec_present;
 }

 /* Return an integer which represents the order of magnitude of
@@ -3211,7 +3211,7 @@ key_init (struct keyfield *key)
 {
   memset (key, 0, sizeof *key);
   key->eword = SIZE_MAX;
-  key->si_present = -1;
+  key->iec_present = -1;
   return key;
 }

@@ -3328,7 +3328,7 @@ main (int argc, char **argv)
   gkey.ignore = NULL;
   gkey.translate = NULL;
   gkey.numeric = gkey.general_numeric = gkey.human_numeric = false;
-  gkey.si_present = -1;
+  gkey.iec_present = -1;
   gkey.random = gkey.version = false;
   gkey.month = gkey.reverse = false;
   gkey.skipsblanks = gkey.skipeblanks = false;




Re: [PATCH] sort: parallel external sort implementation

2010-03-04 Thread Pádraig Brady

On 04/03/10 09:55, Chen Guo wrote:

Hi Padraig,
 Actually we're in the same class. They were assigned external sort, my 
group
worked on internal sort. Our patch submission's imminent, looks like you guys
are gonna be busy with code reviews :-)


OK thanks :)

If we do see a benefit from using threads in the internal sort,
then it's probably better for sort to manage the threads
for external sort also.

cheers,
Pádraig.




Re: [PATCH] sort: parallel external sort implementation

2010-03-04 Thread Chen Guo
Hi Padraig,
Actually we're in the same class. They were assigned external sort, my group
worked on internal sort. Our patch submission's imminent, looks like you guys
are gonna be busy with code reviews :-)




- Original Message 
> From: Pádraig Brady 
> To: Joey Degges 
> Cc: Report bugs to ; Matt Ball 
> Sent: Thu, March 4, 2010 1:41:23 AM
> Subject: Re: [PATCH] sort: parallel external sort implementation
> 
> On 04/03/10 06:01, Joey Degges wrote:
> > Hello,
> >
> > Matt (cc'd, not on list) and I have developed a patch that parallelizes sort
> > using pthreads. The threading approach taken here is to break up input files
> > into groups based on physical device and sort each of those groups in
> > parallel.
> > This allows for storage and processing resources to be more fully utilized
> > during the sort process.
> 
> Thanks for looking at this.
> Are you aware that there is a group already working on this?
> http://lists.gnu.org/archive/html/bug-coreutils/2010-02/msg00223.html
> 
> >
> > Below is a set of experimental results and the statistics of the machine
> > that
> > they were run on. The test below show speed improvements from 45% - 64%.
> >
> > In this patch the method used to determine which device a particular file is
> > on
> > relies on comparisons between struct stat ->  st_dev fields. This leads to
> > some
> > problems in cases where files exist on different partitions of the same
> > disk.
> 
> Issues like this are why it might be better to split the data
> externally to sort. I've commented before that a threading
> solution may only be appropriate for working on a single file:
> http://lists.gnu.org/archive/html/bug-coreutils/2009-10/msg00179.html
> 
> > Preliminary tests have shown that under these situations some performance
> > improvements are still seen, but they are not as dramatic as in the ideal
> > case.
> >
> >
> > system details
> > CPU: Intel(R) Core(TM)2 Quad CPUQ6600  @ 2.40GHz
> > cache size: 4096 KB
> > RAM: 2G
> >
> > disk details:
> > /q/0:
> >   Timing cached reads:   7556 MB in  2.00 seconds = 3781.13 MB/sec
> >   Timing buffered disk reads:  320 MB in  3.00 seconds = 106.50 MB/sec
> > /q/1:
> >   Timing cached reads:   7772 MB in  2.00 seconds = 3888.09 MB/sec
> >   Timing buffered disk reads:  224 MB in  3.03 seconds =  74.00 MB/sec
> > /q/2:
> >   Timing cached reads:   7444 MB in  2.00 seconds = 3724.34 MB/sec
> >   Timing buffered disk reads:  222 MB in  3.01 seconds =  73.77 MB/sec
> > /q/3:
> >   Timing cached reads:   7512 MB in  2.00 seconds = 3758.49 MB/sec
> >   Timing buffered disk reads:  240 MB in  3.02 seconds =  79.50 MB/sec
> >
> > before each run the cache was dropped with:
> >echo 3>  /proc/sys/vm/drop_caches
> >
> > store temp files in RAM:
> >mount -t tmpfs -o size=1000M /ram /ram
> >export TMPDIR=/ram
> >
> > multidisk sort with 4x 50M inputs
> > $ time -v sort_multidisk /q/0/50M /q/1/50M /q/2/50M /q/3/50M>  /dev/null
> 
> Could you compare:
> 
> time -v sort -m <(sort /q/0/50M) \
>  <(sort /q/1/50M) \
>  <(sort /q/2/50M) \
>  <(sort /q/3/50M) > /dev/null
> 
> cheers,
> Pádraig.





Re: [Fedora-livecd-list] LANGUAGE trumps LC_ALL

2010-03-04 Thread Jim Meyering
Jim Meyering wrote:
> Mads Kiilerich wrote:
...
>> be aware that LANGUAGE trumps:
>>
>> $ LC_ALL=fr_FR.UTF-8 LANGUAGE=da_DK /bin/cat no-such
>> /bin/cat: no-such: Ingen sådan fil eller filkatalog
>>
>> $ LANGUAGE=da_DK.UTF-8 LC_ALL=fr_FR.UTF-8 LANG=C /sbin/parted
>> WARNING: You are not superuser.  Watch out for permissions.
>> /dev/mapper/control: open failed: Adgang nægtet
>> Failure to communicate with kernel device-mapper driver.
>> Fejl: Ingen enhed fundet
>> Forsøg igen/Retry/Annullér/Cancel?
>
> Thanks for the info.
> Beware that using the LANGUAGE envvar like that takes advantage
> of a glibc extension, and hence is not portable.
> Knowing this, I've added "LANGUAGE" to the list of envvars
> that coreutils unsets when running its test suite.
> (for many Perl scripts it was already handled in individual scripts, but
> this change also protects shell scripts, none of which guarded against this)

As far as I can see, no test script is vulnerable to false positives,
either, since nothing failed (pre-patch), when I did this:

LC_ALL=fr_FR.UTF-8 LANGUAGE=da_DK make check

I've pushed the patch regardless.




Re: [PATCH] sort: parallel external sort implementation

2010-03-04 Thread Pádraig Brady

On 04/03/10 06:01, Joey Degges wrote:

Hello,

Matt (cc'd, not on list) and I have developed a patch that parallelizes sort
using pthreads. The threading approach taken here is to break up input files
into groups based on physical device and sort each of those groups in
parallel.
This allows for storage and processing resources to be more fully utilized
during the sort process.


Thanks for looking at this.
Are you aware that there is a group already working on this?
http://lists.gnu.org/archive/html/bug-coreutils/2010-02/msg00223.html



Below is a set of experimental results and the statistics of the machine
that
they were run on. The test below show speed improvements from 45% - 64%.

In this patch the method used to determine which device a particular file is
on
relies on comparisons between struct stat ->  st_dev fields. This leads to
some
problems in cases where files exist on different partitions of the same
disk.


Issues like this are why it might be better to split the data
externally to sort. I've commented before that a threading
solution may only be appropriate for working on a single file:
http://lists.gnu.org/archive/html/bug-coreutils/2009-10/msg00179.html


Preliminary tests have shown that under these situations some performance
improvements are still seen, but they are not as dramatic as in the ideal
case.


system details
CPU: Intel(R) Core(TM)2 Quad CPUQ6600  @ 2.40GHz
cache size: 4096 KB
RAM: 2G

disk details:
/q/0:
  Timing cached reads:   7556 MB in  2.00 seconds = 3781.13 MB/sec
  Timing buffered disk reads:  320 MB in  3.00 seconds = 106.50 MB/sec
/q/1:
  Timing cached reads:   7772 MB in  2.00 seconds = 3888.09 MB/sec
  Timing buffered disk reads:  224 MB in  3.03 seconds =  74.00 MB/sec
/q/2:
  Timing cached reads:   7444 MB in  2.00 seconds = 3724.34 MB/sec
  Timing buffered disk reads:  222 MB in  3.01 seconds =  73.77 MB/sec
/q/3:
  Timing cached reads:   7512 MB in  2.00 seconds = 3758.49 MB/sec
  Timing buffered disk reads:  240 MB in  3.02 seconds =  79.50 MB/sec

before each run the cache was dropped with:
   echo 3>  /proc/sys/vm/drop_caches

store temp files in RAM:
   mount -t tmpfs -o size=1000M /ram /ram
   export TMPDIR=/ram

multidisk sort with 4x 50M inputs
$ time -v sort_multidisk /q/0/50M /q/1/50M /q/2/50M /q/3/50M>  /dev/null


Could you compare:

time -v sort -m <(sort /q/0/50M) \
<(sort /q/1/50M) \
<(sort /q/2/50M) \
<(sort /q/3/50M) > /dev/null

cheers,
Pádraig.