Also, each external symbol (function, macro, variable) should have a comment
explaining what it does. Currently I'm at a bit of a loss trying to
figure out what things do, so my comments will be limited.
+#ifndef _CHECKSUM_H
+#define _CHECKSUM_H 1
+
+#include sys/types.h
+#include
On Sat, Jul 16, 2005 at 07:01:53AM -0700, Frederik Eaton wrote:
If it's the pseudorandomness, I think mentioning that is redundant,
and the same thing I said about not wanting implementation in the API
applies - a good pseudorandom number generator should be externally
indistinguishable from
Frederik Eaton [EMAIL PROTECTED] wrote:
Attached is a second patch, which contains a ChangeLog entry and some
formatting changes as requested by Jim.
Can you update your patch to be relative to coreutils-CVS,
http://savannah.gnu.org/cvs/?group=coreutils
rather than to the aging 5.2.1?
Also,
Thanks for working on this. You've gotten further than anyone else
has! Some quick comments:
Frederik Eaton [EMAIL PROTECTED] writes:
Is there a script for making a patch with all the right files excluded
by the way?
Not yet. That's on the list of things to do. The fix will be to remove
Hi,
Attached is a third patch.
Is there a script for making a patch with all the right files excluded
by the way? cvs diff produces a huge amount of unrelated output
because of files that are both in the repository and touched by
configure, and it doesn't list new files. And diff doesn't seem to
Attached is a second patch, which contains a ChangeLog entry and some
formatting changes as requested by Jim.
On Tue, Jun 07, 2005 at 08:47:14AM +0200, Jim Meyering wrote:
Frederik Eaton [EMAIL PROTECTED] wrote:
Here is a preliminary patch for basic shuffling functionality in
'sort', with
Frederik Eaton [EMAIL PROTECTED] wrote:
Here is a preliminary patch for basic shuffling functionality in
'sort', with same-keys-sort-together behavior. It adds two options:
-R to compare based on a hash of key, and --seed to specify salt
for the hash. If --seed is not given then the default is
Here is a preliminary patch for basic shuffling functionality in
'sort', with same-keys-sort-together behavior. It adds two options:
-R to compare based on a hash of key, and --seed to specify salt
for the hash. If --seed is not given then the default is to read
from /dev/random or /dev/urandom.
So, the prototype runs a little slower than I had expected - it's
currently using md5 hashes, I could also look into CRC or something
faster (but less secure, for those concerned). Anyway here is a
sample:
$ time ./sort -R /usr/share/dict/words /dev/null
./sort -R /usr/share/dict/words
$ print -l g f e d c b a | ./sort -R | md5sum
dda0a6660319917afd6ed021f27fb452 -
$ print -l a b c d e f g | ./sort -R | md5sum
dda0a6660319917afd6ed021f27fb452 -
By the way, this wouldn't actually be the default behavior, you'd have
to specify an explicit seed and have it be the same each
On Saturday 04 June 2005 01:11, Philip Rowlands wrote:
Extend sort.
In extending sort, would the O(n) shuffle algorithm be implemented? Or would
the existing O(n log n) mergesort logic be used via keys?
Though I am not a sort maintainer, and am probably the least qualified to pass
assumption
If you intened on making shuffle part of coreutils someday, then you
could use the GNU womb repository on Savannah. You'd need to get
proper papers form [EMAIL PROTECTED] though, and if you add code that was
written by someone else we'd need papers from them too. But this
would make putting
Frederik Eaton [EMAIL PROTECTED] writes:
How about this: Put an upper limit on the number of samples that your
adversary will be able to try before the earth blows up.
But that's not how adversarial attacks work. They work by exploiting
flaws in your pseudorandom number generator.
Thus, for
On Sat, Jun 04, 2005 at 04:29:50PM -0700, Paul Eggert wrote:
were truly random. The application to high-stakes poker games should
be obvious.
snip
(I agree that all this is overkill for non-adversarial applications.)
Aside from shuffling cards (which should rarely if ever involve more
than
On Sat, Jun 04, 2005 at 04:29:50PM -0700, Paul Eggert wrote:
Frederik Eaton [EMAIL PROTECTED] writes:
How about this: Put an upper limit on the number of samples that your
adversary will be able to try before the earth blows up.
But that's not how adversarial attacks work. They work by
Philip Rowlands [EMAIL PROTECTED] writes:
I'm still interested to read what Paul considers to be the
difficulties of such an implementation?
Suppose you're randomizing an input file of 10 million lines. And
suppose you want to approximate a truly random key by using a
128-bit random key for
On Thu, Jun 02, 2005 at 11:31:26PM -0700, Paul Eggert wrote:
Philip Rowlands [EMAIL PROTECTED] writes:
I'm still interested to read what Paul considers to be the
difficulties of such an implementation?
Suppose you're randomizing an input file of 10 million lines. And
suppose you want
On Wed, Jun 01, 2005 at 06:52:08PM -0700, Frederik Eaton wrote:
So, what is the current state of things? Who is in charge of accepting
patches?
The coreutils maintainers, who are all subscribed to this list I
think. So, you're asking in the right place.
Are we decided that a 'shuffle'
On Thu, 2 Jun 2005, James Youngman wrote:
I think the consensus is that the functionality belongs in sort.
Beyond that things are a bit less clear. However, Paul put forward a
proposed usage which adapts the current -k option (see
Frederik Eaton [EMAIL PROTECTED] wrote:
So, what is the current state of things? Who is in charge of accepting
patches? Are we decided that a 'shuffle' command but no 'sort -R'
facility would be best, or that it would be good to have both, or is
it still in question whether either would be
James I think the consensus is that the functionality belongs in sort.
James Beyond that things are a bit less clear. However, Paul put forward a
James proposed usage which adapts the current -k option (see
James http://lists.gnu.org/archive/html/bug-coreutils/2005-05/msg00179.html).
James Nobody
On Thu, 2 Jun 2005, Frederik Eaton wrote:
Phil Is it that the app must guarantee all lines of a
Phil non-seekable stdin must have an equal chance of any sort order?
See my comment to James above. I think one need not make this
guarantee, since only a tiny fraction of possible sort orders will be
There seems to be some sloppy thinking regarding efficiency and uniform
randomness. Regarding uniform randomness, the infamous Oleg of
comp.lang.{scheme,functional} writes:
Furthermore, if we have a sequence of N elements and associate with
each element a key -- a random number uniformly
There seems to be some sloppy thinking regarding efficiency and uniform
randomness. Regarding uniform randomness, the infamous Oleg of
comp.lang.{scheme,functional} writes:
Furthermore, if we have a sequence of N elements and associate with
each element a key -- a random number uniformly
Phil Is it that the app must guarantee all lines of a
Phil non-seekable stdin must have an equal chance of any sort order?
See my comment to James above. I think one need not make this
guarantee, since only a tiny fraction of possible sort orders will be
able to be tried by the user.
So, what is the current state of things? Who is in charge of accepting
patches? Are we decided that a 'shuffle' command but no 'sort -R'
facility would be best, or that it would be good to have both, or is
it still in question whether either would be accepted?
Frederik
--
On Monday 30 May 2005 23:02, Frederik Eaton wrote:
I hope that you aren't proposing an algorithm which is similar to
card-shuffling. That would be exactly like merge-sorting on a key hash
- i.e. no more efficient.
Agreed! The algorithm implemented is a slight variation on Knuth's shuffle
On Wed, May 25, 2005 at 10:58:41AM +0100, James Youngman wrote:
On Tue, May 24, 2005 at 09:55:35AM -0700, Paul Eggert wrote:
That way, you could use, e.g.:
sort -k 2,2 -k R
which would mean sort by the 2nd field, but if there are ties then
sort the ties randomly. sort -R would
I'm not following exactly - in part I think it is premature to discuss
implementation details now. And as for the idea to put shuffle
functionality in a separate command, this and other issues were
discussed at length in the previous thread which starts here:
Hi Frederik! I guess we're both a little confused :) My question is why would
I sort AND shuffle in the same command? Are we talking sort the whole data
set and shuffle a subset? I guess I'm having a hard time thinking why I would
randomize via key--not saying that there aren't reasons, I'm
On Mon, May 30, 2005 at 09:25:45AM +, Davis Houlton wrote:
Hi Frederik! I guess we're both a little confused :) My question is why would
I sort AND shuffle in the same command? Are we talking sort the whole data
set and shuffle a subset? I guess I'm having a hard time thinking why I would
On Tue, May 24, 2005 at 09:55:35AM -0700, Paul Eggert wrote:
That way, you could use, e.g.:
sort -k 2,2 -k R
which would mean sort by the 2nd field, but if there are ties then
sort the ties randomly. sort -R would be short for sort -k R.
Perhaps this approach avoids the problems that
On Tue, May 24, 2005 at 11:25:48AM +0100, [EMAIL PROTECTED] wrote:
James Youngman wrote:
Davis Houlton writes:-
I recently had to write a shuffle utility for a personal project and
was wondering if it would make a canidate for the coreutils
suite. It seems like the kind of utility the
On Mon, May 23, 2005 at 08:02:19PM +, Davis Houlton wrote:
On Monday 23 May 2005 16:35, you wrote:
So, I think that shuffle is a good idea.
Great! As I wasn't sure if this was a good idea or not, right now the
functionality is quite minimal. I agree that it needs to be exapnded, and
James Youngman wrote:
Davis Houlton writes:-
I recently had to write a shuffle utility for a personal project and
was wondering if it would make a canidate for the coreutils
suite. It seems like the kind of utility the toolbox could use
(maybe under section 3. Output of entire files).
I'm just a lurker so my opinion doesn't count. for much.
Certainly I don't expect everyone to be a programmer in order to be able
to shuffle their playlist, but perhaps an example needs to be added to
the sort man-page stating how easy is to accomplish with tools that are
likely already
Lemley James - jlemle wrote:
Certainly I don't expect everyone to be a programmer in order to be able
to shuffle their playlist, but perhaps an example needs to be added to
the sort man-page stating how easy is to accomplish with tools that are
likely already installed on your system
[EMAIL PROTECTED] writes:
Logically the only difference from sort is the low level ordering
algorithm. so I vote for and extra arg to sort:
--sort=random.
More generally, sort could pretend that every line had an extra field
called R whose contents are random. That way, you could use, e.g.:
On Tuesday 24 May 2005 15:33, Frederik Eaton wrote:
reason to expand the functionality of 'sort'. But in my opinion a more
important reason is that the set of commands that one runs on a unix
system comprise a language, which is a very important language from a
user's perspective, and if
Davis Houlton writes:-
I recently had to write a shuffle utility for a personal project and
was wondering if it would make a canidate for the coreutils
suite. It seems like the kind of utility the toolbox could use
(maybe under section 3. Output of entire files).
This behaviour was proposed
40 matches
Mail list logo