Re: Mersenne: Factors Everywhere

1999-09-26 Thread Brian J. Beesley

Hi everyone,

A few points about publishing known factors.

1) I don't see any particular need to make available files in human-
readable format, provided that we provide a specification for the 
file, and also binary executables for common platforms for converting 
a machine-readable file to human-readbale format. On the other hand, 
it could be that "zipping" a human-readable format file provides near-
optimal compression.

2) Since factors are of the form 2kp+1, it is only neccessary to list 
p  k. This saves a significant amount of storage (compared with 
listing p and 2kp+1) and the proper factor is easily recovered.

3) If the data was held in variable-length records in a format 
similar to

p,n,k1,k2,k3...

where n is the number of known factors and k1, k2, k3,... are the 
multipliers for each of the known factors in turn, this would save 
much duplication compared with the current format (of factor.txt).

4) For values of p which fit into a (integer) CPU register and 
"small" values of k (up to approx. 64K), trial factoring is so fast 
that I doubt it would be any faster to search a file than to repeat 
the calculation - even if the file is already available locally. 
Therefore I suggest that, in order to reduce substantially the size 
of the frequently-updated publicly-available known factors file, 
"small" values of k are omitted, and a program be provided instead to 
generate the data (with binary executables for common platforms).

Perhaps the file should contain a "flag" denoting that factors are 
known, but have been omitted for this reason.

We should still have a large master file somewhere, but this need not 
be uploaded frequently.

5) Following on from (4), it seems to make sense to record the 
highest value of k tested by trial factoring, rather than the maximum 
number of bits.

6) PrimeNet trial factoring has moved well into the 10 millions, 
however George's factor.zip file contains data only up to 10 million. 
I know there is a problem with uploading the large files concerned; 
hopefully, the suggestions above will help to reduce the size of the 
file, sufficiently to make it feasible to expand the exponent range 
to cover (at least) the active Primenet assignments for long enough 
for cheap, high-speed Internet connectivity to become widespread.

7) I have several hundred megabytes of disk space available on my ftp 
server, which has reasonable Internet access - at least 8 Mbps 
connectivity to the Internet core - and would be happy to provide 
means for anyone interested to upload factoring data (or anything 
else, strictly relevant to Mersenne or closely-related numbers) for 
the purpose of making it publicly available.


Regards
Brian Beesley
_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: Factors Everywhere

1999-09-26 Thread Will Edgington


Brian J. Beesley writes:

   1) I don't see any particular need to make available files in human-
   readable format, provided that we provide a specification for the 
   file, and also binary executables for common platforms for converting 
   a machine-readable file to human-readbale format.

I, personally, have no way of producing executables except for Intel
CPUs, and presently only for Linux (I just yesterday got a old P100
machine up running Win98 (and Prime95, of course:)).

I have thought about a binary data format off and on for a while now,
but mostly in terms of making updates easier by designing a format
that can be updated in place (at least most of the time), which is a
different problem.

   On the other hand, it could be that "zipping" a human-readable
   format file provides near-optimal compression.

Most likely, and this avoids having to provide executables to read it:
they're already out there.

   2) Since factors are of the form 2kp+1, it is only neccessary to list 
   p  k. This saves a significant amount of storage (compared with 
   listing p and 2kp+1) and the proper factor is easily recovered.

Unfortunately, it's only this simple when p is odd.  For even p,
factors can be one more than any multiple of p, not just one more than
even multiples of p.  And I'd really rather not have a different
format for even exponents ...

   3) If the data was held in variable-length records in a format 
   similar to

   p,n,k1,k2,k3...

   where n is the number of known factors and k1, k2, k3,... are the 
   multipliers for each of the known factors in turn, this would save 
   much duplication compared with the current format (of factor.txt).

This is also a good idea, but a few of the lines would be _very_ long,
which some text editors and other programs will have problems with.
I'd also suggest dropping n; including it doesn't really add anything
and it can be computed quickly from the other data.  Perhaps something
like:

n
p,pk1
,pk2
,pk3

Note that M(n) has no known factors.

   4) For values of p which fit into a (integer) CPU register and 
   "small" values of k (up to approx. 64K), trial factoring is so fast 
   that I doubt it would be any faster to search a file than to repeat 
   the calculation - even if the file is already available locally. 

Yes, but then there's no way to be sure whether the factor is already
known or not.  There was a bug in an early post-GIMPS-start factorer
of mine that caused it to miss _exactly_ one factor, of M(47).  If I
hadn't had an explicit list of factors to compare with, I would have
never found the bug.

   Perhaps the file should contain a "flag" denoting that factors are 
   known, but have been omitted for this reason.

This might be enough, though.  But note that this can already be done
using the DATABASE format, if you're willing to omit all factors.  Hm;
so perhaps a DATABASE file with all the exponents and trial factoring
extents and a second, human readable, file, as above, is a possibility.

And I've long thought about a small program ( function) to do fast,
binary lookups in large Mersenne data files like these (human readable
or binary), akin to the common UNIX "look" program for dictionary
files, but, of course, doing a numeric search rather than a dictionary
search.

   We should still have a large master file somewhere, but this need not 
   be uploaded frequently.

I update my local copy about twice a week usually; it takes only a
couple hours on my new PIII/450MHz machine and was well under a day
even on my old P233MMX system.  These times include automatically
downloading data out there every few days to a month, as appropriate
for the particular file.

   5) Following on from (4), it seems to make sense to record the 
   highest value of k tested by trial factoring, rather than the maximum 
   number of bits.

This is lacking in the DATABASE format, but my format implies this
info.

And, for a year or more, I've actually been treating the "distance" in
k's to get to the next power of two as a "gap" in the trial factoring
data and thus most of it is just above powers of two.  Note also that
exponents with known factors can be worked on by Prime95 again, after
mersfacgmp or some other factorer pushes the extent past the next
power of two above a factor.

   6) PrimeNet trial factoring has moved well into the 10 millions, 
   however George's factor.zip file contains data only up to 10 million. 

Yup.  This also means that my factors and his for exponents above 10
million have not been compared for some time.

   I know there is a problem with uploading the large files concerned;
   hopefully, the suggestions above will help to reduce the size of
   the file, sufficiently to make it feasible to expand the exponent
   range to cover (at least) the active Primenet assignments for long
   enough for cheap, high-speed Internet connectivity to become
   widespread.

Or perhaps George should simply shift to the factors of the exponents

Re: Mersenne: Factors Everywhere

1999-09-19 Thread Conrad Curry



On Sat, 18 Sep 1999, Eric Hahn wrote:

   What I'm looking for is the following two items for *all*
 Mersenne numbers 2^p-1 where p is prime and p1:

  It can be proven that there are an infinite number of these.


   1) All known factors (including, but not limited to,
  the smallest known factor (noted if it isn't))
   2) Largest potential factor attempted
   I ask that the two items are human-readable at the
 very least.

  Will Edgington maintains this information, but it may be
hundreds of megabytes in size.  If a website, such as
Entropia, has the space it will be useful to make this database
available (in many small compressed files) so that others may
use it.


_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: Factors Everywhere

1999-09-19 Thread Eric Hahn


   What I'm looking for is the following two items for *all*
 Mersenne numbers 2^p-1 where p is prime and p1:

  It can be proven that there are an infinite number of these.

Yeah, right, I knew that...  I guess I should've clarified
and said for all of them that the information is known :(  
If no information is known where p100M, then what can I do??

   1) All known factors (including, but not limited to,
  the smallest known factor (noted if it isn't))
   2) Largest potential factor attempted
   I ask that the two items are human-readable at the
 very least.

  Will Edgington maintains this information, but it may be
hundreds of megabytes in size.  If a website, such as
Entropia, has the space it will be useful to make this database
available (in many small compressed files) so that others may
use it.

Isn't the majority of the information he has in
machine-readable format though??  I can't make much use
of it, if I can't read it...

Eric Hahn

_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers