Re: Mersenne: Two L-L tests at once?

2002-03-01 Thread Brian J. Beesley

On Thursday 28 February 2002 22:03, Guillermo Ballester Valor wrote:
 Hi,

 On Thu 28 Feb 2002 22:19, Brian J Beesley wrote:
[ snip ]
  The difference here is that your method generates memory bus traffic at
  twice the rate George's method takes advantage of the fact that (with
  properly aligned operands) fetching the odd element data automatically
  fetches the adjacent even element data

 The streams would be alternated :  stream0_data(n) , stream1_data(n),
 stream0_data(n+1), stream1_data(n+1)

 When fetching data(n) for a stream we also get the other

Yes, this scheme does seem to work

 The memory bottleneck was the first thing I thought, and I was near to
 discard the idea when I realized that the trig bata would be the same, and
 the required memory access would be less than double the single stream
 scheme If a double stream version cost less than double the single one the
 we can speed up the project a bit

On Friday 01 March 2002 00:37, George Woltman wrote:

 Well, that would be true if SSE2 had a multiply vector by scalar
 instruction That is, to multiply two values by the same trig value, you
 must either load two copies the trig value or add instructions to copy the
 value into both halves
 of the SSE2 register

I can't see that being a major problem Surely there's only one main memory 
fetch to load the two halves of the SSE2 register with the same value, and 
surely the loads can be done in parallel since there's no interaction
( M - X; then X - R1  X - R2 in parallel, where X is one of the temporary 
registers available to the pipeline)

On Thursday 28 February 2002 21:20, Steinar H Gunderson wrote:

 Testing a number in parallel with itself is obviously a bad idea if there
 occurs an undetected error :-)

Sure But the only way there would be a problem here (given that the data 
values are independent because of the different random offsets) is if there 
was a major error like miscounting the number of iterations This is 
relatively easy to test out

I'm sort of marginally uneasy, rather than terrified, about running a 
double-check in parallel with the first test on the same system at the same 
time Also, I think most people would rather complete one assignment in time 
T rather than two assignments in time 2T with both results unknown till they 
both complete Against this is that Guillermo's suggestion does something to 
counter the relatively low rate at which DCs are completed

Regards
Brian Beesley

_
Unsubscribe  list info -- http://wwwndatechcom/mersenne/signuphtm
Mersenne Prime FAQ  -- http://wwwtasamcom/~lrwiman/FAQ-mers



Re: Mersenne: Two L-L tests at once?

2002-03-01 Thread Guillermo Ballester Valor

Hi,

On Friday 01 Mar 2002 21:22, Brian J Beesley wrote:
 [ snip ]


  The memory bottleneck was the first thing I thought, and I was near to
  discard the idea when I realized that the trig bata would be the same,
  and the required memory access would be less than double the single
  stream scheme If a double stream version cost less than double the
  single one the we can speed up the project a bit

 On Friday 01 March 2002 00:37, George Woltman wrote:
  Well, that would be true if SSE2 had a multiply vector by scalar
  instruction That is, to multiply two values by the same trig value, you
  must either load two copies the trig value or add instructions to copy
  the value into both halves
  of the SSE2 register

 I can't see that being a major problem Surely there's only one main memory
 fetch to load the two halves of the SSE2 register with the same value, and
 surely the loads can be done in parallel since there's no interaction
 ( M - X; then X - R1  X - R2 in parallel, where X is one of the
 temporary registers available to the pipeline)


We would have to evaluate the cost of memory traffic to load data with two 
halves the same, or load two differnt data and then double them in two XMM 
registers I have not any skill in SSE2, no machine to try  This morning 
I've been reading (on the fly) the intel PDF manual, and I saw that the SSE2 
was made by Intel engineers thinking more in multimedia than in Mathematics 
(or GIMPS)  There are some elemental ops they could be implemented to do 
complex number multiplication easy, or a vector by escalar mul, or an 
exchange within halves  :-(  Perhaps in SSE3 :)

 On Thursday 28 February 2002 21:20, Steinar H Gunderson wrote:
  Testing a number in parallel with itself is obviously a bad idea if there
  occurs an undetected error :-)

 Sure But the only way there would be a problem here (given that the data
 values are independent because of the different random offsets) is if there
 was a major error like miscounting the number of iterations This is
 relatively easy to test out

 I'm sort of marginally uneasy, rather than terrified, about running a
 double-check in parallel with the first test on the same system at the same
 time Also, I think most people would rather complete one assignment in
 time T rather than two assignments in time 2T with both results unknown
 till they both complete Against this is that Guillermo's suggestion does
 something to counter the relatively low rate at which DCs are completed


I also was worried about that idea, but every time I think, it seems less 
absurd to me 

OTOH, I don't know how difficult would be the carry and normalization code of 
DWT for two _different_ exponents At first approximation, I recall some code 
I wrote without branches for Glucas, actually a code which makes two streams 
at once I mean perhaps the cost is small

Regards

Guillermo
_
Unsubscribe  list info -- http://wwwndatechcom/mersenne/signuphtm
Mersenne Prime FAQ  -- http://wwwtasamcom/~lrwiman/FAQ-mers



Re: Mersenne: Two L-L tests at once?

2002-02-28 Thread Guillermo Ballester Valor

Hi again:

I received the mail from Mersenne list two times Is it because of subejct? :)
The first time is the mail I sent to list, the second is the same mail 
mirrored by an unknown for me 'ntsys24yucombe' system !?


 Back to the subject, I'm wondering about how fast can we do two L-L test in
 parallel using this SSE2 extensions Basically, I'm thinking in use two
 nearest exponents with the same FFT-length The memory access in FFT phase
 would be the same, the trig data also the same, the most difficult part
 would be the carry-and-normalization pass

This dificult could also dissapear making the second test over the same 
exponent We then get the L-L and double check at once Remember the scheme 
Prime95 uses to make double check is to shift initally a random number of 
bits DWT scrambles the data enough to be reasonabily sure both test are 
independent  A matching result would imply a very confident result A non 
matching result would say us something was wrong  It also would allow us to 
check interim results to be sure all is well so far

Regards

Guillermo

_
Unsubscribe  list info -- http://wwwndatechcom/mersenne/signuphtm
Mersenne Prime FAQ  -- http://wwwtasamcom/~lrwiman/FAQ-mers



Re: Mersenne: Two L-L tests at once?

2002-02-28 Thread Guillermo Ballester Valor


 At 11:03 PM 2/28/2002 +0100, Guillermo Ballester Valor wrote:
 The memory bottleneck was the first thing I thought, and I was near to
 discard the idea when I realized that the trig bata would be the same, and
 the required memory access would be less than double the single stream
  scheme.

 Well, that would be true if SSE2 had a multiply vector by scalar
 instruction. That is, to multiply two values by the same trig value, you
 must either load two copies the trig value or add instructions to copy the
 value into both halves
 of the SSE2 register.


Yes, I was thinking in copy the trig value from a half to other, although I 
don't know how would be the cost.
_
Unsubscribe  list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers