Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread ki7mt
Hi Bill,

Has the patch been checked-in or will it be available only via the  patch ?

73's
Greg, KI7MT

On 2/4/2015 10:01 AM, Bill Somerville wrote:
 Hi All,

 it looks like the latest jt9 using OpenMP and multi-threaded FFTs along
 with Joe's recent re-factorings for performance seem to be approaching
 stability. If anyone wants to try them out on air with WSJT-X, the
 attached patch will allow WSJT-X to be built with them enabled.

 Note that the patch enables a fairly lengthy FFT plan optimization and
 the first decode cycle may take a few minutes to complete, do not kill
 the program as the accumulated FFT wisdom is written out at the end of a
 session. Once the FFTW wisdom is saved there will be no further delays.

 Testing here results in decodes that are so fast that it hardly seems
 worth looking fro any more performance improvements until we start
 getting 100's concurrent of QSOs per band. Impressive stuff!

 Running on a quad-core Core i7 laptop here.

 73
 Bill
 G4WJS.


 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. http://goparallel.sourceforge.net/



 ___
 wsjt-devel mailing list
 wsjt-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/wsjt-devel


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread Bill Somerville
On 04/02/2015 17:09, ki...@yahoo.com wrote:
 Hi Bill,
Hi Greg,

 Has the patch been checked-in or will it be available only via the  patch ?
As it will not work on Mac at this stage I cannot check it in. For now I 
think the developers who build themselves should try it out for a while. 
Given that multi-threading is very hard to empirically test, there are 
bound to be a few outstanding problems to solve anyway.

 73's
 Greg, KI7MT
73
Bill
G4WJS.

 On 2/4/2015 10:01 AM, Bill Somerville wrote:
 Hi All,

 it looks like the latest jt9 using OpenMP and multi-threaded FFTs along
 with Joe's recent re-factorings for performance seem to be approaching
 stability. If anyone wants to try them out on air with WSJT-X, the
 attached patch will allow WSJT-X to be built with them enabled.

 Note that the patch enables a fairly lengthy FFT plan optimization and
 the first decode cycle may take a few minutes to complete, do not kill
 the program as the accumulated FFT wisdom is written out at the end of a
 session. Once the FFTW wisdom is saved there will be no further delays.

 Testing here results in decodes that are so fast that it hardly seems
 worth looking fro any more performance improvements until we start
 getting 100's concurrent of QSOs per band. Impressive stuff!

 Running on a quad-core Core i7 laptop here.

 73
 Bill
 G4WJS.


 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. http://goparallel.sourceforge.net/



 ___
 wsjt-devel mailing list
 wsjt-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/wsjt-devel

 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 wsjt-devel mailing list
 wsjt-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/wsjt-devel


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread ki7mt
Hi Bill,

Ok, I'll just wait then, as I can't process / act upon more than 10 or 
12 decodes in 8 seconds anyway.

73's
Greg, KI7MT

On 2/4/2015 10:13 AM, Bill Somerville wrote:
 On 04/02/2015 17:09, ki...@yahoo.com wrote:
 Hi Bill,
 Hi Greg,

 Has the patch been checked-in or will it be available only via the  patch ?
 As it will not work on Mac at this stage I cannot check it in. For now I
 think the developers who build themselves should try it out for a while.
 Given that multi-threading is very hard to empirically test, there are
 bound to be a few outstanding problems to solve anyway.

 73's
 Greg, KI7MT
 73
 Bill
 G4WJS.

 On 2/4/2015 10:01 AM, Bill Somerville wrote:
 Hi All,

 it looks like the latest jt9 using OpenMP and multi-threaded FFTs along
 with Joe's recent re-factorings for performance seem to be approaching
 stability. If anyone wants to try them out on air with WSJT-X, the
 attached patch will allow WSJT-X to be built with them enabled.

 Note that the patch enables a fairly lengthy FFT plan optimization and
 the first decode cycle may take a few minutes to complete, do not kill
 the program as the accumulated FFT wisdom is written out at the end of a
 session. Once the FFTW wisdom is saved there will be no further delays.

 Testing here results in decodes that are so fast that it hardly seems
 worth looking fro any more performance improvements until we start
 getting 100's concurrent of QSOs per band. Impressive stuff!

 Running on a quad-core Core i7 laptop here.

 73
 Bill
 G4WJS.


 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. http://goparallel.sourceforge.net/



 ___
 wsjt-devel mailing list
 wsjt-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/wsjt-devel

 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 wsjt-devel mailing list
 wsjt-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/wsjt-devel


 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 wsjt-devel mailing list
 wsjt-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/wsjt-devel


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread Bill Somerville
On 04/02/2015 19:49, Joe Taylor wrote:

Hi Joe,
 On 2/4/2015 2:26 PM, Michael Black wrote:
 Not sure we want more than 1 thread
 As I demonstrated and wrote here several hours ago:

 When using OpenMP to run JT9 and JT65 decoders in parallel, we gain
 almost nothing by using multi-threading for the FFTW plans.

 I think this will remain true.  I recommend using -w 2 -m 1 to set up
 the FFTW plans, and using two threads (and only two) for the parallel
 sections initiated in decoder.f90 on multi-core machines.
Before completely discarding MT FFTW3 I would like to try something. Can 
you give a brief rough summary of all the FFT sizes used by jt9?

If there are many smaller FFT being run then I think their plans should 
be limited to 1 thread and only unleash 2 or more threads for the big FFTS.

   -- Joe
73
Bill
G4WJS.

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread Michael Black
Can't this be decided by objective testing?
Can anybody show any advantage to  -m 1 with r4940?

And we can always revisit this if it ever proves to be worthwhile.  

On both my machines -m 1 is the best time by about 20% now.
I'm using this now -- the argument you pass is the # of threads.  Easily
adaptable to Unix.
Mike W9MDB

#include stdio.h
#include math.h

int main(int argc,char *argv[])
{
char cmd[4096];
double total=0;
int n=0;
int nthreads=1;
char buf[4096];
if (argc  1) {
nthreads = atoi(argv[1]);
}
printf(Testing %d threads\n,nthreads);
sprintf(cmd,TimeMem-1.0.exe jt9_omp -p 1 -d 3 -w 2 -m %d
130610_2343.wav | grep Elapsed | cut -f2 -d: doit.txt,nthreads);
while(1) {
system(cmd);
FILE *fp=fopen(doit.txt,r);
fgets(buf,sizeof(buf),fp);
fclose(fp);
double sec = atof(buf);
++n;
total+=sec;
double avg = total/n;
if (sec  avg*1.5) {
printf(\nlong run %.2f avg=%.2f\n,sec,avg);
}
printf(%d sec=%.2f avg=%.2f\r,n,sec,avg);
fflush(stdout);
}
}

-Original Message-
From: Bill Somerville [mailto:g4...@classdesign.com] 
Sent: Wednesday, February 04, 2015 1:55 PM
To: wsjt-devel@lists.sourceforge.net
Subject: Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in
WSJT-X

On 04/02/2015 19:49, Joe Taylor wrote:

Hi Joe,
 On 2/4/2015 2:26 PM, Michael Black wrote:
 Not sure we want more than 1 thread
 As I demonstrated and wrote here several hours ago:

 When using OpenMP to run JT9 and JT65 decoders in parallel, we gain 
 almost nothing by using multi-threading for the FFTW plans.

 I think this will remain true.  I recommend using -w 2 -m 1 to set 
 up the FFTW plans, and using two threads (and only two) for the 
 parallel sections initiated in decoder.f90 on multi-core machines.
Before completely discarding MT FFTW3 I would like to try something. Can you
give a brief rough summary of all the FFT sizes used by jt9?

If there are many smaller FFT being run then I think their plans should be
limited to 1 thread and only unleash 2 or more threads for the big FFTS.

   -- Joe
73
Bill
G4WJS.


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread Joe Taylor
On 2/4/2015 2:26 PM, Michael Black wrote:
 Not sure we want more than 1 thread

As I demonstrated and wrote here several hours ago:

When using OpenMP to run JT9 and JT65 decoders in parallel, we gain
almost nothing by using multi-threading for the FFTW plans.

I think this will remain true.  I recommend using -w 2 -m 1 to set up 
the FFTW plans, and using two threads (and only two) for the parallel 
sections initiated in decoder.f90 on multi-core machines.

-- Joe

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread Bill Somerville
On 04/02/2015 19:26, Michael Black wrote:
Hi Mike,
 Not sure we want more than 1 threadmy testing shows this on my 8-core box 
 since this patch would give me 6 threads.
Agreed, the patch uses a trivially crude algorithm for the FFT thread 
count which will use way too many threads on a processor with more than 
4 CPUs. In your case that will be 14 threads for FFTs :(
 I think the old improvement 1 thread showed was overtaken by multi-threading 
 the top level.
You need to ensure that you run at least one decode in each mode before 
running any timing tests. Changing the number of FFT threads requires 
new FFT wisdom to be calculated.

You could try:

   , -m, QString::number (qMin (qMax (QThread::idealThreadCount () 
- 2, 1), 2)) //FFTW threads

for line 379 of mainwindow.cpp for a more controlled thread 
utilization.That sort of approach will also give sane results for those 
who run multiple instances of WSJT-X with multiple RX SDRs.

We need to fine tune the FFT plans with personalized thread counts and 
perhaps have some user setting to set a maximum number of compute 
intensive threads.

 Thread1   2   3   4   5   6
 4930  1.101.271.241.231.291.31
   1.101.211.251.251.271.31
   1.091.251.231.221.291.32
   1.091.221.241.221.271.29
   1.111.231.221.231.271.33
   1.121.231.251.271.291.31
   1.081.221.241.231.261.30
   1.101.251.231.251.311.30
   1.141.231.241.261.271.32
   1.131.221.221.231.281.29
   1.111.231.251.251.271.32
   1.121.231.231.241.271.29
   1.121.251.241.231.271.29
   1.081.241.251.241.281.34
   1.081.231.231.211.301.29
   1.081.271.241.251.281.27
   1.151.251.251.291.281.30
   1.131.261.221.271.271.30
   1.101.251.231.241.271.30
 Avg   1.091.251.241.241.281.30

73
Bill
G4WJS.
 -Original Message-
 From: Bill Somerville [mailto:g4...@classdesign.com]
 Sent: Wednesday, February 04, 2015 11:02 AM
 To: WSJT software development
 Subject: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

 Hi All,

 it looks like the latest jt9 using OpenMP and multi-threaded FFTs along with 
 Joe's recent re-factorings for performance seem to be approaching stability. 
 If anyone wants to try them out on air with WSJT-X, the attached patch will 
 allow WSJT-X to be built with them enabled.

 Note that the patch enables a fairly lengthy FFT plan optimization and the 
 first decode cycle may take a few minutes to complete, do not kill the 
 program as the accumulated FFT wisdom is written out at the end of a session. 
 Once the FFTW wisdom is saved there will be no further delays.

 Testing here results in decodes that are so fast that it hardly seems worth 
 looking fro any more performance improvements until we start getting 100's 
 concurrent of QSOs per band. Impressive stuff!

 Running on a quad-core Core i7 laptop here.

 73
 Bill
 G4WJS.


 --
 Dive into the World of Parallel Programming. The Go Parallel Website,
 sponsored by Intel and developed in partnership with Slashdot Media, is your
 hub for all things parallel software development, from weekly thought
 leadership blogs to news, videos, case studies, tutorials and more. Take a
 look and join the conversation now. http://goparallel.sourceforge.net/
 ___
 wsjt-devel mailing list
 wsjt-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/wsjt-devel


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread Michael Black
Not sure we want more than 1 threadmy testing shows this on my 8-core box 
since this patch would give me 6 threads.
I think the old improvement 1 thread showed was overtaken by multi-threading 
the top level.

Thread  1   2   3   4   5   6
49301.101.271.241.231.291.31
1.101.211.251.251.271.31
1.091.251.231.221.291.32
1.091.221.241.221.271.29
1.111.231.221.231.271.33
1.121.231.251.271.291.31
1.081.221.241.231.261.30
1.101.251.231.251.311.30
1.141.231.241.261.271.32
1.131.221.221.231.281.29
1.111.231.251.251.271.32
1.121.231.231.241.271.29
1.121.251.241.231.271.29
1.081.241.251.241.281.34
1.081.231.231.211.301.29
1.081.271.241.251.281.27
1.151.251.251.291.281.30
1.131.261.221.271.271.30
1.101.251.231.241.271.30
Avg 1.091.251.241.241.281.30


-Original Message-
From: Bill Somerville [mailto:g4...@classdesign.com] 
Sent: Wednesday, February 04, 2015 11:02 AM
To: WSJT software development
Subject: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

Hi All,

it looks like the latest jt9 using OpenMP and multi-threaded FFTs along with 
Joe's recent re-factorings for performance seem to be approaching stability. If 
anyone wants to try them out on air with WSJT-X, the attached patch will allow 
WSJT-X to be built with them enabled.

Note that the patch enables a fairly lengthy FFT plan optimization and the 
first decode cycle may take a few minutes to complete, do not kill the program 
as the accumulated FFT wisdom is written out at the end of a session. Once the 
FFTW wisdom is saved there will be no further delays.

Testing here results in decodes that are so fast that it hardly seems worth 
looking fro any more performance improvements until we start getting 100's 
concurrent of QSOs per band. Impressive stuff!

Running on a quad-core Core i7 laptop here.

73
Bill
G4WJS.


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel