[wsjt-devel] WSJT-X on Fedora 64 Rawhide

2015-02-04 Thread Chuck Forsberg WA7KGX
Running on Fedora Rawhide Linux 3.19 I downloaded wsprx and wsjtx.

svn co svn://svn.code.sf.net/p/wsjt/wsjt/branches/wsjtx
svn co svn://svn.code.sf.net/p/wsjt/wsjt/branches/wsprx

Wsjtx compiled and with a year old kvasd I made the odd contact.
I can't get it to write a log file however.

The cmake process in wsprx generates a spurious reference to
a windows DLL.   Obviously broken for Linux.

-- 
  Chuck Forsberg WA7KGX   c...@omen.com   www.omen.com
Developer of Industrial ZMODEM(Tm) for Embedded Applications
   Omen Technology Inc  "The High Reliability Software"
10255 NW Old Cornelius Pass Portland OR 97231   503-614-0430


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


[wsjt-devel] WSJT-X - Wide Graph: Spectral Display Enhancement Request

2015-02-04 Thread Paul DU2/WA8UGN
Hello Developers:

If not already on the "To Do" list, would you consider the following 
suggested "improvement" for the spectral display in Wide Graph?

The waterfall is excellent - I only hope that this suggestion doesn't have 
to impair the waterfall display in any way just to enhance the spectral 
display.

The version of WSJT-X that I am currently using (v1.5.0-devel r4848) allows 
for three selections of data for the spectral 
display:  "Current," "Cumulative," and "Linear Avg." Of the three, "Linear 
Avg" is the only selection to have the baseline trace clamped at a specific 
level, where it remains regardless of input signal levels, 60-second 
resets, and the like. It's a nice solid line across the spectrum with 
signal representations being the only variations.

The other two selections are not clamped, but are free to venture up and 
down the vertical graduals. The trace's movement can be caused by most 
anything - noise, signal, setting of volume-related controls, etc. Most 
disconcerting is the downward movement of the trace that occurs when a very 
strong signal is present - with the trace sometimes leaving the display at 
the bottom of the graph. This also causes the waterfall to "go dark" for 
those portions of the spectrum close to the very strong signal. Having two 
such signals at either end of the spectrum often results in only those two 
signals appearing on the waterfall with the rest of the graphic display 
blanked out (other weak to medium-strong signals are blanked out).

MY REQUEST: For the "Current" and "Cumulative" selections, is it possible 
to have the spectrum display's baseline trace appear clamped to a specific 
level on the display? Having the baseline appear clamped for these two 
selections, similar to the "Linear Avg" selection, would produce a cleaner 
representation of the spectrum and could eliminate the blanking of other 
signals on the waterfall when accompanied by a very strong signal.

I ask this so as to obtain the best information from the Wide Graph display 
that I can. I know that the bandpass of the filtered audio input doesn't 
change with the introduction of a very strong signal - it's a constant, 
more or less. Plus, I already know the effects that a very strong signal 
will have on weaker signals within the bandpass - my ears let me know.

Thanks for your consideration (and, hopefully, implementation).

Best 73 de Paul DU2/WA8UGN


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] r4932

2015-02-04 Thread Joe Taylor
Hi all,

On 2/4/2015 4:49 PM, Michael Black wrote:
> So does this also mean if you don't check the Flatten box that your decodes
> would be slower since it does the flatten later if the GUi doesn't do it?

No.  And I realized after hitting "Send" that I was wrong to say the 
decodes would be different with nflatten=0.  "Flatten" affects only the 
difplayed spectrum and waterfall.  The decoders take care of their own 
flattening as required.

> What does this if check look for in jt9.f90
> if(nhsym.ge.1 .and. nhsym.ne.nhsym0) then

I have no idea what you're asking about here.

That if() statement just checks (in jt9 commend-line mode) to see 
whether enough data have been read from disk that it's time to call symspec.

-- Joe

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] r4932

2015-02-04 Thread Bill Somerville
On 04/02/2015 21:49, Michael Black wrote:
Hi Mike,
> So does this also mean if you don't check the Flatten box that your decodes
> would be slower since it does the flatten later if the GUi doesn't do it?
> Or is always being done anyways...just not for display unless checked?
That's above my pay grade at the moment :(
> What does this if check look for in jt9.f90
> if(nhsym.ge.1 .and. nhsym.ne.nhsym0) then
That code is calling symspec for each symbol sized chunk (or is that 
half symbol sized) which is similar to what the happens in the GUI. The 
audio input thread sends the audio received in chunks to accumulate the 
symbol spectra and to drive the waterfall and RX meter.

Note that the GUI path to the decoder is back up at the call to jt9a, 
this code is command line jt9 only.
>
>
> BTW..running this now and it seems like decodes are screaming by
> comparativelygood job guys...
It's going to be painful to going back to crunching CAT issues after 
this brief period of big improvements :(
>
> Mike W9MDB
73
Bill
G4WJS.
>
> -Original Message-
> From: Joe Taylor [mailto:j...@princeton.edu]
> Sent: Wednesday, February 04, 2015 3:21 PM
> To: WSJT software development
> Subject: Re: [wsjt-devel] r4932
>
> Hi Mike,
>
> The change was to replace "slope", a currently undefined, vestigial remnant
> of some old code, to "nflatten" -- the correct argument to pass
> to symspec.   The variable "slope" was undefined in jt9.f90.  Inside
> symspec, its value supposedly controlled whether the spectrum being computed
> for the waterfall would be flattened, or not.  Since the variable was
> undefined, it might sometimes be zero, sometimes nonzero.
> Not a good situation if we're trying to make comparative timing tests.
>
> Perhaps you will like it better to set nflatten=0 in jt9.f90.  That will
> make jt9 or jt9_omp run faster from the command line, especially since
> "flat3" is kinda slow.  But if you normally check the "Flatten" box in the
> GUI, your decoding results may not be exactly the same.
>
> In normal operation the timing difference is moot, because the extra work
> takes place during the Rx minute rather than at its end.  It's for the
> CPU-bound stuff at the end of the Rx minute that I have been speeding things
> up.
>
> Perhaps I should look at "flat3", to see why it's so slow -- even though its
> slow behavior has essentially no effect noticeable to an operator.
>
>   -- Joe
>
> On 2/4/2015 3:45 PM, Michael Black wrote:
>> The change on jt9.for replacing nflatten with slope really made
>> processing times much worse.
>>
>> Mainwindow.cpp is using nflatten.so I don't quite understand the log
>> comment about making them consistent.
>>
>>
>>
>> On my testing jt9_omp went from 1.12 to 1.81 seconds.
>>
>>
>>
>> Mike W9MDB
>>
>>
>>
>>
>>
>> --
>>  Dive into the World of Parallel Programming. The Go Parallel
>> Website, sponsored by Intel and developed in partnership with Slashdot
>> Media, is your hub for all things parallel software development, from
>> weekly thought leadership blogs to news, videos, case studies,
>> tutorials and more. Take a look and join the conversation now.
>> http://goparallel.sourceforge.net/
>>
>>
>>
>> ___
>> wsjt-devel mailing list
>> wsjt-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/wsjt-devel
> 
> --
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> ___
> wsjt-devel mailing list
> wsjt-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wsjt-devel
>
>
> --
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> ___
> wsjt-devel mailing list
> wsjt-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wsjt-devel


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leaders

Re: [wsjt-devel] r4932

2015-02-04 Thread Michael Black
So does this also mean if you don't check the Flatten box that your decodes
would be slower since it does the flatten later if the GUi doesn't do it?
Or is always being done anyways...just not for display unless checked?
What does this if check look for in jt9.f90
if(nhsym.ge.1 .and. nhsym.ne.nhsym0) then


BTW..running this now and it seems like decodes are screaming by
comparativelygood job guys...

Mike W9MDB

-Original Message-
From: Joe Taylor [mailto:j...@princeton.edu] 
Sent: Wednesday, February 04, 2015 3:21 PM
To: WSJT software development
Subject: Re: [wsjt-devel] r4932

Hi Mike,

The change was to replace "slope", a currently undefined, vestigial remnant
of some old code, to "nflatten" -- the correct argument to pass 
to symspec.   The variable "slope" was undefined in jt9.f90.  Inside 
symspec, its value supposedly controlled whether the spectrum being computed
for the waterfall would be flattened, or not.  Since the variable was
undefined, it might sometimes be zero, sometimes nonzero. 
Not a good situation if we're trying to make comparative timing tests.

Perhaps you will like it better to set nflatten=0 in jt9.f90.  That will
make jt9 or jt9_omp run faster from the command line, especially since
"flat3" is kinda slow.  But if you normally check the "Flatten" box in the
GUI, your decoding results may not be exactly the same.

In normal operation the timing difference is moot, because the extra work
takes place during the Rx minute rather than at its end.  It's for the
CPU-bound stuff at the end of the Rx minute that I have been speeding things
up.

Perhaps I should look at "flat3", to see why it's so slow -- even though its
slow behavior has essentially no effect noticeable to an operator.

-- Joe

On 2/4/2015 3:45 PM, Michael Black wrote:
> The change on jt9.for replacing nflatten with slope really made 
> processing times much worse.
>
> Mainwindow.cpp is using nflatten.so I don't quite understand the log 
> comment about making them consistent.
>
>
>
> On my testing jt9_omp went from 1.12 to 1.81 seconds.
>
>
>
> Mike W9MDB
>
>
>
>
>
> --
>  Dive into the World of Parallel Programming. The Go Parallel 
> Website, sponsored by Intel and developed in partnership with Slashdot 
> Media, is your hub for all things parallel software development, from 
> weekly thought leadership blogs to news, videos, case studies, 
> tutorials and more. Take a look and join the conversation now. 
> http://goparallel.sourceforge.net/
>
>
>
> ___
> wsjt-devel mailing list
> wsjt-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wsjt-devel


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] r4932

2015-02-04 Thread Michael Black
Changed to nflatten=0 and times are back to what they were.

Now comparing 4930 with 4933 we've gained a little ground...
Mike W9MDB


49304933 nflatten=0 
Thread  1   2   1   2
49301.101.271.031.13
1.101.211.041.12
1.091.251.021.14
1.091.221.061.12
1.111.231.031.14
1.121.231.041.11
1.081.221.051.11
1.101.251.051.14
1.141.231.051.12
1.131.221.051.13
1.111.231.031.12
1.121.231.031.13
1.121.251.031.12
1.081.241.071.12
1.081.231.041.14
1.081.271.031.11
1.151.251.031.11
1.131.261.051.12
1.101.251.051.12
Avg 1.091.251.061.09
Diff2.75%   12.80%


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] r4932

2015-02-04 Thread Joe Taylor
Hi Mike,

The change was to replace "slope", a currently undefined, vestigial 
remnant of some old code, to "nflatten" -- the correct argument to pass 
to symspec.   The variable "slope" was undefined in jt9.f90.  Inside 
symspec, its value supposedly controlled whether the spectrum being 
computed for the waterfall would be flattened, or not.  Since the 
variable was undefined, it might sometimes be zero, sometimes nonzero. 
Not a good situation if we're trying to make comparative timing tests.

Perhaps you will like it better to set nflatten=0 in jt9.f90.  That will 
make jt9 or jt9_omp run faster from the command line, especially since 
"flat3" is kinda slow.  But if you normally check the "Flatten" box in 
the GUI, your decoding results may not be exactly the same.

In normal operation the timing difference is moot, because the extra 
work takes place during the Rx minute rather than at its end.  It's for 
the CPU-bound stuff at the end of the Rx minute that I have been 
speeding things up.

Perhaps I should look at "flat3", to see why it's so slow -- even though 
its slow behavior has essentially no effect noticeable to an operator.

-- Joe

On 2/4/2015 3:45 PM, Michael Black wrote:
> The change on jt9.for replacing nflatten with slope really made processing
> times much worse.
>
> Mainwindow.cpp is using nflatten.so I don't quite understand the log comment
> about making them consistent.
>
>
>
> On my testing jt9_omp went from 1.12 to 1.81 seconds.
>
>
>
> Mike W9MDB
>
>
>
>
>
> --
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
>
>
>
> ___
> wsjt-devel mailing list
> wsjt-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wsjt-devel

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread Bill Somerville
On 04/02/2015 20:30, Joe Taylor wrote:
> Hi Bill,
>
>> Before completely discarding MT FFTW3 I would like to try something. Can
>> you give a brief rough summary of all the FFT sizes used by jt9?
> The file wisdom1.bat in .../wsjtx/lib shows the following FFT plans
> being used:
>
> rif672000 cif77175 cib77175 rif16384 rif884736 cib2048 rif8192 rif512
> rib512 cib512
>
> Several days ago the big FFT in downsam9 was changed from length 884736
> to 604800.  I changed the two big ones to "out of place" transforms.  So
> I think the new lineup is
>
> rof672000 cif77175 cib77175 rif16384 rof604800 cib2048 rif8192 rif512
> rib512 cib512
Thanks for the quick response on that.
>
>> If there are many smaller FFT being run then I think their plans should
>> be limited to 1 thread and only unleash 2 or more threads for the big FFTS.
> I think the only ones for which MT will help are rof672000 and
> rof604800.  For these, three threads (on a 4-core machine) helps
> significantly:
OK, I have amended filbig and dowsam9 to use the '-m #' argument for the 
those two big FFTs, all the rest use 1 thread.
>
> (JTSDK-QT) C:\JTSDK\src\wsjtx\lib)timefft 1 4 or672000
>
> Problem  Threads PlanTimeGflops RMS   iters
> 
> or6720001   0.005  0.004878  13.34  0.002  100
> or6720002   1.427  0.004469  14.55  0.002  100
> or6720003   1.828  0.003406  19.10  0.002  100
> or6720004   2.037  0.003459  18.81  0.002  100
>
> (JTSDK-QT) C:\JTSDK\src\wsjtx\lib)timefft 1 4 or604800
>
> Problem  Threads PlanTimeGflops RMS   iters
> 
> or6048001   0.858  0.005361  10.83  0.002   94
> or6048002   1.901  0.003405  17.06  0.002  100
> or6048003   2.509  0.002658  21.85  0.002  100
> or6048004   2.544  0.002618  22.19  0.002  100
>
> However, these long FFTs make up only about 10% of the total running
> time.  Speeding them up by a factor of 2 will shave about 5% off the
> running time, at best.  And probably not that much, when we're already
> running the two decoders in parallel.
>
> It's not that MT FFTs won't help at all; they just won't help much.
Agreed, but we should remember that at least one of the decoding threads 
is stalled when the FFT executes so, apart from the context switching 
overhead which should be relatively small, there are several CPU threads 
waiting for work on all but the lowest end dual core non-hyperthreaded 
processors.

I am using jt9_omp launch parameters as follows at the moment:

   , "-m", QString::number (qMin (qMax (QThread::idealThreadCount () 
- 1, 1), 3)) //FFTW threads

which will use 3 thread big FFTs if the processor has at least 4 CPU 
threads but only use 1 thread big FFTs on processors with lesser capability.
>
>   -- Joe
73
Bill
G4WJS.

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


[wsjt-devel] r4932

2015-02-04 Thread Michael Black
The change on jt9.for replacing nflatten with slope really made processing
times much worse.

Mainwindow.cpp is using nflatten.so I don't quite understand the log comment
about making them consistent.

 

On my testing jt9_omp went from 1.12 to 1.81 seconds.

 

Mike W9MDB

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread Joe Taylor
Hi Bill,

> Before completely discarding MT FFTW3 I would like to try something. Can
> you give a brief rough summary of all the FFT sizes used by jt9?

The file wisdom1.bat in .../wsjtx/lib shows the following FFT plans 
being used:

rif672000 cif77175 cib77175 rif16384 rif884736 cib2048 rif8192 rif512 
rib512 cib512

Several days ago the big FFT in downsam9 was changed from length 884736 
to 604800.  I changed the two big ones to "out of place" transforms.  So 
I think the new lineup is

rof672000 cif77175 cib77175 rif16384 rof604800 cib2048 rif8192 rif512 
rib512 cib512

> If there are many smaller FFT being run then I think their plans should
> be limited to 1 thread and only unleash 2 or more threads for the big FFTS.

I think the only ones for which MT will help are rof672000 and 
rof604800.  For these, three threads (on a 4-core machine) helps 
significantly:

(JTSDK-QT) C:\JTSDK\src\wsjtx\lib)timefft 1 4 or672000

Problem  Threads PlanTimeGflops RMS   iters

or6720001   0.005  0.004878  13.34  0.002  100
or6720002   1.427  0.004469  14.55  0.002  100
or6720003   1.828  0.003406  19.10  0.002  100
or6720004   2.037  0.003459  18.81  0.002  100

(JTSDK-QT) C:\JTSDK\src\wsjtx\lib)timefft 1 4 or604800

Problem  Threads PlanTimeGflops RMS   iters

or6048001   0.858  0.005361  10.83  0.002   94
or6048002   1.901  0.003405  17.06  0.002  100
or6048003   2.509  0.002658  21.85  0.002  100
or6048004   2.544  0.002618  22.19  0.002  100

However, these long FFTs make up only about 10% of the total running 
time.  Speeding them up by a factor of 2 will shave about 5% off the 
running time, at best.  And probably not that much, when we're already 
running the two decoders in parallel.

It's not that MT FFTs won't help at all; they just won't help much.

-- Joe

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread Michael Black
Can't this be decided by objective testing?
Can anybody show any advantage to > "-m 1" with r4940?

And we can always revisit this if it ever proves to be worthwhile.  

On both my machines -m 1 is the best time by about 20% now.
I'm using this now -- the argument you pass is the # of threads.  Easily
adaptable to Unix.
Mike W9MDB

#include 
#include 

int main(int argc,char *argv[])
{
char cmd[4096];
double total=0;
int n=0;
int nthreads=1;
char buf[4096];
if (argc > 1) {
nthreads = atoi(argv[1]);
}
printf("Testing %d threads\n",nthreads);
sprintf(cmd,"TimeMem-1.0.exe jt9_omp -p 1 -d 3 -w 2 -m %d
130610_2343.wav | grep Elapsed | cut -f2 -d: >doit.txt",nthreads);
while(1) {
system(cmd);
FILE *fp=fopen("doit.txt","r");
fgets(buf,sizeof(buf),fp);
fclose(fp);
double sec = atof(buf);
++n;
total+=sec;
double avg = total/n;
if (sec > avg*1.5) {
printf("\nlong run %.2f avg=%.2f\n",sec,avg);
}
printf("%d sec=%.2f avg=%.2f\r",n,sec,avg);
fflush(stdout);
}
}

-Original Message-
From: Bill Somerville [mailto:g4...@classdesign.com] 
Sent: Wednesday, February 04, 2015 1:55 PM
To: wsjt-devel@lists.sourceforge.net
Subject: Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in
WSJT-X

On 04/02/2015 19:49, Joe Taylor wrote:

Hi Joe,
> On 2/4/2015 2:26 PM, Michael Black wrote:
>> Not sure we want more than 1 thread
> As I demonstrated and wrote here several hours ago:
>
> "When using OpenMP to run JT9 and JT65 decoders in parallel, we gain 
> almost nothing by using multi-threading for the FFTW plans."
>
> I think this will remain true.  I recommend using "-w 2 -m 1" to set 
> up the FFTW plans, and using two threads (and only two) for the 
> parallel sections initiated in decoder.f90 on multi-core machines.
Before completely discarding MT FFTW3 I would like to try something. Can you
give a brief rough summary of all the FFT sizes used by jt9?

If there are many smaller FFT being run then I think their plans should be
limited to 1 thread and only unleash 2 or more threads for the big FFTS.
>
>   -- Joe
73
Bill
G4WJS.


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread Bill Somerville
On 04/02/2015 19:49, Joe Taylor wrote:

Hi Joe,
> On 2/4/2015 2:26 PM, Michael Black wrote:
>> Not sure we want more than 1 thread
> As I demonstrated and wrote here several hours ago:
>
> "When using OpenMP to run JT9 and JT65 decoders in parallel, we gain
> almost nothing by using multi-threading for the FFTW plans."
>
> I think this will remain true.  I recommend using "-w 2 -m 1" to set up
> the FFTW plans, and using two threads (and only two) for the parallel
> sections initiated in decoder.f90 on multi-core machines.
Before completely discarding MT FFTW3 I would like to try something. Can 
you give a brief rough summary of all the FFT sizes used by jt9?

If there are many smaller FFT being run then I think their plans should 
be limited to 1 thread and only unleash 2 or more threads for the big FFTS.
>
>   -- Joe
73
Bill
G4WJS.

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread Joe Taylor
On 2/4/2015 2:26 PM, Michael Black wrote:
> Not sure we want more than 1 thread

As I demonstrated and wrote here several hours ago:

"When using OpenMP to run JT9 and JT65 decoders in parallel, we gain
almost nothing by using multi-threading for the FFTW plans."

I think this will remain true.  I recommend using "-w 2 -m 1" to set up 
the FFTW plans, and using two threads (and only two) for the parallel 
sections initiated in decoder.f90 on multi-core machines.

-- Joe

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread Bill Somerville
On 04/02/2015 19:26, Michael Black wrote:
Hi Mike,
> Not sure we want more than 1 threadmy testing shows this on my 8-core box 
> since this patch would give me 6 threads.
Agreed, the patch uses a trivially crude algorithm for the FFT thread 
count which will use way too many threads on a processor with more than 
4 CPUs. In your case that will be 14 threads for FFTs :(
> I think the old improvement >1 thread showed was overtaken by multi-threading 
> the top level.
You need to ensure that you run at least one decode in each mode before 
running any timing tests. Changing the number of FFT threads requires 
new FFT wisdom to be calculated.

You could try:

   , "-m", QString::number (qMin (qMax (QThread::idealThreadCount () 
- 2, 1), 2)) //FFTW threads

for line 379 of mainwindow.cpp for a more controlled thread 
utilization.That sort of approach will also give sane results for those 
who run multiple instances of WSJT-X with multiple RX SDRs.

We need to fine tune the FFT plans with "personalized" thread counts and 
perhaps have some user setting to set a maximum number of compute 
intensive threads.
>
> Thread1   2   3   4   5   6
> 4930  1.101.271.241.231.291.31
>   1.101.211.251.251.271.31
>   1.091.251.231.221.291.32
>   1.091.221.241.221.271.29
>   1.111.231.221.231.271.33
>   1.121.231.251.271.291.31
>   1.081.221.241.231.261.30
>   1.101.251.231.251.311.30
>   1.141.231.241.261.271.32
>   1.131.221.221.231.281.29
>   1.111.231.251.251.271.32
>   1.121.231.231.241.271.29
>   1.121.251.241.231.271.29
>   1.081.241.251.241.281.34
>   1.081.231.231.211.301.29
>   1.081.271.241.251.281.27
>   1.151.251.251.291.281.30
>   1.131.261.221.271.271.30
>   1.101.251.231.241.271.30
> Avg   1.091.251.241.241.281.30
>
73
Bill
G4WJS.
> -Original Message-
> From: Bill Somerville [mailto:g4...@classdesign.com]
> Sent: Wednesday, February 04, 2015 11:02 AM
> To: WSJT software development
> Subject: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X
>
> Hi All,
>
> it looks like the latest jt9 using OpenMP and multi-threaded FFTs along with 
> Joe's recent re-factorings for performance seem to be approaching stability. 
> If anyone wants to try them out on air with WSJT-X, the attached patch will 
> allow WSJT-X to be built with them enabled.
>
> Note that the patch enables a fairly lengthy FFT plan optimization and the 
> first decode cycle may take a few minutes to complete, do not kill the 
> program as the accumulated FFT wisdom is written out at the end of a session. 
> Once the FFTW wisdom is saved there will be no further delays.
>
> Testing here results in decodes that are so fast that it hardly seems worth 
> looking fro any more performance improvements until we start getting 100's 
> concurrent of QSOs per band. Impressive stuff!
>
> Running on a quad-core Core i7 laptop here.
>
> 73
> Bill
> G4WJS.
>
>
> --
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> ___
> wsjt-devel mailing list
> wsjt-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wsjt-devel


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread Michael Black
Not sure we want more than 1 threadmy testing shows this on my 8-core box 
since this patch would give me 6 threads.
I think the old improvement >1 thread showed was overtaken by multi-threading 
the top level.

Thread  1   2   3   4   5   6
49301.101.271.241.231.291.31
1.101.211.251.251.271.31
1.091.251.231.221.291.32
1.091.221.241.221.271.29
1.111.231.221.231.271.33
1.121.231.251.271.291.31
1.081.221.241.231.261.30
1.101.251.231.251.311.30
1.141.231.241.261.271.32
1.131.221.221.231.281.29
1.111.231.251.251.271.32
1.121.231.231.241.271.29
1.121.251.241.231.271.29
1.081.241.251.241.281.34
1.081.231.231.211.301.29
1.081.271.241.251.281.27
1.151.251.251.291.281.30
1.131.261.221.271.271.30
1.101.251.231.241.271.30
Avg 1.091.251.241.241.281.30


-Original Message-
From: Bill Somerville [mailto:g4...@classdesign.com] 
Sent: Wednesday, February 04, 2015 11:02 AM
To: WSJT software development
Subject: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

Hi All,

it looks like the latest jt9 using OpenMP and multi-threaded FFTs along with 
Joe's recent re-factorings for performance seem to be approaching stability. If 
anyone wants to try them out on air with WSJT-X, the attached patch will allow 
WSJT-X to be built with them enabled.

Note that the patch enables a fairly lengthy FFT plan optimization and the 
first decode cycle may take a few minutes to complete, do not kill the program 
as the accumulated FFT wisdom is written out at the end of a session. Once the 
FFTW wisdom is saved there will be no further delays.

Testing here results in decodes that are so fast that it hardly seems worth 
looking fro any more performance improvements until we start getting 100's 
concurrent of QSOs per band. Impressive stuff!

Running on a quad-core Core i7 laptop here.

73
Bill
G4WJS.


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] git-apply -v /src/wsjtx/wsjtx_omp.patch

2015-02-04 Thread Bill Somerville
On 04/02/2015 18:54, John N1ISA wrote:
> WSJTX Developers...
Hi John,
>
> I use the build process posted here a while ago. Recently, I had to add "--
> without-cxx-binding", to obtain a error free build. It has worked well for
> me.
OK, the C++ binding of Hamlib is not used in WSJT-X so it need not be 
enabled in the Hamlib build, although It does build OK for me on all 
platforms.
>
> Question about applying Bill's wsjtx_omp.patch.
>
> I build out of my ~/HOME directory. My WSJTX source code is located at
> ~/src/wsjtx. If I move Bill's patch into the ~/src/wsjtx directory, and:
>
> cd ~/src/wsjtx/
>
> git-apply -v ~/src/wsjtx/wsjtx_omp.patch
OK but given most people will be using Subversion a simple:

$ patch -p1 
> Bill's patch will be applied to: mainwindow.cpp...
>
> and then just build as I normally would, and follow Bill's directions related
> to the first decode process of WSJTX.
>
> Will this work, and apply the patch correctly?
Yes but see above regarding using patch rather than git to apply the patch.
>
> 73, John, N1ISA
73
Bill
G4WJS.

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


[wsjt-devel] git-apply -v /src/wsjtx/wsjtx_omp.patch

2015-02-04 Thread John N1ISA
WSJTX Developers...

I use the build process posted here a while ago. Recently, I had to add "--
without-cxx-binding", to obtain a error free build. It has worked well for 
me.

Question about applying Bill's wsjtx_omp.patch.

I build out of my ~/HOME directory. My WSJTX source code is located at 
~/src/wsjtx. If I move Bill's patch into the ~/src/wsjtx directory, and:

cd ~/src/wsjtx/

git-apply -v ~/src/wsjtx/wsjtx_omp.patch

Bill's patch will be applied to: mainwindow.cpp...

and then just build as I normally would, and follow Bill's directions related 
to the first decode process of WSJTX.

Will this work, and apply the patch correctly?

73, John, N1ISA



--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread ki7mt
Hi Bill,

Ok, I'll just wait then, as I can't process / act upon more than 10 or 
12 decodes in 8 seconds anyway.

73's
Greg, KI7MT

On 2/4/2015 10:13 AM, Bill Somerville wrote:
> On 04/02/2015 17:09, ki...@yahoo.com wrote:
>> Hi Bill,
> Hi Greg,
>>
>> Has the patch been checked-in or will it be available only via the  patch ?
> As it will not work on Mac at this stage I cannot check it in. For now I
> think the developers who build themselves should try it out for a while.
> Given that multi-threading is very hard to empirically test, there are
> bound to be a few outstanding problems to solve anyway.
>>
>> 73's
>> Greg, KI7MT
> 73
> Bill
> G4WJS.
>>
>> On 2/4/2015 10:01 AM, Bill Somerville wrote:
>>> Hi All,
>>>
>>> it looks like the latest jt9 using OpenMP and multi-threaded FFTs along
>>> with Joe's recent re-factorings for performance seem to be approaching
>>> stability. If anyone wants to try them out on air with WSJT-X, the
>>> attached patch will allow WSJT-X to be built with them enabled.
>>>
>>> Note that the patch enables a fairly lengthy FFT plan optimization and
>>> the first decode cycle may take a few minutes to complete, do not kill
>>> the program as the accumulated FFT wisdom is written out at the end of a
>>> session. Once the FFTW wisdom is saved there will be no further delays.
>>>
>>> Testing here results in decodes that are so fast that it hardly seems
>>> worth looking fro any more performance improvements until we start
>>> getting 100's concurrent of QSOs per band. Impressive stuff!
>>>
>>> Running on a quad-core Core i7 laptop here.
>>>
>>> 73
>>> Bill
>>> G4WJS.
>>>
>>>
>>> --
>>> Dive into the World of Parallel Programming. The Go Parallel Website,
>>> sponsored by Intel and developed in partnership with Slashdot Media, is your
>>> hub for all things parallel software development, from weekly thought
>>> leadership blogs to news, videos, case studies, tutorials and more. Take a
>>> look and join the conversation now. http://goparallel.sourceforge.net/
>>>
>>>
>>>
>>> ___
>>> wsjt-devel mailing list
>>> wsjt-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/wsjt-devel
>>>
>> --
>> Dive into the World of Parallel Programming. The Go Parallel Website,
>> sponsored by Intel and developed in partnership with Slashdot Media, is your
>> hub for all things parallel software development, from weekly thought
>> leadership blogs to news, videos, case studies, tutorials and more. Take a
>> look and join the conversation now. http://goparallel.sourceforge.net/
>> ___
>> wsjt-devel mailing list
>> wsjt-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/wsjt-devel
>
>
> --
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> ___
> wsjt-devel mailing list
> wsjt-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wsjt-devel
>

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X Decoder Performance

2015-02-04 Thread Michael Black
I removed all the flush(6) except the one in decoder.f90.  There was an
unprotected one in jt9c.f90 which may explain the long runtimes I see
one-in-a-great while on my Windows 10 system.  Last long runtime was 7
seconds using 2 threads before I removed the flushes.
I am now running a loop test to see if any long run times are seen on both
my computers.  
Mike W9MDB
#include 

int main(int argc,char *argv[])
{
char *cmd = "TimeMem-1.0.exe jt9_omp -p 1 -d 3 -w 2 -m 2
130610_2343.wav | grep Elapsed | cut -f2 -d: >doit.txt";
double total=0;
int n=0;
char buf[4096];
while(1) {
system(cmd);
FILE *fp=fopen("doit.txt","r");
fgets(buf,sizeof(buf),fp);
fclose(fp);
double sec = atof(buf);
++n;
total+=sec;
double avg = total/n;
if (sec > avg*1.5) {
printf("long run %.2f avg=.2f\n",sec,avg);
}
printf("%d\r",n);
fflush(stdout);
}
}

Looking at how the output comes out of jt9_omp it would appear to me these
flushes are not necessary as it appears each line is being flushed anyways.

Not really any change in the timing
Mike W9MDB

Thread  1   2   %Diff   !flush  
49301.1 1.28-16.36% 1.111.21-9.01%
1.081.21-12.04% 1.111.22-9.91%
1.081.2 -11.11% 1.091.22-11.93%
1.1 1.22-10.91% 1.071.22-14.02%
1.081.23-13.89% 1.071.23-14.95%
1.081.22-12.96% 1.141.22-7.02%
1.091.22-11.93% 1.081.23-13.89%
1.071.23-14.95% 1.081.24-14.81%
1.091.23-12.84% 1.091.25-14.68%
1.131.22-7.96%  1.1 1.26-14.55%
1.091.22-11.93% 1.091.26-15.60%
1.081.22-12.96% 1.081.26-16.67%
1.081.25-15.74% 1.091.21-11.01%
1.081.22-12.96% 1.091.23-12.84%
1.111.24-11.71% 1.061.26-18.87%
1.091.22-11.93% 1.091.24-13.76%
1.111.24-11.71% 1.071.23-14.95%
1.1 1.22-10.91% 1.081.24-14.81%
1.081.2 -11.11% 1.071.22-14.02%
1.091.2 -10.09% 1.081.21-12.04%
Avg 1.0905  1.2245  -12.30% 1.087   1.233   -13.47%



--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X Decoder Performance

2015-02-04 Thread Michael Black
Doing the same 20-pass run on my Windows 10 HP Envy with i7-4702MQ @ 2.2Ghz
(compare to the dual X5450 4-core CPU at 3Ghz) -- you can see Ghz doesn't
tell the whole story...
With 2 threads on Windows 10 I see a long run once in a great while.

Thread  1   2   %Diff
49300.790.82-3.80%
0.780.79-1.28%
0.780.79-1.28%
0.750.78-4.00%
0.780.780.00%
0.770.8 -3.90%
0.780.780.00%
0.770.79-2.60%
0.780.79-1.28%
0.790.82-3.80%
0.8 0.782.50%
0.760.77-1.32%
0.770.8 -3.90%
0.780.83-6.41%
0.8 0.82-2.50%
0.780.79-1.28%
0.780.780.00%
0.780.780.00%
0.770.78-1.30%
0.770.78-1.30%
Average 0.778   0.7925  -1.87%



--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread Bill Somerville
On 04/02/2015 17:09, ki...@yahoo.com wrote:
> Hi Bill,
Hi Greg,
>
> Has the patch been checked-in or will it be available only via the  patch ?
As it will not work on Mac at this stage I cannot check it in. For now I 
think the developers who build themselves should try it out for a while. 
Given that multi-threading is very hard to empirically test, there are 
bound to be a few outstanding problems to solve anyway.
>
> 73's
> Greg, KI7MT
73
Bill
G4WJS.
>
> On 2/4/2015 10:01 AM, Bill Somerville wrote:
>> Hi All,
>>
>> it looks like the latest jt9 using OpenMP and multi-threaded FFTs along
>> with Joe's recent re-factorings for performance seem to be approaching
>> stability. If anyone wants to try them out on air with WSJT-X, the
>> attached patch will allow WSJT-X to be built with them enabled.
>>
>> Note that the patch enables a fairly lengthy FFT plan optimization and
>> the first decode cycle may take a few minutes to complete, do not kill
>> the program as the accumulated FFT wisdom is written out at the end of a
>> session. Once the FFTW wisdom is saved there will be no further delays.
>>
>> Testing here results in decodes that are so fast that it hardly seems
>> worth looking fro any more performance improvements until we start
>> getting 100's concurrent of QSOs per band. Impressive stuff!
>>
>> Running on a quad-core Core i7 laptop here.
>>
>> 73
>> Bill
>> G4WJS.
>>
>>
>> --
>> Dive into the World of Parallel Programming. The Go Parallel Website,
>> sponsored by Intel and developed in partnership with Slashdot Media, is your
>> hub for all things parallel software development, from weekly thought
>> leadership blogs to news, videos, case studies, tutorials and more. Take a
>> look and join the conversation now. http://goparallel.sourceforge.net/
>>
>>
>>
>> ___
>> wsjt-devel mailing list
>> wsjt-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/wsjt-devel
>>
> --
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> ___
> wsjt-devel mailing list
> wsjt-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wsjt-devel


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread ki7mt
Hi Bill,

Has the patch been checked-in or will it be available only via the  patch ?

73's
Greg, KI7MT

On 2/4/2015 10:01 AM, Bill Somerville wrote:
> Hi All,
>
> it looks like the latest jt9 using OpenMP and multi-threaded FFTs along
> with Joe's recent re-factorings for performance seem to be approaching
> stability. If anyone wants to try them out on air with WSJT-X, the
> attached patch will allow WSJT-X to be built with them enabled.
>
> Note that the patch enables a fairly lengthy FFT plan optimization and
> the first decode cycle may take a few minutes to complete, do not kill
> the program as the accumulated FFT wisdom is written out at the end of a
> session. Once the FFTW wisdom is saved there will be no further delays.
>
> Testing here results in decodes that are so fast that it hardly seems
> worth looking fro any more performance improvements until we start
> getting 100's concurrent of QSOs per band. Impressive stuff!
>
> Running on a quad-core Core i7 laptop here.
>
> 73
> Bill
> G4WJS.
>
>
> --
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
>
>
>
> ___
> wsjt-devel mailing list
> wsjt-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wsjt-devel
>

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


[wsjt-devel] WSJT-X: Using the latest decoder improvements in WSJT-X

2015-02-04 Thread Bill Somerville

Hi All,

it looks like the latest jt9 using OpenMP and multi-threaded FFTs along 
with Joe's recent re-factorings for performance seem to be approaching 
stability. If anyone wants to try them out on air with WSJT-X, the 
attached patch will allow WSJT-X to be built with them enabled.


Note that the patch enables a fairly lengthy FFT plan optimization and 
the first decode cycle may take a few minutes to complete, do not kill 
the program as the accumulated FFT wisdom is written out at the end of a 
session. Once the FFTW wisdom is saved there will be no further delays.


Testing here results in decodes that are so fast that it hardly seems 
worth looking fro any more performance improvements until we start 
getting 100's concurrent of QSOs per band. Impressive stuff!


Running on a quad-core Core i7 laptop here.

73
Bill
G4WJS.
diff --git a/mainwindow.cpp b/mainwindow.cpp
index 26658bd..5c26815 100644
--- a/mainwindow.cpp
+++ b/mainwindow.cpp
@@ -345,7 +345,7 @@ MainWindow::MainWindow(bool multiple, QSettings * settings, 
QSharedMemory *shdme
 {
   while(true)
 {
-  int iret=killbyname("jt9.exe");
+  int iret=killbyname("jt9_omp.exe");
   if(iret == 603) break;
   if(iret != 0) msgBox("KillByName return code: " +
QString::number(iret));
@@ -375,14 +375,14 @@ MainWindow::MainWindow(bool multiple, QSettings * 
settings, QSharedMemory *shdme
 
   QStringList jt9_args {
 "-s", QApplication::applicationName ()
-, "-w", "1" //FFTW patience
-, "-m", "1" //FFTW threads
+  , "-w", "2"   //FFTW patience
+  , "-m", QString::number (qMax (QThread::idealThreadCount () - 2, 1)) 
//FFTW threads
   , "-e", QDir::toNativeSeparators (m_appDir)
   , "-a", QDir::toNativeSeparators (m_dataDir.absolutePath ())
   , "-t", QDir::toNativeSeparators (m_config.temp_dir ().absolutePath ())
   };
   proc_jt9.start(QDir::toNativeSeparators (m_appDir) + QDir::separator () +
-  "jt9", jt9_args, QIODevice::ReadWrite | QIODevice::Unbuffered);
+  "jt9_omp", jt9_args, QIODevice::ReadWrite | QIODevice::Unbuffered);
 
   QString fname {QDir::toNativeSeparators(m_dataDir.absoluteFilePath 
("wsjtx_wisdom.dat"))};
   QByteArray cfname=fname.toLocal8Bit();
@@ -1321,7 +1321,7 @@ void MainWindow::decode() 
  //decode()
 void MainWindow::jt9_error (QProcess::ProcessError e)
 {
   if(!m_killAll) {
-msgBox("Error starting or running\n" + m_appDir + "/jt9 -s");
+msgBox("Error starting or running\n" + m_appDir + "/jt9_omp -s");
 qDebug() << e;   // silence compiler warning
 exit(1);
   }
--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X Decoder Performance

2015-02-04 Thread Michael Black
I sent this in via HTML and it got blocked...so here it is plain text...
Mike W9MDB

Did a 20-pass run on the last two versions of interest - I have a dual
4-Core CPU so apparently would have 8 threads available on 4928 cut to 2
threads on 4930.  So an ever so slight improvement with 1 thread..2 threads
got worse though but they were already worse to start with.
Col1 = TimeMem-1.0.exe jt9_omp -p 1 -d 3 -w 2 -m 1 130610_2343.wav | grep
Elapsed | cut -f2 -d:
Col2 = TimeMem-1.0.exe jt9_omp -p 1 -d 3 -w 2 -m 2 130610_2343.wav | grep
Elapsed | cut -f2 -d:

Thread  1   2   %Diff
49281.131.14-0.88%
1.081.13-4.63%
1.1 1.2 -9.09%
1.1 1.13-2.73%
1.191.190.00%
1.121.1 1.79%
1.1 1.14-3.64%
1.081.13-4.63%
1.111.18-6.31%
1.1 1.12-1.82%
1.091.12-2.75%
1.1 1.13-2.73%
1.091.18-8.26%
1.1 1.12-1.82%
1.091.17-7.34%
1.081.2 -11.11%
1.091.31-20.18%
1.111.23-10.81%
1.1 1.24-12.73%
1.091.19-9.17%
Average 1.1025  1.1675  -5.94%

Thread  1   2   %Diff
49301.1 1.28-16.36%
1.081.21-12.04%
1.081.2 -11.11%
1.1 1.22-10.91%
1.081.23-13.89%
1.081.22-12.96%
1.091.22-11.93%
1.071.23-14.95%
1.091.23-12.84%
1.131.22-7.96%
1.091.22-11.93%
1.081.22-12.96%
1.081.25-15.74%
1.081.22-12.96%
1.111.24-11.71%
1.091.22-11.93%
1.111.24-11.71%
1.1 1.22-10.91%
1.081.2 -11.11%
1.091.2 -10.09%
Average 1.0905  1.2245  -12.30%

-Original Message-
From: Bill Somerville [mailto:g4...@classdesign.com] 
Sent: Wednesday, February 04, 2015 9:31 AM
To: wsjt-devel@lists.sourceforge.net
Subject: Re: [wsjt-devel] WSJT-X Decoder Performance

On 04/02/2015 15:27, Joe Taylor wrote:
> Hi Bill,
Hi Joe,
>
> OK, by all means go ahead.
>
> BTW: I notice that jt9_omp.exe r4929 always runs with 4 threads on my 
> 4-core machine.  Since we have only two tasks running in parallal, I 
> can see little reason to use more than 2 threads.  Should we specify 
> two threads explicitly?
Yes, I have addressed that as well.
>
>   -- Joe
73
Bill
G4WJS.
>
> On 2/4/2015 10:24 AM, Bill Somerville wrote:
>> On 04/02/2015 15:21, Joe Taylor wrote:
>>> Hi Bill and all,
>> Hi Joe,
>>
>> 
>>> Note that decoder.f90 now decodes the two modes in parallel sections
>>> *ONLY* if txmode is JT9.  I will fix this.
>> Joe, I already have this in hand, I can check it in if you wish.
>>
>> 
>>> -- Joe
>> 73
>> Bill
>> G4WJS.
>>
>> -
>> - Dive into the World of Parallel Programming. The Go 
>> Parallel Website, sponsored by Intel and developed in partnership 
>> with Slashdot Media, is your hub for all things parallel software 
>> development, from weekly thought leadership blogs to news, videos, 
>> case studies, tutorials and more. Take a look and join the 
>> conversation now. http://goparallel.sourceforge.net/
>> ___
>> wsjt-devel mailing list
>> wsjt-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/wsjt-devel
> --
>  Dive into the World of Parallel Programming. The Go Parallel 
> Website, sponsored by Intel and developed in partnership with Slashdot 
> Media, is your hub for all things parallel software development, from 
> weekly thought leadership blogs to news, videos, case studies, 
> tutorials and more. Take a look and join the conversation now. 
> http://goparallel.sourceforge.net/
> ___
> wsjt-devel mailing list
> wsjt-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wsjt-devel



--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Inte

Re: [wsjt-devel] WSJT-X Decoder Performance

2015-02-04 Thread Michael Black
Did a 20-pass run on the last two versions of interest - I have a dual
4-Core CPU so apparently would have 8 threads available on 4928 cut to 2
threads on 4930.  So an ever so slight improvement with 1 thread..2 threads
got worse though but they were already worse to start with.

Col1 = TimeMem-1.0.exe jt9_omp -p 1 -d 3 -w 2 -m 1 130610_2343.wav | grep
Elapsed | cut -f2 -d:

Col2 = TimeMem-1.0.exe jt9_omp -p 1 -d 3 -w 2 -m 2 130610_2343.wav | grep
Elapsed | cut -f2 -d:

 

 


Threads

1

2

%Diff


4928

1.13

1.14

-0.88%


1.08

1.13

-4.63%


1.1

1.2

-9.09%


1.1

1.13

-2.73%


1.19

1.19

0.00%


1.12

1.1

1.79%


1.1

1.14

-3.64%


1.08

1.13

-4.63%


1.11

1.18

-6.31%


1.1

1.12

-1.82%


1.09

1.12

-2.75%


1.1

1.13

-2.73%


1.09

1.18

-8.26%


1.1

1.12

-1.82%


1.09

1.17

-7.34%


1.08

1.2

-11.11%


1.09

1.31

-20.18%


1.11

1.23

-10.81%


1.1

1.24

-12.73%


1.09

1.19

-9.17%


Average

1.1025

1.1675

-5.94%



 

Threads

1

2

%Diff


4930

1.1

1.28

-16.36%


1.08

1.21

-12.04%


1.08

1.2

-11.11%


1.1

1.22

-10.91%


1.08

1.23

-13.89%


1.08

1.22

-12.96%


1.09

1.22

-11.93%


1.07

1.23

-14.95%


1.09

1.23

-12.84%


1.13

1.22

-7.96%


1.09

1.22

-11.93%


1.08

1.22

-12.96%


1.08

1.25

-15.74%


1.08

1.22

-12.96%


1.11

1.24

-11.71%


1.09

1.22

-11.93%


1.11

1.24

-11.71%


1.1

1.22

-10.91%


1.08

1.2

-11.11%


1.09

1.2

-10.09%


Average

1.0905

1.2245

-12.30%

 

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X Decoder Performance

2015-02-04 Thread Bill Somerville
On 04/02/2015 15:27, Joe Taylor wrote:
> Hi Bill,
Hi Joe,
>
> OK, by all means go ahead.
>
> BTW: I notice that jt9_omp.exe r4929 always runs with 4 threads on my
> 4-core machine.  Since we have only two tasks running in parallal, I can
> see little reason to use more than 2 threads.  Should we specify two
> threads explicitly?
Yes, I have addressed that as well.
>
>   -- Joe
73
Bill
G4WJS.
>
> On 2/4/2015 10:24 AM, Bill Somerville wrote:
>> On 04/02/2015 15:21, Joe Taylor wrote:
>>> Hi Bill and all,
>> Hi Joe,
>>
>> 
>>> Note that decoder.f90 now decodes the two modes in parallel sections
>>> *ONLY* if txmode is JT9.  I will fix this.
>> Joe, I already have this in hand, I can check it in if you wish.
>>
>> 
>>> -- Joe
>> 73
>> Bill
>> G4WJS.
>>
>> --
>> Dive into the World of Parallel Programming. The Go Parallel Website,
>> sponsored by Intel and developed in partnership with Slashdot Media, is your
>> hub for all things parallel software development, from weekly thought
>> leadership blogs to news, videos, case studies, tutorials and more. Take a
>> look and join the conversation now. http://goparallel.sourceforge.net/
>> ___
>> wsjt-devel mailing list
>> wsjt-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/wsjt-devel
> --
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> ___
> wsjt-devel mailing list
> wsjt-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wsjt-devel


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X Decoder Performance

2015-02-04 Thread Joe Taylor
Hi Bill,

OK, by all means go ahead.

BTW: I notice that jt9_omp.exe r4929 always runs with 4 threads on my 
4-core machine.  Since we have only two tasks running in parallal, I can 
see little reason to use more than 2 threads.  Should we specify two 
threads explicitly?

-- Joe

On 2/4/2015 10:24 AM, Bill Somerville wrote:
> On 04/02/2015 15:21, Joe Taylor wrote:
>> Hi Bill and all,
> Hi Joe,
>
> 
>> Note that decoder.f90 now decodes the two modes in parallel sections
>> *ONLY* if txmode is JT9.  I will fix this.
> Joe, I already have this in hand, I can check it in if you wish.
>
> 
>>  -- Joe
> 73
> Bill
> G4WJS.
>
> --
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> ___
> wsjt-devel mailing list
> wsjt-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wsjt-devel

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X Decoder Performance

2015-02-04 Thread Bill Somerville
On 04/02/2015 15:21, Joe Taylor wrote:
> Hi Bill and all,
Hi Joe,


> Note that decoder.f90 now decodes the two modes in parallel sections
> *ONLY* if txmode is JT9.  I will fix this.
Joe, I already have this in hand, I can check it in if you wish.


>   -- Joe
73
Bill
G4WJS.

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X Decoder Performance

2015-02-04 Thread Joe Taylor
Hi Bill and all,

Tests here suggest that r4929 produces a Windows jt9_omp.exe that runs 
correctly.  At least, it runs to completion on my sequence of 25 test 
files -- which r4928 does not.

Timing results on a 4-core Win7 machine:

Params   jt9   jt9_omp
--
-w 2 -m 1   25.5 s  21.1 s
-w 2 -m 2   24.921.0

When using OpenMP to run JT9 and JT65 decoders in parallel, we gain 
almost nothing by using multi-threading for the FFTW plans.

Note that decoder.f90 now decodes the two modes in parallel sections 
*ONLY* if txmode is JT9.  I will fix this.

I may also look for additional places where concurrent processing could 
help performance... but I don't consider this a very high priority.

-- Joe

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] v4926 OpenMP

2015-02-04 Thread Bill Somerville
On 04/02/2015 14:29, Michael Black wrote:
Hi Mike,
> I don't think you'll find any gain using FFTW openmp.  WSJT-X does not do
> big enough FFTs to overtake the thread create/delete overhead.
That's not so Mike. Joe has already determined that FFTW3 given roughly 
2 threads has a small performance gain on the larger FFTs in the decoders.

There may be some confusion here, the talk about using the OpenMP 
version of FFTW3 is as an option to the native/pthreads version. Both 
are multi-threaded and have similar performance. The OpenMP version has 
the benefit that it is aware of the threads also being used elsewhere in 
the application and therefore plays well with the dynamic number of 
threads algorithm in OpenMP. This is currently not relevant to us as we 
are simply dividing the work in half and running the two halves (JT65 
decode and JT9 decode) in parallel, the thread allocation for that is 
trivial i.e. 2 if there are at least 2 CPUs available. We also have 
direct control of the number of threads FFTW3 uses so we can allocate 
any spare CPUs, above the two used for the decoder threads, to the 
larger FFT plans.
>
> I worked a job a few years ago on a 512-core machine doing FFTs on synthetic
> aperture radar systems.  Using FFTW with OpenMp did very little good.  Using
> OpenMP at the layer above did...which is the same thing I think we'll find
> here.
> OpenMP inside FFTW for small FFTs wil have the overhead dominate and defeat
> it.
wsjtx/jt9 use a number of different FFT sizes, currently the '-m #' 
argument is being used as the thread count for all of them, we probably 
need to only use more than one thread for the large FFTs as you are 
correct that there is a high proportion thread synchronization overhead 
for small FFTs, but FFTW3 does address this internally by only using 
multiple threads on FFTs larger than ~2^11.
>
> We're already seeing only a 20-25% improvement in openmp at this level which
> is a clear indication to me that we're not getting anywhere near 100% gain
> for threading so doing it at a lower level isn't worth it.
That is comparing Apples and Screwdrivers ;) the threading strategy for 
the decoders is one task per thread whereas the FFT  strategy is a true 
divide and conquer algorithm with a recursive distribution to threads. 
They are both able to deliver performance improvements in the same 
application given enough CPUs to run on (2 for decoders + ~2 for FFTs 
has been shown to be optimal). Note that there is absolutely no 
threading contention or overhead between the FFTs and the decoders, even 
though the latter uses the former. So given that the average low end PC 
these days is usually at least a dual core hyper-threaded Intel 
processor or equivalent, we can assume that 4 CPUs are available.

Not achieving 100% improvement at this stage from parallel decoding is 
likely to be due to overheads that we can and should address like not 
having the correct granularity on locks and being too pessimistic about 
data sharing controls, the FFTW3 concurrency is in and working but the 
direct use of OpenMP for parallel decoding is yet to be fully implemented.
> When I did my 512-core system I was getting over 90% gain for each thread I
> added.
> Much like "don't sweat the small stuff" I think you''ll find "don't
> multi-thread the small FFTs" is a good paradigm...
> When you got a ~50% gain then it's time to look at multi-threading below
> that level.
OK but the FFTW3 threading is almost free in terms of complexity, the 
FFTW developers have done all the hard stuff, we just need to turn it 
on. That means even quite small gains are cost effective. OTOH the 
direct use of OpenMP in the decoder is adding a lot of complexity since 
we have to design and implement or eliminate the data sharing controls, 
the potential gain is large so is probably worth the cost in development 
effort and complexity.
>
> Mike W9MDB
73
Bill
G4WJS.
>
>
> -Original Message-
> From: Bill Somerville [mailto:g4...@classdesign.com]
> Sent: Wednesday, February 04, 2015 8:21 AM
> To: wsjt-devel@lists.sourceforge.net
> Subject: Re: [wsjt-devel] v4926 OpenMP
>
> On 04/02/2015 14:05, John Nelson wrote:
>> Hi Bill and Joe,
> Hi John,
>> With regard to Mac builds, your [Bill] code test with workspace and
> workspace_mt executes correctly with my gfortran compiler.   However, as you
> point out the current clang/clang++ do not [yet] have OpenMP support.
>> So when I compile fftw_3.3.4 with --enable-threads, I cannot also use
> --with-openmp.  I also get:
>> -- Try OpenMP C flag = [ ]
>> -- Performing Test OpenMP_FLAG_DETECTED
>> -- Performing Test OpenMP_FLAG_DETECTED - Failed
> I am experimenting with the MacPorts gcc 4.9 suite with building WSJT-X.
> That needs changes to the CMake script which I have not committed yet.
> So far it doesn't seem to be necessary to build or use the OpenMP version of
> FFTW3, the native/pthreads version is working well and seems to be
> compatible with an OpenMP program. 

Re: [wsjt-devel] WSJT-X Decoder Performance

2015-02-04 Thread Michael Black
Doing some testing on the 4928 jt9_omp on my Windows 10 box using command
line test.  I'm getting periodic long runs of 40-100 seconds...kind of like
it's running wisdom again or such.

There are a lot more page faults when that happens too.

I haven't see this behavior on Windows 7 yet.

Mike W9MDB






--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] v4926 OpenMP

2015-02-04 Thread Michael Black
I don't think you'll find any gain using FFTW openmp.  WSJT-X does not do
big enough FFTs to overtake the thread create/delete overhead.

I worked a job a few years ago on a 512-core machine doing FFTs on synthetic
aperture radar systems.  Using FFTW with OpenMp did very little good.  Using
OpenMP at the layer above did...which is the same thing I think we'll find
here.
OpenMP inside FFTW for small FFTs wil have the overhead dominate and defeat
it.

We're already seeing only a 20-25% improvement in openmp at this level which
is a clear indication to me that we're not getting anywhere near 100% gain
for threading so doing it at a lower level isn't worth it.
When I did my 512-core system I was getting over 90% gain for each thread I
added.
Much like "don't sweat the small stuff" I think you''ll find "don't
multi-thread the small FFTs" is a good paradigm...
When you got a ~50% gain then it's time to look at multi-threading below
that level.

Mike W9MDB


-Original Message-
From: Bill Somerville [mailto:g4...@classdesign.com] 
Sent: Wednesday, February 04, 2015 8:21 AM
To: wsjt-devel@lists.sourceforge.net
Subject: Re: [wsjt-devel] v4926 OpenMP

On 04/02/2015 14:05, John Nelson wrote:
> Hi Bill and Joe,
Hi John,
>
> With regard to Mac builds, your [Bill] code test with workspace and
workspace_mt executes correctly with my gfortran compiler.   However, as you
point out the current clang/clang++ do not [yet] have OpenMP support.
>
> So when I compile fftw_3.3.4 with --enable-threads, I cannot also use
--with-openmp.  I also get:
>
> -- Try OpenMP C flag = [ ]
> -- Performing Test OpenMP_FLAG_DETECTED
> -- Performing Test OpenMP_FLAG_DETECTED - Failed
I am experimenting with the MacPorts gcc 4.9 suite with building WSJT-X. 
That needs changes to the CMake script which I have not committed yet. 
So far it doesn't seem to be necessary to build or use the OpenMP version of
FFTW3, the native/pthreads version is working well and seems to be
compatible with an OpenMP program. I believe the only issue is that we need
to control the number of threads used by FFTW3 and OpenMP manually to a
certain extent. If it does become necessary to use the OpenMP version of
FFTW3, that can be built on Mac, again I have the MacPorts version
available.

There also appears to be a bug in CMake that is causing it not to pass on
the portability options to the gcc compilers/linker (MAC_OSX_SYSROOT and
MAC_OSX_DEPLOYMENT_TARGET). This is not serious and can be worked around if
necessary but I want to get it sorted out properly if possible.

My current focus apart from v1.4 issues is to help Joe with multi-threading
hazards in jt9 but I am working on the Mac builds with OpenMP as well.
>
> when building WSJT-X r4928 which is currently executing successfully - and
certainly decodes rapidly.
You are getting the latest performance increases which are significant. 
The OpenMP jt9, which is not in WSJT-X yet, has the potential to almost half
decoding times in dual JT65+JT9 mode when there is equivalent work to be
done in each mode.
>
> --- John G4KLA
73
Bill
G4WJS.


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] v4926 OpenMP

2015-02-04 Thread Claude Frantz

Hi all,

I have just run jt9_omp under valgrind. All the other arguments are the 
same as used previously. You will find the log attached.


Perhaps that this can help a little bit.

Best 88 de Claude



valog.txt.bz2
Description: application/bzip
--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] v4926 OpenMP

2015-02-04 Thread Bill Somerville
On 04/02/2015 14:05, John Nelson wrote:
> Hi Bill and Joe,
Hi John,
>
> With regard to Mac builds, your [Bill] code test with workspace and 
> workspace_mt executes correctly with my gfortran compiler.   However, as you 
> point out the current clang/clang++ do not [yet] have OpenMP support.
>
> So when I compile fftw_3.3.4 with --enable-threads, I cannot also use  
> --with-openmp.  I also get:
>
> -- Try OpenMP C flag = [ ]
> -- Performing Test OpenMP_FLAG_DETECTED
> -- Performing Test OpenMP_FLAG_DETECTED - Failed
I am experimenting with the MacPorts gcc 4.9 suite with building WSJT-X. 
That needs changes to the CMake script which I have not committed yet. 
So far it doesn't seem to be necessary to build or use the OpenMP 
version of FFTW3, the native/pthreads version is working well and seems 
to be compatible with an OpenMP program. I believe the only issue is 
that we need to control the number of threads used by FFTW3 and OpenMP 
manually to a certain extent. If it does become necessary to use the 
OpenMP version of FFTW3, that can be built on Mac, again I have the 
MacPorts version available.

There also appears to be a bug in CMake that is causing it not to pass 
on the portability options to the gcc compilers/linker (MAC_OSX_SYSROOT 
and MAC_OSX_DEPLOYMENT_TARGET). This is not serious and can be worked 
around if necessary but I want to get it sorted out properly if possible.

My current focus apart from v1.4 issues is to help Joe with 
multi-threading hazards in jt9 but I am working on the Mac builds with 
OpenMP as well.
>
> when building WSJT-X r4928 which is currently executing successfully - and 
> certainly decodes rapidly.
You are getting the latest performance increases which are significant. 
The OpenMP jt9, which is not in WSJT-X yet, has the potential to almost 
half decoding times in dual JT65+JT9 mode when there is equivalent work 
to be done in each mode.
>
> --- John G4KLA
73
Bill
G4WJS.

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] v4926 OpenMP

2015-02-04 Thread John Nelson
Hi Bill and Joe,

With regard to Mac builds, your [Bill] code test with workspace and 
workspace_mt executes correctly with my gfortran compiler.   However, as you 
point out the current clang/clang++ do not [yet] have OpenMP support.

So when I compile fftw_3.3.4 with --enable-threads, I cannot also use  
--with-openmp.  I also get:

-- Try OpenMP C flag = [ ]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Failed

when building WSJT-X r4928 which is currently executing successfully - and 
certainly decodes rapidly.

--- John G4KLA
--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X Decoder Performance

2015-02-04 Thread Michael Black
Just tested 4928 on Windows 7...jt9_omp would not finish running in 4926.
Runs fine nowa 20% improvement on a Dual CPU quad core 3Ghz Intel X5450
Also tested on my Windows 10 machine but the runtime on jt9_omp was very
unstable taking anywhere from .8 seconds to 8 or so seconds.

TimeMem-1.0.exe jt9 -p 1 -d 3 -w 2 -m 1 130610_2343.wav
2343  -9  0.3 3196 @ WB8QPG IZ0MIT -11
2343 -18  1.0 3372 @ KK4HEG KE0CO CN87
2343  14  0.1 3490 @ CQ AG4M EM75
2343 -20 -1.3 3567 @ CQ TA4A KM37
2343 -15  0.1 3627 @ CT1FBK IK5YZT R+02
2343 -23  0.3 3721 @ KF5SLN KB1SUA FN42
2343 -16  0.2 3774 @ CQ M0ABA JO01
2343  -2  0.2 3843 @ EI3HGB DD2EE JO31
2343 -20  0.3  718 # VE6WQ SQ2NIJ -14
2343  -7  0.3  815 # KK4DSD W7VP -16
2343 -10  0.5  975 # CQ DL7ACA JO40
2343  -9  0.8 1089 # N2SU W0JMW R-14
2343 -11  0.8 1259 # YV6BFE F6GUU R-08
2343  -9  1.7 1471 # VA3UG F1HMR 73
2343  -1  0.6 1718 # BG THX JOE 73
2343 -15  1.3 1951 # RA3Y VE3NLS 73
2343 -20  0.4 2065 # K2OI AJ4UU R-20
   0   1
Exit code  : 0
Elapsed time   : 1.55
Kernel time: 0.03 (2.0%)
User time  : 1.22 (78.3%)
page fault #   : 11742
Working set: 45468 KB
Paged pool : 231 KB
Non-paged pool : 8 KB
Page file size : 71504 KB

TimeMem-1.0.exe jt9_omp -p 1 -d 3 -w 2 -m 1 130610_2343.wav
2343  -9  0.3 3196 @ WB8QPG IZ0MIT -11
2343 -18  1.0 3372 @ KK4HEG KE0CO CN87
2343 -20  0.3  718 # VE6WQ SQ2NIJ -14
2343  -7  0.3  815 # KK4DSD W7VP -16
2343 -10  0.5  975 # CQ DL7ACA JO40
2343  -9  0.8 1089 # N2SU W0JMW R-14
2343  14  0.1 3490 @ CQ AG4M EM75
2343 -11  0.8 1259 # YV6BFE F6GUU R-08
2343 -20 -1.3 3567 @ CQ TA4A KM37
2343 -15  0.1 3627 @ CT1FBK IK5YZT R+02
2343  -9  1.7 1471 # VA3UG F1HMR 73
2343 -23  0.3 3721 @ KF5SLN KB1SUA FN42
2343 -16  0.2 3774 @ CQ M0ABA JO01
2343  -2  0.2 3843 @ EI3HGB DD2EE JO31
2343  -1  0.6 1718 # BG THX JOE 73
2343 -15  1.3 1951 # RA3Y VE3NLS 73
2343 -20  0.4 2065 # K2OI AJ4UU R-20
   0   1
Exit code  : 0
Elapsed time   : 1.10
Kernel time: 0.08 (7.1%)
User time  : 1.23 (111.8%)
page fault #   : 13340
Working set: 51648 KB
Paged pool : 212 KB
Non-paged pool : 11 KB
Page file size : 74824 KB


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X Decoder Performance

2015-02-04 Thread Bill Somerville
On 04/02/2015 13:37, Joe Taylor wrote:
> Hi Claude,
Hi Claude & Joe,
>
> Thanks for your timing report.
>
> Your first test may have used the correct executables, too.  To get a
> good test you must run a configuration at least twice.  In the first
> run, the program accumulates "wisdom" about the best way to configure
> the FFT calculations.  This wisdom is saved and used for subsequent
> runs.  If you change the "-w #" or "-m #" parameters, new wisdom will
> need to be accumulated.
That is also the case if the number of threads (the '-m #' option) is 
changed.
>
>   -- 73, Joe, K1JT
73
Bill
G4WJS.
>
> On 2/4/2015 5:44 AM, Claude Frantz wrote:
>> On 02/04/2015 08:42 AM, Claude Frantz wrote:
>>
>>> Please see here the result I have got with SVNVERSION 4928.
>> I'm very sorry, I have used the wrong executables. Here the right output
>> now:
>>
>> $ time ./jt9 -p 1 -d 3 -w 2 -m 1
>> /home/claude/.wsjtx/bin/save/samples/130610_2343.wav
>> 2343  -9  0.3 3196 @ WB8QPG IZ0MIT -11
>> 2343 -18  1.0 3372 @ KK4HEG KE0CO CN87
>> 2343  14  0.1 3490 @ CQ AG4M EM75
>> 2343 -20 -1.3 3567 @ CQ TA4A KM37
>> 2343 -15  0.1 3627 @ CT1FBK IK5YZT R+02
>> 2343 -23  0.3 3721 @ KF5SLN KB1SUA FN42
>> 2343 -16  0.2 3774 @ CQ M0ABA JO01
>> 2343  -2  0.2 3843 @ EI3HGB DD2EE JO31
>> 2343 -20  0.3  718 # VE6WQ SQ2NIJ -14
>> 2343  -7  0.3  815 # KK4DSD W7VP -16
>> 2343 -10  0.5  975 # CQ DL7ACA JO40
>> 2343  -9  0.8 1089 # N2SU W0JMW R-14
>> 2343 -11  0.8 1259 # YV6BFE F6GUU R-08
>> 2343  -9  1.7 1471 # VA3UG F1HMR 73
>> 2343  -1  0.6 1718 # BG THX JOE 73
>> 2343 -15  1.3 1951 # RA3Y VE3NLS 73
>> 2343 -20  0.4 2065 # K2OI AJ4UU R-20
>> 0   1
>>
>> real 0m2.407s
>> user 0m2.324s
>> sys  0m0.073s
>>
>> 
>>
>> $ time ./jt9_omp -p 1 -d 3 -w 2 -m 1
>> /home/claude/.wsjtx/bin/save/samples/130610_2343.wav
>> 2343 -20  0.3  718 # VE6WQ SQ2NIJ -14
>> 2343  -9  0.3 3196 @ WB8QPG IZ0MIT -11
>> 2343  -7  0.3  815 # KK4DSD W7VP -16
>> 2343 -18  1.0 3372 @ KK4HEG KE0CO CN87
>> 2343 -10  0.5  975 # CQ DL7ACA JO40
>> 2343  -9  0.8 1089 # N2SU W0JMW R-14
>> 2343 -11  0.8 1259 # YV6BFE F6GUU R-08
>> 2343  14  0.1 3490 @ CQ AG4M EM75
>> 2343 -20 -1.3 3567 @ CQ TA4A KM37
>> 2343  -9  1.7 1471 # VA3UG F1HMR 73
>> 2343 -15  0.1 3627 @ CT1FBK IK5YZT R+02
>> 2343 -23  0.3 3721 @ KF5SLN KB1SUA FN42
>> 2343 -16  0.2 3774 @ CQ M0ABA JO01
>> 2343  -2  0.2 3843 @ EI3HGB DD2EE JO31
>> 2343  -1  0.6 1718 # BG THX JOE 73
>> 2343 -15  1.3 1951 # RA3Y VE3NLS 73
>> 2343 -20  0.4 2065 # K2OI AJ4UU R-20
>> 0   1
>>
>> real 0m1.663s
>> user 0m2.502s
>> sys  0m0.090s
>>
>>
>>
>> --
>> Dive into the World of Parallel Programming. The Go Parallel Website,
>> sponsored by Intel and developed in partnership with Slashdot Media, is your
>> hub for all things parallel software development, from weekly thought
>> leadership blogs to news, videos, case studies, tutorials and more. Take a
>> look and join the conversation now. http://goparallel.sourceforge.net/
>> ___
>> wsjt-devel mailing list
>> wsjt-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/wsjt-devel
> --
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> ___
> wsjt-devel mailing list
> wsjt-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wsjt-devel


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] v4926 OpenMP

2015-02-04 Thread Bill Somerville
On 04/02/2015 13:31, Joe Taylor wrote:
> Hi Bill and all,
Hi Joe,
>
> Probably you already know this, but just in case...
>
> I ran WSJT-X over last night using jt9_omp.exe (renamed to jt9.exe, 
> and dropped in as a replacement) under r4928.  It ran OK for a while, 
> producing interspersed decodes in both modes.  The decoder then 
> crashed, popping up the attached error message.
That is different for the ones I am seeing although if you are using a 
release configuration build that might be collateral damage from an un 
caught array bounds violation.
>
> I assume that we still have one or more thread-safety issues.
timer.f90 is the main culprit. Also four2a.f90 may still have some 
issues but I think that one is OK.

There is a secondary problem with stack size. Because OpenMP forces all 
local arrays to be on the stack there is an overly large stack 
requirement. This could be reduced by giving the SAVE attribute to some 
of them, the downside of that is that they then become shared between 
threads and we either have to make sure that one one thread can ever 
modify them or serialize access to them if they can and should be shared 
between threads.

Another outstanding issue is Mac builds, currently we use the Mac native 
compilers for C & C++ (clang/clang++). Those compilers do not have 
(working) OpenMP support. I am experimenting with using gcc/g++ to 
build, also I am experimenting with only having the Fortran compiler 
having OpenMP support and continuing to mix gfortran compiled object 
files with clang/clang++ but I doubt this is going work. Hopefully the 
former will be viable.
>
> -- Joe, K1JT
73
Bill
G4WJS.

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X Decoder Performance

2015-02-04 Thread Joe Taylor
Hi Claude,

Thanks for your timing report.

Your first test may have used the correct executables, too.  To get a 
good test you must run a configuration at least twice.  In the first 
run, the program accumulates "wisdom" about the best way to configure 
the FFT calculations.  This wisdom is saved and used for subsequent 
runs.  If you change the "-w #" or "-m #" parameters, new wisdom will 
need to be accumulated.

-- 73, Joe, K1JT

On 2/4/2015 5:44 AM, Claude Frantz wrote:
> On 02/04/2015 08:42 AM, Claude Frantz wrote:
>
>> Please see here the result I have got with SVNVERSION 4928.
>
> I'm very sorry, I have used the wrong executables. Here the right output
> now:
>
> $ time ./jt9 -p 1 -d 3 -w 2 -m 1
> /home/claude/.wsjtx/bin/save/samples/130610_2343.wav
> 2343  -9  0.3 3196 @ WB8QPG IZ0MIT -11
> 2343 -18  1.0 3372 @ KK4HEG KE0CO CN87
> 2343  14  0.1 3490 @ CQ AG4M EM75
> 2343 -20 -1.3 3567 @ CQ TA4A KM37
> 2343 -15  0.1 3627 @ CT1FBK IK5YZT R+02
> 2343 -23  0.3 3721 @ KF5SLN KB1SUA FN42
> 2343 -16  0.2 3774 @ CQ M0ABA JO01
> 2343  -2  0.2 3843 @ EI3HGB DD2EE JO31
> 2343 -20  0.3  718 # VE6WQ SQ2NIJ -14
> 2343  -7  0.3  815 # KK4DSD W7VP -16
> 2343 -10  0.5  975 # CQ DL7ACA JO40
> 2343  -9  0.8 1089 # N2SU W0JMW R-14
> 2343 -11  0.8 1259 # YV6BFE F6GUU R-08
> 2343  -9  1.7 1471 # VA3UG F1HMR 73
> 2343  -1  0.6 1718 # BG THX JOE 73
> 2343 -15  1.3 1951 # RA3Y VE3NLS 73
> 2343 -20  0.4 2065 # K2OI AJ4UU R-20
> 0   1
>
> real  0m2.407s
> user  0m2.324s
> sys   0m0.073s
>
> 
>
> $ time ./jt9_omp -p 1 -d 3 -w 2 -m 1
> /home/claude/.wsjtx/bin/save/samples/130610_2343.wav
> 2343 -20  0.3  718 # VE6WQ SQ2NIJ -14
> 2343  -9  0.3 3196 @ WB8QPG IZ0MIT -11
> 2343  -7  0.3  815 # KK4DSD W7VP -16
> 2343 -18  1.0 3372 @ KK4HEG KE0CO CN87
> 2343 -10  0.5  975 # CQ DL7ACA JO40
> 2343  -9  0.8 1089 # N2SU W0JMW R-14
> 2343 -11  0.8 1259 # YV6BFE F6GUU R-08
> 2343  14  0.1 3490 @ CQ AG4M EM75
> 2343 -20 -1.3 3567 @ CQ TA4A KM37
> 2343  -9  1.7 1471 # VA3UG F1HMR 73
> 2343 -15  0.1 3627 @ CT1FBK IK5YZT R+02
> 2343 -23  0.3 3721 @ KF5SLN KB1SUA FN42
> 2343 -16  0.2 3774 @ CQ M0ABA JO01
> 2343  -2  0.2 3843 @ EI3HGB DD2EE JO31
> 2343  -1  0.6 1718 # BG THX JOE 73
> 2343 -15  1.3 1951 # RA3Y VE3NLS 73
> 2343 -20  0.4 2065 # K2OI AJ4UU R-20
> 0   1
>
> real  0m1.663s
> user  0m2.502s
> sys   0m0.090s
>
>
>
> --
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> ___
> wsjt-devel mailing list
> wsjt-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wsjt-devel

--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] v4926 OpenMP

2015-02-04 Thread Joe Taylor

Hi Bill and all,

Probably you already know this, but just in case...

I ran WSJT-X over last night using jt9_omp.exe (renamed to jt9.exe, and 
dropped in as a replacement) under r4928.  It ran OK for a while, 
producing interspersed decodes in both modes.  The decoder then crashed, 
popping up the attached error message.


I assume that we still have one or more thread-safety issues.

-- Joe, K1JT
--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] v4926 OpenMP

2015-02-04 Thread Bill Somerville

On 04/02/2015 10:12, Josh Rovero wrote:
Hi Josh,

v4926 seems to compile and run fine on Fedora Core 20 x86_64.
Compile output says it built with OpenMP.
Only jt9_omp is built with OpenMP, WSJT-X is still using the non-OpenMP 
jt9 for decoding. There is still some code to write and test before 
jt9_omp can be used by WSJT-X. For now jt9_omp is just a test bed for 
the OpenMP trial.


I do have a patch to allow WSJT-X to use jt9_omp for decoding but it 
crashes at the moment, I will post the patch here when it works a bit 
better so those who are building the development branch can test the 
multi-threaded decoder on air.


Not a ton of JT9 signals last night, but didn't crash or otherwise
misbehave for the hour or so I ran it.

--
P.J. "Josh" Rovero   Ham Radio: KK1D
http://www.roveroresearch. net 
http://www.roveroresearch.info

73
Bill
G4WJS.
--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


Re: [wsjt-devel] WSJT-X Decoder Performance

2015-02-04 Thread Claude Frantz
On 02/04/2015 08:42 AM, Claude Frantz wrote:

> Please see here the result I have got with SVNVERSION 4928.

I'm very sorry, I have used the wrong executables. Here the right output 
now:

$ time ./jt9 -p 1 -d 3 -w 2 -m 1 
/home/claude/.wsjtx/bin/save/samples/130610_2343.wav
2343  -9  0.3 3196 @ WB8QPG IZ0MIT -11
2343 -18  1.0 3372 @ KK4HEG KE0CO CN87
2343  14  0.1 3490 @ CQ AG4M EM75
2343 -20 -1.3 3567 @ CQ TA4A KM37
2343 -15  0.1 3627 @ CT1FBK IK5YZT R+02
2343 -23  0.3 3721 @ KF5SLN KB1SUA FN42
2343 -16  0.2 3774 @ CQ M0ABA JO01
2343  -2  0.2 3843 @ EI3HGB DD2EE JO31
2343 -20  0.3  718 # VE6WQ SQ2NIJ -14
2343  -7  0.3  815 # KK4DSD W7VP -16
2343 -10  0.5  975 # CQ DL7ACA JO40
2343  -9  0.8 1089 # N2SU W0JMW R-14
2343 -11  0.8 1259 # YV6BFE F6GUU R-08
2343  -9  1.7 1471 # VA3UG F1HMR 73
2343  -1  0.6 1718 # BG THX JOE 73
2343 -15  1.3 1951 # RA3Y VE3NLS 73
2343 -20  0.4 2065 # K2OI AJ4UU R-20
   0   1

real0m2.407s
user0m2.324s
sys 0m0.073s



$ time ./jt9_omp -p 1 -d 3 -w 2 -m 1 
/home/claude/.wsjtx/bin/save/samples/130610_2343.wav
2343 -20  0.3  718 # VE6WQ SQ2NIJ -14
2343  -9  0.3 3196 @ WB8QPG IZ0MIT -11
2343  -7  0.3  815 # KK4DSD W7VP -16
2343 -18  1.0 3372 @ KK4HEG KE0CO CN87
2343 -10  0.5  975 # CQ DL7ACA JO40
2343  -9  0.8 1089 # N2SU W0JMW R-14
2343 -11  0.8 1259 # YV6BFE F6GUU R-08
2343  14  0.1 3490 @ CQ AG4M EM75
2343 -20 -1.3 3567 @ CQ TA4A KM37
2343  -9  1.7 1471 # VA3UG F1HMR 73
2343 -15  0.1 3627 @ CT1FBK IK5YZT R+02
2343 -23  0.3 3721 @ KF5SLN KB1SUA FN42
2343 -16  0.2 3774 @ CQ M0ABA JO01
2343  -2  0.2 3843 @ EI3HGB DD2EE JO31
2343  -1  0.6 1718 # BG THX JOE 73
2343 -15  1.3 1951 # RA3Y VE3NLS 73
2343 -20  0.4 2065 # K2OI AJ4UU R-20
   0   1

real0m1.663s
user0m2.502s
sys 0m0.090s



--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel


[wsjt-devel] v4926 OpenMP

2015-02-04 Thread Josh Rovero
v4926 seems to compile and run fine on Fedora Core 20 x86_64.
Compile output says it built with OpenMP.

Not a ton of JT9 signals last night, but didn't crash or otherwise
misbehave for the hour or so I ran it.

-- 
P.J. "Josh" Rovero   Ham Radio: KK1D
http://www.roveroresearch. net
http://www.roveroresearch.info
--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel