[linrad] Re: Network speed problems.

2007-01-06 Thread J.D. Bakker

Is there something wrong in what I do? Does Windows behave
the same way?


On Linux, try this:

- if it's not installed already, install the netcat tool through your 
distribution's package system

- make sure your multicast routing is set up properly
- do:

  dd if=/dev/zero bs=1k count=100k | nc -u -q 1 239.255.0.16 1234

(on some distributions the nc command is called netcat, with the same syntax)

This sends 100MB of zeroes to the multicast address. On my PII/350 this gives:

  102400+0 records in
  102400+0 records out
  104857600 bytes (105 MB) copied, 8.88197 seconds, 11.8 MB/s

...which is as close to the wire speed as one can expect. I cannot 
test this on my PI/166, since it has no Fast Ethernet interface.


Note that this test may give false negatives. The netcat tool is not 
optimized for speed (or processor loading), but if your computer can 
get >11MB/sec with netcat, you can expect linrad to be able to get 
the same amount of bandwidth. The opposite may not be true.


JD 'insomnia' B.
--
LART. 250 MIPS under one Watt. Free hardware design files.
http://www.lartmaker.nl/

#
This message is sent to you because you are subscribed to
 the mailing list .
To unsubscribe, E-mail to: <[EMAIL PROTECTED]>
To switch to the DIGEST mode, E-mail to <[EMAIL PROTECTED]>
To switch to the INDEX mode, E-mail to <[EMAIL PROTECTED]>
Send administrative queries to  <[EMAIL PROTECTED]>



[linrad] Re: Network speed problems.

2007-01-06 Thread J.D. Bakker

Presumably this is well known, I an just a newcomer
to networking and I could not guess it would behave
this way.


It's not well known; with modern hardware (P3 and newer) you should 
easily be able to saturate a 100Base-TX network (ie get full 
bandwidth). Why else would all modern machines come with 1Gb 
networking cards ?


Not all network hardware is created equal; some Ethernet chips have 
trouble running at full speed (I seem to recall older RTL8xxx parts, 
but it's been a while.



Is there some settings I should change? One possibillity
seems to be to open several sockets on different ports
simultaneously to increase the throughput.


The kernel socket layer should never be the bottleneck on non-ancient hardware.


Is there something wrong in what I do?


Do you have a short piece of code that shows this behaviour ? A 
standalone program would be best.



To be able to run at 96 khz with fft1 transfer it seems
I will have to use at least 6 sockets in parallel with
different port numbers.


That should never happen.


Trying to find some hints on the Internet I came across this:
http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IBMp690/IBM/usr/share/man/info/en_US/a_doc_lib/aixbman/prftungd/2365c93.htm

It seems to indicate that I should send much larger packets???
(a multiple of 4096 bytes 'header' included)


That's for a completely different operating system, and has nothing 
to do with the way Linux works.


JDB.
--
Riddoch's Myth of computing:
Any computer problem is invariably the fault of the closest
sysadmin.

#
This message is sent to you because you are subscribed to
 the mailing list .
To unsubscribe, E-mail to: <[EMAIL PROTECTED]>
To switch to the DIGEST mode, E-mail to <[EMAIL PROTECTED]>
To switch to the INDEX mode, E-mail to <[EMAIL PROTECTED]>
Send administrative queries to  <[EMAIL PROTECTED]>



[linrad] Re: Network standards for SDR

2007-01-05 Thread J.D. Bakker

 >As for the
 > ReceiveData() function, that line can be directly replaced with your

 recvfrom() call (I just was too lazy too look up recvfrom() when I
 wrote that example).

Well, if you care to write the full code you will find that
this statement is not quite what it looks like.


It's not much different. This is what a modified version of your 
program looks like:


typedef struct {
  short header_len;  // This field is always present, and always first
  short data_len;// This field is always present, and always second
  [ header contents, including header version, type, etc. goes here ]
  char data[NET_MULTICAST_PAYLOAD];
} NET_RX_STRUCT;

NET_RX_STRUCT msg;
rxin_char=(void*)(&timf1_char[timf1p_pa]);
timf1p_pa=(timf1p_pa+ad_read_bytes)&timf1_bytemask;
for(j=0; j  memcpy(&rxin_char[j], ((void *)&msg) + msg.header_len, 
NET_MULTICAST_PAYLOAD);

  }


For the time being I kept the data size constant, to not mix the 
issues (variable data size adds 5-6 lines). And yes, there is a 
memcpy. But read below...


By the way, there appears to be an inconsistency in your program. Here:

  timf1p_pa=(timf1p_pa+ad_read_bytes)&timf1_bytemask;

you make sure that the address pointer for the circular buffer wraps 
around, but I see no such protection in the for() loop. Or am I 
missing something ?



 > even if it

 did, it matters very little on modern CPUs (packets this size will
 remain entirely within the CPU cache).

Linrad is intended to run on elderly computers and it is also
intended to run at much higher bandwidths on modern ones.
You suggest that the data is put into a buffer to which a
pointer is returned by ReceiveData. The next step would be to
store the payload into a circular buffer. This will cause the data
to become written into memory twice.


An extra memcpy makes little difference in this loop. Note that the 
recvfrom() needs to do the equivalent of a memcpy() anyway. I wrote a 
little test program (see the bottom of this mail) to test the speed 
difference between 1 and 2 copy instructions if the destination 
buffer is larger than the cache.


On a Pentium MMX 166MHz, a Thinkpad laptop with X running, I get:

Single copy: 1000 loops in 129.44 seconds, or 79.11 MiBps.
Double copy: 1000 loops in 147.21 seconds, or 69.56 MiBps.

The first copy takes about two cycles per byte, adding a second copy 
adds less than 0.3 cycles per byte.


On a Pentium II 350MHz (rescued from the garbage a month ago):

Single copy: 1000 loops in 55.76 seconds, or 183.64 MiBps.
Double copy: 1000 loops in 66.36 seconds, or 154.30 MiBps.

The ratio is similar: first copy just under 2 cycles/byte, second 
copy adds 0.36 cycles/byte.


Your scenario will likely be even closer, since the kernel will need 
to read the UDP datagrams from main memory, too.



 Processing of the data is
in another thread that require hundreds of packages in the circular
buffer. It will fetch its input from memory because other
threads have been using the cash in the meantime.


Is there any way at all that you can avoid that, and process the data 
as it comes in ? My first big multi-threaded program was a real-time 
streaming video encoder for a quad Pentium Pro machine, and switching 
processing from a frame at a time to a macroblock (16x16pixels) at a 
time sped the encoder up tremendously, even though the required 
number of operations almost doubled.



 > Zero-copy architectures make

 sense for hi-speed packet switching on slow computers; as soon as you
 add any processing on the data, that single extra copy gets lost in
 the noise. Cache line/block alignment is much more important for
 performance.

Actually this is not in agreement with my observations. It does depend
on how efficcien "processing" is done.


In some cases, yes. I've re-written a fixed-point FFT for ARM so that 
reading the first word would trigger the loading of a full cache 
line, so that the FFT would never have to wait for its data. But even 
that would get lost in the noise once you actually started processing 
the data.



The most demanding task is the full bandwidth, full dynamic
range FFT. It would be identical in all computers and it does
not make any sense to do it in more than one computer.


Why ? Because this one computer would be much faster than the others ?


 > Do you want to have an exact, synchronized display on

 multiple machines ?

This would be the case also if raw data were used.


[snip]


The "innocent" slave does not have to know that a data stream is
"cooked". It can be processed as if it were raw data, but a clever
slave can make use of complex information that it might want to
as for. If you want to compute the noise floor power density
you want to know what percentage of samples that were blanked
out because of noise pulses for example. Normally one would not care
at all.


So would it be correct to say that:

(a) if all comp

[linrad] Re: Network standards for SDR

2007-01-05 Thread J.D. Bakker

 > This is still easy to parse, since all a user needs to do is something like


struct NET_RX_STRUCT *rx_packet;
char *my_data;
short i, my_data_len;

rx_packet = ReceiveData();
my_data = ((char *) rx_packet) + rx_packet->header_len;
for(i = 0; i < rx_packet->data_len; i++)
  DoSomethingWithMyData(my_data[i]);

Yes, but but these modern ways of writing scares off all my
friends who can use old-fashioned C but not C++.

First of all ReceiveData() has to be written, separate
buffers of size NET_RX_STRUCT have to be allocated and
managed etc. I do not currently have such code and I suspect
it involves needless copy operations. I am looking
for bandwidths of 2 MHz and above (for VHF noise blanking
to remove static rain) so needless copy - probably
up and down to main memory is something I want to avoid.


I've never written a single line of C++ in my life. As for the 
ReceiveData() function, that line can be directly replaced with your 
recvfrom() call (I just was too lazy too look up recvfrom() when I 
wrote that example). I don't see that having a header in front of the 
package needs more copy calls than one after the package; even if it 
did, it matters very little on modern CPUs (packets this size will 
remain entirely within the CPU cache). Zero-copy architectures make 
sense for hi-speed packet switching on slow computers; as soon as you 
add any processing on the data, that single extra copy gets lost in 
the noise. Cache line/block alignment is much more important for 
performance.



 > This is enough for basic decoding of any stream, no ? Even the center

 frequency can be seen as superfluous (since it's only for display and
 not strictly needed for decoding).

The primary usage of the Linrad network was for the second operator
in a contest station. It is an obvious advantage that the display
is always correct - particularly if several bands are monitored
simultaneously.


 I cannot imagine what systems would evolve over the coming five years
 that couldn't fit in this framework.

Linrad can also send data in the frequency domain and there is
quite a lot of info that a slave will need. Admittedly those
formats are likely to be used by Linrad only but they carry
many more complications.


OK, I see.

I was assuming that the multicast connection would be used for 
distributing raw data only. Re-reading your earlier posts it looks 
like you want to be able to send both raw *and* cooked (processed) 
data. Why ? Do you expect the slaves to be much slower than the 
master ? Do you want to have an exact, synchronized display on 
multiple machines ?


Looking at other network protocols (especially streaming), it has 
historically been a bad idea to combine multiple modes into one 
protocol, for reasons of maintainability, performance and clarity. 
Linrad is your code, and it's completely up to you, but might I 
suggest you consider either splitting the transmission modes in raw 
and cooked, or (better) multicasting only raw, unprocessed data and 
sending all filter parameters over a separate channel ?



 > [about an ADC-to-Ethernet]
Probably it would be better to connect it to a socket on a
dedicated ethernet port on one computer, the one which has
the controls for the radio hardware connected to this soundcard.
The master wants 100% reliable data because I assume you do not
want to put the master system clock on the audio-to-Ethernet
converter.


Why not ? It has a GPS-controlled OCXO to synchronize all sampling 
clocks and to keep time, isn't that sufficient ?



Are you aware of any standard format for streaming unprocessed
audio data?


At my previous job we had a few, but they were paper-only. There is 
MADI (digital audio) and SMPTE-259M (digital video) over ATM, but 
that doesn't quite apply here. I believe the AES have some, but those 
are for-pay documents, and I'm not an AES member anymore.



What data format were you contemplating before this discussion
started?


Pretty much what I described above: header with header length, data 
length, number of channels, sample size, sample rate, timestamp and a 
few descriptive fields (with a version field, so that -- if truly 
necessary -- upgrades are possible). Nothing that isn't strictly 
required: less is more.


It's what everybody else does. I know that that's not much of an 
argument ('50 million Elvis fans can't be wrong'), but in 15 years of 
working on network protocols, this is pretty much the only way that 
I've seen working reliably for successful sampled AV or radio 
projects (I've seen a similar system used on an antenna array for 
MIMO trials). Conversely, I have never ever seen a combined 
raw/cooked protocol that worked, or better: that remained working. Or 
it evolved into something like WAV: a historical accident that 
everyone loves to hate.


JDB.
--
In protocol design, perfection has been reached not when there is 
nothing left to add, but when there is nothing left to take away.

   --

[linrad] Re: Network standards for SDR

2007-01-04 Thread J.D. Bakker

Leif and all,


Would you agree on milliseconds since midnight? From JDB I
learned that a double with seconds since Unix epoch would be a bad 
idea since conversion may be difficult on non-PC platforms. (It is 
the internal time format within

Linrad however)


Yes, milliseconds since  UTC would be OK.  Maybe you should send 
BOTH this quantity AND a double with seconds since Unix epoch (which 
I would actually prefer).  I don't see the conversion issue as a big 
deal; little-endian to big-endian copnversion is trivial, and 
doesn't nearly everybody use IEEE floating point these days?


Pretty hard to fit in a 256-cell CPLD.

JDB.
--
LART. 250 MIPS under one Watt. Free hardware design files.
http://www.lartmaker.nl/

#
This message is sent to you because you are subscribed to
 the mailing list .
To unsubscribe, E-mail to: <[EMAIL PROTECTED]>
To switch to the DIGEST mode, E-mail to <[EMAIL PROTECTED]>
To switch to the INDEX mode, E-mail to <[EMAIL PROTECTED]>
Send administrative queries to  <[EMAIL PROTECTED]>



[linrad] Re: Network standards for SDR

2007-01-04 Thread J.D. Bakker

The newcomer who wants to write his own software does not have
to know anything about the header, he can just use the 1024
bytes of data and ignore whatever has been appended. Having a
header which has to be properly decoded in order to extract
the data builds a threshold that makes it more difficult to
get started. (Processing simple .wav files has a pretty high
threshold in decoding the header. Common practice between
amateurs has been to just dicard the header and read the actual
data using the information supplied with the file. Those
who worked with the UNKN422 challenge were adviced to do
so for example.)


WAV is an example of a file format where *everyone* added their own 
custom headers/chunks, without any planning. As a result, no program 
can read all existing WAV files; WAV is considered an example how 
*not* to do a file format.


Would you consider a very simple, easy to parse header like this:

typedef struct {
  short header_len;  // This field is always present, and always first
  short data_len;// This field is always present, and always second
  [ header contents, including header version, type, etc. goes here ]
  char data[];
} NET_RX_STRUCT;

This is still easy to parse, since all a user needs to do is something like

  struct NET_RX_STRUCT *rx_packet;
  char *my_data;
  short i, my_data_len;

  rx_packet = ReceiveData();
  my_data = ((char *) rx_packet) + rx_packet->header_len;
  for(i = 0; i < rx_packet->data_len; i++)
DoSomethingWithMyData(my_data[i]);



 > That, too, makes it harder for dedicated hardware receivers; ideally

 these would not need _any_ communication from the slave to the
 master. As I see it, encoding this information in the header of each
 package is a low-overhead way to reduce ambiguity, too.

The problem is that there are so many possibillities. I do not
want to invent a complicated scheme for describing the myriad
of things I can imagine now only to discover in a few years that
something entirely different has evolved.


I would suggest keeping it extremely simple. There is not very much 
information that varies between sampled systems:


- sample size
- sample rate
- number of channels (could even be fixed to 'always I/Q')

and, for radio systems,

- center frequency

This is enough for basic decoding of any stream, no ? Even the center 
frequency can be seen as superfluous (since it's only for display and 
not strictly needed for decoding).


I cannot imagine what systems would evolve over the coming five years 
that couldn't fit in this framework.


At 17:47 +0100 04-01-2007, Leif Asbrink wrote (in another mail):

Would you agree on milliseconds since midnight? From JDB I
learned that a double with seconds since Unix epoch
would be a bad idea since conversion may be difficult on
non-PC platforms. (It is the internal time format within
Linrad however)


I would use the same interface that gettimeofday() uses: a long with 
seconds since the Epoch (Jan 1 1970), and a long with microseconds.



The formats I intend to use within Linrad will use IA32 little endian
(as well as IA32 float) I have no intention to make Linrad portable
to other platforms and I am pretty sure I will not change my mind
on this point for the next 5 years or more. Probably never.


OK, that's fine, so please document this somewhere so those of us on 
non-IA32 can deal with it.


As an example: I'm currently soldering the prototype of an 
audio-to-Ethernet converter as part of a portable hard disk recorder. 
This design uses the CS5381 ADC, one of the best professional audio 
converters on the market with a dynamic range approaching 120dB. This 
is an open-hardware system[1], and with a few modifications I could 
see it being usable for Linrad. A lot of the limitations (time jitter 
on the system clock etc) that are present on a PC platform simply do 
not appear for such a dedicated device. How would you like me to 
interface such a system to Linrad ? Should it be able to act as a 
Linrad master ?


JDB
[1] Converter schematics are here:
http://www.lartmaker.nl/recbox-adc-cs5381-main.png
http://www.lartmaker.nl/recbox-adc-cs5381-power.png
http://www.lartmaker.nl/recbox-adc-cs5381.pdf
--
LART. 250 MIPS under one Watt. Free hardware design files.
http://www.lartmaker.nl/

#
This message is sent to you because you are subscribed to
 the mailing list .
To unsubscribe, E-mail to: <[EMAIL PROTECTED]>
To switch to the DIGEST mode, E-mail to <[EMAIL PROTECTED]>
To switch to the INDEX mode, E-mail to <[EMAIL PROTECTED]>
Send administrative queries to  <[EMAIL PROTECTED]>



[linrad] Re: Network standards for SDR

2007-01-03 Thread J.D. Bakker
A general point: in virtually all communications protocols the 
(descriptive) header comes before the data block, since the receiver 
usually needs to decode the header to be sure what to do with the 
data. This also makes it possible to vary the length of the data 
block, if desired (for instance, to tune to FFT block sizes or 
sampling hardware word length).



 > - I would want a timestamp in there somewhere. It might be derived

 from  block_no, but why not make it explicit ?

I do not see what it would be good for. Why do you want the clock
from the master while there is another one in the slave?


Array processing. It would be very useful for a situation where you 
have multiple masters on one network (either during a contest, or -in 
my case- with a few servers each connected to an antenna+receiver). 
Time sync is not hard over either GPS/TAC or ntp.


Even in one-master situations it could be useful: with timestamps, it 
is very easy to make something similar to the Time Machine.



 > - how is the sampling rate communicated ?
The slave(client) asks the server for the meaning of the data.
Number of channels, nominal sampling rate, whether the format is
real or complex etc.


That, too, makes it harder for dedicated hardware receivers; ideally 
these would not need _any_ communication from the slave to the 
master. As I see it, encoding this information in the header of each 
package is a low-overhead way to reduce ambiguity, too.



 > - if you are not doing so already, please please _please_ use the

 functions htons() / ntohs() and friends to convert between host byte
 order and network byte order (or forever determine that linrad
 communicates with either little endian (IA32) or big endian (Alpha,
 PowerPC etc) byte order. I would want to be able to use a PC as the
 server and my PowerBook as the client, for instance.

I do not see how it matters. Linrad does not put port numbers or
addresses in the packages, that is done by the operating system
and the inner workings of Linrad is not visible from the network.


Byte ordering is not restricted to port numbers or addresses. Every 
time you put an integer which is larger than one byte into a packet, 
the transmitter and receiver need to agree on the byte order. See


http://en.wikipedia.org/wiki/Endianness

for details. Taking my example, if the master runs on an Intel 
machine and the slave on my PowerBook, if the master transmits a 
block_no of 0x01020304, my PowerBook will see that as 0x04030201. Not 
good.


JDB.
--
Years from now, if you are doing something quick and dirty,
you imagine that I am looking over your shoulder and say to
yourself, "Dijkstra would not like this," well that would be
immortality for me.  -- Edsger Dijkstra, 1930 - 2002

#
This message is sent to you because you are subscribed to
 the mailing list .
To unsubscribe, E-mail to: <[EMAIL PROTECTED]>
To switch to the DIGEST mode, E-mail to <[EMAIL PROTECTED]>
To switch to the INDEX mode, E-mail to <[EMAIL PROTECTED]>
Send administrative queries to  <[EMAIL PROTECTED]>



[linrad] Re: Network standards for SDR

2007-01-03 Thread J.D. Bakker

// Structure for multicasting receive data on the network.
#define NET_MULTICAST_PAYLOAD 1024
typedef struct {
char buf[NET_MULTICAST_PAYLOAD];
double passband_center;
float userx_freq;
unsigned int block_no;
unsigned char userx_no;
char passband_direction;
} NET_RX_STRUCT;


Very interesting ! A couple of observations:

- I would want a timestamp in there somewhere. It might be derived 
from  block_no, but why not make it explicit ?

- how is the sampling rate communicated ?
- using float/double makes it much harder for dedicated hardware 
receivers to act as server.
- if you are not doing so already, please please _please_ use the 
functions htons() / ntohs() and friends to convert between host byte 
order and network byte order (or forever determine that linrad 
communicates with either little endian (IA32) or big endian (Alpha, 
PowerPC etc) byte order. I would want to be able to use a PC as the 
server and my PowerBook as the client, for instance.


JDB.
--
LART. 250 MIPS under one Watt. Free hardware design files.
http://www.lartmaker.nl/

#
This message is sent to you because you are subscribed to
 the mailing list .
To unsubscribe, E-mail to: <[EMAIL PROTECTED]>
To switch to the DIGEST mode, E-mail to <[EMAIL PROTECTED]>
To switch to the INDEX mode, E-mail to <[EMAIL PROTECTED]>
Send administrative queries to  <[EMAIL PROTECTED]>



[linrad] Re: complete answer from Roger

2006-11-08 Thread J.D. Bakker

 > - it is just like opening a "regular"

 socket with agroup target address. See

 > http://www.cs.unc.edu/~jeffay/dirt/FAQ/comp249-001-F99/mcast-socket.html

 or the bible of Stevens


My problem is that this is far to cryptic.

I could spend a lot of time searching the net, but maybe someone
can point me to something a little more novice oriented.
I do not have "the bible of Stevens".


A little Googling produces:

http://jungla.dit.upm.es/~jmseyas/linux/mcast.lj/mcast-lj.html


...the original of which appears to be here:

http://www.linuxjournal.com/article/3041

JDB
[ta-ta-ta-talking to myself]
--
LART. 250 MIPS under one Watt. Free hardware design files.
http://www.lartmaker.nl/

#
This message is sent to you because you are subscribed to
 the mailing list .
To unsubscribe, E-mail to: <[EMAIL PROTECTED]>
To switch to the DIGEST mode, E-mail to <[EMAIL PROTECTED]>
To switch to the INDEX mode, E-mail to <[EMAIL PROTECTED]>
Send administrative queries to  <[EMAIL PROTECTED]>



[linrad] Re: complete answer from Roger

2006-11-08 Thread J.D. Bakker

 > - it is just like opening a "regular"

 socket with agroup target address. See

 > http://www.cs.unc.edu/~jeffay/dirt/FAQ/comp249-001-F99/mcast-socket.html

 or the bible of Stevens


My problem is that this is far to cryptic.

I could spend a lot of time searching the net, but maybe someone
can point me to something a little more novice oriented.
I do not have "the bible of Stevens".


A little Googling produces:

http://jungla.dit.upm.es/~jmseyas/linux/mcast.lj/mcast-lj.html
http://www.linuxjunkies.org/html/Multicast-HOWTO.html#s6
http://www.wlug.org.nz/SourceSpecificMulticastExample

(easiest one first)

HTH,

JDB
[looking into bolting Ethernet+multicasting onto an audio ADC]
--
LART. 250 MIPS under one Watt. Free hardware design files.
http://www.lartmaker.nl/

#
This message is sent to you because you are subscribed to
 the mailing list .
To unsubscribe, E-mail to: <[EMAIL PROTECTED]>
To switch to the DIGEST mode, E-mail to <[EMAIL PROTECTED]>
To switch to the INDEX mode, E-mail to <[EMAIL PROTECTED]>
Send administrative queries to  <[EMAIL PROTECTED]>