Wow, Greg,

great insight! I've never actually played around with software like
Hadoop, but always planned to (on a small scale – the number of nodes in
my very heterogeneous "home/office cluster" would be an insane 5).

I really think we should be harnessing the power of the big data tools
in SDR quite a bit more. I totally agree, a lot of SDR folks definitely
have a lot of data to deal with in the first place, and they might
really want to distributedly filter that down to a data stream that is
actually worth storing  – in most applications, you don't care about the
original IQ data once you've extracted the features you wanted – be it
the presence of e.g. aircraft transponders, interference, radar-detected
meteors or ocean waves, atmosphere data or whatever purpose you're doing
SDR for in the first place.

> I have always though HDFS + Gnuradio are destined for each other.  It
> may be a bit early for this with today's hardware; Mr. Moore is
> helping us along just fine, so is AWS.  

Hm, do you *think* there's reasonably easy and yet interesting
applications that a student could implement during a Summer of Code? As
far as I understand, HDFS is more or less the storage subsystem for
Hadoop; maybe that's not the only thing worth considering in the context
of SDR here. Is there an interesting application of the MapReduce
paradigm that you'd love to see combined with GNU Radio?

Best regards,
Marcus

On 01/21/2017 05:33 PM, Gregory Ratcliff wrote:
> I spend my working hours on big data and Hadoop.
> It occurs to me you really need to be thinking about something outside
> of a normal file system.  HDFS lets you write out data in chunks that
> you later combine when you have time.  There are some really (really)
> fast implementation projects that write to hdfs.  Most of the new work
> is in java, but I think you are asking for something pretty light.
>
> I can visualize a "gatherer" for RF and a "filer" in HDFS that writes
> out xx MB chunks every period.  Now as others have said, you don't
> just slap some stuff together, you will need to optimize the
> integration points and think about the best caching and write speeds
> of the "filer" system and the persistent storage.
>
> Likewise, there are plenty of apache tools that will recombine the
> HDFS chunks back into files of arbitrary size.....which you can then
> analyze later with gnuradio...when time doesn't matter as much.  You
> might not need much of Hadoop that the file system and some tools.
>
> I have always though HDFS + Gnuradio are destined for each other.  It
> may be a bit early for this with today's hardware; Mr. Moore is
> helping us along just fine, so is AWS.  
>
> Greg
> Nz8r
>
> On Jan 20, 2017, at 2:46 PM, Marcus Müller <[email protected]
> <mailto:[email protected]>> wrote:
>
>> I can assure you that 32 GHz is not your sampling rate. Do you mean
>> 32 MHz?
>>
>> The problem here is that at first, your operating system can be smart
>> and cache write accesses to files on mass storage devices in RAM (or
>> you use a RAM disk, so everything happens in RAM). But at some point,
>> RAM is going to run out – and then, your recording speed is
>> effectively limited by how fast you can write to your storage (in
>> case of a RAM disk, you simply run full, or your OS starts
>> "swapping", ie. writing RAM to storage. same problem).
>>
>> So, unless you find a way to *reduce* the amount of data you want to
>> record, or simply buy a faster storage system, there's not much you
>> can do.
>>
>>
>> Best regards,
>>
>> Marcus
>>
>>
>> On 01/20/2017 08:42 PM, Mallesham Dasari wrote:
>>> Hi Marcus,
>>>
>>> Thanks for the quick response. I am recording the FFT samples
>>> continuously. But, I am getting overflow after some time when the
>>> file size has become huge. My sample rate is high (32GHz) and hence
>>> writing to the file takes so long and hence the usrp_spectrum_sense
>>> getting overflow.
>>>
>>> On Fri, Jan 20, 2017 at 2:33 PM, Marcus Müller
>>> <[email protected] <mailto:[email protected]>> wrote:
>>>
>>>     Hello Mallesham,
>>>
>>>     I'm afraid not, since I'm afraid that to my current
>>>     understanding, what you want is mathematically impossible.
>>>     Either you want much data – and that seems to be the case, since
>>>     you want to record 24h of raw IQ data – or you can store it in
>>>     what comparably little RAM modern computers have.
>>>
>>>     Maybe, however, we haven't fully understood the problem. Can
>>>     you, mathematically, define what you want to observe and record?
>>>
>>>     Best regards,
>>>
>>>     Marcus
>>>
>>>
>>>
>>>     On 01/20/2017 08:28 PM, Mallesham Dasari wrote:
>>>>     Hello everyone,
>>>>
>>>>     Can anyone give some solution for this? Even writing to the
>>>>     ramdisk is not enough for running the flow graph for so long. I
>>>>     am facing the same issue.  
>>>>
>>>>     Thank you!
>>>>
>>>>     On Thu, Jan 12, 2017 at 5:41 PM, Hasini Abeywickrama
>>>>     <[email protected] <mailto:[email protected]>> wrote:
>>>>
>>>>         Hi all,
>>>>
>>>>         Thank you very much for the informative responses.
>>>>
>>>>         My requirement is to run the flowgraph for a long time
>>>>         (ideally 24 hours) and store the FFT data in the memory
>>>>         (ramdisk) to they can be processed later or in chunks, not
>>>>         everything at the same time.
>>>>
>>>>         So far, I have increased the size of the ramdisk and it
>>>>         works fine for a few hours. But it still is not  the
>>>>         solution I'm looking for.
>>>>
>>>>         Regards,
>>>>         Hasini
>>>>
>>>>         On Thu, Jan 12, 2017 at 8:30 PM, Marcus Müller
>>>>         <[email protected]
>>>>         <mailto:[email protected]>> wrote:
>>>>
>>>>             But if you do a single 1024-FFT, you'd only operate on
>>>>             1024 of the input samples!
>>>>
>>>>             And: the FFT doesn't just give you power values, but
>>>>             complex values; mathematically, the FFT is a DFT, and
>>>>             the DFT is an invertible linear operator
>>>>             <mime-attachment.png>:
>>>>
>>>>             <mime-attachment.png>
>>>>
>>>>             which maps complex vectors to complex vectors of size
>>>>             <mime-attachment.png>.  It is, in fact, representable
>>>>             as square matrix with column (and row) vectors being
>>>>             samples of the orthogonal complex sinusoids
>>>>             $e^{j\frac{2\pi}N nk},\, k=0,\ldots,N-1$; that is, it
>>>>             can also be understood as a /base change matrix/, that
>>>>             just represents the "input vector" according to a
>>>>             different base, orthogonal base.
>>>>             In the physical sense: the input vector base was
>>>>             represented by the standard basis $\mathbf e_N$,
>>>>             meaning that each base vector represents a single point
>>>>             in time – the sample time of the respective entry; the
>>>>             "output" of the transform is represented on a base of
>>>>             orthogonal frequencies. This is an invertible operation
>>>>             – really just another way to look at *the same signal*.
>>>>             I think this is really important to keep in mind:
>>>>
>>>>             The Fourier transforms are /not/ magical by any means.
>>>>             What they do is represent *the same signal* from a
>>>>             different point of view. It can be /interpreted/ as
>>>>             transform between time and frequency domain (or space
>>>>             and impulse, or...). The DFT is still just a boring,
>>>>             old, square, orthogonal, invertible matrix that
>>>>             produces output of the same dimensionality as it takes
>>>>             input.
>>>>
>>>>             As you can see, the DFT/FFT itself never reduces the
>>>>             amount of data.
>>>>
>>>>             What you might be referring to is some kind PSD
>>>>             estimate done by first |·|² a lot of DFTed vectors and
>>>>             then averaging them. The data reduction here lies in
>>>>             the magnitude square operation and the average, not in
>>>>             the DFT.
>>>>             The point here is that you're throwing away a whole lot
>>>>             of information, and I'm not convinced that's what
>>>>             Hasini needs!
>>>>
>>>>             Best regards,
>>>>
>>>>             Marcus
>>>>
>>>>
>>>>             On 12.01.2017 05:54, Mallesham Dasari wrote:
>>>>>             Hi Marcus,
>>>>>
>>>>>             Raw IQ samples take lots of memory because each sample
>>>>>             will be around 8Bytes. Suppose, if we 1Msps sample
>>>>>             rate, just for 10 minutes of data, we get 10*60*1M*8B
>>>>>             = 4.8GB data. On the other hand, if you store just FFT
>>>>>             with 1024 bin, we get 4.8GB/1024 power values right
>>>>>             (which has very less size)? 
>>>>>
>>>>>             Please correct me if I am wrong.
>>>>>
>>>>>             Thanks
>>>>>
>>>>>             On Wed, Jan 11, 2017 at 7:32 AM, Marcus Müller
>>>>>             <[email protected]
>>>>>             <mailto:[email protected]>> wrote:
>>>>>
>>>>>                 Hi Mallesham,
>>>>>
>>>>>                 I don't understand – the raw IQ samples and their
>>>>>                 FFT have the same size, and data type.
>>>>>
>>>>>                 Maybe you've understood something that I (and
>>>>>                 Martin) didn't – could you elaborate?
>>>>>
>>>>>                 Best regards,
>>>>>                 Marcus
>>>>>
>>>>>
>>>>>                 On 01/11/2017 12:56 AM, Mallesham Dasari wrote:
>>>>>>                 Hi Hasini,
>>>>>>
>>>>>>                 If you are trying to print just the FFT, it
>>>>>>                 should not be an issue. If you print raw iq
>>>>>>                 samples, then you will run out of memory. By
>>>>>>                 long, you mean how long? Days?
>>>>>>
>>>>>>                 On Tue, Jan 10, 2017 at 3:16 PM, Martin Braun
>>>>>>                 <[email protected]
>>>>>>                 <mailto:[email protected]>> wrote:
>>>>>>
>>>>>>                     Hasini,
>>>>>>
>>>>>>                     can you please re-state what you're trying to
>>>>>>                     do? That might help you
>>>>>>                     getting some answers. It is not quite clear
>>>>>>                     from this email.
>>>>>>
>>>>>>                     Cheers,
>>>>>>                     Martin
>>>>>>
>>>>>>
>>>>>>                     On 01/02/2017 09:16 PM, Hasini Abeywickrama
>>>>>>                     wrote:
>>>>>>                     > Hi all,
>>>>>>                     >
>>>>>>                     > I have a flowgraph that reads a signal and
>>>>>>                     writes its FFT samples to a
>>>>>>                     > file. I need to run this continuously (for
>>>>>>                     a long time), without running
>>>>>>                     > out of memory.
>>>>>>                     >
>>>>>>                     > I tired deleting the earlier FFT samples
>>>>>>                     from the file but that messes
>>>>>>                     > up with reading the data. I also tried
>>>>>>                     starting writing to a different
>>>>>>                     > file after some time so the initial file
>>>>>>                     can be completely deleted. But
>>>>>>                     > it did not work as well.
>>>>>>                     >
>>>>>>                     > What would be the best approach for this?
>>>>>>                     Any thought would be very much
>>>>>>                     > appreciated.
>>>>>>                     >
>>>>>>                     > Regards,
>>>>>>                     > Hasini
>>>>>>                     >
>>>>>>                     >
>>>>>>                     > _______________________________________________
>>>>>>                     > Discuss-gnuradio mailing list
>>>>>>                     > [email protected]
>>>>>>                     <mailto:[email protected]>
>>>>>>                     >
>>>>>>                     
>>>>>> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>>>>>                     
>>>>>> <https://lists.gnu.org/mailman/listinfo/discuss-gnuradio>
>>>>>>                     >
>>>>>>
>>>>>>
>>>>>>                     _______________________________________________
>>>>>>                     Discuss-gnuradio mailing list
>>>>>>                     [email protected]
>>>>>>                     <mailto:[email protected]>
>>>>>>                     
>>>>>> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>>>>>                     
>>>>>> <https://lists.gnu.org/mailman/listinfo/discuss-gnuradio>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>                 -- 
>>>>>>                 Best Regards,
>>>>>>                 *Mallesham Dasari*
>>>>>>                 Department of Computer Science
>>>>>>                 Stony Brook University
>>>>>>                 USA - 11794
>>>>>>
>>>>>>
>>>>>>                 _______________________________________________
>>>>>>                 Discuss-gnuradio mailing list
>>>>>>                 [email protected]
>>>>>>                 <mailto:[email protected]>
>>>>>>                 https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>>>>>                 <https://lists.gnu.org/mailman/listinfo/discuss-gnuradio>
>>>>>                 _______________________________________________
>>>>>                 Discuss-gnuradio mailing list
>>>>>                 [email protected]
>>>>>                 <mailto:[email protected]>
>>>>>                 https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>>>>                 <https://lists.gnu.org/mailman/listinfo/discuss-gnuradio>
>>>>>
>>>>>             -- 
>>>>>             Best Regards,
>>>>>             *Mallesham Dasari*
>>>>>             Department of Computer Science
>>>>>             Stony Brook University
>>>>>             USA - 11794
>>>>             _______________________________________________
>>>>             Discuss-gnuradio mailing list [email protected]
>>>>             <mailto:[email protected]>
>>>>             https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>>>             <https://lists.gnu.org/mailman/listinfo/discuss-gnuradio> 
>>>>
>>>>     -- 
>>>>     Best Regards,
>>>>     *Mallesham Dasari*
>>>>     Department of Computer Science
>>>>     Stony Brook University
>>>>     USA - 11794
>>>
>>> -- 
>>> Best Regards,
>>> *Mallesham Dasari*
>>> Department of Computer Science
>>> Stony Brook University
>>> USA - 11794
>> _______________________________________________ Discuss-gnuradio
>> mailing list [email protected]
>> <mailto:[email protected]>
>> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio 
>
> _______________________________________________
> Discuss-gnuradio mailing list
> [email protected]
> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
_______________________________________________
Discuss-gnuradio mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

Reply via email to