Hi Marcus and Kyeong,

Thanks for the suggestions, particularly for the fixed point storage.
Coming to Marcus's question: My aim to find the optimal sampling of
spectrum sensor data. A large number of low-cost spectrum sensors are
deployed around some area to monitor the spectrum. As these sensors
generate a huge (around 100mbps, considering raw IQ) amount of data with
32msps (USRP B12), I want to find the suitable duty cycle parameters for
this particular setup with minimal error compared to actual spectrum data.

On Sat, Jan 21, 2017 at 9:23 PM, Kyeong Su Shin <[email protected]> wrote:

> To Whom it may concern:
>
> I don't know much about distributed computing, but I also agree that it is
> right idea to store PSD data in dBFS (of dBm, if calibrated) scale fixed
> point format. Microsoft Spectrum Observatory uses 16-bit Q format fixed
> point number to store PSD data in dBFS, and you can probably go down to
> 8bit per a point if you are happy with 0.5~1dB resolution.
>
> Regards,
> Kyeong Su Shin
>
> On Sat, Jan 21, 2017 at 11:04 AM, Marcus Müller <[email protected]>
> wrote:
>
>> Hi Mallesham,
>>
>> that does indeed sound interesting, but you first of all have a local
>> problem – that of data volume concentration on your single receiver node.
>> 32MS/s is already more than you can shift out through a single Gigabit
>> Ethernet connection – so either you must immediately update to more
>> datacenter-style interconnects, or you must start thinking about
>> consolidating your data where it happens. On the other hand, compared to
>> other SDR systems, a mere 32 MS/s from a single channel with a non-100%
>> duty cycle is "not that much"; I really feel like you might be running this
>> on slightly undersized hardware.
>>
>> I, again, ask you to describe what you *want* rather than what you *do* –
>> a system specification is very crucial here, and I hope that Greg agrees
>> with my opinion that the possibility to handle Big Data (whatever that is,
>> in the end) alone is not a solution to a data problem. Partitioning,
>> analyzing, reducing / compressing, filtering and discarding of data can
>> only be designed if you have a clear concept of what your target is – and
>> in the case of signal processing, much more than in many other big data
>> applications, that concept is often pretty well-known a priori.
>>
>> So, whilst I really think that you're on to something very interesting
>> here, combining distributed computing with SDR, and hope you can share a
>> lot of your insights in the future, I also really think that you should
>> start with a well-though out design of what you want to process and store.
>> This far, you've only told us you have "FFT data" (with which you imply
>> "spectral power estimates", which already is a reduction by a half), but
>> you haven't really explained how much, in how much detail, you need that. A
>> lot of interesting aspects might arise from that – for example, if you're
>> really after power spectra, a logarithmic storage (dB!) would make a lot of
>> sense; combine that with storing these logarithmic values in a fixed-point
>> format could easily save you another factor of two in storage bandwidth –
>> without you ever losing the "essence" of your data. The way in which you
>> capture your data might, as Greg mentioned, be a key indicator of the
>> granularity in which you distribute it.
>>
>> In short: it might be helpful if you could formulate what you want to
>> *do* with your data, not only how you want to do that.
>>
>> Best regards,
>> Marcus
>>
>> On 01/21/2017 07:37 PM, Mallesham Dasari wrote:
>>
>> Dear Greg,
>>
>> Thanks for bringing this into the picture. My long term goal is exactly
>> what you have just explained. My plan is to use Spark for storing this big
>> data and come with a data processing algorithms to monitor the spectrum
>> data. For instance, a simple case where I would find the duty cycle
>> parameters that gives me how coarse-grained my sub-sampling could be so
>> that I would not loose much of the spectrum data. Similarly, there could be
>> many applications by integrating Bigdata and SDR platforms.
>>
>> I will share the same if I can integrate these successfully. Thanks!
>>
>> On Sat, Jan 21, 2017 at 11:33 AM, Gregory Ratcliff <[email protected]> wrote:
>>
>>> I spend my working hours on big data and Hadoop.
>>> It occurs to me you really need to be thinking about something outside
>>> of a normal file system.  HDFS lets you write out data in chunks that you
>>> later combine when you have time.  There are some really (really) fast
>>> implementation projects that write to hdfs.  Most of the new work is in
>>> java, but I think you are asking for something pretty light.
>>>
>>> I can visualize a "gatherer" for RF and a "filer" in HDFS that writes
>>> out xx MB chunks every period.  Now as others have said, you don't just
>>> slap some stuff together, you will need to optimize the integration points
>>> and think about the best caching and write speeds of the "filer" system and
>>> the persistent storage.
>>>
>>> Likewise, there are plenty of apache tools that will recombine the HDFS
>>> chunks back into files of arbitrary size.....which you can then analyze
>>> later with gnuradio...when time doesn't matter as much.  You might not need
>>> much of Hadoop that the file system and some tools.
>>>
>>> I have always though HDFS + Gnuradio are destined for each other.  It
>>> may be a bit early for this with today's hardware; Mr. Moore is helping us
>>> along just fine, so is AWS.
>>>
>>> Greg
>>> Nz8r
>>>
>>> On Jan 20, 2017, at 2:46 PM, Marcus Müller <[email protected]>
>>> wrote:
>>>
>>> I can assure you that 32 GHz is not your sampling rate. Do you mean 32
>>> MHz?
>>>
>>> The problem here is that at first, your operating system can be smart
>>> and cache write accesses to files on mass storage devices in RAM (or you
>>> use a RAM disk, so everything happens in RAM). But at some point, RAM is
>>> going to run out – and then, your recording speed is effectively limited by
>>> how fast you can write to your storage (in case of a RAM disk, you simply
>>> run full, or your OS starts "swapping", ie. writing RAM to storage. same
>>> problem).
>>>
>>> So, unless you find a way to *reduce* the amount of data you want to
>>> record, or simply buy a faster storage system, there's not much you can do.
>>>
>>>
>>> Best regards,
>>>
>>> Marcus
>>>
>>>
>>> On 01/20/2017 08:42 PM, Mallesham Dasari wrote:
>>>
>>> Hi Marcus,
>>>
>>> Thanks for the quick response. I am recording the FFT samples
>>> continuously. But, I am getting overflow after some time when the file size
>>> has become huge. My sample rate is high (32GHz) and hence writing to the
>>> file takes so long and hence the usrp_spectrum_sense getting overflow.
>>>
>>> On Fri, Jan 20, 2017 at 2:33 PM, Marcus Müller <[email protected]
>>> > wrote:
>>>
>>>> Hello Mallesham,
>>>>
>>>> I'm afraid not, since I'm afraid that to my current understanding, what
>>>> you want is mathematically impossible. Either you want much data – and that
>>>> seems to be the case, since you want to record 24h of raw IQ data – or you
>>>> can store it in what comparably little RAM modern computers have.
>>>>
>>>> Maybe, however, we haven't fully understood the problem. Can you,
>>>> mathematically, define what you want to observe and record?
>>>>
>>>> Best regards,
>>>>
>>>> Marcus
>>>>
>>>>
>>>>
>>>> On 01/20/2017 08:28 PM, Mallesham Dasari wrote:
>>>>
>>>> Hello everyone,
>>>>
>>>> Can anyone give some solution for this? Even writing to the ramdisk is
>>>> not enough for running the flow graph for so long. I am facing the same
>>>> issue.
>>>>
>>>> Thank you!
>>>>
>>>> On Thu, Jan 12, 2017 at 5:41 PM, Hasini Abeywickrama <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Thank you very much for the informative responses.
>>>>>
>>>>> My requirement is to run the flowgraph for a long time (ideally 24
>>>>> hours) and store the FFT data in the memory (ramdisk) to they can be
>>>>> processed later or in chunks, not everything at the same time.
>>>>>
>>>>> So far, I have increased the size of the ramdisk and it works fine for
>>>>> a few hours. But it still is not  the solution I'm looking for.
>>>>>
>>>>> Regards,
>>>>> Hasini
>>>>>
>>>>> On Thu, Jan 12, 2017 at 8:30 PM, Marcus Müller <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> But if you do a single 1024-FFT, you'd only operate on 1024 of the
>>>>>> input samples!
>>>>>>
>>>>>> And: the FFT doesn't just give you power values, but complex values;
>>>>>> mathematically, the FFT is a DFT, and the DFT is an invertible linear
>>>>>> operator <mime-attachment.png>:
>>>>>>
>>>>>> <mime-attachment.png>
>>>>>>
>>>>>> which maps complex vectors to complex vectors of size
>>>>>> <mime-attachment.png>.  It is, in fact, representable as square matrix 
>>>>>> with
>>>>>> column (and row) vectors being samples of the orthogonal complex 
>>>>>> sinusoids
>>>>>> $e^{j\frac{2\pi}N nk},\, k=0,\ldots,N-1$; that is, it can also be
>>>>>> understood as a *base change matrix*, that just represents the
>>>>>> "input vector" according to a different base, orthogonal base.
>>>>>>
>>>>>> In the physical sense: the input vector base was represented by the
>>>>>> standard basis $\mathbf e_N$, meaning that each base vector represents a
>>>>>> single point in time – the sample time of the respective entry; the
>>>>>> "output" of the transform is represented on a base of orthogonal
>>>>>> frequencies. This is an invertible operation – really just another way to
>>>>>> look at *the same signal*. I think this is really important to keep
>>>>>> in mind:
>>>>>>
>>>>>> The Fourier transforms are *not* magical by any means. What they do
>>>>>> is represent *the same signal* from a different point of view. It
>>>>>> can be *interpreted* as transform between time and frequency domain
>>>>>> (or space and impulse, or...). The DFT is still just a boring, old, 
>>>>>> square,
>>>>>> orthogonal, invertible matrix that produces output of the same
>>>>>> dimensionality as it takes input.
>>>>>>
>>>>>> As you can see, the DFT/FFT itself never reduces the amount of data.
>>>>>>
>>>>>> What you might be referring to is some kind PSD estimate done by
>>>>>> first |·|² a lot of DFTed vectors and then averaging them. The data
>>>>>> reduction here lies in the magnitude square operation and the average, 
>>>>>> not
>>>>>> in the DFT.
>>>>>> The point here is that you're throwing away a whole lot of
>>>>>> information, and I'm not convinced that's what Hasini needs!
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Marcus
>>>>>>
>>>>>> On 12.01.2017 05:54, Mallesham Dasari wrote:
>>>>>>
>>>>>> Hi Marcus,
>>>>>>
>>>>>> Raw IQ samples take lots of memory because each sample will be around
>>>>>> 8Bytes. Suppose, if we 1Msps sample rate, just for 10 minutes of data, we
>>>>>> get 10*60*1M*8B = 4.8GB data. On the other hand, if you store just FFT 
>>>>>> with
>>>>>> 1024 bin, we get 4.8GB/1024 power values right (which has very less 
>>>>>> size)?
>>>>>>
>>>>>> Please correct me if I am wrong.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> On Wed, Jan 11, 2017 at 7:32 AM, Marcus Müller <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi Mallesham,
>>>>>>>
>>>>>>> I don't understand – the raw IQ samples and their FFT have the same
>>>>>>> size, and data type.
>>>>>>> Maybe you've understood something that I (and Martin) didn't – could
>>>>>>> you elaborate?
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Marcus
>>>>>>>
>>>>>>>
>>>>>>> On 01/11/2017 12:56 AM, Mallesham Dasari wrote:
>>>>>>>
>>>>>>> Hi Hasini,
>>>>>>>
>>>>>>> If you are trying to print just the FFT, it should not be an issue.
>>>>>>> If you print raw iq samples, then you will run out of memory. By long, 
>>>>>>> you
>>>>>>> mean how long? Days?
>>>>>>>
>>>>>>> On Tue, Jan 10, 2017 at 3:16 PM, Martin Braun <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Hasini,
>>>>>>>>
>>>>>>>> can you please re-state what you're trying to do? That might help
>>>>>>>> you
>>>>>>>> getting some answers. It is not quite clear from this email.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Martin
>>>>>>>>
>>>>>>>>
>>>>>>>> On 01/02/2017 09:16 PM, Hasini Abeywickrama wrote:
>>>>>>>> > Hi all,
>>>>>>>> >
>>>>>>>> > I have a flowgraph that reads a signal and writes its FFT samples
>>>>>>>> to a
>>>>>>>> > file. I need to run this continuously (for a long time), without
>>>>>>>> running
>>>>>>>> > out of memory.
>>>>>>>> >
>>>>>>>> > I tired deleting the earlier FFT samples from the file but that
>>>>>>>> messes
>>>>>>>> > up with reading the data. I also tried starting writing to a
>>>>>>>> different
>>>>>>>> > file after some time so the initial file can be completely
>>>>>>>> deleted. But
>>>>>>>> > it did not work as well.
>>>>>>>> >
>>>>>>>> > What would be the best approach for this? Any thought would be
>>>>>>>> very much
>>>>>>>> > appreciated.
>>>>>>>> >
>>>>>>>> > Regards,
>>>>>>>> > Hasini
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > _______________________________________________
>>>>>>>> > Discuss-gnuradio mailing list
>>>>>>>> > [email protected]
>>>>>>>> > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Discuss-gnuradio mailing list
>>>>>>>> [email protected]
>>>>>>>> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards,
>>>>>>> *Mallesham Dasari*
>>>>>>> Department of Computer Science
>>>>>>> Stony Brook University
>>>>>>> USA - 11794
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Discuss-gnuradio mailing 
>>>>>>> [email protected]https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>>>>>>
>>>>>>> _______________________________________________ Discuss-gnuradio
>>>>>>> mailing list [email protected] https://lists.gnu.org/mailman/
>>>>>>> listinfo/discuss-gnuradio
>>>>>>
>>>>>> --
>>>>>> Best Regards,
>>>>>> *Mallesham Dasari*
>>>>>> Department of Computer Science
>>>>>> Stony Brook University
>>>>>> USA - 11794
>>>>>>
>>>>>> _______________________________________________ Discuss-gnuradio
>>>>>> mailing list [email protected] https://lists.gnu.org/mailman/
>>>>>> listinfo/discuss-gnuradio
>>>>>>
>>>>> --
>>>> Best Regards,
>>>> *Mallesham Dasari*
>>>> Department of Computer Science
>>>> Stony Brook University
>>>> USA - 11794
>>>>
>>>> --
>>> Best Regards,
>>> *Mallesham Dasari*
>>> Department of Computer Science
>>> Stony Brook University
>>> USA - 11794
>>>
>>> _______________________________________________ Discuss-gnuradio
>>> mailing list [email protected] https://lists.gnu.org/mailman/
>>> listinfo/discuss-gnuradio
>>>
>>> _______________________________________________ Discuss-gnuradio
>>> mailing list [email protected] https://lists.gnu.org/mailman/
>>> listinfo/discuss-gnuradio
>>
>> --
>> Best Regards,
>> *Mallesham Dasari*
>> Department of Computer Science
>> Stony Brook University
>> USA - 11794
>>
>> _______________________________________________
>> Discuss-gnuradio mailing 
>> [email protected]https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>
>>
>> _______________________________________________
>> Discuss-gnuradio mailing list
>> [email protected]
>> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>
>>
>
> _______________________________________________
> Discuss-gnuradio mailing list
> [email protected]
> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>
>


-- 
Best Regards,
*Mallesham Dasari*
Department of Computer Science
Stony Brook University
USA - 11794
_______________________________________________
Discuss-gnuradio mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

Reply via email to