To Whom it may concern:

I don't know much about distributed computing, but I also agree that it is
right idea to store PSD data in dBFS (of dBm, if calibrated) scale fixed
point format. Microsoft Spectrum Observatory uses 16-bit Q format fixed
point number to store PSD data in dBFS, and you can probably go down to
8bit per a point if you are happy with 0.5~1dB resolution.

Regards,
Kyeong Su Shin

On Sat, Jan 21, 2017 at 11:04 AM, Marcus Müller <[email protected]>
wrote:

> Hi Mallesham,
>
> that does indeed sound interesting, but you first of all have a local
> problem – that of data volume concentration on your single receiver node.
> 32MS/s is already more than you can shift out through a single Gigabit
> Ethernet connection – so either you must immediately update to more
> datacenter-style interconnects, or you must start thinking about
> consolidating your data where it happens. On the other hand, compared to
> other SDR systems, a mere 32 MS/s from a single channel with a non-100%
> duty cycle is "not that much"; I really feel like you might be running this
> on slightly undersized hardware.
>
> I, again, ask you to describe what you *want* rather than what you *do* –
> a system specification is very crucial here, and I hope that Greg agrees
> with my opinion that the possibility to handle Big Data (whatever that is,
> in the end) alone is not a solution to a data problem. Partitioning,
> analyzing, reducing / compressing, filtering and discarding of data can
> only be designed if you have a clear concept of what your target is – and
> in the case of signal processing, much more than in many other big data
> applications, that concept is often pretty well-known a priori.
>
> So, whilst I really think that you're on to something very interesting
> here, combining distributed computing with SDR, and hope you can share a
> lot of your insights in the future, I also really think that you should
> start with a well-though out design of what you want to process and store.
> This far, you've only told us you have "FFT data" (with which you imply
> "spectral power estimates", which already is a reduction by a half), but
> you haven't really explained how much, in how much detail, you need that. A
> lot of interesting aspects might arise from that – for example, if you're
> really after power spectra, a logarithmic storage (dB!) would make a lot of
> sense; combine that with storing these logarithmic values in a fixed-point
> format could easily save you another factor of two in storage bandwidth –
> without you ever losing the "essence" of your data. The way in which you
> capture your data might, as Greg mentioned, be a key indicator of the
> granularity in which you distribute it.
>
> In short: it might be helpful if you could formulate what you want to *do*
> with your data, not only how you want to do that.
>
> Best regards,
> Marcus
>
> On 01/21/2017 07:37 PM, Mallesham Dasari wrote:
>
> Dear Greg,
>
> Thanks for bringing this into the picture. My long term goal is exactly
> what you have just explained. My plan is to use Spark for storing this big
> data and come with a data processing algorithms to monitor the spectrum
> data. For instance, a simple case where I would find the duty cycle
> parameters that gives me how coarse-grained my sub-sampling could be so
> that I would not loose much of the spectrum data. Similarly, there could be
> many applications by integrating Bigdata and SDR platforms.
>
> I will share the same if I can integrate these successfully. Thanks!
>
> On Sat, Jan 21, 2017 at 11:33 AM, Gregory Ratcliff <[email protected]> wrote:
>
>> I spend my working hours on big data and Hadoop.
>> It occurs to me you really need to be thinking about something outside of
>> a normal file system.  HDFS lets you write out data in chunks that you
>> later combine when you have time.  There are some really (really) fast
>> implementation projects that write to hdfs.  Most of the new work is in
>> java, but I think you are asking for something pretty light.
>>
>> I can visualize a "gatherer" for RF and a "filer" in HDFS that writes out
>> xx MB chunks every period.  Now as others have said, you don't just slap
>> some stuff together, you will need to optimize the integration points and
>> think about the best caching and write speeds of the "filer" system and the
>> persistent storage.
>>
>> Likewise, there are plenty of apache tools that will recombine the HDFS
>> chunks back into files of arbitrary size.....which you can then analyze
>> later with gnuradio...when time doesn't matter as much.  You might not need
>> much of Hadoop that the file system and some tools.
>>
>> I have always though HDFS + Gnuradio are destined for each other.  It may
>> be a bit early for this with today's hardware; Mr. Moore is helping us
>> along just fine, so is AWS.
>>
>> Greg
>> Nz8r
>>
>> On Jan 20, 2017, at 2:46 PM, Marcus Müller <[email protected]>
>> wrote:
>>
>> I can assure you that 32 GHz is not your sampling rate. Do you mean 32
>> MHz?
>>
>> The problem here is that at first, your operating system can be smart and
>> cache write accesses to files on mass storage devices in RAM (or you use a
>> RAM disk, so everything happens in RAM). But at some point, RAM is going to
>> run out – and then, your recording speed is effectively limited by how fast
>> you can write to your storage (in case of a RAM disk, you simply run full,
>> or your OS starts "swapping", ie. writing RAM to storage. same problem).
>>
>> So, unless you find a way to *reduce* the amount of data you want to
>> record, or simply buy a faster storage system, there's not much you can do.
>>
>>
>> Best regards,
>>
>> Marcus
>>
>>
>> On 01/20/2017 08:42 PM, Mallesham Dasari wrote:
>>
>> Hi Marcus,
>>
>> Thanks for the quick response. I am recording the FFT samples
>> continuously. But, I am getting overflow after some time when the file size
>> has become huge. My sample rate is high (32GHz) and hence writing to the
>> file takes so long and hence the usrp_spectrum_sense getting overflow.
>>
>> On Fri, Jan 20, 2017 at 2:33 PM, Marcus Müller <[email protected]>
>> wrote:
>>
>>> Hello Mallesham,
>>>
>>> I'm afraid not, since I'm afraid that to my current understanding, what
>>> you want is mathematically impossible. Either you want much data – and that
>>> seems to be the case, since you want to record 24h of raw IQ data – or you
>>> can store it in what comparably little RAM modern computers have.
>>>
>>> Maybe, however, we haven't fully understood the problem. Can you,
>>> mathematically, define what you want to observe and record?
>>>
>>> Best regards,
>>>
>>> Marcus
>>>
>>>
>>>
>>> On 01/20/2017 08:28 PM, Mallesham Dasari wrote:
>>>
>>> Hello everyone,
>>>
>>> Can anyone give some solution for this? Even writing to the ramdisk is
>>> not enough for running the flow graph for so long. I am facing the same
>>> issue.
>>>
>>> Thank you!
>>>
>>> On Thu, Jan 12, 2017 at 5:41 PM, Hasini Abeywickrama <[email protected]>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> Thank you very much for the informative responses.
>>>>
>>>> My requirement is to run the flowgraph for a long time (ideally 24
>>>> hours) and store the FFT data in the memory (ramdisk) to they can be
>>>> processed later or in chunks, not everything at the same time.
>>>>
>>>> So far, I have increased the size of the ramdisk and it works fine for
>>>> a few hours. But it still is not  the solution I'm looking for.
>>>>
>>>> Regards,
>>>> Hasini
>>>>
>>>> On Thu, Jan 12, 2017 at 8:30 PM, Marcus Müller <
>>>> [email protected]> wrote:
>>>>
>>>>> But if you do a single 1024-FFT, you'd only operate on 1024 of the
>>>>> input samples!
>>>>>
>>>>> And: the FFT doesn't just give you power values, but complex values;
>>>>> mathematically, the FFT is a DFT, and the DFT is an invertible linear
>>>>> operator <mime-attachment.png>:
>>>>>
>>>>> <mime-attachment.png>
>>>>>
>>>>> which maps complex vectors to complex vectors of size
>>>>> <mime-attachment.png>.  It is, in fact, representable as square matrix 
>>>>> with
>>>>> column (and row) vectors being samples of the orthogonal complex sinusoids
>>>>> $e^{j\frac{2\pi}N nk},\, k=0,\ldots,N-1$; that is, it can also be
>>>>> understood as a *base change matrix*, that just represents the "input
>>>>> vector" according to a different base, orthogonal base.
>>>>>
>>>>> In the physical sense: the input vector base was represented by the
>>>>> standard basis $\mathbf e_N$, meaning that each base vector represents a
>>>>> single point in time – the sample time of the respective entry; the
>>>>> "output" of the transform is represented on a base of orthogonal
>>>>> frequencies. This is an invertible operation – really just another way to
>>>>> look at *the same signal*. I think this is really important to keep
>>>>> in mind:
>>>>>
>>>>> The Fourier transforms are *not* magical by any means. What they do
>>>>> is represent *the same signal* from a different point of view. It can
>>>>> be *interpreted* as transform between time and frequency domain (or
>>>>> space and impulse, or...). The DFT is still just a boring, old, square,
>>>>> orthogonal, invertible matrix that produces output of the same
>>>>> dimensionality as it takes input.
>>>>>
>>>>> As you can see, the DFT/FFT itself never reduces the amount of data.
>>>>>
>>>>> What you might be referring to is some kind PSD estimate done by first
>>>>> |·|² a lot of DFTed vectors and then averaging them. The data reduction
>>>>> here lies in the magnitude square operation and the average, not in the 
>>>>> DFT.
>>>>> The point here is that you're throwing away a whole lot of
>>>>> information, and I'm not convinced that's what Hasini needs!
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Marcus
>>>>>
>>>>> On 12.01.2017 05:54, Mallesham Dasari wrote:
>>>>>
>>>>> Hi Marcus,
>>>>>
>>>>> Raw IQ samples take lots of memory because each sample will be around
>>>>> 8Bytes. Suppose, if we 1Msps sample rate, just for 10 minutes of data, we
>>>>> get 10*60*1M*8B = 4.8GB data. On the other hand, if you store just FFT 
>>>>> with
>>>>> 1024 bin, we get 4.8GB/1024 power values right (which has very less size)?
>>>>>
>>>>> Please correct me if I am wrong.
>>>>>
>>>>> Thanks
>>>>>
>>>>> On Wed, Jan 11, 2017 at 7:32 AM, Marcus Müller <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi Mallesham,
>>>>>>
>>>>>> I don't understand – the raw IQ samples and their FFT have the same
>>>>>> size, and data type.
>>>>>> Maybe you've understood something that I (and Martin) didn't – could
>>>>>> you elaborate?
>>>>>>
>>>>>> Best regards,
>>>>>> Marcus
>>>>>>
>>>>>>
>>>>>> On 01/11/2017 12:56 AM, Mallesham Dasari wrote:
>>>>>>
>>>>>> Hi Hasini,
>>>>>>
>>>>>> If you are trying to print just the FFT, it should not be an issue.
>>>>>> If you print raw iq samples, then you will run out of memory. By long, 
>>>>>> you
>>>>>> mean how long? Days?
>>>>>>
>>>>>> On Tue, Jan 10, 2017 at 3:16 PM, Martin Braun <[email protected]
>>>>>> > wrote:
>>>>>>
>>>>>>> Hasini,
>>>>>>>
>>>>>>> can you please re-state what you're trying to do? That might help you
>>>>>>> getting some answers. It is not quite clear from this email.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Martin
>>>>>>>
>>>>>>>
>>>>>>> On 01/02/2017 09:16 PM, Hasini Abeywickrama wrote:
>>>>>>> > Hi all,
>>>>>>> >
>>>>>>> > I have a flowgraph that reads a signal and writes its FFT samples
>>>>>>> to a
>>>>>>> > file. I need to run this continuously (for a long time), without
>>>>>>> running
>>>>>>> > out of memory.
>>>>>>> >
>>>>>>> > I tired deleting the earlier FFT samples from the file but that
>>>>>>> messes
>>>>>>> > up with reading the data. I also tried starting writing to a
>>>>>>> different
>>>>>>> > file after some time so the initial file can be completely
>>>>>>> deleted. But
>>>>>>> > it did not work as well.
>>>>>>> >
>>>>>>> > What would be the best approach for this? Any thought would be
>>>>>>> very much
>>>>>>> > appreciated.
>>>>>>> >
>>>>>>> > Regards,
>>>>>>> > Hasini
>>>>>>> >
>>>>>>> >
>>>>>>> > _______________________________________________
>>>>>>> > Discuss-gnuradio mailing list
>>>>>>> > [email protected]
>>>>>>> > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Discuss-gnuradio mailing list
>>>>>>> [email protected]
>>>>>>> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards,
>>>>>> *Mallesham Dasari*
>>>>>> Department of Computer Science
>>>>>> Stony Brook University
>>>>>> USA - 11794
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Discuss-gnuradio mailing 
>>>>>> [email protected]https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>>>>>
>>>>>> _______________________________________________ Discuss-gnuradio
>>>>>> mailing list [email protected] https://lists.gnu.org/mailman/
>>>>>> listinfo/discuss-gnuradio
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> *Mallesham Dasari*
>>>>> Department of Computer Science
>>>>> Stony Brook University
>>>>> USA - 11794
>>>>>
>>>>> _______________________________________________ Discuss-gnuradio
>>>>> mailing list [email protected] https://lists.gnu.org/mailman/
>>>>> listinfo/discuss-gnuradio
>>>>>
>>>> --
>>> Best Regards,
>>> *Mallesham Dasari*
>>> Department of Computer Science
>>> Stony Brook University
>>> USA - 11794
>>>
>>> --
>> Best Regards,
>> *Mallesham Dasari*
>> Department of Computer Science
>> Stony Brook University
>> USA - 11794
>>
>> _______________________________________________ Discuss-gnuradio mailing
>> list [email protected] https://lists.gnu.org/mailman/
>> listinfo/discuss-gnuradio
>>
>> _______________________________________________ Discuss-gnuradio mailing
>> list [email protected] https://lists.gnu.org/mailman/
>> listinfo/discuss-gnuradio
>
> --
> Best Regards,
> *Mallesham Dasari*
> Department of Computer Science
> Stony Brook University
> USA - 11794
>
> _______________________________________________
> Discuss-gnuradio mailing 
> [email protected]https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>
>
> _______________________________________________
> Discuss-gnuradio mailing list
> [email protected]
> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>
>
_______________________________________________
Discuss-gnuradio mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

Reply via email to