Hi Marcus and Kyeong, Thanks for the suggestions, particularly for the fixed point storage. Coming to Marcus's question: My aim to find the optimal sampling of spectrum sensor data. A large number of low-cost spectrum sensors are deployed around some area to monitor the spectrum. As these sensors generate a huge (around 100mbps, considering raw IQ) amount of data with 32msps (USRP B12), I want to find the suitable duty cycle parameters for this particular setup with minimal error compared to actual spectrum data.
On Sat, Jan 21, 2017 at 9:23 PM, Kyeong Su Shin <[email protected]> wrote: > To Whom it may concern: > > I don't know much about distributed computing, but I also agree that it is > right idea to store PSD data in dBFS (of dBm, if calibrated) scale fixed > point format. Microsoft Spectrum Observatory uses 16-bit Q format fixed > point number to store PSD data in dBFS, and you can probably go down to > 8bit per a point if you are happy with 0.5~1dB resolution. > > Regards, > Kyeong Su Shin > > On Sat, Jan 21, 2017 at 11:04 AM, Marcus Müller <[email protected]> > wrote: > >> Hi Mallesham, >> >> that does indeed sound interesting, but you first of all have a local >> problem – that of data volume concentration on your single receiver node. >> 32MS/s is already more than you can shift out through a single Gigabit >> Ethernet connection – so either you must immediately update to more >> datacenter-style interconnects, or you must start thinking about >> consolidating your data where it happens. On the other hand, compared to >> other SDR systems, a mere 32 MS/s from a single channel with a non-100% >> duty cycle is "not that much"; I really feel like you might be running this >> on slightly undersized hardware. >> >> I, again, ask you to describe what you *want* rather than what you *do* – >> a system specification is very crucial here, and I hope that Greg agrees >> with my opinion that the possibility to handle Big Data (whatever that is, >> in the end) alone is not a solution to a data problem. Partitioning, >> analyzing, reducing / compressing, filtering and discarding of data can >> only be designed if you have a clear concept of what your target is – and >> in the case of signal processing, much more than in many other big data >> applications, that concept is often pretty well-known a priori. >> >> So, whilst I really think that you're on to something very interesting >> here, combining distributed computing with SDR, and hope you can share a >> lot of your insights in the future, I also really think that you should >> start with a well-though out design of what you want to process and store. >> This far, you've only told us you have "FFT data" (with which you imply >> "spectral power estimates", which already is a reduction by a half), but >> you haven't really explained how much, in how much detail, you need that. A >> lot of interesting aspects might arise from that – for example, if you're >> really after power spectra, a logarithmic storage (dB!) would make a lot of >> sense; combine that with storing these logarithmic values in a fixed-point >> format could easily save you another factor of two in storage bandwidth – >> without you ever losing the "essence" of your data. The way in which you >> capture your data might, as Greg mentioned, be a key indicator of the >> granularity in which you distribute it. >> >> In short: it might be helpful if you could formulate what you want to >> *do* with your data, not only how you want to do that. >> >> Best regards, >> Marcus >> >> On 01/21/2017 07:37 PM, Mallesham Dasari wrote: >> >> Dear Greg, >> >> Thanks for bringing this into the picture. My long term goal is exactly >> what you have just explained. My plan is to use Spark for storing this big >> data and come with a data processing algorithms to monitor the spectrum >> data. For instance, a simple case where I would find the duty cycle >> parameters that gives me how coarse-grained my sub-sampling could be so >> that I would not loose much of the spectrum data. Similarly, there could be >> many applications by integrating Bigdata and SDR platforms. >> >> I will share the same if I can integrate these successfully. Thanks! >> >> On Sat, Jan 21, 2017 at 11:33 AM, Gregory Ratcliff <[email protected]> wrote: >> >>> I spend my working hours on big data and Hadoop. >>> It occurs to me you really need to be thinking about something outside >>> of a normal file system. HDFS lets you write out data in chunks that you >>> later combine when you have time. There are some really (really) fast >>> implementation projects that write to hdfs. Most of the new work is in >>> java, but I think you are asking for something pretty light. >>> >>> I can visualize a "gatherer" for RF and a "filer" in HDFS that writes >>> out xx MB chunks every period. Now as others have said, you don't just >>> slap some stuff together, you will need to optimize the integration points >>> and think about the best caching and write speeds of the "filer" system and >>> the persistent storage. >>> >>> Likewise, there are plenty of apache tools that will recombine the HDFS >>> chunks back into files of arbitrary size.....which you can then analyze >>> later with gnuradio...when time doesn't matter as much. You might not need >>> much of Hadoop that the file system and some tools. >>> >>> I have always though HDFS + Gnuradio are destined for each other. It >>> may be a bit early for this with today's hardware; Mr. Moore is helping us >>> along just fine, so is AWS. >>> >>> Greg >>> Nz8r >>> >>> On Jan 20, 2017, at 2:46 PM, Marcus Müller <[email protected]> >>> wrote: >>> >>> I can assure you that 32 GHz is not your sampling rate. Do you mean 32 >>> MHz? >>> >>> The problem here is that at first, your operating system can be smart >>> and cache write accesses to files on mass storage devices in RAM (or you >>> use a RAM disk, so everything happens in RAM). But at some point, RAM is >>> going to run out – and then, your recording speed is effectively limited by >>> how fast you can write to your storage (in case of a RAM disk, you simply >>> run full, or your OS starts "swapping", ie. writing RAM to storage. same >>> problem). >>> >>> So, unless you find a way to *reduce* the amount of data you want to >>> record, or simply buy a faster storage system, there's not much you can do. >>> >>> >>> Best regards, >>> >>> Marcus >>> >>> >>> On 01/20/2017 08:42 PM, Mallesham Dasari wrote: >>> >>> Hi Marcus, >>> >>> Thanks for the quick response. I am recording the FFT samples >>> continuously. But, I am getting overflow after some time when the file size >>> has become huge. My sample rate is high (32GHz) and hence writing to the >>> file takes so long and hence the usrp_spectrum_sense getting overflow. >>> >>> On Fri, Jan 20, 2017 at 2:33 PM, Marcus Müller <[email protected] >>> > wrote: >>> >>>> Hello Mallesham, >>>> >>>> I'm afraid not, since I'm afraid that to my current understanding, what >>>> you want is mathematically impossible. Either you want much data – and that >>>> seems to be the case, since you want to record 24h of raw IQ data – or you >>>> can store it in what comparably little RAM modern computers have. >>>> >>>> Maybe, however, we haven't fully understood the problem. Can you, >>>> mathematically, define what you want to observe and record? >>>> >>>> Best regards, >>>> >>>> Marcus >>>> >>>> >>>> >>>> On 01/20/2017 08:28 PM, Mallesham Dasari wrote: >>>> >>>> Hello everyone, >>>> >>>> Can anyone give some solution for this? Even writing to the ramdisk is >>>> not enough for running the flow graph for so long. I am facing the same >>>> issue. >>>> >>>> Thank you! >>>> >>>> On Thu, Jan 12, 2017 at 5:41 PM, Hasini Abeywickrama <[email protected]> >>>> wrote: >>>> >>>>> Hi all, >>>>> >>>>> Thank you very much for the informative responses. >>>>> >>>>> My requirement is to run the flowgraph for a long time (ideally 24 >>>>> hours) and store the FFT data in the memory (ramdisk) to they can be >>>>> processed later or in chunks, not everything at the same time. >>>>> >>>>> So far, I have increased the size of the ramdisk and it works fine for >>>>> a few hours. But it still is not the solution I'm looking for. >>>>> >>>>> Regards, >>>>> Hasini >>>>> >>>>> On Thu, Jan 12, 2017 at 8:30 PM, Marcus Müller < >>>>> [email protected]> wrote: >>>>> >>>>>> But if you do a single 1024-FFT, you'd only operate on 1024 of the >>>>>> input samples! >>>>>> >>>>>> And: the FFT doesn't just give you power values, but complex values; >>>>>> mathematically, the FFT is a DFT, and the DFT is an invertible linear >>>>>> operator <mime-attachment.png>: >>>>>> >>>>>> <mime-attachment.png> >>>>>> >>>>>> which maps complex vectors to complex vectors of size >>>>>> <mime-attachment.png>. It is, in fact, representable as square matrix >>>>>> with >>>>>> column (and row) vectors being samples of the orthogonal complex >>>>>> sinusoids >>>>>> $e^{j\frac{2\pi}N nk},\, k=0,\ldots,N-1$; that is, it can also be >>>>>> understood as a *base change matrix*, that just represents the >>>>>> "input vector" according to a different base, orthogonal base. >>>>>> >>>>>> In the physical sense: the input vector base was represented by the >>>>>> standard basis $\mathbf e_N$, meaning that each base vector represents a >>>>>> single point in time – the sample time of the respective entry; the >>>>>> "output" of the transform is represented on a base of orthogonal >>>>>> frequencies. This is an invertible operation – really just another way to >>>>>> look at *the same signal*. I think this is really important to keep >>>>>> in mind: >>>>>> >>>>>> The Fourier transforms are *not* magical by any means. What they do >>>>>> is represent *the same signal* from a different point of view. It >>>>>> can be *interpreted* as transform between time and frequency domain >>>>>> (or space and impulse, or...). The DFT is still just a boring, old, >>>>>> square, >>>>>> orthogonal, invertible matrix that produces output of the same >>>>>> dimensionality as it takes input. >>>>>> >>>>>> As you can see, the DFT/FFT itself never reduces the amount of data. >>>>>> >>>>>> What you might be referring to is some kind PSD estimate done by >>>>>> first |·|² a lot of DFTed vectors and then averaging them. The data >>>>>> reduction here lies in the magnitude square operation and the average, >>>>>> not >>>>>> in the DFT. >>>>>> The point here is that you're throwing away a whole lot of >>>>>> information, and I'm not convinced that's what Hasini needs! >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Marcus >>>>>> >>>>>> On 12.01.2017 05:54, Mallesham Dasari wrote: >>>>>> >>>>>> Hi Marcus, >>>>>> >>>>>> Raw IQ samples take lots of memory because each sample will be around >>>>>> 8Bytes. Suppose, if we 1Msps sample rate, just for 10 minutes of data, we >>>>>> get 10*60*1M*8B = 4.8GB data. On the other hand, if you store just FFT >>>>>> with >>>>>> 1024 bin, we get 4.8GB/1024 power values right (which has very less >>>>>> size)? >>>>>> >>>>>> Please correct me if I am wrong. >>>>>> >>>>>> Thanks >>>>>> >>>>>> On Wed, Jan 11, 2017 at 7:32 AM, Marcus Müller < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi Mallesham, >>>>>>> >>>>>>> I don't understand – the raw IQ samples and their FFT have the same >>>>>>> size, and data type. >>>>>>> Maybe you've understood something that I (and Martin) didn't – could >>>>>>> you elaborate? >>>>>>> >>>>>>> Best regards, >>>>>>> Marcus >>>>>>> >>>>>>> >>>>>>> On 01/11/2017 12:56 AM, Mallesham Dasari wrote: >>>>>>> >>>>>>> Hi Hasini, >>>>>>> >>>>>>> If you are trying to print just the FFT, it should not be an issue. >>>>>>> If you print raw iq samples, then you will run out of memory. By long, >>>>>>> you >>>>>>> mean how long? Days? >>>>>>> >>>>>>> On Tue, Jan 10, 2017 at 3:16 PM, Martin Braun < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hasini, >>>>>>>> >>>>>>>> can you please re-state what you're trying to do? That might help >>>>>>>> you >>>>>>>> getting some answers. It is not quite clear from this email. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Martin >>>>>>>> >>>>>>>> >>>>>>>> On 01/02/2017 09:16 PM, Hasini Abeywickrama wrote: >>>>>>>> > Hi all, >>>>>>>> > >>>>>>>> > I have a flowgraph that reads a signal and writes its FFT samples >>>>>>>> to a >>>>>>>> > file. I need to run this continuously (for a long time), without >>>>>>>> running >>>>>>>> > out of memory. >>>>>>>> > >>>>>>>> > I tired deleting the earlier FFT samples from the file but that >>>>>>>> messes >>>>>>>> > up with reading the data. I also tried starting writing to a >>>>>>>> different >>>>>>>> > file after some time so the initial file can be completely >>>>>>>> deleted. But >>>>>>>> > it did not work as well. >>>>>>>> > >>>>>>>> > What would be the best approach for this? Any thought would be >>>>>>>> very much >>>>>>>> > appreciated. >>>>>>>> > >>>>>>>> > Regards, >>>>>>>> > Hasini >>>>>>>> > >>>>>>>> > >>>>>>>> > _______________________________________________ >>>>>>>> > Discuss-gnuradio mailing list >>>>>>>> > [email protected] >>>>>>>> > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio >>>>>>>> > >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Discuss-gnuradio mailing list >>>>>>>> [email protected] >>>>>>>> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards, >>>>>>> *Mallesham Dasari* >>>>>>> Department of Computer Science >>>>>>> Stony Brook University >>>>>>> USA - 11794 >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Discuss-gnuradio mailing >>>>>>> [email protected]https://lists.gnu.org/mailman/listinfo/discuss-gnuradio >>>>>>> >>>>>>> _______________________________________________ Discuss-gnuradio >>>>>>> mailing list [email protected] https://lists.gnu.org/mailman/ >>>>>>> listinfo/discuss-gnuradio >>>>>> >>>>>> -- >>>>>> Best Regards, >>>>>> *Mallesham Dasari* >>>>>> Department of Computer Science >>>>>> Stony Brook University >>>>>> USA - 11794 >>>>>> >>>>>> _______________________________________________ Discuss-gnuradio >>>>>> mailing list [email protected] https://lists.gnu.org/mailman/ >>>>>> listinfo/discuss-gnuradio >>>>>> >>>>> -- >>>> Best Regards, >>>> *Mallesham Dasari* >>>> Department of Computer Science >>>> Stony Brook University >>>> USA - 11794 >>>> >>>> -- >>> Best Regards, >>> *Mallesham Dasari* >>> Department of Computer Science >>> Stony Brook University >>> USA - 11794 >>> >>> _______________________________________________ Discuss-gnuradio >>> mailing list [email protected] https://lists.gnu.org/mailman/ >>> listinfo/discuss-gnuradio >>> >>> _______________________________________________ Discuss-gnuradio >>> mailing list [email protected] https://lists.gnu.org/mailman/ >>> listinfo/discuss-gnuradio >> >> -- >> Best Regards, >> *Mallesham Dasari* >> Department of Computer Science >> Stony Brook University >> USA - 11794 >> >> _______________________________________________ >> Discuss-gnuradio mailing >> [email protected]https://lists.gnu.org/mailman/listinfo/discuss-gnuradio >> >> >> _______________________________________________ >> Discuss-gnuradio mailing list >> [email protected] >> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio >> >> > > _______________________________________________ > Discuss-gnuradio mailing list > [email protected] > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio > > -- Best Regards, *Mallesham Dasari* Department of Computer Science Stony Brook University USA - 11794
_______________________________________________ Discuss-gnuradio mailing list [email protected] https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
