To Whom it may concern: I don't know much about distributed computing, but I also agree that it is right idea to store PSD data in dBFS (of dBm, if calibrated) scale fixed point format. Microsoft Spectrum Observatory uses 16-bit Q format fixed point number to store PSD data in dBFS, and you can probably go down to 8bit per a point if you are happy with 0.5~1dB resolution.
Regards, Kyeong Su Shin On Sat, Jan 21, 2017 at 11:04 AM, Marcus Müller <[email protected]> wrote: > Hi Mallesham, > > that does indeed sound interesting, but you first of all have a local > problem – that of data volume concentration on your single receiver node. > 32MS/s is already more than you can shift out through a single Gigabit > Ethernet connection – so either you must immediately update to more > datacenter-style interconnects, or you must start thinking about > consolidating your data where it happens. On the other hand, compared to > other SDR systems, a mere 32 MS/s from a single channel with a non-100% > duty cycle is "not that much"; I really feel like you might be running this > on slightly undersized hardware. > > I, again, ask you to describe what you *want* rather than what you *do* – > a system specification is very crucial here, and I hope that Greg agrees > with my opinion that the possibility to handle Big Data (whatever that is, > in the end) alone is not a solution to a data problem. Partitioning, > analyzing, reducing / compressing, filtering and discarding of data can > only be designed if you have a clear concept of what your target is – and > in the case of signal processing, much more than in many other big data > applications, that concept is often pretty well-known a priori. > > So, whilst I really think that you're on to something very interesting > here, combining distributed computing with SDR, and hope you can share a > lot of your insights in the future, I also really think that you should > start with a well-though out design of what you want to process and store. > This far, you've only told us you have "FFT data" (with which you imply > "spectral power estimates", which already is a reduction by a half), but > you haven't really explained how much, in how much detail, you need that. A > lot of interesting aspects might arise from that – for example, if you're > really after power spectra, a logarithmic storage (dB!) would make a lot of > sense; combine that with storing these logarithmic values in a fixed-point > format could easily save you another factor of two in storage bandwidth – > without you ever losing the "essence" of your data. The way in which you > capture your data might, as Greg mentioned, be a key indicator of the > granularity in which you distribute it. > > In short: it might be helpful if you could formulate what you want to *do* > with your data, not only how you want to do that. > > Best regards, > Marcus > > On 01/21/2017 07:37 PM, Mallesham Dasari wrote: > > Dear Greg, > > Thanks for bringing this into the picture. My long term goal is exactly > what you have just explained. My plan is to use Spark for storing this big > data and come with a data processing algorithms to monitor the spectrum > data. For instance, a simple case where I would find the duty cycle > parameters that gives me how coarse-grained my sub-sampling could be so > that I would not loose much of the spectrum data. Similarly, there could be > many applications by integrating Bigdata and SDR platforms. > > I will share the same if I can integrate these successfully. Thanks! > > On Sat, Jan 21, 2017 at 11:33 AM, Gregory Ratcliff <[email protected]> wrote: > >> I spend my working hours on big data and Hadoop. >> It occurs to me you really need to be thinking about something outside of >> a normal file system. HDFS lets you write out data in chunks that you >> later combine when you have time. There are some really (really) fast >> implementation projects that write to hdfs. Most of the new work is in >> java, but I think you are asking for something pretty light. >> >> I can visualize a "gatherer" for RF and a "filer" in HDFS that writes out >> xx MB chunks every period. Now as others have said, you don't just slap >> some stuff together, you will need to optimize the integration points and >> think about the best caching and write speeds of the "filer" system and the >> persistent storage. >> >> Likewise, there are plenty of apache tools that will recombine the HDFS >> chunks back into files of arbitrary size.....which you can then analyze >> later with gnuradio...when time doesn't matter as much. You might not need >> much of Hadoop that the file system and some tools. >> >> I have always though HDFS + Gnuradio are destined for each other. It may >> be a bit early for this with today's hardware; Mr. Moore is helping us >> along just fine, so is AWS. >> >> Greg >> Nz8r >> >> On Jan 20, 2017, at 2:46 PM, Marcus Müller <[email protected]> >> wrote: >> >> I can assure you that 32 GHz is not your sampling rate. Do you mean 32 >> MHz? >> >> The problem here is that at first, your operating system can be smart and >> cache write accesses to files on mass storage devices in RAM (or you use a >> RAM disk, so everything happens in RAM). But at some point, RAM is going to >> run out – and then, your recording speed is effectively limited by how fast >> you can write to your storage (in case of a RAM disk, you simply run full, >> or your OS starts "swapping", ie. writing RAM to storage. same problem). >> >> So, unless you find a way to *reduce* the amount of data you want to >> record, or simply buy a faster storage system, there's not much you can do. >> >> >> Best regards, >> >> Marcus >> >> >> On 01/20/2017 08:42 PM, Mallesham Dasari wrote: >> >> Hi Marcus, >> >> Thanks for the quick response. I am recording the FFT samples >> continuously. But, I am getting overflow after some time when the file size >> has become huge. My sample rate is high (32GHz) and hence writing to the >> file takes so long and hence the usrp_spectrum_sense getting overflow. >> >> On Fri, Jan 20, 2017 at 2:33 PM, Marcus Müller <[email protected]> >> wrote: >> >>> Hello Mallesham, >>> >>> I'm afraid not, since I'm afraid that to my current understanding, what >>> you want is mathematically impossible. Either you want much data – and that >>> seems to be the case, since you want to record 24h of raw IQ data – or you >>> can store it in what comparably little RAM modern computers have. >>> >>> Maybe, however, we haven't fully understood the problem. Can you, >>> mathematically, define what you want to observe and record? >>> >>> Best regards, >>> >>> Marcus >>> >>> >>> >>> On 01/20/2017 08:28 PM, Mallesham Dasari wrote: >>> >>> Hello everyone, >>> >>> Can anyone give some solution for this? Even writing to the ramdisk is >>> not enough for running the flow graph for so long. I am facing the same >>> issue. >>> >>> Thank you! >>> >>> On Thu, Jan 12, 2017 at 5:41 PM, Hasini Abeywickrama <[email protected]> >>> wrote: >>> >>>> Hi all, >>>> >>>> Thank you very much for the informative responses. >>>> >>>> My requirement is to run the flowgraph for a long time (ideally 24 >>>> hours) and store the FFT data in the memory (ramdisk) to they can be >>>> processed later or in chunks, not everything at the same time. >>>> >>>> So far, I have increased the size of the ramdisk and it works fine for >>>> a few hours. But it still is not the solution I'm looking for. >>>> >>>> Regards, >>>> Hasini >>>> >>>> On Thu, Jan 12, 2017 at 8:30 PM, Marcus Müller < >>>> [email protected]> wrote: >>>> >>>>> But if you do a single 1024-FFT, you'd only operate on 1024 of the >>>>> input samples! >>>>> >>>>> And: the FFT doesn't just give you power values, but complex values; >>>>> mathematically, the FFT is a DFT, and the DFT is an invertible linear >>>>> operator <mime-attachment.png>: >>>>> >>>>> <mime-attachment.png> >>>>> >>>>> which maps complex vectors to complex vectors of size >>>>> <mime-attachment.png>. It is, in fact, representable as square matrix >>>>> with >>>>> column (and row) vectors being samples of the orthogonal complex sinusoids >>>>> $e^{j\frac{2\pi}N nk},\, k=0,\ldots,N-1$; that is, it can also be >>>>> understood as a *base change matrix*, that just represents the "input >>>>> vector" according to a different base, orthogonal base. >>>>> >>>>> In the physical sense: the input vector base was represented by the >>>>> standard basis $\mathbf e_N$, meaning that each base vector represents a >>>>> single point in time – the sample time of the respective entry; the >>>>> "output" of the transform is represented on a base of orthogonal >>>>> frequencies. This is an invertible operation – really just another way to >>>>> look at *the same signal*. I think this is really important to keep >>>>> in mind: >>>>> >>>>> The Fourier transforms are *not* magical by any means. What they do >>>>> is represent *the same signal* from a different point of view. It can >>>>> be *interpreted* as transform between time and frequency domain (or >>>>> space and impulse, or...). The DFT is still just a boring, old, square, >>>>> orthogonal, invertible matrix that produces output of the same >>>>> dimensionality as it takes input. >>>>> >>>>> As you can see, the DFT/FFT itself never reduces the amount of data. >>>>> >>>>> What you might be referring to is some kind PSD estimate done by first >>>>> |·|² a lot of DFTed vectors and then averaging them. The data reduction >>>>> here lies in the magnitude square operation and the average, not in the >>>>> DFT. >>>>> The point here is that you're throwing away a whole lot of >>>>> information, and I'm not convinced that's what Hasini needs! >>>>> >>>>> Best regards, >>>>> >>>>> Marcus >>>>> >>>>> On 12.01.2017 05:54, Mallesham Dasari wrote: >>>>> >>>>> Hi Marcus, >>>>> >>>>> Raw IQ samples take lots of memory because each sample will be around >>>>> 8Bytes. Suppose, if we 1Msps sample rate, just for 10 minutes of data, we >>>>> get 10*60*1M*8B = 4.8GB data. On the other hand, if you store just FFT >>>>> with >>>>> 1024 bin, we get 4.8GB/1024 power values right (which has very less size)? >>>>> >>>>> Please correct me if I am wrong. >>>>> >>>>> Thanks >>>>> >>>>> On Wed, Jan 11, 2017 at 7:32 AM, Marcus Müller < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi Mallesham, >>>>>> >>>>>> I don't understand – the raw IQ samples and their FFT have the same >>>>>> size, and data type. >>>>>> Maybe you've understood something that I (and Martin) didn't – could >>>>>> you elaborate? >>>>>> >>>>>> Best regards, >>>>>> Marcus >>>>>> >>>>>> >>>>>> On 01/11/2017 12:56 AM, Mallesham Dasari wrote: >>>>>> >>>>>> Hi Hasini, >>>>>> >>>>>> If you are trying to print just the FFT, it should not be an issue. >>>>>> If you print raw iq samples, then you will run out of memory. By long, >>>>>> you >>>>>> mean how long? Days? >>>>>> >>>>>> On Tue, Jan 10, 2017 at 3:16 PM, Martin Braun <[email protected] >>>>>> > wrote: >>>>>> >>>>>>> Hasini, >>>>>>> >>>>>>> can you please re-state what you're trying to do? That might help you >>>>>>> getting some answers. It is not quite clear from this email. >>>>>>> >>>>>>> Cheers, >>>>>>> Martin >>>>>>> >>>>>>> >>>>>>> On 01/02/2017 09:16 PM, Hasini Abeywickrama wrote: >>>>>>> > Hi all, >>>>>>> > >>>>>>> > I have a flowgraph that reads a signal and writes its FFT samples >>>>>>> to a >>>>>>> > file. I need to run this continuously (for a long time), without >>>>>>> running >>>>>>> > out of memory. >>>>>>> > >>>>>>> > I tired deleting the earlier FFT samples from the file but that >>>>>>> messes >>>>>>> > up with reading the data. I also tried starting writing to a >>>>>>> different >>>>>>> > file after some time so the initial file can be completely >>>>>>> deleted. But >>>>>>> > it did not work as well. >>>>>>> > >>>>>>> > What would be the best approach for this? Any thought would be >>>>>>> very much >>>>>>> > appreciated. >>>>>>> > >>>>>>> > Regards, >>>>>>> > Hasini >>>>>>> > >>>>>>> > >>>>>>> > _______________________________________________ >>>>>>> > Discuss-gnuradio mailing list >>>>>>> > [email protected] >>>>>>> > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio >>>>>>> > >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Discuss-gnuradio mailing list >>>>>>> [email protected] >>>>>>> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best Regards, >>>>>> *Mallesham Dasari* >>>>>> Department of Computer Science >>>>>> Stony Brook University >>>>>> USA - 11794 >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Discuss-gnuradio mailing >>>>>> [email protected]https://lists.gnu.org/mailman/listinfo/discuss-gnuradio >>>>>> >>>>>> _______________________________________________ Discuss-gnuradio >>>>>> mailing list [email protected] https://lists.gnu.org/mailman/ >>>>>> listinfo/discuss-gnuradio >>>>> >>>>> -- >>>>> Best Regards, >>>>> *Mallesham Dasari* >>>>> Department of Computer Science >>>>> Stony Brook University >>>>> USA - 11794 >>>>> >>>>> _______________________________________________ Discuss-gnuradio >>>>> mailing list [email protected] https://lists.gnu.org/mailman/ >>>>> listinfo/discuss-gnuradio >>>>> >>>> -- >>> Best Regards, >>> *Mallesham Dasari* >>> Department of Computer Science >>> Stony Brook University >>> USA - 11794 >>> >>> -- >> Best Regards, >> *Mallesham Dasari* >> Department of Computer Science >> Stony Brook University >> USA - 11794 >> >> _______________________________________________ Discuss-gnuradio mailing >> list [email protected] https://lists.gnu.org/mailman/ >> listinfo/discuss-gnuradio >> >> _______________________________________________ Discuss-gnuradio mailing >> list [email protected] https://lists.gnu.org/mailman/ >> listinfo/discuss-gnuradio > > -- > Best Regards, > *Mallesham Dasari* > Department of Computer Science > Stony Brook University > USA - 11794 > > _______________________________________________ > Discuss-gnuradio mailing > [email protected]https://lists.gnu.org/mailman/listinfo/discuss-gnuradio > > > _______________________________________________ > Discuss-gnuradio mailing list > [email protected] > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio > >
_______________________________________________ Discuss-gnuradio mailing list [email protected] https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
