On Wed, Nov 17, 2010 at 09:14:47AM -0800, John Andrews wrote: > Hi, > I am posting this question again with better explanation as I got no help > yet. > > I have a custom C++ block that I use in the modified dbpsk.py modulation > scheme. This block basically spreads each input data bit by 1023. > > The flowgraph connect looks like this > self.connect(self,self.bytes2chunks,self.symbol_mapper,self.diffenc,self.CUSTOM_BLOCK,self.chunks2symbols,self.rrc_filter,self) > > The CUSTOM_BLOCK outputs 5115 bytes for every input byte read therefore, in > the flowgraph the input rate at self.chunks2symbols is 5115 times to that of > input at self.CUSTOM_BLOCK. This causes the flowgraph to slow down > incredibly to such an extent that I have to force kill it. I am using > benchmark_tx.py to pass data to the flowgraph.
Stating the obvious, you have just increased the workload by a factor of 5115. You seem surprised that it's taking 5000 times longer to run... > I implemented the custom block in 2 different ways once by inheriting > gr_block and the other by using gr_sync_interpolator but the result is still > the same. What should I do to make it work smoothly? > > Thanks > > P.S - The work function is shown below when using gr_sync_interpolator. > > dsss_sync_spread_b::work(int noutput_items,gr_vector_const_void_star > &input_items,gr_vector_void_star &output_items) > { > const unsigned char *in = (const unsigned char *)input_items[0]; > unsigned char *out = (unsigned char *)output_items[0]; > int data_items=noutput_items/interpolation(); // interploation() returns > (d_length_PN * d_n_pn which is equal to 1023 * 5) > int nout=0; > for(int i=0;i<data_items;i++){ > if(in[i]&0x01){ > for(int j=0;j<interpolation();j++){ > out[nout]=d_pn_array1[j%d_length_PN]; // the array d_pn_array1 > has datatype 'char' and is of size 1023. d_length_PN = 1023 and is > initialised in the constructor and is never changed > nout++; > } The modulo operator in the inner loop isn't helping matters. div and mod are not free. Q: How may cycles does an integer divide take on the Core 2 microarchitecture? Have you used oprofile or some other tool to see where you're actually spending your cycles? With a bit of restructuring, you could turn the inner loop into a memcpy. Left as an exercise... However, I strongly recommend using oprofile of some other tool to see where you're spending your cycles before you change anything. > } > else{ > for(int j=0;j<d_length_PN*d_n_pn;j++){ > out[nout]=d_pn_array0[j%d_length_PN]; // the array d_pn_array2 > has datatype 'char' and is of size 1023. d_length_PN = 1023 and is > initialised in the constructor and is never changed > nout++; > } > } > } > return noutput_items; > } _______________________________________________ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org http://lists.gnu.org/mailman/listinfo/discuss-gnuradio