Re: [Discuss-gnuradio] Help - Custom block generating data @ 1:5115 input to output ratio causes flowgraph to hang -- how to prevent this

2010-11-17 Thread Eric Blossom
On Wed, Nov 17, 2010 at 09:14:47AM -0800, John Andrews wrote:
> Hi,
> I am posting this question again with better explanation as I got no help
> yet.
> 
> I have a custom C++ block that I use in the modified dbpsk.py modulation
> scheme. This block basically spreads each input data bit by 1023.
> 
> The flowgraph connect looks like this
> self.connect(self,self.bytes2chunks,self.symbol_mapper,self.diffenc,self.CUSTOM_BLOCK,self.chunks2symbols,self.rrc_filter,self)
> 
> The CUSTOM_BLOCK outputs 5115 bytes for every input byte read therefore, in
> the flowgraph the input rate at self.chunks2symbols is 5115 times to that of
> input at self.CUSTOM_BLOCK. This causes the flowgraph to slow down
> incredibly to such an extent that I have to force kill it. I am using
> benchmark_tx.py to pass data to the flowgraph.

Stating the obvious, you have just increased the workload by a factor
of 5115.  You seem surprised that it's taking 5000 times longer to run...



> I implemented the custom block in 2 different ways once by inheriting
> gr_block and the other by using gr_sync_interpolator but the result is still
> the same. What should I do to make it work smoothly?
> 
> Thanks
> 
> P.S - The work function is shown below when using gr_sync_interpolator.
> 
>  dsss_sync_spread_b::work(int noutput_items,gr_vector_const_void_star
> &input_items,gr_vector_void_star &output_items)
> {
>   const unsigned char *in = (const unsigned char *)input_items[0];
>   unsigned char *out = (unsigned char *)output_items[0];
>   int data_items=noutput_items/interpolation(); // interploation() returns
> (d_length_PN * d_n_pn which is equal to 1023 * 5)
>   int nout=0;
>   for(int i=0;i   if(in[i]&0x01){
>   for(int j=0;j   out[nout]=d_pn_array1[j%d_length_PN];   // the array d_pn_array1
> has datatype 'char' and is of size 1023. d_length_PN = 1023 and is
> initialised in the constructor and is never changed
>   nout++;
>   }

The modulo operator in the inner loop isn't helping matters.  div and
mod are not free.  Q: How may cycles does an integer divide take on
the Core 2 microarchitecture?

Have you used oprofile or some other tool to see where you're actually
spending your cycles?

With a bit of restructuring, you could turn the inner loop into a
memcpy.  Left as an exercise...

However, I strongly recommend using oprofile of some other tool to see
where you're spending your cycles before you change anything.


>   }
>   else{
>   for(int j=0;j   out[nout]=d_pn_array0[j%d_length_PN];  // the array d_pn_array2
> has datatype 'char' and is of size 1023. d_length_PN = 1023 and is
> initialised in the constructor and is never changed
>   nout++;
>   }
>   }
>   }
>   return noutput_items;
> }


___
Discuss-gnuradio mailing list
Discuss-gnuradio@gnu.org
http://lists.gnu.org/mailman/listinfo/discuss-gnuradio


Re: [Discuss-gnuradio] Help - Custom block generating data @ 1:5115 input to output ratio causes flowgraph to hang -- how to prevent this

2010-11-17 Thread Tom Rondeau
On Wed, Nov 17, 2010 at 9:14 AM, John Andrews  wrote:
> Hi,
> I am posting this question again with better explanation as I got no help
> yet.
>
> I have a custom C++ block that I use in the modified dbpsk.py modulation
> scheme. This block basically spreads each input data bit by 1023.
>
> The flowgraph connect looks like this
> self.connect(self,self.bytes2chunks,self.symbol_mapper,self.diffenc,self.CUSTOM_BLOCK,self.chunks2symbols,self.rrc_filter,self)
>
> The CUSTOM_BLOCK outputs 5115 bytes for every input byte read therefore, in
> the flowgraph the input rate at self.chunks2symbols is 5115 times to that of
> input at self.CUSTOM_BLOCK. This causes the flowgraph to slow down
> incredibly to such an extent that I have to force kill it. I am using
> benchmark_tx.py to pass data to the flowgraph.
>
> I implemented the custom block in 2 different ways once by inheriting
> gr_block and the other by using gr_sync_interpolator but the result is still
> the same. What should I do to make it work smoothly?
>
> Thanks
>
> P.S - The work function is shown below when using gr_sync_interpolator.

John,
There's nothing obvious that I would think would kill your
application, but there are definitely some modifications that I think
could help. See below.

Are you familiar with using Oprofile of valgrind --cachegrind? They
can help you isolate areas of particular trouble.

Because you know that you're creating N items our for every 1 item in,
use the sync_interpolator.

>  dsss_sync_spread_b::work(int noutput_items,gr_vector_const_void_star
> &input_items,gr_vector_void_star &output_items)
> {
>   const unsigned char *in = (const unsigned char *)input_items[0];
>   unsigned char *out = (unsigned char *)output_items[0];
>   int data_items=noutput_items/interpolation(); // interploation() returns
> (d_length_PN * d_n_pn which is equal to 1023 * 5)
>   int nout=0;
>   for(int i=0;i       for(int j=0;j       out[nout]=d_pn_array1[j%d_length_PN];   // the array d_pn_array1
> has datatype 'char' and is of size 1023. d_length_PN = 1023 and is
> initialised in the constructor and is never changed
>       nout++;
>       }


Use a memcpy here instead of the four loop. Same for below.

>   }
>   else{
>       for(int j=0;jFrom the comments, these sound like the same value.

>       out[nout]=d_pn_array0[j%d_length_PN];  // the array d_pn_array2
> has datatype 'char' and is of size 1023. d_length_PN = 1023 and is
> initialised in the constructor and is never changed
>       nout++;
>       }
>   }
>   }
>   return noutput_items;
> }


Tom


> The general_work function when using gr_block is shown below,
>
> int
> dsss_spreading_b::general_work(int noutput_items,gr_vector_int
> &ninput_items,gr_vector_const_void_star &input_items,gr_vector_void_star
> &output_items)
> {
>   const unsigned char *in = (const unsigned char *)input_items[0];
>   unsigned char *out = (unsigned char *)output_items[0];
>   int data_items=noutput_items/(d_length_PN*d_n_pn); // d_length_PN = 1023,
> d_n_pn = 5
>   int nout=0;
>   for(int i=0;i   if(in[i]&0x01){
>       for(int j=0;j       out[nout]=d_pn_array1[j%d_length_PN];
>       nout++;
>       }
>   }
>   else{
>       for(int j=0;j       out[nout]=d_pn_array0[j%d_length_PN];
>       nout++;
>       }
>   }
>   }
>
>    consume(0,data_items);
>    return noutput_items;
> }
>
>
>
> ___
> Discuss-gnuradio mailing list
> Discuss-gnuradio@gnu.org
> http://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>
>

___
Discuss-gnuradio mailing list
Discuss-gnuradio@gnu.org
http://lists.gnu.org/mailman/listinfo/discuss-gnuradio