Hi Evan, > I need to scatter the elements of a vector out to multiple processors. > The mapping is one to many (vector elements can go to many procs). I > would like to do this with a permutation matrix which has 1 nonzero > per row. > > I'd like the process to run on the GPU, so a warp would need to > operate on multiple rows of the permutation matrix. What do you think > the best approach would be?
The best approach with the existing API would be to fill such an 'permutation matrix'. However, this is a little wasteful in terms of memory bandwidth. There's also another option available soon: A couple of days back I added some first support for viennacl::vector<int> and friends, and for the next release we already want to provide an assignment of the form vector<double> x,y; vector<int> indices; ... x = y(indices); // or similar. API not decided yet. Over time this will gradually be extended to support more complicated expressions, but this is quite some undertaking in terms of robust implementation and thus won't happen 'immediately'. Best regards, Karli ------------------------------------------------------------------------------ See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk _______________________________________________ ViennaCL-devel mailing list ViennaCL-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/viennacl-devel