Also: when i am using a wrapping constructor to initilaize a MAIN_MEMORY
matrix around preexisting row-major buffer, when i subsequently try to use
this matrix, i get the message:

ViennaCL: Internal memory error: not initialised!

why?


On Wed, Jul 13, 2016 at 2:01 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:

> So fast_copy still copies the memory and has copying overhead, even with
> MAIN_MEMORY context?
>
> Is there a way to do shallow copying  (i.e. just pointer initialization)
> to the matrix data buffer? Isn't it what some constructors of matrix or
> matrix_base do?
>
> What i am getting at, it looks like i am getting a significant overhead
> for just copying -- actually, it seems i am getting double overhead -- once
> when i prepare padding and all as required by the internal_size?(), and
> then i pass it into the fast_copy() which apparently does copying again,
> even if we are using host memory matrices.
>
> all in all, by my estimates this copying back and forth (which is,
> granted, is not greatly optimized on our side) takes ~15..17 seconds out of
> 60 seconds total when multiplying 10k x 10k dense arguments via ViennaCL. I
> also optimize to -march=haswell  and use -ffast-math, without those i seem
> to fall too far behind what R + openblas can do in this test. Then, my
> processing time swells up to 2 minutes without optimizing for non-compliant
> arithmetics.
>
> If i can wrap the buffer and avoid copying for MAIN_MEMORY context, i'd be
> shaving off another 10% or so of the execution time. Which would make me
> happier, as i probably would be able to beat openblas given custom cpu
> architecture flags.
>
> On the other hand, bidmat (which allegedly uses mkl) does the same test,
> double precision, in under 10 seconds. I can't fathom how, but it does. I
> have a haswell-E platform.
>
> thank you.
> dmitriy
>
> On Tue, Jul 12, 2016 at 9:27 AM, Karl Rupp <r...@iue.tuwien.ac.at> wrote:
>
>> Hi,
>>
>> > One question: you mentioned padding for the `matrix` type. When i
>>
>>> initialize the `matrix` instance, i only specify dimensions. how do I
>>> know padding values?
>>>
>>
>> if you want to provide your own padded dimensions, consider using
>> matrix_base directly. If you want to query the padded dimensions, use
>> internal_size1() and internal_size2() for the internal number of rows and
>> columns.
>>
>> http://viennacl.sourceforge.net/doc/manual-types.html#manual-types-matrix
>>
>> Best regards,
>> Karli
>>
>>
>>
>>
>>> On Tue, Jul 12, 2016 at 5:53 AM, Karl Rupp <r...@iue.tuwien.ac.at
>>> <mailto:r...@iue.tuwien.ac.at>> wrote:
>>>
>>>     Hi Dmitriy,
>>>
>>>     On 07/12/2016 07:17 AM, Dmitriy Lyubimov wrote:
>>>
>>>         Hi,
>>>
>>>         I am trying to create some elementary wrappers for VCL in
>>> javacpp.
>>>
>>>         Everything goes fine, except i really would rather not use those
>>>         "cpu"
>>>         types (std::map,
>>>         std::vector) and rather initialize matrices directly by feeding
>>>         row-major or CCS formats.
>>>
>>>         I see that matrix () constructor accepts this form of
>>>         initialization;
>>>         but it really states that
>>>         it does "wrapping" for the device memory.
>>>
>>>
>>>     Yes, the constructors either create their own memory buffer
>>>     (zero-initialized) or wrap an existing buffer. These are the only
>>>     reasonable options.
>>>
>>>
>>>         Now, i can create a host matrix() using host memory and row-major
>>>         packing. This works ok it seems.
>>>
>>>         However, these are still host instances. Can i copy host
>>>         instances to
>>>         instances on opencl context?
>>>
>>>
>>>     Did you look at viennacl::copy() or viennacl::fast_copy()?
>>>
>>>
>>>         That might be one way bypassing unnecessary (in my case)
>>>         complexities of
>>>         working with std::vector and std::map classes from java side.
>>>
>>>         But it looks like there's no copy() variation that would accept a
>>>         matrix-on-host and matrix-on-opencl arguments (or rather, it of
>>>         course
>>>         declares those to be ambiguous since two methods fit).
>>>
>>>
>>>     If you want to copy your OpenCL data into a viennacl::matrix, you
>>>     may wrap the memory handle (obtained with .elements()) into a vector
>>>     and copy that. If you have plain host data, use
>>>     viennacl::fast_copy() and mind the data layout (padding of
>>>     rows/columns!)
>>>
>>>
>>>         For compressed_matrix, there seems to be a set() method, but i
>>> guess
>>>         this also requires CCS arrays in the device memory if I use it.
>>> Same
>>>         question, is there a way to send-and-wrap CCS arrays to an
>>>         opencl device
>>>         instance of compressed matrix without using std::map?
>>>
>>>
>>>     Currently you have to use .set() if you want to bypass
>>>     viennacl::copy() and std::map.
>>>
>>>     I acknowledge that the C++ type system is a pain when interfacing
>>>     from other languages. We will make this much more convenient in
>>>     ViennaCL 2.0. The existing interface in ViennaCL 1.x is too hard to
>>>     fix without breaking lots of user code, so we won't invest time in
>>>     that (contributions welcome, though :-) )
>>>
>>>     Best regards,
>>>     Karli
>>>
>>>
>>>
>>>
>>
>
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity planning
reports.http://sdm.link/zohodev2dev
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to