Hi,

Can you share your ignite::binary::BinaryType::Read method where reading of
the std::vector is going on?

Also, are your strings large too or only vectors?

Best Regards,
Igor


On Mon, Sep 28, 2020 at 8:29 PM Brett Elliott <belli...@icr-team.com> wrote:

> Hello,
>
> Tl;dr: I'm doing some profiling, and the cpp thin client is spending a lot
> of time in memset. It's spending almost as much time as in the socket recv
> call.
>
> Longer version:
>
> I was profiling a test harness that does a Get from our ingite grid. The
> test harness is written in c++ using the thin client. I profiled the code
> using gperftools, and I found that memset (__memset_sse2 in my case) was
> taking a large portion of the execution time. I'm Get-ting a BinaryType
> which contains a std::string, an int32_t, and an array of int8_t. For my
> test case, the array of int8_t values can vary, but I got the best
> throughput on my machine using about 100MB for the size of that array.
>
> I profiled the test code doing a single Get, and doing 8 Gets. In the 8
> Get case, the number of memset calls increased, but the percentage of
> overall time spent in memset was reduced. However it was not reduced as
> much as I'd hoped. I was hoping that the first Get call would have a large
> memset, and the rest of the Get calls would skip it, but that's maybe not
> the case.
>
> I'm seeing almost as much time spent in memset as is being spent in recv
> (__libc_recv in this case). That seems like a lot of time spent
> initializing. I suspect that it's std::vector initialization caused by
> resize.
>
> I believe memset is being invoked by a std::vector::resize operation
> inside of ignite::binary::BinaryType::Read. I believe the source file is
> modules/platforms/cpp/binary/src/impl/binary/binary_reader_impl.cpp. In the
> code I'm looking at it's line 905. There are only two calls to resize in
> this sourcefile, and it's the one in ReadTopObject0 which I think is the
> culprit. I didn't compile with debug symbols to confirm the particular
> resize call, but my profiler's callstack shows that resize is to blame for
> all the memset calls.
>
> Is there any way we can avoid std::vector::resize? I suspect that
> ultimately the problem is that the buffer somewhere gets passed to a socket
> recv call, and recv call takes a pointer and length. In that case, there's
> no way that I know of to use a std::vector for the buffer and avoid the
> unnecessary initialization/memset in the resize call.
>
> Could another container be used instead of a vector?
> Could the vector be reused, so on subsequent calls we don't need to resize
> it again?
> Could something like uvector (which skips initialization) be used instead?
>
> Thank,
> Brett
>
>
>

Reply via email to