Re: IGNITE-13

Вадим Опольский Thu, 02 Mar 2017 13:28:36 -0800

Hi Valentin!

I've created:


new method strToUtf8BytesDirect in BinaryUtilsNew
https://github.com/javaller/MyBenchmark/blob/master/src/
main/java/org/sample/BinaryUtilsNew.java

new method doWriteStringDirect in BinaryWriterExImplNew
https://github.com/javaller/MyBenchmark/blob/master/src/
main/java/org/sample/BinaryWriterExImplNew.java

benchmarks for BinaryWriterExImpl doWriteString and BinaryWriterExImplNew
doWriteStringDirect
https://github.com/javaller/MyBenchmark/blob/master/src/
main/java/org/sample/ExampleTest.java

This is a result of comparing:

Benchmark
Mode  Cnt   Score               Error
UnitsExampleTest.binaryHeapOutputStreamDirect      avgt   50  1128448,743 ±
13536,689  ns/opExampleTest.binaryHeapOutputStreamInDirect  avgt   50
1127270,695 ± 17309,256  ns/op

Vadim

2017-03-02 1:02 GMT+03:00 Valentin Kulichenko <[email protected]
>:

> Hi Vadim,
>
> We're getting closer :) I would actually like to see the test for actual
> implementation of BinaryWriterExImpl#doWriteString method. Logic in
> binaryHeapOutputInDirect() confuses me a bit and I'm not sure comparison is
> valid.
>
> Can you please do the following:
>
> 1. Create new BinaryUtils#strToUtf8BytesDirect method, copy-paste the
> code from existing BinaryUtils#strToUtf8Bytes and modify it so that it
> takes BinaryOutputStream as an argument and writes to it directly. Do not
> create stream inside this method, as it's the same as creating new array.
> 2. Create new BinaryWriterExImpl#doWriteStringDirect, copy-paste the code
> from existing BinaryWriterExImpl#doWriteString and modify it so that it
> uses BinaryUtils#strToUtf8BytesDirect and doesn't call out.writeByteArray.
> 3. Create benchmark for BinaryWriterExImpl#doWriteString method. I.e.,
> create an instance of BinaryWriterExImpl and call doWriteString() in
> benchmark method.
> 4. Similarly, create benchmark for BinaryWriterExImpl#doWriteStringDirect.
> 5. Compare results.
>
> This will give us clear picture of how these two approaches perform. Your
> current results are actually promising, but I would like to confirm them.
>
> -Val
>
> On Wed, Mar 1, 2017 at 6:17 AM, Вадим Опольский <[email protected]>
> wrote:
>
>> Hi Valentin!
>>
>> Thank you for comments.
>>
>> There is a new method which writes directly to BinaryOutputStream instead
>> of intermediate array.
>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>> /java/org/sample/BinaryUtilsNew.java
>>
>> There is benchmark.
>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>> /java/org/sample/MyBenchmark.java
>>
>> Unit test
>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>> /java/org/sample/BinaryOutputStreamTest.java
>>
>> Statistics
>> https://github.com/javaller/MyBenchmark/blob/master/out_01_03_17.txt
>>
>> Benchmark
>>  Mode       Cnt    Score        Error  Units MyBenchmark.binaryHeapOutputIn
>> Direct            avgt          50  111,337 ± 0,742  ns/op
>> MyBenchmark.binaryHeapOutputStreamDirect   avgt          50   23,847 ±
>> 0,303    ns/op
>>
>>
>> Vadim
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 2017-02-28 4:29 GMT+03:00 Valentin Kulichenko <
>> [email protected]>:
>>
>>> Hi Vadim,
>>>
>>> Looks like you accidentally removed dev list from the thread, adding it
>>> back.
>>>
>>> I think there is still misunderstanding. What I propose is to modify
>>> the BinaryUtils#strToUtf8Bytes so that it writes directly to 
>>> BinaryOutputStream
>>> instead of intermediate array. This should decrease memory consumption and
>>> can also increase performance as we will avoid 'writeByteArray' step at
>>> the end.
>>>
>>> Does it make sense to you?
>>>
>>> -Val
>>>
>>> On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский <[email protected]>
>>> wrote:
>>>
>>>> Hi, Valentin!
>>>>
>>>> What do you think about using the methods of BinaryOutputStream:
>>>>
>>>> 1) writeByteArray(byte[] val)
>>>> 2) writeCharArray(char[] val)
>>>> 3) write (byte[] arr, int off, int len)
>>>>
>>>> String val = "Test";
>>>>     out.writeByteArray( val.getBytes(UTF_8));
>>>>
>>>>  String val = "Test";
>>>>     out.writeCharArray(str.toCharArray());
>>>>
>>>> String val = "Test"
>>>> InputStream stream = new ByteArrayInputStream(
>>>> exampleString.getBytes(StandartCharsets.UTF_8));
>>>> byte[] buffer = new byte[1024];
>>>> while ((buffer = stream.read()) != -1) {
>>>> out.writeByteArray(buffer);
>>>> }
>>>>
>>>> What else can we use ?
>>>>
>>>> Vadim
>>>>
>>>>
>>>> 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko <
>>>> [email protected]>:
>>>>
>>>>> Hi Vadim,
>>>>>
>>>>> Which method implements the approach described in the ticket? From
>>>>> what I see, all writeToStringX versions are still encoding into an
>>>>> intermediate array and then call out.writeByteArray. What we need to test
>>>>> is the approach where bytes are written directly into the stream during
>>>>> encoding. Encoding algorithm itself should stay the same for now, 
>>>>> otherwise
>>>>> we will not know how to interpret the result.
>>>>>
>>>>> It looks like there is some misunderstanding here, so please let me
>>>>> know anything is still unclear. I will be happy to answer your questions.
>>>>>
>>>>> -Val
>>>>>
>>>>> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi Vadim,
>>>>>>
>>>>>> Thanks, I will review this week.
>>>>>>
>>>>>> -Val
>>>>>>
>>>>>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi Valentin!
>>>>>>>
>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>
>>>>>>> I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and
>>>>>>> added new methods with changes described in the ticket
>>>>>>>
>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>> /java/org/sample/BinaryWriterExImplNew.java
>>>>>>>
>>>>>>> I created a benchmark for BinaryWriterExImplNew
>>>>>>>
>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>>>> /java/org/sample/ExampleTest.java
>>>>>>>
>>>>>>> I run benchmark and compared results
>>>>>>>
>>>>>>> https://github.com/javaller/MyBenchmark/blob/master/totalstat.txt
>>>>>>>
>>>>>>> # Run complete. Total time: 00:10:24
>>>>>>> Benchmark                                    Mode  Cnt
>>>>>>> Score       Error  Units
>>>>>>> ExampleTest.binaryHeapOutputStream1          avgt   50  1114999,207
>>>>>>> ± 16756,776  ns/op
>>>>>>> ExampleTest.binaryHeapOutputStream2          avgt   50  1118149,320
>>>>>>> ± 17515,961  ns/op
>>>>>>> ExampleTest.binaryHeapOutputStream3          avgt   50  1113678,657
>>>>>>> ± 17652,314  ns/op
>>>>>>> ExampleTest.binaryHeapOutputStream4          avgt   50  1112415,051
>>>>>>> ± 18273,874  ns/op
>>>>>>> ExampleTest.binaryHeapOutputStream5          avgt   50  1111366,583
>>>>>>> ± 18282,829  ns/op
>>>>>>> ExampleTest.binaryHeapOutputStreamACSII   avgt   50  1112079,667 ±
>>>>>>> 16659,532  ns/op
>>>>>>> ExampleTest.binaryHeapOutputStreamUTFCustom  avgt   50  1114949,759
>>>>>>> ± 16809,669  ns/op
>>>>>>> ExampleTest.binaryHeapOutputStreamUTFNIO        avgt   50
>>>>>>> 1121462,325 ± 19836,466  ns/op
>>>>>>>
>>>>>>> Is it OK? Whats the next step? Do I have to move this JMH benchmark
>>>>>>> to the Ignite project ?
>>>>>>>
>>>>>>> Vadim Opolski
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko <
>>>>>>> [email protected]>:
>>>>>>>
>>>>>>>> Hi Vadim,
>>>>>>>>
>>>>>>>> I'm not sure I understand your benchmarks and how they verify the
>>>>>>>> optimization discussed here. Basically, here is what needs to be done:
>>>>>>>>
>>>>>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString method.
>>>>>>>> 2. Run the benchmark with current implementation.
>>>>>>>> 3. Make the change described in the ticket.
>>>>>>>> 4. Run the benchmark with these changes.
>>>>>>>> 5. Compare results.
>>>>>>>>
>>>>>>>> Makes sense? Let me know if anything is unclear.
>>>>>>>>
>>>>>>>> -Val
>>>>>>>>
>>>>>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hello everybody!
>>>>>>>>>
>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>
>>>>>>>>> Valentin, I just have finished benchmark (with JMH) -
>>>>>>>>> https://github.com/javaller/MyBenchmark.git
>>>>>>>>>
>>>>>>>>> It collect data about time working of serialization.
>>>>>>>>>
>>>>>>>>> For instance - https://github.com/javaller/My
>>>>>>>>> Benchmark/blob/master/out200217.txt
>>>>>>>>>
>>>>>>>>> To start it you have to do next:
>>>>>>>>>
>>>>>>>>> 1) clone it - git colne https://github.com/javal
>>>>>>>>> ler/MyBenchmark.git
>>>>>>>>>
>>>>>>>>> 2) install it - mvn install
>>>>>>>>>
>>>>>>>>> 3) run benchmarks -  java -Xms1024m -Xmx4096m -jar
>>>>>>>>> target\benchmarks.jar
>>>>>>>>>
>>>>>>>>> Vadim Opolski
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko <
>>>>>>>>> [email protected]>:
>>>>>>>>>
>>>>>>>>>> Vladimir,
>>>>>>>>>>
>>>>>>>>>> I think we misunderstood each other. My understanding of this
>>>>>>>>>> optimization is the following.
>>>>>>>>>>
>>>>>>>>>> Currently string serialization is done in two steps (see
>>>>>>>>>> BinaryWriterExImpl#doWriteString):
>>>>>>>>>>
>>>>>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string into
>>>>>>>>>> byte array.
>>>>>>>>>> out.writeByteArray(strArr);                      // Write byte
>>>>>>>>>> array into stream.
>>>>>>>>>>
>>>>>>>>>> What this ticket suggests is to write directly into stream while
>>>>>>>>>> string is encoded, without intermediate array. This both reduces 
>>>>>>>>>> memory
>>>>>>>>>> consumption and eliminates array copy step.
>>>>>>>>>>
>>>>>>>>>> I updated the ticket and added this explanation there.
>>>>>>>>>>
>>>>>>>>>> Vadim, can you create a micro benchmark and check if it gives any
>>>>>>>>>> improvement?
>>>>>>>>>>
>>>>>>>>>> -Val
>>>>>>>>>>
>>>>>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> It is hard to say whether it makes sense or not. No doubt, it
>>>>>>>>>>> could speed up marshalling process at the cost of 2x memory 
>>>>>>>>>>> required for
>>>>>>>>>>> strings. From my previous experience with marshalling 
>>>>>>>>>>> micro-optimizations,
>>>>>>>>>>> we will hardly ever notice speedup in distributed environment.
>>>>>>>>>>>
>>>>>>>>>>> But, there is another sied - it could speedup our queries,
>>>>>>>>>>> because we will not have to unmarshal string on every field access. 
>>>>>>>>>>> So I
>>>>>>>>>>> would try to make this optimization optional and then measure query
>>>>>>>>>>> performance with classes having lots of strings. It could give us
>>>>>>>>>>> interesting results.
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Vladimir,
>>>>>>>>>>>>
>>>>>>>>>>>> Can you please take a look and provide your thoughts? Can this
>>>>>>>>>>>> be applied to binary marshaller? From what I recall, it serializes 
>>>>>>>>>>>> string a
>>>>>>>>>>>> bit differently from optimized marshaller, so I'm not sure.
>>>>>>>>>>>>
>>>>>>>>>>>> -Val
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> > Hi Vadim,
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > I don't think it makes much sense to invest into
>>>>>>>>>>>>> OptimizedMarshaller.
>>>>>>>>>>>>> > However, I would check if this optimization is applicable to
>>>>>>>>>>>>> > BinaryMarshaller, and if yes, implement it.
>>>>>>>>>>>>> >
>>>>>>>>>>>>>
>>>>>>>>>>>>> Val, in this case can you please update the ticket?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > -Val
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим Опольский <
>>>>>>>>>>>>> [email protected]>
>>>>>>>>>>>>> > wrote:
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > > Dear sirs!
>>>>>>>>>>>>> > >
>>>>>>>>>>>>> > > I want to resolve issue IGNITE-13 -
>>>>>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>>>> > >
>>>>>>>>>>>>> > > Is it actual?
>>>>>>>>>>>>> > >
>>>>>>>>>>>>> > > Vadim Opolski
>>>>>>>>>>>>> > >
>>>>>>>>>>>>> >
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: IGNITE-13

Reply via email to