This perf testing is definitely a good way to approach things, Harbs. I can't reconcile it with my experience from the past, but I was using far more 'rudimentary' tests using times for simple loops over much larger ranges, and using the Date class for timing (precision was one reason why I used much larger loops)
I have to concede for sure that the TypedArray approach is slower. I could have sworn it was faster when I did comparisons as I was working on this (which was quite some time ago now). It may have been a result of the way I was 'testing' as well perhaps. I don't expect things to change I guess, but it would theoretically be better testing with the compiled version of BinaryData. Things like the function call overhead from getTypedArray() vs. referencing something directly and the repeated_position++ versus one _position op to increment (e.g. for DataView) in some cases mean there is more work at the byte level. But I don't expect this to change things based on these tests. Maybe I can troubleshoot the MD5 thing in the coming weeks. One of the other things I wondered about for BinaryData is whether it should have it's own 'autogrow' strategy internally for the buffer with sequential writes. This is not really PAYG though, so I guess not. It definitely performs much better on writing with a preallocated buffer length, so maybe that is the recommended approach for users. On Mon, Jun 25, 2018 at 8:45 AM Harbs <[email protected]> wrote: > I switched float and double functions to always use DataView. If we could > eliminate DataView instantiation we could probably improve performance with > that. I tried a single instance, but I got bounds errors. > > All read and write functions should work correctly now (the DM5 error > not-withstanding). There should be test cases to cover everything. (Unless > I missed some edge cases.) It’s possible that the float methods could be > improved with math, but I’m done for now. If anyone wants to improve things > further, feel free… > > Harbs > > > On Jun 24, 2018, at 4:46 AM, Greg Dove <[email protected]> wrote: > > > > I think DataView might be a lot faster than it was a couple of years ago. > > At least it seems to be (on Chrome/windows on the machine I am currently > > using) (and it should only need a single instance like the Uint8Array, > for > > the current buffer, I think, instead of 'new DataView' ) > > > > On Sun, Jun 24, 2018 at 1:19 PM Greg Dove <[email protected]> wrote: > > > >> > >> I tried to add in the extra test, but messed up I think because it did > not > >> copy over the original tests in the revision (even though I was > 'adding' a > >> new test). I have not used jsperf before, so probably did something > stupid. > >> On the face of it, it certainly does look like the byte level stuff is > >> faster. > >> > >> > >> On Sun, Jun 24, 2018 at 10:32 AM Greg Dove <[email protected]> wrote: > >> > >>> > >>> Harbs, for the typed array test, try using the ArrayBuffer in the > >>> constructor instead of the Uint8Array, I think you might see a > difference > >>> here. > >>> > >>> > >>> On Sun, Jun 24, 2018 at 10:09 AM Harbs <[email protected]> wrote: > >>> > >>>> I created a jsperf which compares different read methods, and the one > I > >>>> changed it to seems fastest by far: > >>>> https://jsperf.com/typedarray-read-performance/1 < > >>>> https://jsperf.com/typedarray-read-performance/1> > >>>> > >>>> The only optimization which might give slight gains is to use shift > >>>> operators instead of multiplication. Enabling the > --enable-benchmarking > >>>> flag in Chrome seemed to indicate that there’s up to a 10% performance > >>>> increase of the shift operators over multiplication. Logically, shift > >>>> operators should be cheaper. In Safari, the shift operator gave a > >>>> performance boost of 3 times the multiplication. > >>>> > >>>> I only tested 16 bit uints. It might be worthwhile adding ints and 32 > >>>> bit. > >>>> > >>>> Thanks, > >>>> Harbs > >>>> > >>>>> On Jun 22, 2018, at 8:58 AM, Alex Harui <[email protected]> > >>>> wrote: > >>>>> > >>>>> It is definitely true that I don't know how expensive instantiation > of > >>>> small instances is in the browser. Might even be different in > different > >>>> browsers. There should also be function call overhead in calling the > >>>> constructor as well. On the other hand, maybe using the typed arrays > helps > >>>> with the browser's type inferencing and there is some advantage there > that > >>>> also assumes the equivalent of word/long alignments. > >>>>> > >>>>> If TypeArrays really aren't that expensive to instantiate, it may be > >>>> that the way to handle floats/doubles is to copy bytes into a new > >>>> BinaryData and convert that to the appropriate TypedArray. That > should > >>>> ensure that there won't be alignment issues. > >>>>> > >>>>> IMO, this discussion shows the importance of providing choices. You > >>>> can have a much simpler variant that assumes certain endian and thus > save a > >>>> whole bunch of byte swapping. You might in fact have a faster > >>>> implementation if you can guarantee/assume alignment. > >>>>> > >>>>> HTH, > >>>>> -Alex > >>>>> > >>>>> On 6/21/18, 3:01 PM, "Greg Dove" <[email protected]> wrote: > >>>>> > >>>>> ' If you have the time to resolve these remaining issues, that > >>>> would be > >>>>> awesome.' > >>>>> > >>>>> Happy to commit to doing this by mid-July if that works. I do have > >>>> downtime > >>>>> in July that I can use for Royale, but I also need to set up for > >>>> Royale > >>>>> development again because I have changed machines. > >>>>> > >>>>> > >>>>> On Fri, Jun 22, 2018 at 9:13 AM Harbs <[email protected]> > >>>> wrote: > >>>>> > >>>>>> Cool. > >>>>>> > >>>>>> FYI there are still some issues to resolve: > >>>>>> > >>>>>> 1. readFloat and readDouble should have the same issues as the > >>>> various int > >>>>>> read methods. I did not fix that yet because calculating floats and > >>>> doubles > >>>>>> is harder than ints. I’m not sure of the best way to handle that. > One > >>>> way > >>>>>> is to create a new temporary ArrayBuffer with only the necessary > >>>> bytes and > >>>>>> read that. Another would be to use DataViews for floats and > doubles. A > >>>>>> third would be to actually do the math for floating point numbers… > >>>>>> > >>>>>> 2. I did not fix any of the write methods. They should have the same > >>>> issue > >>>>>> as the read methods. > >>>>>> > >>>>>> If you have the time to resolve these remaining issues, that would > be > >>>>>> awesome. :-) > >>>>>> > >>>>>> Harbs > >>>>>> > >>>>>>> On Jun 21, 2018, at 11:55 PM, Greg Dove <[email protected]> > wrote: > >>>>>>> > >>>>>>> Thanks Harbs - that short explanation makes it very clear. Thanks > >>>> for > >>>>>>> fixing my broken implementation :). I am guilty of not covering > 'real > >>>>>>> world' combinations in the tests, which was what I used to guide > >>>>>>> development when I was working on this originally. > >>>>>>> > >>>>>>> I am pretty sure I did have also have an interim implementation > >>>> doing the > >>>>>>> bytewise reads/writes and dealing with endian-ness branching along > >>>> the > >>>>>> way, > >>>>>>> then I switched to using the typed arrays when I saw a performance > >>>> boost. > >>>>>>> I wish I had written something like your new test before I did that > >>>> :). > >>>>>>> I did a quick search for this and read up on the issue about the > >>>> offset. > >>>>>> I > >>>>>>> guess that also partly explains why the typed array performance is > >>>>>> 'faster' > >>>>>>> (possibly due to the alignment I mean). And I did not do memory > >>>>>> profiling, > >>>>>>> so maybe Alex is right - there could also have been a massive GC > dump > >>>>>> after > >>>>>>> my large loop tests, something that I was not seeing because I was > >>>> only > >>>>>>> focused on speed at the time. > >>>>>>> > >>>>>>> I had to add the 'sysEndian' flag approach internally when I > >>>> switched to > >>>>>>> using the typed arrays (and only reordering the result if it was > >>>>>> different > >>>>>>> to the 'native' endian used in the typed arrays). If everything is > >>>> based > >>>>>> on > >>>>>>> bytewise approach now, that flag should no longer be needed I think > >>>> (you > >>>>>>> may already have done that, I did not look at your changes in > detail > >>>> yet > >>>>>> - > >>>>>>> will check them out this weekend). > >>>>>>> > >>>>>>> Obviously having this working correctly comes first, but with a > >>>>>>> comprehensive set of tests, we do have the ability to more easily > >>>> respond > >>>>>>> to changes in the browser engines over time and update the > >>>> implementation > >>>>>>> if, for example, DataView becomes faster across the board. I did > see > >>>> it > >>>>>>> working faster in one of the browsers I tested in, but I am not > >>>> confident > >>>>>>> that I kept a good record of all this... > >>>>>>> > >>>>>>> I will look in more detail this weekend, but for now it sounds like > >>>> you > >>>>>>> have resolved the issue - well done! I hope I can get into some > more > >>>> work > >>>>>>> on Royale again soon. > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On Thu, 21 Jun 2018, 21:04 Harbs, <[email protected]> wrote: > >>>>>>> > >>>>>>>> I just added a unit test for this. > >>>>>>>> > >>>>>>>> The test will fail with a RTE using the old implementation. > >>>>>>>> > >>>>>>>> Explanation: After reading the first byte, the position is 1 which > >>>>>> doesn’t > >>>>>>>> align with a 16IntArray. > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Harbs > >>>>>>>> > >>>>>>>>> On Jun 21, 2018, at 1:48 AM, Greg Dove <[email protected]> > >>>> wrote: > >>>>>>>>> > >>>>>>>>> I am not sure what the int16/int-aligned issue is yet (I have not > >>>> read > >>>>>>>> back > >>>>>>>>> through all the relevant history) but if we can set up a unit > test > >>>> for > >>>>>>>> it, > >>>>>>>>> this stuff all becomes much clearer I think. > >>>>>>>> > >>>>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> > >>>> > >>>> > >
