Re: FastSaxSerializer::write ...
Hi, On Wed, Jan 20, 2016 at 11:54:41AM -0600, Norbert Thiebaud wrote: > On Wed, Jan 20, 2016 at 9:37 AM, David Tardonwrote: > > Hi, > > > > On Mon, Jan 18, 2016 at 12:53:07PM +1100, Chris Sherlock wrote: > >> I think a unit test might be helpful. They are actually quite easy to > >> write - in fact, I wrote a very simple one the other day. > >> > >> Have a look on master at vcl/qa/cppunit/font.cxx > > > > Note that you don't have to create a whole new CppunitTest every time. > > You can add new source files to an existing one. You just have to make > > sure that only one of the sources contains CPPUNIT_PLUGIN_IMPLEMENT(), > > otherwise you'd get a linker error. > > > > Speaking about excessively granular tests: would anyone protest against > > an Easy Hack to merge the tests in sal module into bigger groups, e.g., > > just one test for osl, one for rtl and another one for sal? I.e., > > reduction of the number of CppunitTest makefiles from 33 to 3. > > Yeah but in general merging create trouble with parallelism > if you have only 3 unit test, then that can use only 3 cores > > This is particularly important for the larger one later, which > consistently add long tail to the build, period of 30s to 2 minutes > where the build is stuck on waiting for one or two tests to finish I understand that. And I remember that some of the big import/export tests have been artificially split for that reason (I was among those who complained about too long running times). But we are not talking about long running import/export tests here. We are talking about "real" unit tests, testing a single class--with no heavyweight setup, no UNO etc. IMHO a reasonable granularity for those is at the level of subdirs. Possibly even whole modules if there's just a few of them ATM. D. ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: FastSaxSerializer::write ...
On 01/20/2016 09:25 PM, Michael Meeks wrote: Amusingly, while the sal/ tests -look- good ;-) many of them are simply not run because they fail to register the actual tests - which is not ideal (or did that get fixed?); At least cases where a test function is not added with CPPUNIT_TEST should be found by loplugin:unreffun (and it did find some back when it got introduced). ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: FastSaxSerializer::write ...
> On 21 Jan 2016, at 2:37 AM, David Tardonwrote: > > Hi, > > On Mon, Jan 18, 2016 at 12:53:07PM +1100, Chris Sherlock wrote: >> I think a unit test might be helpful. They are actually quite easy to write >> - in fact, I wrote a very simple one the other day. >> >> Have a look on master at vcl/qa/cppunit/font.cxx > > Note that you don't have to create a whole new CppunitTest every time. > You can add new source files to an existing one. You just have to make > sure that only one of the sources contains CPPUNIT_PLUGIN_IMPLEMENT(), > otherwise you'd get a linker error. > > Speaking about excessively granular tests: would anyone protest against > an Easy Hack to merge the tests in sal module into bigger groups, e.g., > just one test for osl, one for rtl and another one for sal? I.e., > reduction of the number of CppunitTest makefiles from 33 to 3. > > D. That sounds like it would make managing the unit tests a lot easier. Sure, why not? Chris ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: FastSaxSerializer::write ...
Hi, On Mon, Jan 18, 2016 at 12:53:07PM +1100, Chris Sherlock wrote: > I think a unit test might be helpful. They are actually quite easy to write - > in fact, I wrote a very simple one the other day. > > Have a look on master at vcl/qa/cppunit/font.cxx Note that you don't have to create a whole new CppunitTest every time. You can add new source files to an existing one. You just have to make sure that only one of the sources contains CPPUNIT_PLUGIN_IMPLEMENT(), otherwise you'd get a linker error. Speaking about excessively granular tests: would anyone protest against an Easy Hack to merge the tests in sal module into bigger groups, e.g., just one test for osl, one for rtl and another one for sal? I.e., reduction of the number of CppunitTest makefiles from 33 to 3. D. ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: FastSaxSerializer::write ...
On Wed, 2016-01-20 at 11:54 -0600, Norbert Thiebaud wrote: > > Speaking about excessively granular tests: would anyone protest against > > an Easy Hack to merge the tests in sal module into bigger groups, e.g., > > Yeah but in general merging create trouble with parallelism > if you have only 3 unit test, then that can use only 3 cores Of course; there is a happy medium here - but sal/ takes it to the extreme here: > (1) of course too small tests are not good either because then the > set-up tear-down overhead dominate.. which is not good for perf > either. No I do not know the magic optimal value, but it is something > to keep in mind: merging a bunch of test into one is not all positive Amusingly, while the sal/ tests -look- good ;-) many of them are simply not run because they fail to register the actual tests - which is not ideal (or did that get fixed?); I guess now we have the CI infra - the easy hack to fix that becomes even more of an easy hack - I recall filing it as one; but can't find it just now ... odd. I forget which macro it is necessary to add to nail that. ATB, Michael. -- michael.me...@collabora.com <><, Pseudo Engineer, itinerant idiot ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: FastSaxSerializer::write ...
On Wed, Jan 20, 2016 at 9:37 AM, David Tardonwrote: > Hi, > > On Mon, Jan 18, 2016 at 12:53:07PM +1100, Chris Sherlock wrote: >> I think a unit test might be helpful. They are actually quite easy to write >> - in fact, I wrote a very simple one the other day. >> >> Have a look on master at vcl/qa/cppunit/font.cxx > > Note that you don't have to create a whole new CppunitTest every time. > You can add new source files to an existing one. You just have to make > sure that only one of the sources contains CPPUNIT_PLUGIN_IMPLEMENT(), > otherwise you'd get a linker error. > > Speaking about excessively granular tests: would anyone protest against > an Easy Hack to merge the tests in sal module into bigger groups, e.g., > just one test for osl, one for rtl and another one for sal? I.e., > reduction of the number of CppunitTest makefiles from 33 to 3. Yeah but in general merging create trouble with parallelism if you have only 3 unit test, then that can use only 3 cores This is particularly important for the larger one later, which consistently add long tail to the build, period of 30s to 2 minutes where the build is stuck on waiting for one or two tests to finish From a build wall-time point of view reasonably small and balanced execution time for tests is important for the overall performances of the big boxes. (1) Norbert (1) of course too small tests are not good either because then the set-up tear-down overhead dominate.. which is not good for perf either. No I do not know the magic optimal value, but it is something to keep in mind: merging a bunch of test into one is not all positive ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: FastSaxSerializer::write ...
Hi Mark, On Sat, 2016-01-16 at 21:36 +0800, Mark Hung wrote: > I'd look into performance issue. Thanks =) should be an easy tweak there. > Is there any benchmark or unit test that I can use to check > performance enhancement ? Looking at a profile quickly; it seems that I get about 300k calls to this method while saving - 140m pcycles - which to be fair is 0.1% of save time =) so in fact not a big issue now I look more carefully [ at least for XLSX. DOCX etc. may use it more, but there is far less XML content there ]. > And what do you mean by regression test for the characters? Ah - that is simple; just create a document, add a character string that hits this code-path, save as XLSX (using fast-parser) and re-load it, and check that the string is the same. sc/qa/unit/subsequent_export-test.cxx Is a great piece of code to read for that =) but of course calc focused; more ideally we'd test just the FastSerializer =) Thanks ! Michael. -- michael.me...@collabora.com <><, Pseudo Engineer, itinerant idiot ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: FastSaxSerializer::write ...
I think a unit test might be helpful. They are actually quite easy to write - in fact, I wrote a very simple one the other day. Have a look on master at vcl/qa/cppunit/font.cxx I’m trying to write a unit test for every low level component I change or refactor, I’ve found that it doesn’t take much time to do and I’m trying to do it if only to ensure that any code I modify doesn’t break unexpectedly when I make changes further down the line. Let me know if you get stuck :-) Chris > On 17 Jan 2016, at 12:36 AM, Mark Hung <mark...@gmail.com> wrote: > > Hi Michael, > > I'd look into performance issue. > Is there any benchmark or unit test that I can use to check performance > enhancement? > And what do you mean by regression test for the characters? > > > 2016-01-16 19:33 GMT+08:00 Michael Meeks <michael.me...@collabora.com > <mailto:michael.me...@collabora.com>>: > Hi Mark, > > Great to see: > > commit e99f22bbc499ab0566621ee0bb01e4a7747efe76 > Author: Mark Hung <mark...@gmail.com <mailto:mark...@gmail.com>> > Date: Sun Jan 10 00:28:14 2016 +0800 > > Fix FastSaxSerializer::write() for non-BMP unicode characters. > > Clearly we don't want to mangle UTF-16 etc. characters - and the code > looks dodgy indeed for that =) > > I'm somewhat suspicious though that this will cause some unhelpful > performance regression; and I was wondering if you could look into > fixing that ? and (ideally) helping out with a regression test for these > characters ? =) > > Memory allocation of OStrings etc. via rtl/sal is rather expensive in > most of the profiles I've seen; and particularly doing it for every > attribute ;-) > > Would it be possible to replace the original code - but - as soon as > we > hit a non-ASCII character to convert the rest of the string, and write > that - so something like: > > write (OString(sOutput[i], nLength-i, RTL_TEXTENCODING_UTF8), > bEscape); > return; > > So we can get the character conversion and pairing correct - while > keeping the lack of allocation there ? =) > > Anyhow - thanks muchly for the fix ! great to see complex text support > improve. > > All the best, > > Michael. > > -- > michael.me...@collabora.com <mailto:michael.me...@collabora.com> <><, > Pseudo Engineer, itinerant idiot > > > > > -- > Mark Hung > ___ > LibreOffice mailing list > LibreOffice@lists.freedesktop.org <mailto:LibreOffice@lists.freedesktop.org> > http://lists.freedesktop.org/mailman/listinfo/libreoffice > <http://lists.freedesktop.org/mailman/listinfo/libreoffice> ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
FastSaxSerializer::write ...
Hi Mark, Great to see: commit e99f22bbc499ab0566621ee0bb01e4a7747efe76 Author: Mark Hung <mark...@gmail.com> Date: Sun Jan 10 00:28:14 2016 +0800 Fix FastSaxSerializer::write() for non-BMP unicode characters. Clearly we don't want to mangle UTF-16 etc. characters - and the code looks dodgy indeed for that =) I'm somewhat suspicious though that this will cause some unhelpful performance regression; and I was wondering if you could look into fixing that ? and (ideally) helping out with a regression test for these characters ? =) Memory allocation of OStrings etc. via rtl/sal is rather expensive in most of the profiles I've seen; and particularly doing it for every attribute ;-) Would it be possible to replace the original code - but - as soon as we hit a non-ASCII character to convert the rest of the string, and write that - so something like: write (OString(sOutput[i], nLength-i, RTL_TEXTENCODING_UTF8), bEscape); return; So we can get the character conversion and pairing correct - while keeping the lack of allocation there ? =) Anyhow - thanks muchly for the fix ! great to see complex text support improve. All the best, Michael. -- michael.me...@collabora.com <><, Pseudo Engineer, itinerant idiot ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice
Re: FastSaxSerializer::write ...
Hi Michael, I'd look into performance issue. Is there any benchmark or unit test that I can use to check performance enhancement? And what do you mean by regression test for the characters? 2016-01-16 19:33 GMT+08:00 Michael Meeks <michael.me...@collabora.com>: > Hi Mark, > > Great to see: > > commit e99f22bbc499ab0566621ee0bb01e4a7747efe76 > Author: Mark Hung <mark...@gmail.com> > Date: Sun Jan 10 00:28:14 2016 +0800 > > Fix FastSaxSerializer::write() for non-BMP unicode characters. > > Clearly we don't want to mangle UTF-16 etc. characters - and the > code > looks dodgy indeed for that =) > > I'm somewhat suspicious though that this will cause some unhelpful > performance regression; and I was wondering if you could look into > fixing that ? and (ideally) helping out with a regression test for these > characters ? =) > > Memory allocation of OStrings etc. via rtl/sal is rather expensive > in > most of the profiles I've seen; and particularly doing it for every > attribute ;-) > > Would it be possible to replace the original code - but - as soon > as we > hit a non-ASCII character to convert the rest of the string, and write > that - so something like: > > write (OString(sOutput[i], nLength-i, RTL_TEXTENCODING_UTF8), > bEscape); > return; > > So we can get the character conversion and pairing correct - while > keeping the lack of allocation there ? =) > > Anyhow - thanks muchly for the fix ! great to see complex text > support > improve. > > All the best, > > Michael. > > -- > michael.me...@collabora.com <><, Pseudo Engineer, itinerant idiot > > -- Mark Hung ___ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice