Re: FastSaxSerializer::write ...

2016-01-21 Thread David Tardon
Hi,

On Wed, Jan 20, 2016 at 11:54:41AM -0600, Norbert Thiebaud wrote:
> On Wed, Jan 20, 2016 at 9:37 AM, David Tardon  wrote:
> > Hi,
> >
> > On Mon, Jan 18, 2016 at 12:53:07PM +1100, Chris Sherlock wrote:
> >> I think a unit test might be helpful. They are actually quite easy to 
> >> write - in fact, I wrote a very simple one the other day.
> >>
> >> Have a look on master at vcl/qa/cppunit/font.cxx
> >
> > Note that you don't have to create a whole new CppunitTest every time.
> > You can add new source files to an existing one. You just have to make
> > sure that only one of the sources contains CPPUNIT_PLUGIN_IMPLEMENT(),
> > otherwise you'd get a linker error.
> >
> > Speaking about excessively granular tests: would anyone protest against
> > an Easy Hack to merge the tests in sal module into bigger groups, e.g.,
> > just one test for osl, one for rtl and another one for sal? I.e.,
> > reduction of the number of CppunitTest makefiles from 33 to 3.
> 
> Yeah but in general merging create trouble with parallelism
> if you have only 3 unit test, then that can use only 3 cores
> 
> This is particularly important for the larger one later, which
> consistently add long tail to the build, period of 30s to 2 minutes
> where the build is stuck on waiting for one or two tests to finish

I understand that. And I remember that some of the big import/export
tests have been artificially split for that reason (I was among those
who complained about too long running times). But we are not talking
about long running import/export tests here. We are talking about "real"
unit tests, testing a single class--with no heavyweight setup, no UNO
etc. IMHO a reasonable granularity for those is at the level of subdirs.
Possibly even whole modules if there's just a few of them ATM.

D.
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: FastSaxSerializer::write ...

2016-01-21 Thread Stephan Bergmann

On 01/20/2016 09:25 PM, Michael Meeks wrote:

Amusingly, while the sal/ tests -look- good ;-) many of them are simply
not run because they fail to register the actual tests - which is not
ideal (or did that get fixed?);


At least cases where a test function is not added with CPPUNIT_TEST 
should be found by loplugin:unreffun (and it did find some back when it 
got introduced).

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: FastSaxSerializer::write ...

2016-01-20 Thread Chris Sherlock

> On 21 Jan 2016, at 2:37 AM, David Tardon  wrote:
> 
> Hi,
> 
> On Mon, Jan 18, 2016 at 12:53:07PM +1100, Chris Sherlock wrote:
>> I think a unit test might be helpful. They are actually quite easy to write 
>> - in fact, I wrote a very simple one the other day. 
>> 
>> Have a look on master at vcl/qa/cppunit/font.cxx
> 
> Note that you don't have to create a whole new CppunitTest every time.
> You can add new source files to an existing one. You just have to make
> sure that only one of the sources contains CPPUNIT_PLUGIN_IMPLEMENT(),
> otherwise you'd get a linker error.
> 
> Speaking about excessively granular tests: would anyone protest against
> an Easy Hack to merge the tests in sal module into bigger groups, e.g.,
> just one test for osl, one for rtl and another one for sal? I.e.,
> reduction of the number of CppunitTest makefiles from 33 to 3.
> 
> D.

That sounds like it would make managing the unit tests a lot easier. Sure, why 
not?

Chris
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: FastSaxSerializer::write ...

2016-01-20 Thread David Tardon
Hi,

On Mon, Jan 18, 2016 at 12:53:07PM +1100, Chris Sherlock wrote:
> I think a unit test might be helpful. They are actually quite easy to write - 
> in fact, I wrote a very simple one the other day. 
> 
> Have a look on master at vcl/qa/cppunit/font.cxx

Note that you don't have to create a whole new CppunitTest every time.
You can add new source files to an existing one. You just have to make
sure that only one of the sources contains CPPUNIT_PLUGIN_IMPLEMENT(),
otherwise you'd get a linker error.

Speaking about excessively granular tests: would anyone protest against
an Easy Hack to merge the tests in sal module into bigger groups, e.g.,
just one test for osl, one for rtl and another one for sal? I.e.,
reduction of the number of CppunitTest makefiles from 33 to 3.

D.
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: FastSaxSerializer::write ...

2016-01-20 Thread Michael Meeks

On Wed, 2016-01-20 at 11:54 -0600, Norbert Thiebaud wrote:
> > Speaking about excessively granular tests: would anyone protest against
> > an Easy Hack to merge the tests in sal module into bigger groups, e.g.,
>
> Yeah but in general merging create trouble with parallelism
> if you have only 3 unit test, then that can use only 3 cores

Of course; there is a happy medium here - but sal/ takes it to the
extreme here:

> (1) of course too small tests are not good either because then the
> set-up tear-down overhead dominate.. which is not good for perf
> either. No I do not know the magic optimal value, but it is something
> to keep in mind: merging a bunch of test into one is not all positive

Amusingly, while the sal/ tests -look- good ;-) many of them are simply
not run because they fail to register the actual tests - which is not
ideal (or did that get fixed?); I guess now we have the CI infra - the
easy hack to fix that becomes even more of an easy hack - I recall
filing it as one; but can't find it just now ... odd. I forget which
macro it is necessary to add to nail that.

ATB,

Michael.

-- 
 michael.me...@collabora.com  <><, Pseudo Engineer, itinerant idiot

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: FastSaxSerializer::write ...

2016-01-20 Thread Norbert Thiebaud
On Wed, Jan 20, 2016 at 9:37 AM, David Tardon  wrote:
> Hi,
>
> On Mon, Jan 18, 2016 at 12:53:07PM +1100, Chris Sherlock wrote:
>> I think a unit test might be helpful. They are actually quite easy to write 
>> - in fact, I wrote a very simple one the other day.
>>
>> Have a look on master at vcl/qa/cppunit/font.cxx
>
> Note that you don't have to create a whole new CppunitTest every time.
> You can add new source files to an existing one. You just have to make
> sure that only one of the sources contains CPPUNIT_PLUGIN_IMPLEMENT(),
> otherwise you'd get a linker error.
>
> Speaking about excessively granular tests: would anyone protest against
> an Easy Hack to merge the tests in sal module into bigger groups, e.g.,
> just one test for osl, one for rtl and another one for sal? I.e.,
> reduction of the number of CppunitTest makefiles from 33 to 3.

Yeah but in general merging create trouble with parallelism
if you have only 3 unit test, then that can use only 3 cores

This is particularly important for the larger one later, which
consistently add long tail to the build, period of 30s to 2 minutes
where the build is stuck on waiting for one or two tests to finish

From a build wall-time point of view reasonably small and balanced
execution time for tests is important for the overall performances of
the big boxes. (1)

Norbert

(1) of course too small tests are not good either because then the
set-up tear-down overhead dominate.. which is not good for perf
either. No I do not know the magic optimal value, but it is something
to keep in mind: merging a bunch of test into one is not all positive
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: FastSaxSerializer::write ...

2016-01-18 Thread Michael Meeks
Hi Mark,

On Sat, 2016-01-16 at 21:36 +0800, Mark Hung wrote:
> I'd look into performance issue. 

Thanks =) should be an easy tweak there.

> Is there any benchmark or unit test that I can use to check
> performance enhancement ?

Looking at a profile quickly; it seems that I get about 300k calls to
this method while saving - 140m pcycles - which to be fair is 0.1% of
save time =) so in fact not a big issue now I look more carefully [ at
least for XLSX. DOCX etc. may use it more, but there is far less XML
content there ].

> And what do you mean by regression test for the characters?

Ah - that is simple; just create a document, add a character string
that hits this code-path, save as XLSX (using fast-parser) and re-load
it, and check that the string is the same.

sc/qa/unit/subsequent_export-test.cxx

Is a great piece of code to read for that =) but of course calc
focused; more ideally we'd test just the FastSerializer =)

Thanks !

Michael.

-- 
 michael.me...@collabora.com  <><, Pseudo Engineer, itinerant idiot

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: FastSaxSerializer::write ...

2016-01-17 Thread Chris Sherlock
I think a unit test might be helpful. They are actually quite easy to write - 
in fact, I wrote a very simple one the other day. 

Have a look on master at vcl/qa/cppunit/font.cxx

I’m trying to write a unit test for every low level component I change or 
refactor, I’ve found that it doesn’t take much time to do and I’m trying to do 
it if only to ensure that any code I modify doesn’t break unexpectedly when I 
make changes further down the line. 

Let me know if you get stuck :-)

Chris

> On 17 Jan 2016, at 12:36 AM, Mark Hung <mark...@gmail.com> wrote:
> 
> Hi Michael,
> 
> I'd look into performance issue. 
> Is there any benchmark or unit test that I can use to check performance 
> enhancement? 
> And what do you mean by regression test for the characters?
> 
> 
> 2016-01-16 19:33 GMT+08:00 Michael Meeks <michael.me...@collabora.com 
> <mailto:michael.me...@collabora.com>>:
> Hi Mark,
> 
> Great to see:
> 
> commit e99f22bbc499ab0566621ee0bb01e4a7747efe76
> Author: Mark Hung <mark...@gmail.com <mailto:mark...@gmail.com>>
> Date:   Sun Jan 10 00:28:14 2016 +0800
> 
> Fix FastSaxSerializer::write() for non-BMP unicode characters.
> 
> Clearly we don't want to mangle UTF-16 etc. characters - and the code
> looks dodgy indeed for that =)
> 
> I'm somewhat suspicious though that this will cause some unhelpful
> performance regression; and I was wondering if you could look into
> fixing that ? and (ideally) helping out with a regression test for these
> characters ? =)
> 
> Memory allocation of OStrings etc. via rtl/sal is rather expensive in
> most of the profiles I've seen; and particularly doing it for every
> attribute ;-)
> 
> Would it be possible to replace the original code - but - as soon as 
> we
> hit a non-ASCII character to convert the rest of the string, and write
> that - so something like:
> 
> write (OString(sOutput[i], nLength-i, RTL_TEXTENCODING_UTF8), 
> bEscape);
> return;
> 
> So we can get the character conversion and pairing correct - while
> keeping the lack of allocation there ? =)
> 
> Anyhow - thanks muchly for the fix ! great to see complex text support
> improve.
> 
> All the best,
> 
> Michael.
> 
> --
>  michael.me...@collabora.com <mailto:michael.me...@collabora.com>  <><, 
> Pseudo Engineer, itinerant idiot
> 
> 
> 
> 
> -- 
> Mark Hung
> ___
> LibreOffice mailing list
> LibreOffice@lists.freedesktop.org <mailto:LibreOffice@lists.freedesktop.org>
> http://lists.freedesktop.org/mailman/listinfo/libreoffice 
> <http://lists.freedesktop.org/mailman/listinfo/libreoffice>
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


FastSaxSerializer::write ...

2016-01-16 Thread Michael Meeks
Hi Mark,

Great to see:

commit e99f22bbc499ab0566621ee0bb01e4a7747efe76
Author: Mark Hung <mark...@gmail.com>
Date:   Sun Jan 10 00:28:14 2016 +0800

    Fix FastSaxSerializer::write() for non-BMP unicode characters.

Clearly we don't want to mangle UTF-16 etc. characters - and the code
looks dodgy indeed for that =)

I'm somewhat suspicious though that this will cause some unhelpful
performance regression; and I was wondering if you could look into
fixing that ? and (ideally) helping out with a regression test for these
characters ? =)

Memory allocation of OStrings etc. via rtl/sal is rather expensive in
most of the profiles I've seen; and particularly doing it for every
attribute ;-)

Would it be possible to replace the original code - but - as soon as we
hit a non-ASCII character to convert the rest of the string, and write
that - so something like:

write (OString(sOutput[i], nLength-i, RTL_TEXTENCODING_UTF8), bEscape);
return;

So we can get the character conversion and pairing correct - while
keeping the lack of allocation there ? =)

Anyhow - thanks muchly for the fix ! great to see complex text support
improve.

All the best,

Michael. 

-- 
 michael.me...@collabora.com  <><, Pseudo Engineer, itinerant idiot

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: FastSaxSerializer::write ...

2016-01-16 Thread Mark Hung
Hi Michael,

I'd look into performance issue.
Is there any benchmark or unit test that I can use to check performance
enhancement?
And what do you mean by regression test for the characters?


2016-01-16 19:33 GMT+08:00 Michael Meeks <michael.me...@collabora.com>:

> Hi Mark,
>
> Great to see:
>
> commit e99f22bbc499ab0566621ee0bb01e4a7747efe76
> Author: Mark Hung <mark...@gmail.com>
> Date:   Sun Jan 10 00:28:14 2016 +0800
>
> Fix FastSaxSerializer::write() for non-BMP unicode characters.
>
> Clearly we don't want to mangle UTF-16 etc. characters - and the
> code
> looks dodgy indeed for that =)
>
> I'm somewhat suspicious though that this will cause some unhelpful
> performance regression; and I was wondering if you could look into
> fixing that ? and (ideally) helping out with a regression test for these
> characters ? =)
>
> Memory allocation of OStrings etc. via rtl/sal is rather expensive
> in
> most of the profiles I've seen; and particularly doing it for every
> attribute ;-)
>
> Would it be possible to replace the original code - but - as soon
> as we
> hit a non-ASCII character to convert the rest of the string, and write
> that - so something like:
>
> write (OString(sOutput[i], nLength-i, RTL_TEXTENCODING_UTF8),
> bEscape);
> return;
>
> So we can get the character conversion and pairing correct - while
> keeping the lack of allocation there ? =)
>
> Anyhow - thanks muchly for the fix ! great to see complex text
> support
> improve.
>
> All the best,
>
> Michael.
>
> --
>  michael.me...@collabora.com  <><, Pseudo Engineer, itinerant idiot
>
>


-- 
Mark Hung
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice