On Sat, May 20, 2017 at 5:48 AM, Botond Ballo <bba...@mozilla.com> wrote:
> On Fri, May 19, 2017 at 10:38 PM, Nicholas Nethercote
> <n.netherc...@gmail.com> wrote:
>> There's also a pre-processor constant that we define in Valgrind/ASAN/etc.
>> builds that you can check in order to free more stuff than you otherwise
>> would. But I can't for the life of me remember what it's called :(
>
> It looks like some code checks for MOZ_ASAN and MOZ_VALGRIND.

Thanks.

On Fri, May 19, 2017 at 10:09 PM, Jeff Muizelaar <jmuizel...@mozilla.com> wrote:
> We use functions like cairo_debug_reset_static_data() on shutdown to
> handle cases like this.

The comment there is encouraging, since it suggests that Cairo doesn't
attempt to deal with cairo_debug_reset_static_data() getting called
too early.

On Fri, May 19, 2017 at 9:58 PM, Kris Maglione <kmagli...@mozilla.com> wrote:
> On Fri, May 19, 2017 at 08:44:58AM +0300, Henri Sivonen wrote:
>>
>> The downsides would be that the memory for the tables wouldn't be
>> reclaimed if the tables aren't needed anymore (the browser can't
>> predict the future) and executions where any of the tables has been
>> created wouldn't be valgrind-clean.
>
>
> If we do this, it would be nice to flush the tables when we get a
> memory-pressure event, which should at least mitigate some of the effects
> for users on memory-constrained systems.

How large would the tables have to be for it to be worthwhile, in your
estimate, to engineer a mechanism for dynamically dropping them (for
non-valgrind reasons)?

If there is only a one-way transition (first a table doesn't exist and
after some point in time it exists and will continue to exist), there
can be an atomic pointer to the table and no mutex involved when the
pointer is read as non-null. That is, only threads that see the
pointer as null would then obtain a mutex to make sure that only one
thread creates the table.

If the tables can go away, the whole use of the table becomes a
critical section and then entering the critical section on a
per-character basis probably becomes a bad idea and hoisting the mutex
acquisition to cover a larger swath of work means that the fact that
the table is dynamically created will leak from behind some small
lookup abstraction. That's doable, of course, but I'd really like to
avoid over-designing this, when there's a good chance that users
wouldn't even notice if GBK and Shift_JIS got the same slowdown as
Big5 got in Firefox 43.

I guess instead of looking at the relative slowness and pondering
acceleration tables, I should measure how much Chinese or Japanese
text a Raspberry Pi 3 (the underpowered ARM device I have access to
and that has predictable-enough scheduling to be benchmarkable in a
usefully repeatable way unlike Android devices) can legacy-encode in a
tenth of a second or 1/24th of a second without an acceleration table.
(I posit that with the network roundtrip happening afterwards, no one
is going to care if the form encode step in the legacy case takes up
to one movie frame duration. Possibly, the "don't care" allowance is
much larger.)

> And is there a reason ClearOnShutdown couldn't be used to deal with valgrind
> issues?

I could live with having a valgrind/ASAN-only clean-up method that
would be UB to call too early if I can genuinely not be on the hook
for someone calling it too early. I don't want to deal with what we
have now: First we tell our encoding framework to shut down but then
we still occasionally do stuff like parse URLs afterwards.

> That said, can we try to get some telemetry on how often we'd need to build
> these tables, and how likely they are to be needed again in the same
> process, before we make a decision?

I'd really like to proceed with work sooner than it takes to do the
whole round-trip of setting up telemetry and getting results. What
kind of thresholds would we be looking for to make decisions?

On Fri, May 19, 2017 at 10:38 PM, Eric Rahm <er...@mozilla.com> wrote:
> I'd be less concerned about overhead if we had a good way of sharing these
> static tables across processes

Seem sad to add process-awareness complexity to a library that
otherwise doesn't need to know about processes when the necessity of
fast legacy encode is itself doubtful.

> (ICU seems like a good candidate as well).

What do you mean?

On Fri, May 19, 2017 at 10:22 PM, Jet Villegas <jville...@mozilla.com> wrote:
> Might be good to serialize to/from disk after the first run, so only
> the first process pays the compute cost?

Building the acceleration table would be more of a matter of writing
memory in a non-sequential order than about doing serious math.
Reading from disk would mostly have the benefit of making the memory
write sequential. I'd expect the general overhead of disk access to be
worse that a recompute. In any case, I don't want encoding_rs to have
to know about file system locations or file IO error situations.

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to