Measuring interactivity in a browser (especially during load)

2019-07-31 Thread Randell Jesup
 to enable page authors to do likewise.

Options:
* Jank
   * TTI/TTFI (or FCI)
  * See above for issues with it
  * 5-second window mostly ensures the page is actually done
loading/initializing.
   * MID (Median Input Delay - see my posting here)
  * Measures expected input delay during load
  * Exact definition of what value to derive from the data TBD
(Mean, Median, 95%, Max, etc)
  * Issues:
 * Since it’s calculating a (single) value-over-time, when
   you stop has a major impact on the final value.  Starting
   time also has a major impact, since while the browser is
   loading the initial page data it’s likely highly interactive.
* Stop on Page Ready?
* Start on last byte delivered of main page?  FCP?
* Stop at the TTI/TTFI point?
* Stop at Visually Complete (or 85%) plus some delta?  Start
  at 10% visually complete?  30%?  (and which “visually complete”?)
   * Graphs of expected input delay
  * Gives continuous readout of delay
  * Can works with profiler counter display
  * No issue defining a stop-time, though you need to stop measuring
sometime.
  * Issues:
 * Doesn’t provide a single number to compare
 * Can provide mean/avg/95%/etc numbers, but then you have the
   “stopping” problem again
* Could use visually complete
* Page Ready
   * Detect when page becomes idle
  * Page going idle CPU-wise isn’t sufficient
 * Need to ignore housekeeping activity, like animations, polls, etc.
 * TTFI/FCI is meant to be this, more or less
  * All scripts of type  must be ready
  * [All short-duration timeouts during load have run.]
 * Likely unnecessary

In reality, we want to measure more than one thing here.  We want to
know when the page is ready; and we want to understand how janky the
browser is during a page load (and later as well, though that’s a
different question).  Graphs are intuitive visually, but a graph of
delay over a load is hard for automation to do something with directly,
or alert on.  We can track Max Delay easily, since that doesn’t have the
stopping (or starting) problems.  We could also track
time-with-delay-worse-than-X, which is relatively free of the
starting/stopping problem, but still may be slightly impacted.

For example, we could have these sorts of results (hypothetical):
* Foo.com:
   * Onload: 10.2s
   * TTFI: 15.4s
   * FCP: 3.4s
   * Time-To-Page-Ready: 11.2s
   * Max-Input-Delay: 900ms
   * Time-Non-Responsive(50ms): 6.5s  (over period start-to-TTFI)
   * Time-Non-Responsive(100ms): 4.3s
   * TIme-Non-Responsive(250ms): 1.2s
   * Avg, Mean, 5%, 95% for Responsiveness (over what period?)[c]

The most interesting of these to track may be Time-To-Page-Ready (TTPR),
Max-Input-Delay, and perhaps Time-Non-Responsive(50ms).


Proposal:

We should track the input delay loading “real-world” pages such as in
the load tests in Raptor, and let the tests run until TTFI or until N
(10?) seconds after the load event.  We should track the input delay as
finely as we can (preferably 1ms), but no worse than once every
~33ms.

This should be available via browsertime, the Gecko Profiler, and also
in Raptor run in automation.

Since the delay graph is closely tied to the specific run, we’ll want to
view the series in Raptor or browsertime, as well as get the raw numbers
directly.  Viewing them in the profiler (as a track) is especially
useful, since they’re directly tied to the code and events running.

With this data, we can develop appropriate metrics that can be then
alerted on or tracked.  Ideally we can define a single/few alertable
metrics from the data.  Once we have graphed data, we can calculate what
makes sense and how well this maps to reality.  Initially we should try
calculating several of the above metrics, and setting up some datasets
were we can play with the start and end-points and what we calculate
from the values.  We can then start tracking some of these in perfherder
and see if they’re stable enough to use as alerts.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Remove browser and OS architecture from Firefox's User-Agent string?

2019-05-20 Thread Randell Jesup
>On Fri, May 10, 2019 at 11:40 PM Chris Peterson  wrote:
>> I propose that Win64 and WOW64 use the unadorned Windows UA already used
>> by Firefox on x86 and AArch64 Windows:
>>
>> < "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0) Gecko/20100101
>> Firefox/66.0"
>> > "Mozilla/5.0 (Windows NT 10.0; rv:66.0) Gecko/20100101 Firefox/66.0"
>
>Would there be significant downsides to hard-coding the Windows
>version to "10.0" in order to put Windows 7 and 8.x users in the same
>anonymity set with Windows 10 users?
>
>(We could still publish statistics of Windows version adoption at
>https://data.firefox.com/dashboard/hardware )

I wonder if any sites distributing windows executables might key off the
OS version to default to the correct exe for your version of windows?

If we do this, one would presume that such sites let you select
alternate versions to download already.  And if not, they could, but
they might not bother/notice.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Lack of browser mochitests in non-e10s configuration and support for turning off e10s on desktop going forward

2019-05-01 Thread Randell Jesup
Jean-Yves writes:
>> If anyone is chomping at the bit to remove 1proc support from their module,
>> please speak up.
>
>I am one of those developers that find non-e10s essential to debug core 
>features.

I've depended on using non-e10s for ages as well.  Most of my debugging
can in theory be done in e10s, but whenever I have to have multiple GDBs
running, and deal with breaking in one causing breaks in the other (or
timeouts, etc), it's really painful, and especially for any work that
crosses the Content/master boundary.  Also: how will Fission affect our
debugging workflows?  I imagine that will make things much more
complicated without better tools.

>Of course we can do without, but at the expense of time and
>convenience. Before non-e10s support is removed, I'd love to see better
>development/debugging tools, particularly on Windows added to help our
>workflow.

+100 -- we need better tools for e10s debugging, especially with Fission coming.
We can get away without them, and without non-e10s - but it will be more
friction that lots of us will have on a daily basis.

[Part of the problem is that lots of debugging tools don't deal with, or
deal poorly with, multi-process apps. No revelation here, of course.]
-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Announcing ./mach busted

2019-04-28 Thread Randell Jesup
>TL;DR - In bug 1543241 (alias 'mach-busted'), we now have a central
>clearinghouse of bugs for broken mozilla-central tooling. You can invoke
>|./mach busted| to get a list of known outstanding issues, and |./mach
>busted file| to report a new one.

In case it's not obvious (it wasn't to me), 'file' is a keyword, not a
file to include, and all you really need to do is file a bug that blocks
bug 1543241 (or add that to an existing bug).  We also should name that
bug, for easy in referencing in the future (I propose "mach-busted" ;-) )

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


PSA on MOZ_LOG entries in profiles

2019-03-15 Thread Randell Jesup
A short PSA about MOZ_LOG() messages:

In Bug 1518030 back in January, I landed support for mirroring MOZ_LOG()
messages into profiles captured with the Gecko Profiler
(https://profiler.firefox.com/ formerly https://perf-html.io/).

You can now add "profilermarkers" to you MOZ_LOG env variable to cause
log messages to be mirrored as profiler Markers in profiles.

I would suggest being a little careful about enabling excessive number
of log messages when using this, and it does allocate memory for each
marker so the overhead may be more than normal (a bit).  That said, you
can probably throw a lot of logs at it safely.

You can find them in the Marker Chart or Marker Table when viewing a
profile.

Now back to your regularly scheduled programming ;-)

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Running different instances of Firefox side-by-side

2019-03-12 Thread Randell Jesup
>-no-remote and -new-instance still exist. Right now they do the same thing,
>they make Firefox not look for existing instances and not listen for
>remoting from future instances. They are pretty pointless now though, the
>only case where they would have an effect is when a second instance is
>trying to use a profile that is already used by an existing instance ... at
>which point we'll show the profile locked dialog on startup and refuse to
>startup anyway.
[snip]
>The other thing that might be confusing is that the version or install of
>Firefox you try to launch doesn't affect which version or install of
>Firefox you might end up remoting to. This has always been the case on
>Windows and normally the case on Linux, unless you pass an extra command
>line argument though so I'm not too concerned here.

-no-remote is still quite useful when debugging; if I build in one of my
dev repos, and then ./firefox -P test -no-remote, it will warn me if I
have another instance using that profile (quite possibly from a
different repo/directory) instead of silents loading it in that other
instance - which can be very confusing if you're trying to test a fix.
("I *swear* I made that change; why didn't it take?")

>Hopefully this all makes sense. I'd like to hear if folks think that this
>is the wrong way to support this and if you spot any issues with it that I
>haven't.

Thanks for doing this; the current system kinda fell out of various
needs mostly around 20+ years ago and hadn't been revisted since then
really.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: try pushes: use a base revision from 2019-02-06 or newer

2019-02-15 Thread Randell Jesup
>On 2/12/19 9:15 PM, Randell Jesup wrote:
>>> if you push to the Try server, use base revisions (= the shared revision on
>>> top of which you add your changes) from 2019-02-06 or newer, else there
>>> will be many test failures due to expired certificates. The newer base
>>> revisions have the fixes from
>>> https://bugzilla.mozilla.org/show_bug.cgi?id=1525191
>> 
>> What if you need to push a Try involving an older rev (i.e. to verify a
>> problem or fix)?  Is there a patch you can add on top of an older rev to
>> avoid failures?
>> 
>> Has the ESR branch been updated so it can be used as a base as well?
>
>ESR was fixed with
>https://hg.mozilla.org/releases/mozilla-esr60/rev/2a0b77c6fa9b
>Unfortunately there isn't a preexisting single patch you can use to
>update the certificates in mozilla-central, but depending on the tests
>you're running, using some or all of the "original commit"s mentioned in
>https://bugzilla.mozilla.org/show_bug.cgi?id=1525191#c26 should work.

Could you pull them together into one patch, and add to that bug?  That
would make it a lot easier for people in the future if they need to
verify something.  Unless you think it would be easy for people if they
need to do it.

Thanks!
-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: try pushes: use a base revision from 2019-02-06 or newer

2019-02-12 Thread Randell Jesup
>if you push to the Try server, use base revisions (= the shared revision on
>top of which you add your changes) from 2019-02-06 or newer, else there
>will be many test failures due to expired certificates. The newer base
>revisions have the fixes from
>https://bugzilla.mozilla.org/show_bug.cgi?id=1525191

What if you need to push a Try involving an older rev (i.e. to verify a
problem or fix)?  Is there a patch you can add on top of an older rev to
avoid failures?

Has the ESR branch been updated so it can be used as a base as well?

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Process Priority Manager enabled on Nightly for Windows (only)

2019-02-06 Thread Randell Jesup
>Hey Valentin,
>
>That's a good question. I haven't yet noticed any difference yet, but
>I'm hoping people can keep an eye out to see if there's appreciable lag
>when switching back to a tab with video, or if background audio starts
>to stutter.

Also this could impact webrtc calls or perhaps webrtc-based datachannel
applications (file transfers, games, etc, though likely if it's just
priority this won't be a problem).

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: New and improved "about:config" for Firefox Desktop

2019-02-06 Thread Randell Jesup
>On 1/26/2019 10:09 AM, khagar...@gmail.com wrote:
>> Does it take into account that the sorting is preserved between about:config 
>> calls?
>
>No, but 0.4% is still very low. We could imagine that a lot of people
>keep the table sorted by type at all times, or that only a few people
>do or even know that they can sort, depending on where our confirmation
>bias stands. We're aware of this, and this is why this data point is
>definitely not the only element that will influence the direction here.

I frequently wished to see (only) modified prefs... and never realized I
could sort on modification status.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposal to adjust testing to run on PGO builds only and not test on OPT builds

2019-01-07 Thread Randell Jesup
>* do we turn off builds as well?  I had proposed just the tests, if we decide 
>to turn off talos it would make sense to turn off builds.

Would turning off opt builds cause problems if you want to mozregression
an opt build?  And would this be an issue?  (obviously it might be for
opt-only failures, or trying to verify if a regression identified in
mozregression for PGO was a PGO bug or now, though that could be checked
at the cost of a build or 4 even if we don't build opt, probably).

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: The upcoming C/C++ coding style change

2018-12-13 Thread Randell Jesup
>>If I'm understanding your situation correctly, I believe you can use rebase
>>to update a whole queue of dependent patches, provided you have the
>>format-source extension enabled.

Ok.  For anyone with mq patch queues, here's a sequence that should
work.  I hit a problem, and will file a bug on clang-format - the
sequence below side-steps the bug.

I found when I reviewed my queues in more detail that I only have maybe
6 active queues (mostly for specific projects) with perhaps 50-100
active patches; a number of other queues for areas I'm not actively
working on at the moment (like my webrtc datachannels queue), and quite
a few WIP queues for work I abandoned or decided to mothball, or for
completed work (like webrtc import patch queues).  Most of those date
back to 2016 or earlier.

At times in the past I've had multi-hundred patches for a few dozen
active queues.  That would have made this issue more problematic,
time-wise.

(Note: this sequence below could probably be automated.  It's not worth
my time to do so, however.)


# First, ensure the tools are set up:

# do these once; no need to do them again
./mach bootstrap
./mach clang-format
# that forces clang-tidy/etc to be downloaded and setup! (bug)

# Now here's how to reformat your patches:

hg pull -u
hg tip
# record rev of tip of tree

Now for each set of patches that can be applied together (assuming
they're at 0-N of the current queue; if not adjust accordingly):

hg qshow 0 | head 10
# Use this to find the parent rev this sequence was applied on.
# In some cases that rev is gone (a patch that has since been pushed),
# If it's not found, use PRE_TREEWIDE_CLANG_FORMAT instead.
hg update -r 

hg qpush
# resolve conflicts if any and hg qref
# repeat hg qpush/resolve for all patches in the sequence (N)

hg qser >/tmp/patch_names
hg qfin -a
# if not already on PRE_TREEWIDE_CLANG_FORMAT:
hg rebase -d PRE_TREEWIDE_CLANG_FORMAT
# resolve any conflicts before we reformat

hg rebase -d 
# resolve conflicts

# repeat N times:
hg qimport -r tip -n 
# -n name is optional but recommended
hg qpop

Poof!  that *should* be all.

Thanks to jkew, pehrsons, ehsan for suggestions!

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: The upcoming C/C++ coding style change

2018-12-13 Thread Randell Jesup
>> tl;dr: I need to figure out how I'm going to dig out of the rebasing hole
>> I'm now in, and could use advice/help, and I think the above sequence
>> doesn't work for queues of dependent patches.
>
>If I'm understanding your situation correctly, I believe you can use rebase
>to update a whole queue of dependent patches, provided you have the
>format-source extension enabled.

[snip]

Thanks - that was the sequence I thought might work, after hearing from
pehrsons that the rebase would reformat all the pushed patches.  hg qfin -a
is critical I think.

If I wanted to optimize this, perhaps for each independent sequence:
hg update -r ; hg qpush (*N);
hg rebase -d PRE_TREEWIDE_CLANG_FORMAT (assuming
this does what I expect); resolve any bitrots;
hg rebase -d ;
(hg qimport -r tip; hg qpop) (*N)

Almost automatable (probably automatable; may not be worth it).

Thanks!  I'll let people know if it works

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: The upcoming C/C++ coding style change

2018-12-13 Thread Randell Jesup
>But still all is not lost here.  When you do decide to do the manual
>merging when you needed those patches, you would need to:
>
>  * Update your working tree to the parent of the commit that did the
>reformat.
>  * Apply your patch to that tree and reformat the tree.
>  * Diff the reformat commit and your current working directory.  That
>would give the reformatted diff.

tl;dr: I need to figure out how I'm going to dig out of the rebasing hole
I'm now in, and could use advice/help, and I think the above sequence
doesn't work for queues of dependent patches.

So in my 6 (current) repos on my main dev machine, I have ~150 mq patch
queues with a total of ~3000-3500 patches (though perhaps only 2500
unique ones) - some ancient/abandoned, but most have (unfinished) work
on some subject (not generally single bugs, but something like
"datachannel", with some related patches and some independent patches,
and usually with a tail of some temporary/abandoned/replaced patches).
Perhaps not an optimal flow, but it has worked for me for many years.
Some of these are under revision control (hg mq --init), and are synced
with (shared) patch repos on hg.mozilla.org (for example webrtc.org
update import patch queues, which typically got to 100-150 patches
before squashing for landing).  I also used that to share queues with my
windows dev machine.

Some queues I could drop, but probably the majority of queues and majority of
patches within each queue I'll want to save, if I can.  The "saving" could
be done on an as-needed basis, they don't need to all be converted (and
that helps with the issue of having to decide if they're still
relevant).  But I need a process that won't take crazy amounts of manual
labor to save my patches.  I'd guess I have about 1000 patches I may
need to save, eventually, and several hundred in the short term.

An example is a queue of work to redo Dispatch() to require an
already_AddRefed<>; that has about 30 directory specific patches,
hitting ~250 files (plus some cruft patches after those).  Another is my
profiler work queue, which has 42 patches, mostly for single issues (or
2 or 3 related patches for an issue), and perhaps 1/2 of them are still
relevant.

Dealing with normal bitrot when going back to one of these I'm used to,
but the reformatting will make pretty much every patch be 100% conflicts
I assume.

So what are reasonable options here? Are any of the options an
improvement on "deal with every patch totally manually with 100%
conflicts"? (Which, over time as I go back to various queues, will use a
lot of my time doing manual rebasing.)  Note that the line above "Apply
your patch to that tree" implies doing a manual un-bitrot if needed
(like updating today), which is ok (as I said, it's the same as before
the reformat; most will not need major surgery unless they're ancient).
>From the discussion here, it sounds like some manual fixups might be
needed after the tool runs as well.

But doing this sequence for every patch sounds time-consuming.  And
a much bigger issue: I think that breaks down for a queue of patches that
depend on each other - if I reformat the first patch, I can't go back to
before the big reformat and just apply the second patch, as it may
depend on the first.  I'd have to (I guess) squash all the patches,
which in most cases would be horribly worse than doing it 100% manually.

So what are usuable options for a (long) queue of patches, which may be
dependent? (Some aren't and so perhaps I could do the sequence above for
each one, like the Dispatch queue with per-directory patches, but that's
not the common case.)  Can we come up with a way to partially script
this, if there's a workable sequence?  Or do I just have to do a ton of
manual rebasing?

Thanks for any help/ideas.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Performance profiling improvements #3

2018-10-31 Thread Randell Jesup
>I think sudo will let you have symbolicated kernel stacks which can be handy.

$ sudo perf record ./firefox  has a problem:
"Running Nightly as root in a regular user's session is not supported."
You can work around this with either:

$ sudo perf record -g -F 999 -p 
(this only gets one process of firefox's N processes, though) or

$ sudo perf record -g -F 999 sudo -u  ./firefox -P whatever -no-remote

However: you may find that the extra detail doesn't help much.

Normally, I just use user-space profiling:

$ perf record -g -F 999 ./firefox -P my-profile -no-remote
$ perf script -F +pid > /tmp/firefox.perf

Note that you'll want /proc/sys/kernel/perf_event_paranoid to be 1 (or
less).
To persist across reboots:
sudo sh -c 'echo kernel.perf_event_paranoid=1 >> /etc/sysctl.d/99.local.conf'
(or add to /etc/sysctl.conf, etc - whatever is correct for your system)

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to implement and ship: WebP image support

2018-10-11 Thread Randell Jesup
>Are we bringing in a new third party library for this? (Seems like yes?)

libwebp (see https://bugzilla.mozilla.org/show_bug.cgi?id=1294490)

>Who else uses it/audits it? Does anyone else fuzz it? Is it in OSS-fuzz?
>Are we fuzzing it?

http://developers.google.com/speed/webp - Chrome uses it.  They fuzz it
(including with private fuzzing).

It's in OSS-fuzz: see
https://groups.google.com/a/webmproject.org/forum/#!topic/webp-discuss/aqHRxQqJpH0

I don't believe we're fuzzing the patches yet, but I imagine we will.

>How does upstream behave? Do they cut releases or do they just have
>continual development and downstreams grab random versions of it? How do we
>plan to track security issues upstream? How do we plan to update it
>(mechanically and how often)?

You can see how they handle releases above.  Version 1.0.0 was cut in
April (though there were a number before then).
See https://chromium.googlesource.com/webm/libwebp

I don't know how they track sec issues; probably similar to other
google/chrome/chromium projects.
See https://bugs.chromium.org/p/webp/issues/list
You can report issues as "Security" issues.

> bz wrote:
>> In the past, I believe we objected to adding WebP for various reasons.
>> Do we feel that those reasons are now outweighed by the compat problems?

(Personal opinion) Yes, unfortunately.  And AV1F image format both isn't
ready and isn't universally supported; it will take a while.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Better incremental builds - Tup backend available on Linux

2018-10-01 Thread Randell Jesup
>Tup is a modern build system that enables fast and correct builds. I'm
>pleased to announce the availability of Tup for building Firefox on Linux.
>Compared to Make, building with Tup will result in faster incremental build
>times and reduce the need for clobber builds. See the in-tree docs for
>details [1].

So this seems to be faster for "./mach build" if you have small changes
to c++ code . However, "./mach build binaries" doesn't work at all.
(Perhaps with this it's no longer necessary, but if so a message stating
that would be good).

>cd ~/.mozbuild && mach artifact toolchain --from-build linux64-tup

This doesn't work... no "mach" in my path.  Going to a tree and typing
'./mach artifact toolchain --from-build linux64-tup' works though.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Input Delay Metric proposal

2018-09-26 Thread Randell Jesup
>For in-the-wild input delay not specifically during pageload we also
>measure INPUT_EVENT_RESPONSE_DELAY_MS[1] and (from Firefox 53-60 and then
>resuscitated in 64) INPUT_EVENT_RESPONSE_COALESCED_MS[2] which record[3]
>from the time the OS created the input event until the time when the
>process is done with the event.

Sure, those are interesting... though it's the tail that's generally the
concerning part here - and most of the tail will be during-pageload.
(and those that aren't can actually be interesting for other reasons).

MID is intended for automation and to a lesser extent for investigation
of responsiveness on specific pages, or for devtools (so developers can
see how their page is working).  It's arguable how useful this is in
telemetry - though I think it might be worth tracking there, at least
for a while to compare.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Enabling (many) assertions in opt builds locally and eventually Nightly

2018-09-24 Thread Randell Jesup
>On 9/20/18 5:59 PM, Andrew McCreight wrote:
>> On Wed, Sep 19, 2018 at 5:44 PM Kris Maglione  wrote:
>>
>>> On Wed, Sep 19, 2018 at 05:37:46PM -0700, Bobby Holley wrote:
>>>> So, I don't think we need to do anything fancy with forking - we'd just
>>>> need to capture stacks and send them via telemetry rather than as a crash
>>>> report. This was the idea behind bug 1209131, which got pretty far along
>>>> but never made it to the finish line.
>>>
>>> This would actually potentially even give us better information
>>> than fork-and-crash, since we should be able to include JS
>>> stacks in that setup too. We've never been able to do that in
>>> ordinary crash reports, since breakpad doesn't know how to
>>> unwind JS stacks, and we can't safely ask the JS runtime to do
>>> it while we're crashing.
>>>
>>
>> Though keep in mind that any stack that includes content JS is going to
>> likely count as PII, so it would have to be hidden by default on Soccorro.
>
>
>Please note that it would be illegal to collect such data
>without asking for explicit user consent first.
>The GDPR requires a "positive opt-in" mechanism:
>https://ico.org.uk/for-organisations/guide-to-the-general-data-protection-regulation-gdpr/lawful-basis-for-processing/consent/
>Our current Telemetry permission is an opt-out mechanism.

Right - this would need to be handled in a similar way to real crashes -
pop a crashreporter dialog to let the user submit it.  We just wouldn't
kill the browser (and probably disable future semi-assertions until
restart once we hit and report one to avoid bugging the user too much).

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Enabling (many) assertions in opt builds locally and eventually Nightly

2018-09-19 Thread Randell Jesup
>On Thu, Sep 20, 2018 at 09:22:12AM +1000, Cameron McCormack wrote:
>> On Thu, Sep 20, 2018, at 1:52 AM, Ehsan Akhgari wrote:
>> > While it may be the case that we may need to be more stable for
>> > MOZ_RELEASE_ASSERTs, I very much doubt that every single MOZ_ASSERT in our
>> > codebase is actually a guaranteed to never fail, so promoting them all to
>> > be enabled in something like Nightly is extremely dangerous in terms of the
>> > stability of Nightly for users who are trying to use the browser to get
>> > their normal work done.
>> 
>> If it's truly useful to know whether Nightly users are failing these
>> MOZ_ASSERT assertions, we could use Telemetry to report their failure
>> instead of crashing.> 
>> (I wonder if we could collect all the same data, and use the same crash 
>> reporting infrastructure, for non-crashing crash reports like this.)
>
>On Linux and Mac, we could make MOZ_ASSERT fork and crash in the child,
>and continue in the parent process.

Oooh, tricky.  (though threads aren't preserved in forks, but the
current thread's stack is, so if we care about all threads, we might
need to grab pointers to their stacks/etc and store pointers to the
stacks before fork()).  But even without that it would be good.

And yes, we don't have to have it crash the browser (though if it later
crashes we'd want to know about assertions we'd hit).  Telemetry isn't
quite appropriate for reporting it; it could however save a crash report
at ASSERT(), then (probably) disable MOZ_ASSERT reports and let the
browser continue (and it may well now really crash).  This disabling
would avoid the cascade effect, but still get us crashreports.  This
would be a bunch(?) of work in crashreporter - ted?  It might be easy
invoke a crashreport sampling, and continue, but I suspect it's not, and
might be quite hard.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Enabling (many) assertions in opt builds locally and eventually Nightly

2018-09-19 Thread Randell Jesup
Ehsan wrote:
>On Tue, Sep 18, 2018 at 1:30 AM Randell Jesup  wrote:
>> We already *need* to be stable in that case, given MOZ_RELEASE_ASSERT
>> (and normal just-because crashes). And as best I can tell, we are stable
>> (with regards to user profiles).  Once upon a time we weren't (5(?)
>> years ago?)
>
>I do come across MOZ_ASSERTs that trigger while browsing every once in a
>while (using a debug build).  I almost never file bugs for these, because
>the issues are rarely reproducible, I often have little information that
>would be helpful as to how exactly the assertion was triggered, and I often
>am very actively working on something else and the assertion failure is
>just getting in the way of me getting my work done.  I sometimes have to
>comment out a broken MOZ_ASSERT in my local build to be able to proceed.

Good point, though if we have people actually browsing with these, we'll
find these bogus MOZ_ASSERTs - and a bogus MOZ_ASSERT may mean we have
a sec bug that we assume "can't happen", since it doesn't fail in
mochitests.

>While it may be the case that we may need to be more stable for
>MOZ_RELEASE_ASSERTs, I very much doubt that every single MOZ_ASSERT in our
>codebase is actually a guaranteed to never fail, so promoting them all to
>be enabled in something like Nightly is extremely dangerous in terms of the
>stability of Nightly for users who are trying to use the browser to get
>their normal work done.

Again, good point -- though I would absolutely start with "get
developers to use this".  If we're kicking a MOZ_ASSERT, someone should
be either fixing it or removing the assert and dealing with it.  So
please file against whomever is maintaining that bit of code, and let
them sort out why it would fail.

>This is, I believe, the way that we advertise
>Nightly these days: we advertise it as a mostly stable experimental version
>of Firefox and we encourage people to download it to try out the latest and
>greatest.  Exposing that population to crashes as a result of promoting all
>MOZ_ASSERTs to become MOZ_RELEASE_ASSERTs seems like a proposition that
>needs to be backed by some data demonstrating that such a build would match
>the current stability requirements of Nightly.

Sure.

>FWIW, I think your original proposal, having a way to opt into assertions
>without a slow build locally, is extremely valuable.  I could see myself
>using that option intentionally even with the pain points I described
>above, but for dogfooding rather than while working on a patch and such.

Cool, thanks. I have the start of a patch to enable a version of this.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Input Delay Metric proposal

2018-09-19 Thread Randell Jesup
Problem:
Various measures have been tried to capture user frustration with having
to wait to interact with a site they're loading (or to see the site
data).  This includes:

FID - First Input Delay --
https://developers.google.com/web/updates/2018/05/first-input-delay
TTI - Time To Interactive --
https://developers.google.com/web/fundamentals/performance/user-centric-performance-metrics#time_to_interactive
related to: FCP - First Contentful Paint and FMP - First Meaningful Paint --
https://developers.google.com/web/fundamentals/performance/user-centric-performance-metrics#first_paint_and_first_contentful_paint
TTVC (Time To Visually Complete), etc.

None of these do a great job capturing the reality around pageload and
interactivity.  FID is the latest suggestion, but it's very much based
on watching user actions and reporting on them, and thus depends on how
much they think the page is ready to interact with, and dozens of other
things. It's only good for field measurements in bulk of a specific
site, by the site author.  In particular, FID cannot reasonably be used
in automation (or before wide deployment).

Proposal:

We should define a new measure based on FID name MID, for Median Input
Delay, which is measurable in automation and captures the expected delay
a user experiences during a load.  We can run this in automation against
a set of captured pages, while also measuring related values like FCP
and TTI, and dump this into a set of per-page graphs (perhaps on
"areweinteractiveyet.com" :-) ).

While FID depends on measuring the delay when the user *happens* to
click, MID would measure the median (etc) delay that would be
experienced at any point between (suggestion) FCP and TTI.  I.e. it
would be based on "if a user input event were generated this
millisecond, how long would it be before it ran?"  This would measure
delay in the input event queue (probably 0 for this case) plus the time
remaining until he current-running event for the mainthread finishes.

This inherently assumes we measure TTI and FCP (or something
approximating it).  This is somewhat problematic, as TTI is very noisy.
I have a first cut at TTI measurement (fed into profiler markers) in
bug 1299118 (without the "no more than 2 connections in flight" part).

Value calculation:
Median seems to be the best measure, but once we have data we can look
at the distributions on real sites and our test harness and decide what
has the most correlation to user experience.  We could also measure the
95% point, for example.  In automation, there might be some advantage to
recording/reporting more data, like median and 95%, or median, average,
and 95%, and max.

Another issue with the calculation is that it won't capture burstiness
in the results well (a distribution would).

Range measured over:
We could modify the starting point to be when the first object that
could be interacted with is rendered (input object, link, adding a key
event handler, etc).  This would be a more-accurate measure for web
developers, and would matter only a little for our use.  Note that
getting content on the screen earlier might in some cases hurt you by
starting the measurement "early" when the MainThread is presumably busy.

Likewise, there might very well be alternatives to TTI for the end-point
(and on some pages, you never get to TTI, or it's a Long Time).  Using
TTI does imply we must collect data until 5 seconds after the last "Long
Task", and since some sites will never go 5 seconds without a long
task, we'll need to upper-bound it (or progressively reduce the 5
seconds over time, which may help).   Alternatively, we could use a
shorter window, or put an arbitrary limit on it (5 seconds past
'loaded', or just to 'loaded'), etc.

Issues:

Defining the start and stop point, and the details around the exact way
we calculate the result (I hand-wove about it above).  Note that
"longer" endpoints will result generally in better scores, since it
would average over probably a longer tail where less is happening
(presumably).  OTOH if it ends at TTI on a "Long Task" (50+ms event),
that rather implies that it was at least intermittently busy until then.

If we want to start when something interact-able is rendered, there may
be some work to figure that out.

Note that this inherently is measuring the delay until the input event
*starts* processing, not how long it takes to process (since there is no
actual input event here).

Once we have some experience with this, we could propose it for the
Performance API WG.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Enabling (many) assertions in opt builds locally and eventually Nightly

2018-09-17 Thread Randell Jesup
>On 9/17/2018 1:31 PM, Botond Ballo wrote:
>> One potential issue here is that an assertion failure in Nightly
>> builds might be simultaneously very annoying (because it crashes the
>> browser of Nightly users every time it occurs) and not very actionable
>> (because no reliable STR can be found for it).
>
>I would also be concerned about data loss and profile corruption when an
>assertion crashes the browser. I don't think we'll have a very easy time
>convincing people to do anything non-trivial with Nightly if there's a
>pretty good chance that they completely lose the fruits of their labour.
>
>I'd hope that this wouldn't be too frequent a problem, but it probably
>depends on the assertion, and the state of the durability of our user
>profiles.

We already *need* to be stable in that case, given MOZ_RELEASE_ASSERT
(and normal just-because crashes). And as best I can tell, we are stable
(with regards to user profiles).  Once upon a time we weren't (5(?)
years ago?)

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Enabling (many) assertions in opt builds locally and eventually Nightly

2018-09-17 Thread Randell Jesup
It may be possible to narrow down the performance-regressing MOZ_ASSERTs
by pushing a series of Talos try runs with MOZ_ASSERT enabled for
specific top-level directories - then take those with significant
regressions, and recurse down a layer, lather, rinse, repeat to at least
the leaf-dir level, or (if there are a lot in the dir) the file level.

Then we can look with eyeballs and figure out what's up.

It might also be possible to use trace APIs to measure the cost of each
assertion... though the traces themselves might swamp most assertion
costs.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Enabling (many) assertions in opt builds locally and eventually Nightly

2018-09-17 Thread Randell Jesup
There are quite a few things that may be caught by assertions by
developers before committing, especially sec issues - but most
developers don't run debug builds as their main local test builds, or
for local browsing use, because they are almost unusably slow.  And
turning on compiler optimization doesn't actually help much here - the
problem is mostly debug-only assertions and code that's debug only, such
as bug 1441052 ("Don't do full grey-node checking in local debug
builds").

These added checks may be useful for CI tests, but they make the builds
unusable, so the vast majority of assertions don't run in the builds our
developers are using.  So enabling most of the existing MOZ_ASSERTs for
local opt builds (minus the worst performance-limiting ones) would help
developers catch bugs before landing them (and perhaps shipping
them). It's a lot like using ASAN builds to browse - but with much less
perf degradation on hopes.

Even better, if we can make the overall hit to perf low enough (1%?), we
could enable it for some/all Nightly users/builds.  Or make "early
nightly" (and developer builds) enable them, and late nightly and beta
not.

If we moved some of the most expensive checks to an alternative macro
(MOZ_ASSERT_DEBUG()), we could (selectively?) enable MOZ_ASSERT checks
in some opt builds.  Alternatively we could promote some MOZ_ASSERTs to
MOZ_ASSERT_NIGHTLY() (or MOZ_DIAGNOSTIC_ASSERT), which would assert in
(all) debug builds, and in nightly opt builds - and maybe promote some
to MOZ_ASSERT_RELEASE if we can take the perf hit, and they're in an
important spot.

And as a stepping-stone to the above, having local opt builds enable
(most) MOZ_ASSERTs (excluding really expensive ones, like bug 1441052)
even if the hit is larger (5, 10%) would also increase the likelihood
that we'll catch things before we land them, or before they get to beta.
(Note: --enable-release would turn this local-build behavior off - and
anyone doing speed tests/profiling/etc needs that already, and probably
we should have a specific enable/disable like
--disable-extra-assertions).  This wouldn't be all that hard - most of
the work would be finding "expensive" assertions like bug 1441052.

(and perhaps we should not continue to overload --enable-release for "I
want to benchmark/profile this build")

Enabling most MOZ_ASSERTs in automation tests on opt builds would also
slightly increase our odds of finding problems - though this may not
help much, as the same tests are being run in debug builds.  The only
advantage would be races may be more likely in the faster opt builds.
It might be worth trying once we have this for a few weeks, and see if
it helps or not.

Note: I've discussed this concept with various people including bz in
the past, and mostly have had agreement that this would be useful - thus
my filing of bug 1441052.  If we agree this is worth doing, we should
file bugs on it and see what the effort to get this would be, and if we
could ship some of this to users.  Much like nightly-asan, this would
shake out bugs we'll never find from crash-stats reports, and which can
lead to critical sec bugs, and turn them into frequently-actionable
bugs.  If needed this can be enabled incrementally once the build/config
infra and macros are in place; there are several options for that.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: C++ standards proposal for a embedding library

2018-07-20 Thread Randell Jesup
Ted wrote:
>Honestly I think at this point growth of the C++ standard library is an
>anti-feature. The committee should figure out how to get modules specified
>(which I understand is a difficult thing, I'm not trying to minimize the
>work there) so that tooling can be built to provide a first-class module
>ecosystem for C++ like Rust and other languages have. The language should
>provide a better means for extensibility and code reuse so that the
>standard library doesn't have to solve everyone's problems.

(and various others wrote things along the same lines, or similar
concerns)

I'm strongly in agreement.  This is not going to help people in the long
run, and may well be as Ted puts it an anti-feature that causes all
sorts of problems we've chipped away at to re-rear their heads (ossified
unsafe impls, lack of any improvements, etc).

A good module system is a much more useful and liberating thing to do.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: PSA: pay attention when setting multiple reviewers in Phabricator

2018-07-16 Thread Randell Jesup
>>> A better solution would be to have in-tree metadata files providing
>>> subscription rules for code review (e.g. a mapping of usernames to a list
>>> of patterns matching files). Module owners would be responsible for
>>> reviewing changes to these rules to ensure that automatic delegation
>>> happens to the correct people.
>> I still don't believe this would work well in practice.  It would work,
>> but would have frequent problem cases and often require a lot of updating.
>at the end of the day tooling needs to ensure that reviewer has the
>authority to approve changes.

Well, that's an assertion.  I'm not sure that the tools *needs* to do
more to ensure it than today; I believe that some people believe it
*should* do more.  And perhaps they're right.

>i don't think the issues you've raised are show stoppers; it's certainly a
>system we will iterate on over time.
>keep in mind _how_ we work is also something we should be iterating on too.

Agreed.  And if something can work better, great; but I'm also leery of
adding a lot of process or gatekeepers, especially when you need to
touch a module with few official peers, who might not be available when
you need a rubber stamp of a nit fix.  (Initial review has nits; revise,
now the reviewer is on PTO or simply away for the weekend or a holiday
and you have a big landing gated on this, or a sec issue, or will miss a
release if you can't land before uplift/soft-freeze, etc, etc, etc)

An alternative is more guidelines (and training, especially for new
devs) for reviewers/peers/devs to follow, and deal with any oversteps on
a case-by-case basis, and if there are continual issues then add
enforced process.  Right now it's kinda learn-by-watching-other-devs -
which works, but makes it hard to be sure you learned the right things.

>the current system is extremely coarse - everyone with scm-level-3 can
>approve any change tree-wide.
>there is a strong desire to make this finer grained

I believe that; though I'd be interested to know with who, the reasons,
and the observed problems, and what other solutions have been
considered.  Of course, I don't need to know these; I'm just interested
as while I've seen occasional fubars, I haven't seen persistent problems
that would require that (especially over improved guidelines).  Also,
for example, one could add these gate-checks with overrides that devs
can use at their discretion (like for nits).

I'm sure added checks can avoid some fubars - but I'm also sure it will
cause added friction to development, extra load on some people, and slow
some things down.

>> [snip]
>> Some modules (i.e. DOM) already to have a hard requirement for peer
>> review.  That should be continued and should be enforced as it is now,
>> and it sounds like Lando will (does?) support that.  If another module
>> wants to enforce it we can flip a bit in the list of peers and have it
>> enforced.
>the enforcement you're referring to is controlled by a hardcoded list of
>peers inside a mercurial hook.
>
>this doesn't scale anywhere close to our needs, and all of the exact same
>concerns you raise with regards to in-tree ownership tracking applies
>(however it would be worse as it's imposed by a system separate from the
>review/landing tooling).

Agreed.  Moving those to in-tree lists is certainly a win over current.


-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)

2018-07-13 Thread Randell Jesup
>On 13/07/2018 04:55, Randell Jesup wrote:
>> Correct - we need to have observers/what-have-you for
>> background/foreground state (and we may want an intermediate state or
>> two - foreground-but-not-focused (for example a visible window that
>> isn't the focused window); recently-in-foreground (switching back and
>> forth); background-for-longer-than-delta, etc.
>> 
>> Modules can use these to drop caches, shut down unnecessary threads,
>> change strategies, force GCs/CCs, etc.

>Also note that dealing with the "importance" of a page is not just a
>matter of visibility and focus. There are other factors to take into
>account such as if the page is playing audio or video (like listening to
>music on YouTube), if it's self-updating and so on.

Absolutely

>The only mechanism to reduce memory consumption we have now is
>memory-pressure events which while functional are still under-used. We
>might also need more fine grained mechanisms than "drop as much memory
>as you can".

This is also very important for GeckoView

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)

2018-07-13 Thread Randell Jesup
>On Thu, Jul 12, 2018 at 08:56:28AM -0700, Andrew McCreight wrote:
>>On Thu, Jul 12, 2018 at 3:57 AM, Emilio Cobos Álvarez 
>>wrote:
>>
>>> Just curious, is there a bug on file to measure excess capacity on
>>> nsTArrays and hash tables?
[snip]
>I kind of suspect that improving the storage efficiency of hashtables (and
>probably nsTArrays too) will have an out-sized effect on per-process
>memory. Just at startup, for a mostly empty process, we have a huge amount
>of memory devoted to hashtables that would otherwise be shared across a
>bunch of origins—enough that removing just 4 bytes of padding per entry
>would save 87K per process. And that number tends to grow as we populate
>caches that we need for things like layout and atoms.

Hash tables are a big issue.  There are a lot of 64K/128K/256K
allocations at the moment for hashtables.  When we started looking at
this in bug 1436250, we had a 256K, ~4 128K, and a whole bunch of 64K
hashtable allocs (on linux).  Some may be smaller or gone now, but it's
still big.

I wonder if it's worth the perf hit to realloc to exact size hash tables
that are build-once - probably.  hashtable->Finalize()?  (I wonder if
that would let us make any other memory/speed optimizations if we know
the table is now static.)

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: PSA: pay attention when setting multiple reviewers in Phabricator

2018-07-13 Thread Randell Jesup
>On 05/07/2018 18:19, Mark Côté wrote:
>> I sympathize with the concerns here; however, changing the default would
>> be a very invasive change to Phabricator, which would not only be complex
>> to implement but troublesome to maintain, as we upgrade Phabricator every
>> week or two.

Makes sense.

>> This is, however, something we can address with our new custom
>> commit-series-friendly command-line tool. We are also working towards the
>> superior solution of automatically selecting reviewers based on module
>> owners and peers and enforcing this in Lando.
>
>Automatically selecting reviewers sounds like a huge improvement,
>particularly for people making changes who haven't yet internalised the
>ownership status of the files they are touching (notably any kind of
>first-time or otherwise infrequent contributor to a specific piece of
>code). So I'm very excited about this change.

Agreed that this is an improvement for those who are new to mozilla
development, though the bugzilla reviewer suggestions are also useful
today.  This would be the equivalent of adding "Select for me" in Bugzilla.

>That said, basing it on the list of module owners & peers seems like it may
>not be the right decision for a number of reasons:
>
>* The number of reviews for a given module can be very large and being
>unconditionally selected for every review in a module may be overwhelming.
>
>* The list of module owners and peers is not uniformly well maintained (in
>at least some cases it suggests that components are owned by people who
>have not been involved with the project for several years). Although this
>should certainly be cleaned up, the fact is that the current data is not
>reliable in many cases.

This can be cleaned up (and is being so), but will never be perfect.

Also, unless you're amazingly careful, this will often route reviews to
people on PTO, sick, in the "wrong" time zone, on a holiday day, or
simply too busy to review in a reasonable period due to whatever.  Some
of these are known ahead of time, others aren't.  And then there are the
"I need a review now to fix something" where the automatic selection
might be someone who won't be available to review it for 12 hours, or 3
days - or 3 weeks, requiring noticing this and having someone manually
correct it.

>* Oftentimes there is substructure within a module that means that some
>people should be reviewers in certain files/directories but have no
>knowledge of other parts.

Or you spend a lot of time deciding and updating who can review what
bits - and it doesn't always match the directory heirarchy!

>* It usually desirable to have people perform code review for some time as
>part of the process of becoming a module owner or peer.

Absolutely.

>A better solution would be to have in-tree metadata files providing
>subscription rules for code review (e.g. a mapping of usernames to a list
>of patterns matching files). Module owners would be responsible for
>reviewing changes to these rules to ensure that automatic delegation
>happens to the correct people.

I still don't believe this would work well in practice.  It would work,
but would have frequent problem cases and often require a lot of updating.

In-tree metadata to support suggestions (and to support "select for me"
when someone explicitly asks) is good, I think.  Note that not all
modules are directly related to directories in the tree, so something
out-of-tree or some escape valve in-tree is needed to record those.

I think automatically selecting reviewers always (as is implied, but it
isn't clear that this is the plan - Mark?) is quite problematic, and
enforcing reviewers-are-listed-in-tree also has real problems (and may
require too-frequent manual intervention) such as not supporting peers
assigning reviews to non-peers to train them (to be peers), for cases
where a the best person to review a patch isn't a full module peer but
an SME on the issue/code in question, etc.  These problems can be
gamed/wallpapered (assign a review to both the non-peer and a peer who
doesn't really review or reviews-the-review, etc), but that adds pain
and time and unneeded process.

Some modules (i.e. DOM) already to have a hard requirement for peer
review.  That should be continued and should be enforced as it is now,
and it sounds like Lando will (does?) support that.  If another module
wants to enforce it we can flip a bit in the list of peers and have it
enforced.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)

2018-07-12 Thread Randell Jesup
>On 07/12/2018 11:08 PM, Randell Jesup wrote:
>> We may need to trade first-load time against memory use by lazy-initing
>> more things than now, though we did quite a bit on that already for
>> reducing startup time.
>
>One thing to remember that some of the child processes will be more
>important than others. For example all the processes used for browsing
>contexts in the foreground tab should probably prefer performance over
>memory (in cases that is something we can choose from), but if a
>process is only used for browsing contexts in background tabs and isn't
>playing any audio or such, it can probably use less memory hungry
>approaches.

Correct - we need to have observers/what-have-you for
background/foreground state (and we may want an intermediate state or
two - foreground-but-not-focused (for example a visible window that
isn't the focused window); recently-in-foreground (switching back and
forth); background-for-longer-than-delta, etc.

Modules can use these to drop caches, shut down unnecessary threads,
change strategies, force GCs/CCs, etc.

Some of this certainly already exists, but may need to be extended (and
used a lot more).

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)

2018-07-12 Thread Randell Jesup
>I do hope that the 100 process figures scenario that was given is a worse case 
>scenario though...

It's not.  Worst case is a LOT worse.

Shutting down threads/threadpools when not needed or off an idle timer
is a Good thing.  There may be some perf hit since it may mean starting
a thread instead of just sending a message at times; this may require
some tuning in specific cases, or leaving 1 thread or more running
anyways.

Stylo will be an interesting case here.

We may need to trade first-load time against memory use by lazy-initing
more things than now, though we did quite a bit on that already for
reducing startup time.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposed W3C Charter: Web Performance Working Group

2018-07-11 Thread Randell Jesup
>Adding to what Tom said...
>
>1. "Web developers want the ability to observe the performance
>characteristics of their applications" - they want to do so, but
>*should* they be allowed to do so? The API would give access to deep
>performance data that could be used for all sorts of nefarious purposes
>(profiling, fingerprinting, probing for vulnerabilities, etc.).

The extreme version of this is what Vlad and Benoit (Facebook) have
proposed in WICG, which is an interface to profiling data for the page
(origin): https://github.com/vdjeric/js-self-profiling
Discussion (mozilla) here:
https://github.com/mozilla/standards-positions/issues/92

One can understand why they'd *want* to be able to profile their code
in-the-field.

Exposing this today would be have serious same-origin and Spectre
impacts; in a Fission world these problematic impacts would be (more)
limited though perhaps not "safe".  (Implied in the current Gecko
Profiler impl is that other processes could affect how fast your
origin's process runs, and thus how much progress is made between
profiler 'ticks',leading to some leakage of information cross-orgin
between processes.  How large an impact this is I'm not 100% sure at
this point. If we changed sampling to be runtime-based instead of
wallclock-based, this (mostly) hides the impact of other processes,
though secondary effects still exist (cache impacts, etc).

Making it runtime-based would be a largish change...

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)

2018-07-11 Thread Randell Jesup
>On 7/11/18 5:42 AM, David Bruant wrote:
>> I've seen this information of 100 content processes in a couple places but
>> i haven't been able to find the rationale for it. How was the 100 number
>> picked?
>
>I believe this is based on telemetry for number of distinct sites involved
>in browsing sessions.

As an example, 10 randomly chosen tabs in Chrome site isolation (a few
months ago) yielded ~80 renderers (Content processes).  Some sites
generate a lot; that list of 10 included some which likely don't
generate more than 1 or 2: google.com, mozilla.org, facebook login page,
wikipedia (might spawn a few?).

>> Would 90 prevent a release of project fission?
>
>It would make it harder to ship to users, yes...  Whether it "prevents"
>would depend on other considerations.

It's a continuum - the more memory we use, the more OOMs, the worse
we'll look (relative to Chrome), the larger impact on system perf, etc.
There's likely no hard line, but there may be a defined "we need to get
at least here" line, and for now that's 100 apparently (I wasn't
directly involved in picking it, so I don't know how "hard" it is).

We'll have to do more than just limit process sizes, but limiting
process sizes is basically table stakes, IMO.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Fission MemShrink Newsletter #1: What (it is) and Why (it matters to you)

2018-07-10 Thread Randell Jesup
>Welcome to the first edition of the Fission MemShrink newsletter.[1]

This is awesome and critical.

I'll note (and many of you know this well) that in addition to getting
rid of allocations (or making them lazy), another primary solution is to
move data out of the Content processes, and into the master process (or
some other shared process, if that's advisable for security or other
reasons), and access the data over IPC.  Or you can move it to a shared
memory block (with appropriate locking if not static).  For example, on
linux one of our worst offenders is fontconfig; Chrome for example
remotes much of that to the master process.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: open socket and read file inside Webrtc

2018-07-06 Thread Randell Jesup
>> On 05/07/2018 10:16, amantell...@gmail.com wrote:
>> > I want to open a file inside webrtc core part.
>> 
>> If you could extend the reasons on the why it would allow us to help you.
>> 
>> What is your ultimate objective? why do you think you need to open a 
>> file inside webrtc core?

>We are modifying the webrtc library to use another transport protocol (based 
>on ICN).
>The problem is that in this preliminary phase we have configure the socket 
>using static parameters inside a file.
>The new socket is implemented in a library that communicates via TCP with an 
>external daemon.
>I want to use IPC, but I don’t have examples about this approach.

If this is just for personal experimentation, you can locally disable
the sandbox via about:config, and directly (locally) modify the
media/mtransport code to use your transport protocol instead of UDP via
IPC to the SocketTransportThread in the main process.

Note that there's a lot of code that implements ICE to determine what
ports are open, etc.  That may complicate what you're doing.

If you want to do more than personal/local experimentation, much more
extensive discussions and work would be required.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Compile webrtc from source in firefox

2018-07-06 Thread Randell Jesup
>Il giorno giovedì 5 luglio 2018 10:17:32 UTC+2, amant...@gmail.com ha scritto:
>> Hi,
>> I don't understand which webrtc code is used by firefox builder.
>> I need to modify the webrtc part and recompile it to be used by firefox.
>> any help?

>I tried to break webrtc to understand if it is compiled or not (in
>particular media/webrtc/trunk/webrtc/pc/channel.cc), but i did not receive
>errors.

We don't use (or compile) all the imported webrtc code.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Windows Address Sanitizer enabled on trunk

2018-06-18 Thread Randell Jesup
>Bug 1360120 on inbound enables Windows ASan builds and tests on trunk branches.
...
>This platform has taken several years to stand up. Thank you to
>everyone who helped out, especially Ehsan for getting this started and
>Ting-Yu for working through a bunch of hurdles on automation.
>
>Happy sanitizing!

A late response, but this is truly awesome.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Removing tinderbox-builds from archive.mozilla.org

2018-05-15 Thread Randell Jesup
>On 5/11/18 7:06 PM, Gregory Szorc wrote:
>> Artifact retention and expiration boils down to a
>> trade-off between the cost of storage and the convenience of accessing
>> something immediately (as opposed to waiting several dozen minutes to
>> populate the cache).
>
>Just to be clear, when doing a bisect, one _can_ just deal with local
>builds.  But the point is that then it takes tens of minutes per build as
>you point out.  So a bisect task that might otherwise take 10-15 minutes
>total (1 minute per downloaded build) ends up taking hours...

Also (as others have pointed out) going too far back (often not that
far) may run you into tool differences that break re-building old revs.
Hopefully you don't get variable behavior, just a failure-to-build at
some point.  I'm not sure how much Rust has made this worse.  

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to ship: media-capabilities

2018-05-15 Thread Randell Jesup
>If my understanding is correct, Media Capabilities does expose quite a
>larger fingerprinting surface in practice. While it may have been
>theoretically possible for all trackers to gather statistics on video
>playback for each configuration, the only scripts that could practically
>carry out those attacks without degrading user experience would have been
>video providers. This will be especially true if browsers start blocking
>autoplay by default (https://bugzilla.mozilla.org/show_bug.cgi?id=1376321),
>since users will never interact with media elements from fingerprinting
>scripts. With the Media Capabilities API, it seems that a script like
>fingerprintjs2 (https://github.com/Valve/fingerprintjs2) could run through
>a big list of types/codecs and retrieve device information regarding
>smoothness and energy efficiency with relatively little overhead?

Probably so, yes.  We could reduce but not eliminate the exposure by
rate-limiting requests (perhaps even on a sliding scale, allowing a
small number before delays are introduced).  This is likely insufficient
as a mitigation, however.

>If autoplay is eventually blocked by default could we gate the response of
>this API on user interaction with the media element?

That might be possible, but if so it should be discussed in the spec and
how to get "real" data after user interaction.  (Perhaps giving fake
data until user interaction, but then one needs to warn developers about
this, and how to get real data when interaction occurs reliably.)

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Is super-review still a thing?

2018-05-10 Thread Randell Jesup
>On Thu, Apr 26, 2018, at 12:41 AM, Mark Côté wrote:
>> How we might use blocking reviewers in our workflow is still open, but 
>> it could be used for training up reviewers, in which case the trainee 
>> would be a regular reviewer and the module peer/owner would be a 
>> blocking reviewer.
>
>It's not uncommon for me to submit patches for review from multiple people
>where I require all reviewers to sign off on the patch, usually because I
>ask them to look at different parts of the patch.  I don't think I have
>ever sent a patch to get review from more than one person with the
>intention of landing it if only one person OKs it.  (I'd probably needinfo
>people to get that kind of feedback.)  So ignoring potential new workflows
>like the training one you mention, I think I would exclusively use blocking
>reviewers.  It's good to know that Phabricator will help me avoid
>accidentally landing patches when not all of my reviewers have signed off.

Agreed... but it sounds like they're not quite sure how this will be
used.  I'm concerned that developers may be confused and just as for
review by N people, not realizing it can be landed when one of them
r+'s.  (Especially if the developer doesn't frequently use multiple
reviewers.)  At minimum, if this how it works, there will been to be
clear communication about when to use this - or (better!) to switch the
default to blocking, and have the unusual/more-work-to-ask-for case be
any-of.

Once in a blue moon I'll ask for a fast review from any of N people,
then cancel the r?'s once i have 1 r+.  but this is really rare, and
mostly when there's a time deadline to land something ASAP (sec issue,
or to get in ahead of a merge date).  99% of the time when I ask for N
people, I need r+ from all of them.

On the other side, I do know that the build peers (IIRC) us a shared
reviewing setup, where most bugs are thrown in a pool for any of them to
review.  But that's not the normal workflow for any other group I know
of in Mozilla, at least in Firefox.  (I don't know them all, though.)

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to unship: URL.createObjectURL(MediaStream)

2018-05-10 Thread Randell Jesup
date=2018-04-18&keys=__none__!__none__!__none__&max_channel_version=beta%252F60&measure=USE_COUNTER2_DEPRECATED_URLCreateObjectURL_MediaStream_PAGE&min_channel_version=null&processType=*&product=Firefox&sanitize=1&sort_keys=submissions&start_date=2018-03-04&table=0&trim=1&use_submission_date=0
>
>[2]
>https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_date=2018-03-01&keys=__none__!__none__!__none__&max_channel_version=beta%252F59&measure=USE_COUNTER2_DEPRECATED_URLCreateObjectURL_MediaStream_PAGE&min_channel_version=null&processType=*&product=Firefox&sanitize=1&sort_keys=submissions&start_date=2017-11-13&table=0&trim=1&use_submission_date=0


-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Phabricator and Bugzilla

2018-04-09 Thread Randell Jesup
>As I indicated, those posts go into detail on why we are avoiding both
>comment and more complicated flag mirroring.
>
>Mark

There's no obvious discussion of "flags" in the linked discussions you
gave; I find only a couple of references to "flag" - in a question from
gps.  Given how long the thread is (52 posts), perhaps you can point to
something more specific?  Thanks

>On Sat, Mar 31, 2018 at 10:14 AM, Ben Kelly  wrote:
>
>> On Sat, Mar 31, 2018, 10:09 AM Mark Côté  wrote:
>>
>>> Regarding comment and flag mirroring, we've discussed this before:
>>>
>>> https://groups.google.com/d/msg/mozilla.dev.platform/
>>> Y8kInYxo8UU/e3Pi-_FpBgAJ
>>> https://groups.google.com/d/msg/mozilla.dev.platform/
>>> Y8kInYxo8UU/tsF7UfxvBgAJ
>>>
>>> Given that Phabricator is still new, I don't see any reason to reopen
>>> that discussion at this point, aside from noting that we have work in
>>> progress to include Phabricator requests in BMO's My Dashboard and
>>> notifications indicator (https://bugzilla.mozilla.org/
>>> show_bug.cgi?id=1440828).
>>>
>>
>> What about comment mirroring?  On my mobile so I haven't read all the past
>> threads, but my recollection is that your team did not want to implement
>> that feature.  Personally, this is a huge concern for me.
>>
>> Thanks.
>>
>> Ben
>>
>>
>>> As for interdiffs, feel free to file a bug with any problems you see.  We
>>> have a good relationship with upstream and can pass those on.  Similarly
>>> with method names (which has been mentioned before but I can't find where
>>> at the moment).
>>>
>>> There is official documentation at https://secure.phabricator.
>>> com/book/phabricator/ which is linked from our Mozilla-specific docs (
>>> http://moz-conduit.readthedocs.io/en/latest/phabricator-user.html) which
>>> in turn is linked in the left-hand menu in Phabricator.  We can expand our
>>> own docs as needed if there are areas that are particularly confusing due
>>> to, say, expectations carried over from our other code-review tools.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: testing event targets for memory leaks

2018-04-09 Thread Randell Jesup
>Hi all,
>
>I recently landed some test infrastructure for testing event targets for
>memory leaks.  This was part of fixing my service worker memory leak in bug
>1447871.  I wanted to let people know this existed and also ask for help
>writing tests for more event targets.
>
>To repeat, I need help writing more tests.  Please see the list of untested
>APIs at the bottom of this email.

>There are many, many event targets in the system, though.  Using searchfox
>to look for subclasses of DOMEventTargetHelper I came up with a list of
>event targets that are not currently tested.  Some of these are deprecated
>or not exposed.  But a lot of these are web exposed.  It would be really
>great to get this kind of simple leak test written for these APIs.

I'm surprises that DOMDataChannel wasn't found:
nsDOMDataChannel.h:
  class nsDOMDataChannel final : public mozilla::DOMEventTargetHelper,

perhaps you were looking for "public DOMEventTargetHelper"?

I also find nsScreen and nsDOMOfflineResourceList using
mozilla::DOMEventTargetHelper

Are there any events implemented in JS that we need to worry about?  For
example, there are a lot of events (and a number of objects) defined for
WebRTC as part of dom/media/PeerConnection.js

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Faster gecko builds with IceCC on Mac and Linux

2018-01-31 Thread Randell Jesup
>On 1/16/18 2:59 PM, smaug wrote:

>Would it be possible that when I do an hg pull of mozilla-central or
>mozilla-inbound, I can also choose to download the object files from the
>most recent ancestor that had an automation build? (It could be a separate
>command, or ./mach pull.) They would go into a local ccache (or probably
>sccache?) directory. The files would need to be atomically updated with
>respect to my own builds, so I could race my build against the
>download. And preferably the download would go roughly in the reverse order
>as my own build, so they would meet in the middle at some point, after
>which only the modified files would need to be compiled. It might require
>splitting debug info out of the object files for this to be practical,
>where the debug info could be downloaded asynchronously in the background
>after the main build is complete.

Stolen from a document on Workflow Efficiencies I worked on:

Some type of aggressive pull-and-rebuild in the background may help
by providing a ‘hot’ objdir that can be switched to in place of the
normal “hg pull -u; ./mach build” sequence.

Users would need to deal with reloading editor buffers after
switching, but that’s normal after a pull.  If the path changes it
might require more magic; Emacs could deal with that easily with an
elisp macro; not sure about other editors people use.  Keeping paths
to source the same after a pull is a win, though.

Opportunistic rebuilds as you edit source might help, but the win is
much smaller and would be more work.  Still worth looking at,
especially if you happen to touch something central.

We'd need to be careful how it interacts with things like hg pull,
witching branches, etc (defer starting builds slightly until source
has been unchanged for N seconds?)

I talked a fair bit about this with ted and others.  The main trick here
would be in dealing with cache directories, and with sccache we could
make it support a form of hierarchy for caches (local and remote), so
you could leverage either local rebuilds-in-background (triggered by
automatic pulls on repo updates), or remote build resources (such as
from the m-c build machines).

Note that *any* remote-cache utilization depends on a fixed (or at least
identical-and-checked) configuration *and* compiler and system
includes.  The easiest way to acheive this might be to leverage a local
VM instance of taskcluster, since system includes vary
machine-to-machine, even for the same OS version.  (Perhaps this is less
of an issue on Mac or Windows...).

This requirement greatly complicates things (and requires building a
"standard" config, which many do not).  Leveraging local background
builds would be much easier in many ways, though also less of a win.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Hiding 'new' statements - Good or Evil?

2018-01-04 Thread Randell Jesup
[SNIP]
>> If foo->bar() can mutate the lifetime of foo, of course you must take a 
>> strong
>> reference to foo. Nothing about auto or range-for changes this.
>
>What auto does is make it really hard to tell whether a strong reference is
>being taken.
>
>> If you don't understand your lifetimes, you get bugs.
>
>Fully agreed.  The discussion is about whether auto helps or hinders that
>understanding.  The answer is that it depends on the surrounding code, from
>where I sit...
>
>-Boris

So, where do we go from here?

It's clear that auto is not as safe as some/many believe it to be; it
can as Boris (and smaug and Xidorn) say hide lifetime issues and lead to
non-obvious UAFs and the like.  Some uses are certainly safe, but it
isn't always obvious looking at a patch (per above), requiring more code
investigation by the reviewer.  If the resultant type isn't obvious,
then using it isn't helping comprehension.  When reading the code in the
future, if the reader has to do non-trivial investigation just to know
what's going on, it's hurting our work.

If the win is to avoid typing I'd say it's not worth the risk.  Or
allow it, but only with commentary as to why it's safe, or with rules
about what sort of types it's allowed with and anything other than those
requires justification in comments/etc.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Bigger hard drives wanted (was Re: More ThinkStation P710 Nvidia tips (was Re: Faster gecko builds with IceCC on Mac and Linux))

2017-11-10 Thread Randell Jesup
>On 11/7/17 4:13 PM, Sophana "Soap" Aik wrote:
>For the work I do (e.g. backporting security fixes every so often) I need a
>release tree, a beta tree, and ESR tree, and at least 3 tip trees.  That's
>at least 150GB.  If I want to have an effective ccache, that's about
>20-30GB (recall that each objdir is 9+GB!).  Call it 175GB.
>
>If I want to dual-boot or have a VM so I can do both Linux and Windows
>work, that's 350GB.  Plus the actual operating systems involved.  Plus any
>data files that might be being generated as part of work, etc.

I've "solved" this by having a 2T rotating disk for the stuff I don't
use constantly - release and ESR trees, local backups, if need be I'll
move other large things there (media files, RR storage which is
currently in ~/.rr)  I have 4 inbound trees (one dedicated to ASAN) and
head/beta trees, plus a couple of "mothball" trees for reference from
old instances of alder (those could be moved, though I trust rotating
disks far less than SSD.  That said, I have had a (personal/retail) SSD
die.)

Right now on my ~350GB Linux /home partition (there's a windows one too,
though I rarely use it) I have ~220GB used.  (there's also a 50GB /
partition).  src/mozilla is 120GB (including objdirs, though I kill them
fairly aggressively if they're out-of-date).  I should move my final
aurora repo to rotating disk..

I probably am not giving anywhere near enough space to ccache, though.

Rotating disks are cheap (and easy if you have a desktop; less so though
not horrible if you have a a laptop, especially with a dock).  They
don't necessarily solve Boris's problem, however.  He could really use a
1TB SSD I suspect.

When I got my current laptop, I asked for some options I saw on Lenovo's
site that weren't the default config.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Website memory leaks

2017-11-02 Thread Randell Jesup
>On Thu, Nov 02, 2017 at 05:37:30PM +0200, smaug wrote:
>>This has been an issue forever, and there aren't really good tools on any 
>>browser, as far as
>>I know, for web devs to debug their leaks.
>>Internally we do have useful data (CC and GC graphs and such), but would need 
>>quite some ux skills to design some good
>>UI to deal with leaks. Also, the data to deal with is often massive, so the 
>>tool should be implemented that in mind.
>
>We do have memory tools in devtools now, and the dominator tree in
>particular can provide some useful data for tracking down leaks. But those
>tools aren't especially well-maintained at the moment, and I honestly think
>they're too flaky at this point to recommend to webdevs for general use :(
>
>It would be nice if we could prioritize them again.

Also, tools for developing a page are one thing, but usage in the field
is another.  Ability to know in a loaded page about memory use (and to
know about usage of an embedded iframe) would give web devs information
and a way to put their own pressure (and maybe limits) on.

Plus, most devs unless they're dealing with a report about a problem
won't go looking.  Something that proactively can poke them is far more
likely to get action.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Website memory leaks

2017-11-02 Thread Randell Jesup
>Many of the pages causing these leaks are major sites, like nytimes.com,
>washington post, cnn, arstechnica, Atlantic, New Yorker, etc.
...
>Perhaps we can also push to limit memory use (CPU use??) in embedded
>ads/restricted-iframes/etc, so sites can stop ads from destroying the
>website performance for users over time.  I've often seen ads taking
>300MB-2GB.

So in support of this general concept of limiting ad memory use:

https://www.scientificamerican.com/article/should-iconic-lake-powell-be-drained/
doesn't leak if you load it -- until you scroll down, and the ads load.
Then it leaks forever... to the tune of 1GB leaked in 5-10 minutes.
Differential about:memory reports show that what's primarily leaking are
ads, in particular at this moment:

400.25 MB (43.45%) -- detached
  398.81 MB (43.30%) -- 
window(https://tpc.googlesyndication.com/safeframe/1-0-13/html/container.html)
seems like the worst culprit, plus
406.69 MB (44.15%) --  
top(https://www.scientificamerican.com/article/should-iconic-lake-powell-be-drained/,id=NNN)
  327.79 MB (35.59%) -- js-zone(0xNNN)

Other ads have leaked a few MB to 75MB each.

Also, as soon as I scrolled down CPU use went from ~0% for the process
to ~20% (on a 4-thread/core AMD CPU).

Worse yet (perhaps bug 1410381??) when I hit reload CPU use dropped (not
to 0).  10 sec later, Mem use climbed 3.7GB to 4.5GB.  then dropped to
3.7GB and climbed back to 4.5GB.  Then dropped to 2.7GB - barely above
where it was before I scrolled down - and stayed stable. (Note: I hit
reload with the current position near the bottom with 1 ad visible (no
video).

Using an additional GB+ of memory in order to free 1GB of memory
seems... excessive.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Website memory leaks

2017-11-02 Thread Randell Jesup
>about:performance can provide the tab/pid mapping, with some performance
>indexes.
>This might help solve your issue listed in the side note.

mconley told me in IRC that today's nightly has brought back the PID in
the tooltip (in Nightly only); it was accidentally removed.

about:performance can be useful, but 3 tabs vs all tabs is too coarse
for me, and things like "site leaked memory and is slowing CC" I presume
doesn't show up in the 'heat' for the tab there.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Website memory leaks

2017-11-02 Thread Randell Jesup
 moving (more)
security-sensitive and/or crashy services into separate processes -- all
of which come into play here.

Side-note: removing the process-id from tabs in mult-e10s has made it
harder for me to hunt down offending tabs, since I track down runaway
memory/CPU in things like Process Explorer.  (Usually after I notice
that the browser has gotten slow/non-responsive)  Totally reasonable,
though I'd like to see the ID back on Nightly at least - makes starting
GDB against the right content process *much* easier, and I don't have to
limit the browser to 1 to avoid the problem.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Visual Studio 2017 coming soon

2017-11-01 Thread Randell Jesup
>2017-10-30 19:19 GMT+01:00 Kris Maglione :
>
>> Our static analysis tools are pretty good at catching a lot of lambda
>> capture bugs at this point, though. I'd be much less comfortable using them
>> if they weren't.
>>
>> It's probably worth considering whether we need to write static analysis
>> tools for a new feature before we turn it on, though...
>
>We can probably help with introducing more static analysis to avoid
>incorrect usages of C++{11,14,17} features.

Sure - but in most cases we've only realized that a static-analysis bit
was needed after hitting and solving a few (or a bunch of) crashes/sec bugs --
static-analysis tends to be reactive.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Default Rust optimization level decreased from 2 to 1

2017-10-31 Thread Randell Jesup
>On 2017-10-25 1:34 PM, Gregory Szorc wrote:
>> Adding --enable-release to your mozconfig (the configuration for builds we
>> ship to users) enables -Copt-level=2. (i.e. we didn't change optimization
>> settings for builds we ship to users.)
>
>I've added a note about this to our benchmarking instructions at
>https://developer.mozilla.org/en-US/docs/Mozilla/Benchmarking#Rust_optimization_level

Note that this means that profiles taken with local builds will be "off"
compared to Nightly/etc, unless you rebuild with extra options.  How
much off?  probably not a lot unless you're super-focused on Rust code,
but that's just a guess.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to ship: New default action, horizontal scroll, of wheel with Shift key (except Firefox for macOS)

2017-10-19 Thread Randell Jesup
>SGTM. BTW, bug 143038 was filed 16 years ago. Is that a bugzilla record for
>oldest fixed bug?

Not even close I think... a couple of years ago I remember some low-4-digit
bugs getting fixed (maybe even a 3-digit?) :-)

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Changes to tab min-width

2017-10-08 Thread Randell Jesup
>> I find (perhaps because I'm a mozilla user, and a tab-hoarder) that
>> Chrome's UI is DREADFUL for anyone with lots of tabs - and probably
>> intentionally to push users into closing them, since large numbers of
>> tabs simply doesn't work as well in Chrome as in Firefox.  So mimicing
>> their behavior is NOT a good idea, IMHO.
>
>I'm one of the people (a minority in this thread, it appears) who prefers
>Chrome's behavior and who'd like a skinny-tab option in Firefox, though not
>necessarily the default option. I use all the major browsers, typically
>with a dozen or more tabs open in the browser I'm using for my primary work
>at any given time. I vastly prefer Chrome's tab behavior for several
>reasons and rely on it as my primary browser in part because of it.

Sounds like a good option for an extension.

>I believe the complaint of "XYZ pixels is too narrow because it hides
>necessary information in the tab" misses another point. For me, having tabs
>vanish off one side or the other of the tab strip is a worse omission of
>important information.

I have enough tabs that there's No Way they can even be visible as close
boxes (and it's useless to me when it gets down to favicon).  And I get
that you have a different set of behaviors/preferences.

An un-discoverable feature of the Awesome Bar is
"%url-fragment/title-fragment".  It will show you completions
from the list of open tabs only.  This (and using scroll-wheels to spin
quickly through a overflowing tab bar) make large numbers of tabs
feasible without going the huge number of windows route.

I think we could do skinnier tabs if we retained the ability to see what
they are (not just favicon) easily.  The more I think of it, the more I
like the dock-like expand-what's-near-the-mouse idea - I wonder how easy
it is?

>I have typically navigate my 20 or 30 or 40 tabs mostly by keyboard,
>cycling one way or the other across the tab strip, and for me the spatial
>arrangement is very important (as is tab-switching speed).

I find spatial organization is also quite useful, but I orient myself to
it visually.  I never use the keyboard nav - don't even know what the
bindings are. :-)

>There is an argument to be made for making life easier for people moving to
>Firefox from Chrome, which clearly is an ambition in the current Firefox 57
>Quantum push. I don't have any data about how widespread my preferences are
>or how much of a barrier it is adjusting to Firefox's disappearing tabs,
>but this heavy Chrome user prefers Chrome's approach.
>
>I'm not trying to argue that my preferences are universal. But I expect
>Google made its choices pretty carefully and not as a way to punish people
>for using too many tabs. I use lots more tabs than the average user, but I
>expect the general trend is drifting toward more and more tabs, so graceful
>handling of overflow will become important for a larger fraction of people
>as time goes on.

Having a option for tab-handling might be good; it's a primary way users
interact with the browser -- and *no* single way is the Right Way for
all users.  Extensions can help here, modulo most users don't look for
them.  You could also offer that (extension or option) when they import
profile data from Chrome, or put a link in Prefs to a Chrome-like tab
WebExtension (Prefs *could* have links to relevant Extensions).

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Changes to tab min-width

2017-10-06 Thread Randell Jesup
>On Fri, Oct 6, 2017 at 12:57 AM, Lars Hansen  wrote:
>
>> even if I don't exactly remember the ID I'm looking for I can narrow
>> it down to one or two tabs and then hover if I need to.
>>
>> Many other sites also have tabs that can be distinguished
>> from the first few letters - if you can see them.​  (Too many do not.)
>> Indeed now that I have this pref to play with I have set it to 200 and this
>> seems even better.
>
>​I certainly feel your pain wrt bugzilla tabs​, but outside our community's
>unique intense use-case it's not that common amongst release users (see
>stats given earlier about high percentages with < 10 and < 20 tabs).

I don't know about you, but a common case for me is go-to-news-site,
open-N-articles-in-tabs, read-articles (maybe ;-) ).  Probably learned
that in the days of less bandwidth; stuff can pull down in the
background.  Saves a lot of go-back,
wait-for-page-to-load/render/scroll/etc.

For that usage, I need at least a word or so of the title - N tabs
together with the same favicon.  Current width mostly works for this.

I find (perhaps because I'm a mozilla user, and a tab-hoarder) that
Chrome's UI is DREADFUL for anyone with lots of tabs - and probably
intentionally to push users into closing them, since large numbers of
tabs simply doesn't work as well in Chrome as in Firefox.  So mimicing
their behavior is NOT a good idea, IMHO.

I totally hate the favicon-only view I'm seeing now.  it's miserable.
If I were a user, and didn't know about the hidden pref, I'd downgrade,
and if it wasn't fixed soon I'd look for another browser (or some
extension).  That's how bad it is.

That said: there are other solutions possible.  There's the infamous
"Add something to user Preferences".  There's "publish an extension that
lets you fiddle the width" (doable today).  You could put it on the
right-mouse-button on the bar.  etc.  Or... you could do some sort of
apple-dock-like "tabs (in the bar) expand as you move the mouse
over/near them", which allows much better "Find the tab I want" than
narrow tabs, and even better than 110px/etc because you can see more/all
the title without having to wait for a hover to take hold (and you could
see more of the ones near the mouse, probably.

Of course, that's a lot easier said than done. :-)

>Better still would be making vertical tabs a selectable option in
>about:preferences as a common (and unique to Firefox[1]) power-user option
>rather than making us find and install an add-on.

Yes!

>> I feel that the many-tabs use case is treated as a stepchild in
>> Firefox.​
>>
>> Notably the drag-to-reorganize scheme does not work well for many
>> tabs; a tab pane a la what tabgroups had would be superior.
>
>​Completely agree. Even if we don't support "groups"​ of hidden tabs, the
>tableau visualization was great for managing overstuffed windows and
>quickly culling the tabs spawned during some now-complete task or research.

yes!  I actually often cull tabs from session-restore (vertical list
with titles!) or from about:tabs (Tab-stats extension, now broken -- Glandium!?)

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Follow up on clang-format

2017-09-28 Thread Randell Jesup
>On Thursday, September 28, 2017 at 10:15:05 AM UTC+2, Chris Pearce wrote:
>Oh d'oh! Looks like I replied to an old thread, and the plan now is in fact
>to clang-format the entire tree after 57. Sweet as!

Where did you find that?  Was this plan communicated?  It's not in this
thread... (which started with smaug replying to a 2014 post by Anthony).
Personally, this will mostly make me sad/annoyed and make figuring out
why code was landed a pain (can be a real pain for people who do a lot
of reviews), but I'll deal with it.  I'm more concerned that this wasn't
communicated (or discussed!) well, since 58 is already open...

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Coding style: Argument alignment

2017-09-11 Thread Randell Jesup
>On 8/31/17 6:50 PM, Ehsan Akhgari wrote:
>> On 08/30/2017 09:03 PM, Eric Rahm wrote:
>>> Okay sounds like there's agreement on a).
>>
>> Indeed.  I have edited the Style Guide to reflect this.
>
>Fwiw, I disagree with this decision.
>
>Here's a more realistic example of *actual use* from our code base:
>http://searchfox.org/mozilla-central/rev/d29c832536839b60a9ee5b512eeb934274b879b0/layout/forms/nsFormControlFrame.h#57-64
>
>This example shows that it's easier to read arguments at a glance
>when they are aligned, IMO.

I agree with Mats here - it's easier to read, specifically in the case
where you have a large number of args, especially of widely varying
type-lengths.  Obviously, one could disagree, but that's a good example.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: disabled non-e10s tests on trunk

2017-09-11 Thread Randell Jesup
>you could edit the taskcluster/ci/test/tests.yml file and add
>|linux64/debug: both| in the places we have |windows7-32/debug: both|, as
>seen here:
>http://searchfox.org/mozilla-central/source/taskcluster/ci/test/tests.yml#138
>
>If there are concerns with windows build times, please ask for those to be
>faster- it will make it better for all of us :)

Where's my magic wand...? if it was easy to make them faster I presume
that would have been done long ago.

I frequently do (only) linux Trys unless I suspect platform-specific
issues.  Saves resources...  Can't "mochitest-media" (and other non-e10s
versions) be not-run by default, unless you specify it explicitly in -u ?

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Phabricator and confidential reviews

2017-09-02 Thread Randell Jesup
>> Bite the bullet and at least make all CC'd people able to see all
>> patches, always.  It's needed.
>
>Yeah, that's the direction I think we should take.

Good, thanks.

>For now, we will implement exact syncing of the CC list + reporter as the
>revision's subscriber list.  This means that anyone being added as a
>subscriber who isn't CCed will unfortunately be quickly removed (we may
>even prevent manual additions to the subscriber list).  The reasoning here
>is that I believe it's important that anyone looking at the bug knows
>exactly who can see the patch.  If the two lists are kept in sync, it will
>be obvious.  However, if we let people add subscribers directly, then it
>may turn out that a bug with only, say, 10 CCed people might have a patch
>that 100 people can view, and that's probably something you don't want to
>hide.
>
>We'll also investigate what bidirectional syncing would take.  Due to race
>conditions and circular references, though, it won't be straightforward,
>and I suspect our time would be best spent elsewhere.

Don't bother with bidirectional syncing - your point about being able to
look at the bug and know who can see it is important, but I don't see
any advantage in subscribe mirroring into cc.  I would (as you suggest)
block subscribe on sec issues, to avoid confusion (if you can, tell
people to use cc, but that's just gravy).

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: ./mach try fuzzy: A Try Syntax Alternative

2017-09-02 Thread Randell Jesup
>+to...@lists.mozilla.org
>
>There have been a bunch of new features added here I'd like to highlight:

>   - --env: You can now set environment variables in your tasks directly
>   from the command line, e.g:
>  - ./mach try fuzzy --env XPCOM_DEBUG_BREAK=stack --env
>  MOZ_LOG="example_logger:3"|

This is *awesome*; I've been wanting this for YEARS.  Does this work
without 'fuzzy'?

>   - --save/--preset: Works the same as try syntax, using the --query
>   argument. E.g:
>  - ./mach try fuzzy --save reftest -q "!cov !pgo 'reftest !jsreftest"
>  - ./mach try --preset reftest

Also really great.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Quantum Flow Engineering Newsletter #22

2017-09-01 Thread Randell Jesup
>   - We found out that MediaStreamGraphStableStateRunnable is #20 on this
>   list, which was surprising as this runnable is only supposed to be used for
>   WebRTC and WebAudio, neither being extremely popular features on the Web.
>   Randell Jesup found out that there is a bug
>   <https://bugzilla.mozilla.org/show_bug.cgi?id=1395012> causing the
>   runnable to be continually dispatched after a WebRTC or WebAudio session is
>   over.

This may have already been fixed; after updating and restarting I've
been unable to reproduce, and padenot indicates he fixed a shutdown bug
in MSG in the last week or so before I tested..  Also, further analysis
by billm showed that only 10% of sessions had MSGStableStateRunnables,
but in those 10% it was the most frequent runnable.  This may be
as-expected, since when in use the runnables are used to update
MainThread media state 50-100 times per second.

We hope to at some point move (most of) these runnables to atomics for
updating MainThread, and we'll look to see if we can throttle the
events/s from MediaStreamGraph.

Note that monitoring entries in about:telemetry can be handy for
tracking down issues.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Phabricator and confidential reviews

2017-08-25 Thread Randell Jesup
>On Wed, Aug 9, 2017 at 11:32 AM, Mark Côté  wrote:
>
>> I actually like Gijs's proposal, to mirror *from* Phabricator *to* BMO.
>> That way, if you're looking at the bug and want to pull someone in, you CC
>> them; if you're looking at the fix and want to involve someone, you add
>> them as a subscriber which then incidentally lets them view the bug, for
>> background and updates and such.
>
>
>​What if there's not "the" fix but multiple patches? This is quite common
>for security bugs where a testcase that reveals the vulnerability is done
>in a separate patch so it can be checked in after we release the fix. Or
>occasionally bugs like https://bugzilla.mozilla.org/show_bug.cgi?id=1371259
>that had 9 separate patches. Would the same people have to be separately
>subscribed to each?​

I've seen landings (for non-security bugs) that involved up to 100
patches on one bug... or even more.  While that's probably never
happened for a sec bug, multiple patches is not unusual (and in fact
common where there's a test added).

>And don't forget reporter and assignees. Occasionally a reporter not in the
>security group will notice that a patch is insufficient which is nicer to
>find before the patch is committed than after the commit link is added to
>the bug. Sometimes someone other than the official assignee adds a patch
>for an alternate approach to a fix and asks the assignee for feedback,
>though that's uncommon and the assignee could just be manually subscribed
>to the patch at that point.

In fact it's quite common for people who are cc'd other than the patch
writer and reviewer(s) to look at the bug and comment on it, or to
submit an alternate or modified version of a patch previously uploaded.
And as Dan points out, frequently the reporter (when an external
'researcher') will weigh in on the patches being posted or use them to
verify if they fix the problem they found (since many times only they
can reliably reproduce the problem - private test setups, drivers, HW, etc).

>We can wait and see how common the "please subscribe me to the patch"
>requests are, but I suspect they'll be common enough.

Bite the bullet and at least make all CC'd people able to see all
patches, always.  It's needed.

Another aspect of all this that needs to be thought out and verified is
what happens when a non-sec bug becomes a sec bug (and vice-versa,
though I'm far less concerned about that).   When a bug becomes a sec
bug, all patches associated with that bug must become confidential.  And
when a bug moves to be open (not sec-restricted), the patches should
also become open.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Phabricator Update, July 2017

2017-07-19 Thread Randell Jesup
>On 2017-07-14 1:31 AM, Jim Blandy wrote:
>> Many people seem to be asking, essentially: What will happen to old bugs?
>> I'm trying to follow the discussion, and I'm not clear on this myself.
>>
>> For example, "Splinter will be turned off." For commenting and reviewing,
>> okay, understood. What about viewing patches on old bugs?
>
>Migration-specific issues are still in flux, since we have still have
>plenty of time to sort them out.  I'll be clarifying the migration plan as
>we get closer to turning off old tools.  So far we've agreed that keeping
>Splinter in read-only mode to view old patches is totally fine, and Dylan
>(BMO lead) mentioned that there are even nicer diff viewers that we can
>integrate with BMO, like  https://diff2html.xyz/, which is currently in use
>by Red Hat's Bugzilla installation.

Good! your recent posting made me believe you'd gone back to "splinter
must be totally turned off".  Thanks.  I still have significant
questions about how security-bug access and review (and security of such
items) will work; is there a design?  What has changed since our
discussion in May, and have you met with the security engineers and
major security reviewers/triagers?

Thanks

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Phabricator Update, July 2017

2017-07-19 Thread Randell Jesup
>On 7/13/17 9:04 PM, Mark Côté wrote:
>> It is also what newer systems
>> do today (e.g. GitHub and the full Phabricator suite)
>
>I should note that with GitHub what this means is that you get discussion
>on the PR that should have gone in the issue, with the result that people
>following the issue don't see half the relevant discussion. In particular,
>it's common to go off from "reviewing this line of code" to "raising
>questions about what the desired behavior is", which is squarely back in
>issue-land, not code-review land.

Yes, exactly.  Once the data is in two places, there's no way to avoid
some discussions occuring in the "wrong" place.  And Github shows it
clearly happens all the time, so you still have to flip back and forth
between looking at N review-streams of comments, and the bug, and maybe
looking at the dates of comments, in order to understand everything.

>Unfortunately, I don't know how to solve that problem without designating a
>"central point where all discussion happens" and ensuring that anything
>outside it is mirrored there...

Ditto - I think it's inherent if you can reply to review comments in the
separate tool.  Making the review-tool *worse* for commenting
(features/ease-wise) can actually kinda help, at the cost of hurting
responding to review comments.  Googe's Issue via codereview tooling
kinda takes this approach, from what I've seen, with *very*
non-integrated patches vs bugs.

GPS's comments about the pain of syncing also makes total sense, in that
MozReview tried to mostly attach to existing API points of bugzilla and
do things noone anticipated would be done.

I don't think limiting comments in phabricator is a viable option, nor
making it less-featured, so we may have to eat the pain and lash people
with wet noodles if they post in the "wrong" place. 

-- 
Randell Jesup, Mozilla Corp
remove ".news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Phabricator Update, July 2017

2017-07-13 Thread Randell Jesup
>On Wed, Jul 12, 2017 at 11:27 AM, Byron Jones  wrote:
>
>> But indeed having also the patches in bugzilla would be good.
>>>
>> no, it would be bad for patches to be duplicated into bugzilla.  we're
>> moving from bugzilla/mozreview to phabricator for code review, duplicating
>> phabricate reviews back into the old systems seems like a step backwards
>> (and i'm not even sure what the value is of doing so).
>
>I find this a strange argument.  We don't know how successful phrabricator
>is going to be.  The last attempt to switch review tools seems to be
>getting killed.  I think its somewhat reasonable to be skeptical of further
>review tool changes.

I quite agree...  And I hate the idea of having the data spread across
systems (which I also disliked in MozReview, which I tried and it ended
up being really bad for some aspects of my ability to land patches).

>Also, I have to say the whole phabricator thing has been kind of presented
>as a fait accompli. As someone who continues to use splinter on a daily
>basis I'm 1) skeptical and 2) kind of unhappy.
>
>Maybe phabricator will be great, but I'll believe it when I see it.  Please
>don't force future data and workflow into an as-yet unproven tool.

This while sequence strikes me as "Let's launch our new boat, and oh by
the way we're going to sink the old boat(s) so no one is tempted to not
move to the new boat.  And, BTW, we're leaving behind some important
things, and we'll think about how we can deal with those sometime, maybe
before we launch, maybe later, maybe never."

While this may force people onto the new boat, I'm seriously unconvinced
this will be better in the short term, or that it won't end up slowing
down N*100 developers, either for a while or permanently.  And the lack
of transparency in developing this plan is also very concerning.

IMO

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Phabricator Update, July 2017

2017-07-13 Thread Randell Jesup
>>> To answer the other part of your question, MozReview will be disabled for
>>> active use across the board, but it is currently used by a small number of
>>> projects.  Splinter will be disabled on a per-product basis, as there may be
>>> some projects that can't, won't, or shouldn't be migrated to Phabricator.
>>
>> Splinter is still a nice UI to look at patches already attached to bugs.
>> Please don't disable it.
>
>excellent point; thanks for that feedback.
>
>instead of disabling splinter for phabricator backed products, we could
>make it a read-only patch viewer.

I would consider it a massive fail if we disabled at least read-only
splinter viewing of patches.  As noted before.  let's make sure we
don't lose things we agreed on as issues.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to unship: moz*Frames attributes of HTMLVideoElement

2017-07-10 Thread Randell Jesup
>On Monday, July 10, 2017 at 3:28:07 PM UTC+12, Randell Jesup wrote:
>I added these stats originally, and they are now mostly superseded by the
>stats provided by VideoPlaybackQuality. So I support their removal (in fact
>I suggested to Tim that he remove them).
>
>Adding telemetry to learn how often these stats are used in the wild seems
>like a good idea. I'd be surprised if these are still used, but I'm happy
>to be proved wrong!

I know various webrtc app developers have asked about how to determine
if video elements are showing frames, which they've used as "is video
still flowing" indicators, among other things.  We can try to find
alternative ways to get them what they need; our tests for example use
it to know when to sample the image to see if media is successfully
flowing end-to-end, and to know when to sample with a canvas to see if
the "right" color is there (IIRC)

>Randall: doesn't VideoPlaybackQuality support the use cases you're concerned 
>about?

If VideoPlaybackQuality is supported for srcObject streams, and
totalVideoFrames works (frames as received from the MediaStream by the
video element), then it can substitute.  The other fields in the
VideoPlaybackQuality object are likely much harder to hook up (or
define) for WebRTC or MediaStream/srcObject inputs in general - it would
imply additional streams of metadata flowing independently of the actual
frames (i.e. droppedFrames would need to be updated even though no frame
was sent down the stream from the PeerConnection.)

And there are other sources of MediaStreams, in particular
mediaelement.captureStream() and canvas.captureStream(); someone needs
to define how those interact with srcObject and VideoPlaybackQuality
(and fun cases like PeerConnection -> MediaStream -> MediaElement ->
canvas (webGL, etc) -> captureStream -> MediaElement)

I presume no one working on VideoPlaybackQuality was thinking about
non-streaming sources for video when they defined it.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Switching to per-channel profiles

2017-07-10 Thread Randell Jesup
>What happens when users do that? Because they do.
>
>A variety of kinda-horrible things will happen.
>
>The two copied profiles will compete for the Sync client record. That means
>sent tabs will appear on one or the other, the Tabs from Other Devices list
>will flip-flop between each of the two devices, the count of devices will
>be wrong for scheduling purposes, and the two devices will unpredictably
>handle bookmark repair requests, leading to distributed inconsistency.

Ouch.  Glad when I copied profiles I didn't (and still don't) use Sync;
having this happen would be ... surprising - and painful, sounds like.

I wonder if it's possible to flag moved (or copied) profiles, and force
them to go through some sort of re-sync (effectively become a "new"
client for the account, even if it was only moved to a new disk location).
That *should* in theory avoid the consistency problem, and only annoy
users without reason in the odd case where the moved their profile(s).

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to unship: moz*Frames attributes of HTMLVideoElement

2017-07-09 Thread Randell Jesup
>On Fri, Jul 7, 2017 at 11:54 AM, Jet Villegas  wrote:
>
>> What do we expect to break?
>
>
>I can see that video quality auto adjusment which is based on these APIs
>will become malfunction. But, I don't know is this a real use case that a
>website implement video quality adjusment based on these APIs.

This will break a number of our internal mochitests.  I'm not sure
without some checking if the standardized properties can replace them (I
think the answer is no in some cases).  We could put them behind a pref
for testing use.

>
>> Who's out there using these APIs now?
>>
>

I do believe a number of webrtc services may use these; it's hard to be
sure because many of them are not easily findable by searching github
or search engines.

>There are some addons using these APIs to report media statistics. For our
>internal use, there are only some test code using them. And I think the
>VideoPlaybackQuality can be used to replace them since it provides a
>similar feature.

In webrtc and related use, quality is not what these measure; we want to
know if frames have arrived or been painted (generally using
mozPaintedFrames).  There is one test using mozParsedFrames.

Before landing any deprecation warnings, you should check with the
media/webrtc teams, and we may want to check how one or two external
users are using it.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: switch to macosx cross-compiled builds on taskcluster on trunk

2017-06-21 Thread Randell Jesup
>At 11:00PT today, we will be landing patches to run mac opt builds on trunk
>as cross compiled builds on Linux machines on taskcluster.  As this change
>is uplifted to m-c, nightly builds for mac will also switch to run on
>taskcluster on Linux.  We will be testing to ensure that updates work as
>expected.  We don’t expect any impact to developers as this has been tested
>extensively on a project branch as well as by manual testing by PI.

Does this have affect on our still using the 10.7 Mac SDK?  I've had to
add a number of hacks to imported code to make it buildable on 10.7 SDK,
and we can't use certain newer features/APIs because of it.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: BaseThreadInitThunk potential collateral benefits

2017-06-19 Thread Randell Jesup
Liz wrote:
>I'm so happy for this hammer to slam down. Goodbye, at least a good bit of
>horrible DLL injections and crashes, and horrible BaseThreadInitThunk.
>Thanks for the explanation. The suggestions for next actions sound good
>too.

Hear hear!  And thanks to Carl (and David Durst, and JoeH) for jumping
on this when Liz and I (and others) from the regression triage group put
the screws to Joe Hildebrand (I threatened to withhold desert!) to
prioritize it.

>On Mon, Jun 5, 2017 at 2:36 PM, David Durst  wrote:
>> I wanted to call attention to https://bugzilla.mozilla.org/
>> show_bug.cgi?id=1322554 (Startup crashes in BaseThreadInitThunk since
>> 2016-12-03) -- a startup crash resulting from DLL injection early in the
>> process -- which was resolved fixed for 32bit Windows on 5/24.
>>
>> This particular fix[0] has the potential side effect of solving other bugs
>> as well, ones that may have very different signatures that deal with
>> remotely-injected code[1].
>>
>> Just a caution that it's not going to impact all injection issues. We know
>> that there is an as-yet-unidentified increase and decrease in this crash
>> prior to patch landing[2]. Carl Corcoran, who landed this, will be looking
>> at that time period as well to attempt to find contributing causes (any
>> help diagnosing is probably welcome, though I defer to ccorcoran).
>>
>> Also, just a shout out that this was Carl's first patch.
>>
>> Thanks!
>>
>>
>> [0] The proposed/chosen solution is roughly here:
>> https://bugzilla.mozilla.org/show_bug.cgi?id=1322554#c69
>> [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1322554#c168
>> [2] https://crash-stats.mozilla.com/signature/?product=
>> Firefox&release_channel=release&signature=BaseThreadInitThunk&date=%3E%
>> 3D2016-12-05T14%3A42%3A31.000Z&date=%3C2017-06-05T14%3A42%3A31.000Z#graphs

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: The future of commit access policy for core Firefox

2017-06-14 Thread Randell Jesup
e all the
>time. So you have to pick a reasonable default then debate about adding
>complexity to placate the long tail or handle corner cases. Standard
>product design challenges. On top of that, technical users are the 1% and
>are a very difficult crowd to please.

I've dealt with these by handling rebases myself (i.e. reviewers don't
get involved), with the rare exception that a rebase requires
non-trivial rewrite, in which case I just ask for re-review or re-review
on that part, or write the rebase as a new patch on top of the reject.
This is rarely needed, however.

Interdiffs for responding to review comments - sometimes (trivial
patches) I just replace and r? the entire patch.  More complex ones I'll
set up as a separate patch (interdiff) and ask them to review that, then
fold/merge before checkin (yes, I use mq and splinter still - but the
general idea would apply regardless).  And yes, mozreview does a MUCH
better (and more automatic) job of dealing with interdiffs
automatically.  But that exposes you to the rebase problem.  As you say,
there isn't an easy answer.  Or just "never rebase until the patch is
"done", then deal with rebasing all at the end.  (And, of course, for some
long-in-review patch queues, this won't be possible.)

>I'm sensitive to concerns that removal of "r+ with nits" will provide a net
>productivity regression. We should be thinking of ways to make landing code
>easier, not harder. This is a larger discussion spanning several topics, of
>course. But the point I want to make is we have major, systemic problems
>with code review latency today. Doing away with "r+ with nits" exacerbates
>them, which is part of the reason I think people feel so passionately about
>it. I feel like there needs to be a major cultural shift that emphasizes
>low review latency, especially for new and high volume contributors. Tools
>can help some. But tools only encourage behavior, not enforce it. I feel
>like some kind of forcing function (either social/cultural or formal in
>policy) is needed. What exactly, I'm not sure. At this point (where I see
>very senior engineers with massive review queues), I fear the problem may
>have to be addressed by management to adequately correct. (Yes, I hated
>having to type the past few sentences.)

We have greatly improved review latency in the last year or two (an
extreme case was certain people left reviews active for 9 months or a
year, because they disagreed with the patch direction).  It's much
better now, but it isn't perfect, and trying to drive it a lot lower can
have negative impacts that may be hard to measure.  C.f. Peopleware, and
the impact of interruptions on productivity (and quality).  Many people
don't read email for stretches because they're busy working on something
(or meetings, etc).  Some even disconnect from the internet to focus
(roc did).  People have PTO, they have doctor's visits, they have p1
quantum bugs that are blocking other people - not all reviews are equal.

>I think Steven MacLeod's response was highly insightful and is worth
>re-reading multiple times. Pursuant to his lines of thought, I think there
>is room to have more flexible policies for different components. For
>example, high-value code like anything related to cryptography or security
>could be barred from having "r+ with nits" but low-impact, non-shipping
>files like in-tree documentation could have a very low barrier to
>change.

Having tighter requirements for select portions of the tree would be ok.
We effectively have that for things like NSS since it's imported
periodically.  Such a limited tightening would be very different than
what's proposed.

>Of course, this implies flexible tools to accommodate flexible workflows
>which may imply Mozilla's continued investment in those tools (which is in
>contrast to a mindset held by some that we should use tools in an
>off-the-shelf configuration instead of highly tailoring them to our
>workflows). I like the simplicity of one policy for everything but am not
>convinced it is best. I'd like more details on what other large companies
>do here.
>
>I hope this post doesn't revive this thread and apologize in advance if it
>does. Please consider replying privately instead of poking the bear.

I did consider it.  But I can't be quiet publicly with a posting stating
"this *will* happen" (emphasis added) on the table.

I certainly know of the vulnerabilities here but I also see how much
friction for our development cycle might be caused, with little
real-world benefit (IMHO).  The last thing we need is to move a lot slower.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Fwd: Changes to code review tools and processes

2017-05-12 Thread Randell Jesup
>This was posted on mozilla.tools yesterday but doesn't seem to have made it
>to dev.platform. It is likely to be of interest to many people on this
>list. Note the comment "If you have questions or concerns beyond those
>addressed below, please
>use the mozilla.tools group and/or #phabricator on IRC and Slack to raise
>them. " in the message

I tried cross-posting my mozilla.tools response to here, but it appears
to not work.  I had extensive comments in mozilla.tools about a number
of issues; I strongly advise anyone even slightly interested go there
and read them.  Most important were issues around who trialed it,
security bug/patch access, control of data, archiving, ability to look
at old patches as more than a raw diff, and where review comments live.
(And more).  (Mark responded, and I followed up with more comments)

They do plan to put up a Wiki page on all this soon.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Is there a way to improve partial compilation times?

2017-03-07 Thread Randell Jesup
>On 07/03/17 20:29, zbranie...@mozilla.com wrote:
>
>> I was just wondering if really two days of patches landing in Gecko should 
>> result
>> in what seems like basically full rebuild.
>> 
>> A clean build takes 65-70, a rebuild after two days of patches takes 
>> 50-60min.
>
>That seems pretty normal to me nowadays.  My very vague impression is that
>this has gotten worse in the past few months.  Nowadays I assume that every
>m-i to m-c sync (twice a day) will entail more or less a full rebuild.

someone hits Clobber, or someone touches an MFBT or xpcom header, and
poof, full rebuild.

>I would also say that, if your machine takes 65-70 mins for a clean build,
>and that's on Linux, you need a faster machine.  I reckon on about 23 mins
>for gcc6 "-g -Og", and clang can do it in just about 20.  Linking is also
>a bottleneck -- you really need 16GB for sane linking performance.

Right.  My 5.5 year old desktop Linux machine (admittedly 1 (old) XEON,
16GB, SSD) does builds in around 25ish min (up from about 12 a couple of
years ago!)  I suspect the issue is the laptop has a lower-HP/MHz CPU
(maybe), and/or the cooling solution is causing thermal throttling -
mobile CPUs often can't run all-cores-flat-out for long in typical
laptop cooling.

Windows builds slower.  Of course.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Sandboxing plans for Media, WebRTC and Networking

2017-01-23 Thread Randell Jesup
Since some discussions at the all-hands around audio sandboxing and
mtransport's blocking linux sandbox tightening, and discussions with the
necko team, I decided to explore the options available to us, before we
got too far down a path-of-least-resistance.  The following is the
result of that analysis and discussions with people involved
(Media/WebRTC/Networking/Sandboxing teams).

The full document with details on the pluses and minuses of each option
is here:
https://docs.google.com/document/d/1cwc153l1Vo6CDuzCf7M7WbfFyHLqOcPq3JMwwYuJMRQ

  Randell Jesup

Media, WebRTC and Network Sandboxing plans

This document is meant to lay out the options for Media and Webrtc
sandboxing, as well as options for Necko.  Changes need to made here in
order to tighten the Content sandboxes further, in particular for Audio
input/output and because some mtransport code (ICE/TURN) uses OS calls
for IP/interface discovery that we want to lock down.  Before we start
making changes, we should vet the design and determine if it makes sense
to move more code & where the moved code should live - Master (“Chrome
Process” in MDN docs), Content, or a separate sandbox.

Conclusion: I recommend two new sandboxes in the short term (option 4.5
below) - one for Audio in/out and Video in, one for mtransport.  (We
combine with the audio/video one if the sandboxes cause too much
(memory, startup) overhead.)  For later stages, we should first consider
sandboxing Necko protocols (especially new/external ones) ala GMP, and
finally we should consider adding to the media sandbox MSG, WebAudio,
gUM, and PeerConnection/pipeline code.  That last would be a much larger
change, though with some significant architectural security advantages
potentially.

The current state:
* Video input lives in Master (CamerasParent) and talks to CamerasChild in 
Content
* Video display is handled by Compositor (via independent sandbox and IPC 
channel)
* Audio input and output are handled in Content - both by cubeb in
  full_duplex, and output by cubeb and input by webrtc.org code for
  non-full-duplex.
* WebRTC networking code (media/mtransport) handles ICE/TURN and routing
  RTP/RTCP over the ICE/TURN channels via IPC, using UDPSocket/TCPSocket
  code with packet filters. 
* mtransport talks to PeerConnections and JSEP/signaling
* Some IP/interface discovery code in mtransport makes OS calls from Content
* PeerConnections run in the Content code
* WebRTC codecs run there too, except for things like OpenH264 (in GMP sandbox)
* MediaDecoders run in Content (jya has made some changes here on some
  platforms; DXVA decoders run in the GPU process) 
* Codecs run in Content, GPU , or in GMP sandboxes (especially EME)
* Mac may do similarly in the future
* MediaStreamGraph (MSG) runs in each Content process, as needed.  In
  some cases if used by chrome code it will run in Master as well.
* MSG runs either off a timer (if there’s no sound output) or off of
  audio driver callbacks on OS driver threads (with sound output).
* There can be multiple MSGs in the same process (for example for
  disconnected non-realtime MSGs used for WebAudio, and in the future
  multiple MSGs to handle output device selection) 
* WebAudio runs as part of MSG
* Necko code (used by HTTP/etc and by mtransport for low-level IO) runs
  in Master
  This includes protocols, encryption, cache and many other bits

Needs:
* For sandboxing work, we want to lock down Content process such that
  audio must move out of Content. 
* Deadlines for output data must be met to avoid underruns; tougher
  though doable when the request must transit IPC 
* mtransport needs to stop making OS calls from Content

Wants:
* Minimize complexity
* IPC adds complexity, especially when anything is synchronous.
* Complexity adds bugs
* Complexity slows down maintenance and makes code harder to understand
* Security
* A sandbox is more secure than the Master process
* Especially for code that touches arbitrary data from outside -
  i.e. codec data (encode (can come from canvas) or especially decode),
  and network data (packets can contain most anything if it gets by the
  packet filter, which is mostly about origin). 
* Complex code (codecs) often has hidden bugs, and fuzzing helps but
  doesn’t guarantee holes don’t exist. 
* ICE/TURN is pretty complex networking code, and a good chunk of it is
  in legacy C (no refcounting/etc). 
* Developing new wireline protocols (such as QUIC) in userspace adds new
  risks, especially as fixes and improvements will be rapidly iterated
  on, and they’re exposed to crafted packets. 
* Failure of the code in a sandbox doesn’t give up the whole system
* Firewalling (sandboxing) vulnerable media/networking code from Content
  is also useful, since a break of the media/networking code would
  require a sandbox escape to get access to raw content (passwords, bank
  data, etc). 
* This is part of why GMP is used (though not the only reason).
* Running imported code in a sandbox, e

Re: Intent to implement and ship: RTCDTMFSender

2016-08-26 Thread Randell Jesup
>>David, do you know whether Edge is generally planning to implement this too
>>as they implement the WebRTC API?  What about Safari?
>
>Edge is generally onboard with the DMTF API from what I know, though I
>haven't discussed specifics with them.  We did talk about it a little at
>Cluecon (Freeswitch conference) two weeks ago, in a panel discussion.

Also, the MS guys indicated they will be launching Skype over WebRTC (in
alpha now), and Skype For Business (aka formerly Lync) via WebRTC soon.
Skype and especially Skype For Business need DTMF support, and they
would like to see the API in Firefox.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to implement and ship: RTCDTMFSender

2016-08-26 Thread Randell Jesup
>On 8/26/16 8:30 AM, Anne van Kesteren wrote:
>I filed a few more.  Mostly editorial, but
>https://github.com/w3c/webrtc-pc/issues/775 is substantive.

Thanks.  We'll get that dealt with.

>David, do you know whether Edge is generally planning to implement this too
>as they implement the WebRTC API?  What about Safari?

Edge is generally onboard with the DMTF API from what I know, though I
haven't discussed specifics with them.  We did talk about it a little at
Cluecon (Freeswitch conference) two weeks ago, in a panel discussion.
MS guys were in remotely using Edge, connecting via webrtc to a
Freeswitch conference (which uses adapter.js).  (Video in Edge (H264
only so far) requires flipping a flag in about:flags in the anniversary
release of win10.)  We had someone in via Safari (using a plugin for
webrtc) as well.

Freeswitch developers are very interested in DTMF support since they're
generally PBX/IP-phones and SIP trunking/etc people.

Apple of course never says anything, but they're clearly working on
WebRTC support.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Rationalising Linux audio backend support

2016-08-25 Thread Randell Jesup
>I just landed some telemetry to measure the usage of all audio backends,
>we'll have data soon. 
>
>This was bug 1280630, and the probe is at [0]. This also measures
>failures to open a stream and usage of backends that should not be used
>on certain platform, like winmm on windows vista+. 
>
>Also I support this proposal. 

We have some data now; in Aurora 50 it's 96.5%/3.5%, Nightly 51 is
98%/2%

Of course, even Aurora users aren't necessarily "typical" users, but I
imagine these will not move much in the favor of Alsa in Beta or
release, and more likely move towards Pulse ("users" are more likely to
be running plain-jane distros which have Pulse enabled by default - but
we'll see).

We'll start getting beta results in a few weeks.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: realtime audio on linux

2016-08-11 Thread Randell Jesup
>On 07/14/2016 05:36 PM, ajo...@mozilla.com wrote:
>Last I knew, JACK was the only way to get basically no dropouts and still
>be able to do nontrivial audio processing. But a JACK backend for the
>browser just seems kind of silly; it's too much of a niche "market" to try
>for anytime soon.

Likely so, though I'll note that Jack support for cubeb + Firefox has
recently been updated by a contributor; so you can build with a Jack
backend (or so I understand; I haven't tried it).

Padenot or achronop can tell you more about this.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Checkin-needed requests - Please include complete information in the commit message :)

2016-07-11 Thread Randell Jesup
>On Mon, Jul 11, 2016 at 2:18 PM, Xidorn Quan  wrote:
>> I also use checkin-needed for small changes which I don't think it's
>> worth to run a full testset for, to save some infra resources.
>
>Hmm, that's an odd optimization.  I'd have thought that sheriff time
>is more valuable than infra.

It's not just the the value of sheriff time vs infra cost, it's the
delta cost to other developers who end up waiting longer in Try/etc
queues before *they* can land.  When you fairly often see 4/6/12 or ever
18/24 hour turnarounds on Try, anything that noticably ups the load on
infra when there's a low chance of catching something is a bad trade.
Of course, this depends on the devs being smart about what they skip
try/autoland on (for example, nit-fix changes are a good choice - usually).

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Shutdown leak detection - is there a better way?

2016-05-09 Thread Randell Jesup
I wrote the following in bug 1270308 comment 14, and thought it might be
worth mentioning/discussing here.  Not a new observation, of course, but
perhaps worth thinking about.  I should re-emphasize the effort that
goes into making clean shutdown work without intermittent is a LOT of
work for a lot of people.  That said, any radical change here is also a
lot of work, at least to change over, but it could save time in the long
run, and might incrementally improve security.

  Randell

(In reply to Andrew McCreight [:mccr8] from comment #13)
> (In reply to Jan-Ivar Bruaroey [:jib] from comment #12) 
> > Aren't there other ways we could find runtime leaks?
> 
> Unfortunately, there are none that I know of. In order to check for leaks,
> you have to know whether every live allocation should be alive or not.
> Waiting for shutdown and declaring that nothing should be alive is the
> easiest way to do this.

Right, though there are a couple of ways to do that.  One is what we're
(mostly) doing, which is to unwind everything and free, then see what's
left.  The other way (which we do with a few bits) is to mark
allocations or classes are being ok to exist at exit (some of Mutexes
for example IIRC). Big advantage there is cleanup/unwind code has risks
of it's own and can introduce race conditions.  Downside of this is that
the analysis of leaks is trickier; you end up needing basically a GC
scanner to see of allocations are tied to a "ok to leak" root.

That said... overall I think that's a better/safer/overall-lower-effort
(at least ongoing) than the complexity of the unwind code, just to avoid
leaks.  Also: real leaks we care about are either large single
allocations, or allocations that can increase in number the more times
you do something.  (Which includes leaks-occasionally-due-race).
One-off leaks (leak of a single pre-process instance of something) are
at most an annoyance unless large or eating CPU.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Firefox Hello new data collection

2016-04-05 Thread Randell Jesup
>The privacy review bug is
>https://bugzilla.mozilla.org/show_bug.cgi?id=1261467.
>More details added below.
>> On 04/04/2016 10:01, Romain Testard wrote:
>>
>>> We would use a whitelist client-side to only collect domains that are
>>> part of the top 2000 domains (Alexa list of top domains). This
>>> prevents
>>> personal identification based on obscure domain usage.
>>>
>>
>> Mathematically, the combination of a set of (popular) domains shared could
>> still be uniquely identifying, especially as, AIUI, you will get the counts
>> of each domain and in what sequence they were visited / which ones were
>> visited in which session. It all depends on the number of unique users and
>> the number of domains they visit / share (not clear: see above). Because
>> the total number of Hello users compared with the number of Firefox users
>> is quite low, this still seems somewhat concerning to me. Have you tried to
>> remedy this in any way?
>>
>
>We are aggregating domain names, and are not storing session histories.
>These are submitted at the end of the session, so exact timestamps of any
>visit are not included.

There's been a bunch of surprises over the last few years where
"anonymized" data turned out to be de-anonymizable.  This is the sort of
data that feels like it could lead to surprises.  I think this would
need more looks by someone who actually understands that and where those
risks come from (not me).

There are added risks if you include the case of someone using our data
*and* data from one or more 3rd-party sites, and that's not easy to
reason about, which is why this needs careful consideration.

>> Finally, I am surprised that you're sharing this 2 weeks before we're
>> releasing Firefox 46. Hasn't this been tested and verified on Nightly
>> and/or other channels? Why was no privacy update made at/before that time?
>>
>
>We are shipping Hello through Go Faster. The Go Faster process allows us to
>uplift directly to Beta 46 directly since we're a system add-on
>(development was done about 2 weeks ago).
>Firefox Hello has its own privacy notice (details here
><https://www.mozilla.org/en-US/privacy/firefox-hello/>).

Since the collection is not enabled currently anywhere, how known-stable
is it for beta?  Having the code in a disabled state safely is one
thing; having it known to be safe to turn on is another.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Exit code -11 must die

2016-02-29 Thread Randell Jesup
>On 2/27/2016 9:06 PM, Randell Jesup wrote:
>> months until recently it popped up a bit).  Note that this failure
>> *never* results in a crashdump, and I've never seen it locally, just in
>> Automation.
>
>What we do know:
>
> * Exit code -11 is evidence a SIGSEGV (crash).
>
>This I don't know, but somebody may know (+ted):
>
> * Are we sure that the crash is happening in firefox.exe? Or is it
>   possible that some other process is crashing and taking down our
>   test harness with it?
> * Can somebody point to exactly what line of code in the test harness
>   collects the -11 code?

See Andrew's comments; I don't know

> * Is there no crash dump because the crash reporter is turned off?
> o If it's turned on, is the crash reporter itself crashing? Or is
>   the test harness not picking up the crash dump?

I doubt it's turned off; but it may for some reason be unable to catch it.

>> We *need* to find some solution to it -- even if it's to decide it's a
>> (safe) artifact of some underlying problem outside of our control.
>
>Is "we" you? Are you asking somebody else to help you with this, or own the
>problem completely?

I need help here, or preferably someone to own this, as I'm out of my
area of expertise trying to get stuff to run and debug it on the Try
VMs.  I have a loaner; my (very painful) attempts to run the same tests
with same packages there haven't reproed it.

Ryan will confirm that this has been messing with us all over the tree
heavily for at least a year.

It may be that they're all valid Shutdown bugs that get tagged to
"whichever test ran last" - if so fine, but we need *some* way to get
useful info out about what/why/where it crashed.

>> I'd far rather find a true cause and either fix or wallpaper it.  But right
>> now it's stopping me from landing some important code changes.
>>
>> On the plus side, I have a nice Try run which will cause it 100% of the
>> time - though when I tried to provoke it on a loaner Test VM after
>> painfully emulating what's needed to run tests, it wouldn't fail -- but
>> I don't trust that was a well-setup recreation of a real Try run.
>>
>> https://treeherder.mozilla.org/#/jobs?repo=try&revision=b2eb01359621
>>
>IIRC, there was recently a post about how you can submit a try job and have
>the VM stay alive afterwards for postmortem debugging. I don't
>remember/can't find the details right now

I think that's only for TaskCluster jobs; these aren't.

>Can we also submit a try job with rr enabled, and get a recording of the
>failure? That combination could lead to a pretty quick cause diagnosis of
>this, since it's Linux.

I'd love if we could do that, but rr means I *must* get into the machine
or a clone thereof to run rr replay.

>Also, does this failure happen if you disable all the tests except for the
>one which is permafailing
>(dom/media/tests/mochitest/identity/test_setIdentityProviderWithErrors.html)?
>If so, that would make it easier to record and debug.

I've tried disabling the failing test, and it just moves to a previous
test, and disabled that and it moved again.  It is most certainly
permafailing, but it's also clearly touchy timing-wise, so disabling too
much may make it disappear.  May be worth trying, but has a really long
cycle time.

Thanks!
-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Exit code -11 must die

2016-02-27 Thread Randell Jesup
We have a raft of intermittent oranges where we get "exit code -11".

I'm trying to land a patch, which is causing a largely-unrelated set of
tests (identity) to generate the infamous exit code -11 problem (there's
an existing intermittent filed against that test; it had disappeared for
months until recently it popped up a bit).  Note that this failure
*never* results in a crashdump, and I've never seen it locally, just in
Automation.

My changes apparently tweak the timings just a little, and whichever
test runs last in the identity suite permafails - but only on linux64
opt non-e10s (at lower frequency in linux32-opt-non-e10s).  Adding a
patch which changes the order of some releases and what waits on what
causes it to go from permafail to 10-25%.

However, this error is not confined to the code I'm playing with.
Currently a simple search for exit code yields ~75ish open intermittents
that mention exit code -11: http://mzl.la/1TGByje

We need to either get a handle on these, or decide we don't care.  These
have been widespread in the tree for a year now.

There are many more closed as WorksForMe since the test that gets
'tagged' for the failure seems to not have much to do with the
cause. Either it's something about channel/etc shutdown (like mine
appears to be), or it's a landmine, or it's just timing.  Often it will
tag a test for a while, then disappear/move to another test and perhaps
not come back (in that test; eventually people close the bug as WFM).

We *need* to find some solution to it -- even if it's to decide it's a
(safe) artifact of some underlying problem outside of our control.  I'd
far rather find a true cause and either fix or wallpaper it.  But right
now it's stopping me from landing some important code changes.

On the plus side, I have a nice Try run which will cause it 100% of the
time - though when I tried to provoke it on a loaner Test VM after
painfully emulating what's needed to run tests, it wouldn't fail -- but
I don't trust that was a well-setup recreation of a real Try run.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b2eb01359621

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Presto: Comparing Firefox performance with other browsers (and e10s with non-e10s)

2016-02-12 Thread Randell Jesup
>TL;DR - Firefox does pretty well when compared to Chrome.

Excellent!  Cool stats!

>We've begun this project by doing a set of performance runs on
>WebPageTest.com in order to compare Firefox's performance against Chrome on
>the home pages of the Alexa 100 (the top 100 web sites worldwide). We've
>made a visualization tool which plots all of the runs on a graph, so we may
>easily compare them:
>
>http://valenting.github.io/presto-plot/plot.html
>
>- click on a domain link to see results for each site.
>- you are able to view the results sorted or unsorted, or filter by the
>browser version
>- you can also compare metrics other than load time - such as speed
>Index, number of DNS queries, etc
>- you can also compare the browsers on several connectivity profiles
>(Cable, 3G, Dialup)
>- You can click on each individual point to go to the WPT run and view
>the results in greater detail

What does "run index" mean in the graphs?  The values appear to be
sorted from best to worst; so it's comparing best to best, next-best to
next-best, etc?  Ah, and I see "sorted" undoes the sorting.

I'd think displaying mean/median and std-deviation (or a bell-curve-ish)
might be easier to understand.  But I'm no statistician. :-)  It also
likely is easier to read when the numbers of samples don't match (or you
need to stretch them all to the same "width"; using a bell-curve plot of
median/mean/std-dev avoids that problem.

Thumbnails, or columns on the right for each selected browser with
median (or mean), with the best (for that site) in green, the worst in
red would allow eyeballing the results and finding interesting
differences without clicking on 100 links...  (please!)  Or to avoid
overloading the page, one page with graphs like today, another with the
columns I indicated (where clicking on the row takes you to the graph
page for that side).

>--
>Error sources:
>
>Websites may return different content depending on the UA string. While
>this optimization makes sense for a lot of websites, in this situation it
>is difficult to determine if the browser's performance or the website's
>optimizations have more impact on the page load.

Might be interesting to force our UA for a series of tests to match
theirs, or vice-versa, just to check which sites appear to care (and
then mark them).

Great!  It'll be interesting to track how these change over time as well
(or as versions get added to the list).  Again, medians/means/etc may
help with evaluating and tracking this (or automating notices, ala Talos)

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: nsThread now leaks runnables if dispatch fails

2016-01-05 Thread Randell Jesup
>> In bug 1218297 we saw a case where dispatch to a thread (the socket
>> transport service thread in this case) fails because the thread has
>> already shut down.  In our brave new world, nsThread simply leaks the
>> runnable.  It can't release the reference it holds, because that would
>> reintroduce the race condition we wanted to avoid, and it can't
>> release the reference on the correct thread so it's already gone away.
>>
>
>I am a bit confused with this, if the executing thread has already shut
>down, why would releasing the reference dispatching thread holds
>reintroduce the race condition? Who is racing with dispatching thread?

It doesn't re-introduce a race.  What it may do is cause the same effect
as a lost race, whereby an object meant to be released on the target
thread gets released on the sending thread.  If the object is a
non-threadsafe object (say, for example, a JS success or failure
callback object), then releasing it on the sending thread is Bad.  This
is the same thing that can happen in a race where you have:

What used to happen:
  {
RefPtr bar = new FooRunnable(nonthreadsafe_object);
target_thread->Dispatch(bar, ...);
// Note: Dispatch in this (old-style) case takes a new reference to
// bar.  And the runnable may run and release that reference before
// Dispatch() returns!  And if Dispatch() fails, you always 'lose'
// the race.
  }
  // bar drops the reference.  If the runnable ran already, this is
  // the last reference and it gets destroyed here.  If it hasn't run,
  // it gets destroyed on the target thread.

New way (already_AddRefed<>):
  {
RefPtr bar = new FooRunnable(nonthreadsafe_object);
target_thread->Dispatch(bar.forget(), ...);
// bar is always destroyed on the target thread - unless
// Dispatch() fails, in which case it leaks (for safety).
  }

or:
  other_thread->Dispatch(do_AddRef(new FooRunnable(nonthreadsafe_object)),
 ...);

If you use the old-style (raw ptr Dispatch) as a lot of the tree still
does, it will now leak on Dispatch() failure instead of possibly
wrong-thread releasing:

nsThread.h:
  nsresult Dispatch(nsIRunnable* aEvent, uint32_t aFlags) {
return Dispatch(nsCOMPtr(aEvent).forget(), aFlags);
  }

Since Dispatch(already_AddRefed<>,...) leaks on failure, this means that
Dispatch(raw_ptr,...) now leaks on failure (which is what engendered
this discussion).  You can override this behavior with a check of the
result of Dispatch() and a manual Release if it fails (and you know
the object is entirely threadsafe).

We could add DispatchFallible() (DispatchNeverLeaks()?) if that's a
wide-enough pattern; part of my question in that case is "did the caller
really expect Dispatch to possibly fail, and what implications does such
a failure have to the rest of the system?"

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: nsThread now leaks runnables if dispatch fails

2016-01-04 Thread Randell Jesup
oMainthread; this has forced a number of
tricks or bits of code to handle Dispatch failure.

Once upon a time we didn't do very much with other threads, and also
didn't pass JS references around, etc.  We use threads a lot more and in
much more complex ways now.

As Nathan said in the bug:
> Well, the runnables themselves are certainly safe to refcount
> anywhere, but the contents of the runnables might not be...

Sec bugs and crashes are worse than leaks, and they can occur
if we have non-testable random off-thread releases, which is what we
had, and still do for things not switched to already_AddRefed<>.

I'd be good with flagging/asserting on Dispatch failures *before* shutdown
begins as a step to making it infallible, since those have much more
potential to be logic flaws and/or create accumulating leaks.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Disabling C++ tests by default?

2015-10-02 Thread Randell Jesup
>On Thu, Oct 1, 2015 at 10:25 PM, Mike Hommey  wrote:
>
>> On Thu, Oct 01, 2015 at 10:10:39PM -0700, Gregory Szorc wrote:
>> > Currently, the Firefox build system builds C++ tests by
>> > default. This adds extra time to builds for something that a
>> > significant chunk of developers don't care about because they don't
>> > run them.
>> >
>> > An easy way to get faster builds is to disable C++ tests by default
>> > (requiring a mozconfig entry to enable them - which would be enabled in
>> > automation, of course). A more involved solution is to build them on
>> > demand, when tests are executed (unless a build option says to build them
>> > like today).
>> >
>> > I was curious if Gecko developers (the audience I perceive that
>> > cares about them the most) would be generally opposed to disabling
>> > building C++ tests by default. If not, we can go with an easy
>> > solution (and require people who care to add a mozconfig entry). If
>> > so, we go with the harder solution.
>> >
>> > Is disabling building C++ tests by default a reasonable change?
>>
>> I don't think it is worth the effort. Specifically, we're not really
>> equiped to actually do it currently, and all in all, it's not taking that
>> much of the build time relative to all the other stuff. The one C++ test
>> that has a significant build time impact is gtest, and we're already
>> skipping it.

This is a relevant comment.

>> I'm more interested in growing the faster-make backend to deal with C++
>> compilation in a sane way than working around the recursive-make
>> deficiencies like this.
>
>+10.
>
>When I do build Gecko, I need the C++ unit tests, so this wouldn't save
>me time at all. Are these tests being built every time someone does an
>incremental build or just when someone changes something in the test?

They're being *built* each time if anything they depend on changes, I
believe, on every incremental build.  Ones that don't depend on anything
that changed don't seem to be built.

Most devs who build/rebuild firefox will not run the cpp unittests from
that build.  Even those of us who do use them don't generally do it on
each build cycle for testing something locally; often it's more of a
pre-checkin pass.  So disabling (or deferring) building them locally on
"./mach build" could be a small win for 99% of developers - though a
very small one usually; rebuilding the signaling unit tests after a file
they depend on changed (5 separate unittests) appears to have taken
around 0.05 seconds.  If I touch all the cpp files for the
webrtc/mtransport tests (rare outside of clobber), it adds about 1.5-2
seconds (local 4-year-old XEON desktop linux, building >20 unittests).

For a clobber build rebuilding all the cpp unittests for the tree could
be noticable however.

Because the win is small, it shouldn't be a priority, but requiring
something like --enable-cpp-tests should be ok to do (and of course
automation would do that) if/when someone finds time.

>If the latter, then this seems like it's of pretty small value since
>most builds are incremental.  If the former, then this seems to point
>to a problem in the build system.

It appears to be the latter; I just poked something in editor and it
didn't rebuild any of the cpp unittests.

>Reading Sylvestre's mail, it seems like there is some thought to
>disable them on at least some automation? I would be strongly opposed
>to this. C++ unit tests are the main test suite for WebRTC, so this
>would be bad.

Sylvestre seems to be thinking about moving them to a separate suite,
and disabling *running* them in every cycle of automation (which wasn't
the request above).  Also, they are already a separate testsuite
(cppunit) IIRC; they no longer run as part of the build itself.

We very much need our cppunit tests to get run when things that might
break them change.  That most certainly isn't everything, note.

One smart thing that *could* be done for automation (and save a *bunch*
of VM/tester time) would be to not re-run cppunit tests that didn't get
rebuilt - since inbound/etc do incremental builds by default, many
checkins rebuild few or none of the cppunit tests, so running the same
executable again has limited value (not none, but limited).

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to ship: Canvas CaptureStream

2015-09-18 Thread Randell Jesup
>On Fri, Sep 18, 2015 at 9:02 PM, Ms2ger  wrote:
>
>> That bug does not seem to support the claim that they intend to ship
>> this feature. No intent-to-ship is mentioned, and it appears the
>> intent-to-implement has not been sent.
>>
>
>Here's evidence they intend to implement:
>https://docs.google.com/document/d/1JmWfOtUP6ZqsYJ--U8y0OtHkBt-VyjX4N-JqIjb1t78/edit#heading=h.2ki8zk71f7w2

The working group originally gave broad consensus to this a year ago at
an interim with Google and Microsoft in agreement there.  We just held
another interim and covered this, with focus on nits in the wordings and
a few surprises we found in implementing it which needed minor
adjustments.  Again, Google and Microsoft were directly involved, and
there was no opposition to moving forward and finishing it.

Everyone I've talked to in the WG involved expects it to be widely
implemented (and I believe Edge will be implementing it; MS has been
involved all along, and have recently started doing a lot of
implementation work related to WebRTC including VP8 and Opus support.
They're very clearly focused on using WebRTC/ORTC to implement
conferencing solutions (i.e. Lync and Skype), and this feature helps
with that.)

We showed demos of canvas.captureStream() at KrankyGeek in SF last week,
to major interest from developers, and at the event fielded questions
from Google engineers and Google's lead product person for WebRTC about
the interaction with DRM (which is fine; it can't be used to break EME).

I think it's clear that Google intends to implement and ship it, and I
believe Microsoft will do likewise in Edge.  It adds a ton of power to
WebRTC and really (along with the associated
mediaElement.captureStream/etc) fills out the "ecosystem' of media
routing and ability to transform it (as we can with WebAudio &
MediaStreams).

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Announcing the Content Performance program

2015-06-28 Thread Randell Jesup
>> I was under the impression that because e10s is only a single process for
>all content (at least right now) a background tab can still negatively
>affect the foreground tab.
>
>That's right, but we also tested e10s in the process-per-tab configuration

There are other options for background tab effects on foreground, though
all have some sort of impact. (For example, you could 'freeze'
background tabs more aggressively than we currently do.  This may have
some impact on functionality - we already limit timeouts, but IIRC we
don't turn them off entirely, and other events still occur that we could
defer until the tab is touched or ignore - but again, that could cause
sites to break or make switching back problematic/annoying.  However,
I'm out of touch with the exact state of our 'freezing' of background
tabs.  And there are certainly sites where they'll eat your performance
(eventually) unless you block everything.

>> Have we ever considered building something like the unload tab addon into
>the platform or Firefox directly?
>
>We have talked about it (BarTab Heavy is another example) and the code that
>Yoric wrote for measuring per-compartment main-thread CPU-usage in
>about:performance could be used for this. It's unclear how to prioritize it
>though because doing 100% reliable heavy tab detection will require
>non-trivial effort (see issues with slow-addon info bar) and the background
>tab problem will mostly be mitigated by process-per-tab e10s.

The problem with auto-unloaders is that you don't want that for some
sites - real context or data can be lost.  Yes, you can let users
white-list things, but only (some) power users would even realize that
this is an option or needed.  I think serious care needs to be given
when thinking of this as a default (or even as a particularly easy option).

I think flagging sites as causing slowdown/etc is a great thing and good
first step, which puts the control in the user's hands.  A nice fire
animation on the tab perhaps?  ;-) Seems we used to have those in
tinderbox

>> https://addons.mozilla.org/en-US/firefox/addon/unloadtab/
>>
>> """
>> An essential add-on for power users that have many tabs open. Curb your
>> resource usage by unloading tabs that you haven't visited for a while. An
>> unloaded tab is restored back to its original state when you need it again.
>> """

"original state" likely doesn't include sites that dynamically modify
themselves, which is a lot of sites.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: SpiderMonkey and XPConnect style changing from |T *p| to |T* p|

2015-05-29 Thread Randell Jesup
>On 3/30/15 6:08 PM, Kan-Ru Chen (陳侃如) wrote:
>>>> >>Is there an hg or git equivalent to Perforce's "Time-Lapse View" tool? It
>>>> >>shows a global view of a file's blame history and you can visually 
>>>> >>"scrub"
>>>> >>through the blame history.
>>>> >>
>>> >
>>> >You mean like git log -p FILE? I'm sure there are fancier graphic ways to
>>> >visualize it, but that's the way I usually scrub my way back through
>>> >restyles etc.
>> I think Chris might want to follow the history of a particular line,
>> that is usually what we use flame for. I find |git gui blame |
>> very useful.
>
>Just to clarify what I meant by "visually scrub through blame history",
>here is a video demo of Perforce's GUI:
>
>https://youtu.be/r1Zk5IaHZBQ?t=1m4s

*WANT* :-)  Seriously, is anything close to this available for hg or git?

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Replacing PR_LOG levels

2015-05-26 Thread Randell Jesup
>On Sat, May 23, 2015 at 4:46 AM, Randell Jesup  wrote:
>> This is used extensively in WebRTC and related media bits to enable
>> *huge* amounts of debugs (like every-frame debugs for audio or video or
>> per-network-packet debugs, which will swamp a system normally), and since
>> people are used to enabling "debug" on random modules (or all:5), having
>> verbose debugs in the "normal" :5 setting will cause pain.
>
>Can this be controlled some other way? Via a #define? Via another
>environment variable?

Definitely *not* a #define.  That makes it close-to-useless.

>I ask because in discussions with Eric I've been encouraging him to
>reduce the number of logging levels as much as possible. It would be
>sad to have to complicate the general mechanism because one module
>does things differently to the rest of the system.

My point (as repeated here in other posts) is that you're reducing
complexity (in number of log levels) by adding complexity elsewhere (and
more total complexity, IMHO).  I'd consider your suggestion above an
example of "complicating the general mechanism", but doing it indirectly
so it can *appear* to be simple (but isn't really).

This isn't a single module using values above DEBUG - this is *all* over
the media modules in different debug tags.  Audio, MediaDecoder,
MediaStreamGraph, MediaManager, WebM, etc.  Also, the current patch
misses a number of uses like MediaEngine:

#define LOG(msg) PR_LOG(GetMediaManagerLog(), PR_LOG_DEBUG, msg)
#define LOGFRAME(msg) PR_LOG(GetMediaManagerLog(), 6, msg)

(yes, this uses 6, such that the common habit of blah:5 logging doesn't
turn on the spam - you have to *want* it)

I like fixing up logging!  I just think shoehorning a level *under*
DEBUG confuses things, while adding a level *above* DEBUG would let up
standardize/regularize current practice (and avoid the (temporary)
"which version am I using" problem.)

/me stops arguing having said his piece

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Replacing PR_LOG levels

2015-05-26 Thread Randell Jesup
>Thanks Adam, these are good points.
>
>On Friday, May 22, 2015 at 2:08:33 PM UTC-7, Adam Roach wrote:
>> On 5/22/15 15:51, Eric Rahm wrote:
>> > I agree, we shouldn't make it harder to turn on logging. The easy
>> > solution is just to add a separate logger for verbose messages (if we
>> > choose not to add Verbose/Trace).
>> 
>> I don't know why we wouldn't just add a more verbose log level (Verbose, 
>> Trace... I don't care what we call it). The presence of "DEBUG + 1" in 
>> our code is evidence of a clear, actual need.
>
>This was my thought as well initially (my first proposal had a Verbose
>level). After auditing the code I changed my mind.
>
>I believe adding the Info level will serve this need. The feeling I got
>from most code that did 'DEBUG + 1' was that they needed to use DEBUG for
>the informational stuff (because there was no Info level) and wanted a
>lower level for the more spammy stuff (DEBUG was occupied, so fine DEBUG +
>1).
>
>This was exemplified with the many variations of:
>  #define LOGI(...) PR_LOG(..., PR_LOG_DEBUG)
>  #define LOGV(...) PR_LOG(..., PR_LOG_DEBUG + 1)
>
>Note: For LOGI, to me at least, 'I' => Info
>Flipside of course the 'V' => Verbose, but I think this a semantic
>issue. Debug and Verbose are essentially the same thing now that we have
>Info.

Only if you avoid the "which module debug pattern am I in" question by
switching *all* non-verbose uses of PR_LOG_DEBUG to PR_LOG_INFO
(basically remap DEBUG to INFO, so that verbose logging can own the
"DEBUG" level... but this seems way overkill (and liable to cause
horrible ongoing confusion) compared to allowing a "PR_LOG_VERBOSE" level. 

PR_LOG_VERBOSE is clear and understandable.  Modules where normal debugs
are on INFO and you-likely-don't-want-it is on DEBUG will cause
ongoing mental anguish and mismatch, and confusion for new contributors.

>We can see this ambiguity in various logging frameworks, generally there's
>a debug, but if there's not there's a verbose/trace, and usually not both.

Though Android has both.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Replacing PR_LOG levels

2015-05-26 Thread Randell Jesup
>Thanks Randell, these are good points. I'll address a few of your comments
>that have not already been covered in the rest of the conversation.
>
>> This is used extensively in WebRTC and related media bits to enable
>> *huge* amounts of debugs (like every-frame debugs for audio or video or
>> per-network-packet debugs, which will swamp a system normally), and since
>> people are used to enabling "debug" on random modules (or all:5), having
>> verbose debugs in the "normal" :5 setting will cause pain.
>> 
>
>Here lies half the problem, that number does not mean what we were led to
>think it means. Using :5 would include PR_LOG_DEBUG + 1 [1].
>
>> The above will also be surprising since it will work "differently" than
>> other modules, making the same sorts of debugs appear at different
>> levels.  This would be expecially confusing to people not frequently
>> working in the module or when external users are asked to turn on
>> debugging.
>
>This assumes log levels have been used consistently across modules, I can
>assure you that is not the case :) I hope that we can get to a more
>consistent state of course!

Sure - but having "debug" logs at DEBUG in module A, and INFO in module
B (with "verbose-you-only-want-them-if-you-want-insane-data" being at
DEBUG in B) will cause confusion.  It means "turn on logging for X"
requires knowledge of which of two patterns X follows.  It also
means all:4 (or all:debug) will cause suprising (and useless) results.
And when coding, it'll be easy to use the "wrong" pattern when switching
between modules.  I see serious pain in a two-model system.  And we have
real *need* for logging at a makes-the-browser-almost-unusable volume to
track down certain sorts of things.

Even splitting say mediamanager:5 into mediamanager:4 (or :debug) and
mediamanager_verbose:4 (or :debug) will have the same problem for all:.
And it will make getting verbose logging more painful for no real reason.

>I don't think it will be any more confusing when telling the external user
>what level to turn on (it's already rather confusing). Do we need super
>spammy? Turn on debug. Do we need just the basics? Turn on info.
> 
>> Then there's an oddball: webrtc.org "Trace" logging (a separate
>> subsystem with low-overhead buffered-IO circular-log maxed at 10MB),
>> which requires a 16-bit mask.  Currently this is exposed as
>> webrtc_trace:65535 (for example), with a separate env var telling it
>> where to put the logfile (or 'nspr' to revector the logs through NSPR
>> logging, which *really* will cause problems with realtime code but is
>> handy at times).  For this oddball, we could do other things like move
>> it to a separate env var (which is annoying to people using it, but not
>> horribly).
>
>I think it's probably best to keep this a separate env rather than trying to 
>shoehorn it in.

Understandable - but I've been trying to move in the opposite direction;
avoiding N different env vars (and also, if I can avoid that, I have
a better chance of leveraging on-the-fly log level changes without
having to reinvent the wheel).

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Replacing PR_LOG levels

2015-05-22 Thread Randell Jesup
>As part of the effort to improve logging in gecko we'd like to introduce a new 
>set of unified log levels.

>*PR_LOG_DEBUG + 1 aka log level 5*
>
>Various bits of code invented a log level that was less important than
>debug (I would call this verbose). This was not specified in NSPR logging,
>but just kind of worked. With the addition of |Info| we can transition code
>that was using this pseudo log level to:
>  - map PR_LOG_DEBUG => Info
>  - map PR_LOG_DEBUG + 1 => Debug
>
>In this case we could have added a |Verbose| log level, but with the
>addition of |Info| and feeling that fewer is better we have decided against
>that.

This is used extensively in WebRTC and related media bits to enable
*huge* amounts of debugs (like every-frame debugs for audio or video or
per-network-packet debugs, which will swamp a system normally), and since
people are used to enabling "debug" on random modules (or all:5), having
verbose debugs in the "normal" :5 setting will cause pain.

The above will also be surprising since it will work "differently" than
other modules, making the same sorts of debugs appear at different
levels.  This would be expecially confusing to people not frequently
working in the module or when external users are asked to turn on
debugging.

At minimum, I'd like to keep some form of Verbose (or VerboseDebug, or
what have you) to avoid this "surprise" factor.

The alternative would be to add more log types -- split signaling:6 into
signaling:debug,signaling_verbose:debug for example.  Somewhat annoying
but works.  I'd prefer a 'Verbose' setting.

Then there's an oddball: webrtc.org "Trace" logging (a separate
subsystem with low-overhead buffered-IO circular-log maxed at 10MB),
which requires a 16-bit mask.  Currently this is exposed as
webrtc_trace:65535 (for example), with a separate env var telling it
where to put the logfile (or 'nspr' to revector the logs through NSPR
logging, which *really* will cause problems with realtime code but is
handy at times).  For this oddball, we could do other things like move
it to a separate env var (which is annoying to people using it, but not
horribly).

>*How will log levels be controlled?*
>
>Similar to how it works now (if not terribly clear), turning on:
>  - LogLevel::Error   => Error messages will be printed
>  - LogLevel::Warning => Error and Warning messages will be printed
>  - LogLevel::Info=> Error, Warning, and Info messages will be printed
>  - LogLevel::Debug   => Error, Warning, Info, and Debug messages will be 
> printed

what's the interface to NSPR_LOG_MODULES?  still with the numerics, or
do we have to use
NSPR_LOG_MODULES=signaling:debug,mediamanager:debug,getusermedia:debug,mtransport:debug,mediastreamgraph:debug
etc?  (That gets a bit old... not that numerics are "better", but
they're faster to type/read and lots of existing scripts/etc have them.)
Obviously one could have numbers and names.

>As a future improvement we plan to add the ability to toggle this levels at 
>runtime.

This would be wonderful!  Been looking for that for years (and years).

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Runnables and thread-unsafe members

2015-04-14 Thread Randell Jesup
(was: Re: Proposal to ban the usage of refcounted objects inside C++
lambdas in Gecko)

tl;dr: We should make DispatchToMainThread take already_AddRefed and
move away from raw ptrs for Dispatches in general.

So:

What I want to avoid is this pattern for runnables that hold
thread-restricted pointers (mostly MainThread-restricted, such as JS
callbacks):

 FooRunnable::FooRunnable(nsCOMPtr& aFoo)
 {
   mFoo.swap(aFoo); // Never AddRef/Release off MainThread
 }
...
{
  nsRefPtr foo = new FooRunnable(myFoo);
  // RefCnt == 1, myFoo.get() == nullptr now
  NS_DispatchToMainThread(foo); // infallible (soon), takes a ref to foo 
(refcnt 2)

  // XXX what is foo's refcnt here?  You don't know!
}

The reason is that foo may go out of scope here *after* it has Run() and
had the event ref dropped on mainthread.  I.e.: at the XXX comment, it
may "usually" be 2, but sometimes may be 1, and when foo goes out of
scope we delete it here, and violate thread-safety for mFoo.

Blindly taking an NS_DispatchToMainThread(new FooRunnable) and
converting it to the pattern above (with nsRefPtr) will introduce a
timing-based thread safety violation IF it holds thread-locked items
(which isn't super-common).  But also, it's extra thread-safe
addref/releasing when we don't need to.

A safe pattern is this:
{
  nsRefPtr foo = new FooRunnable(myFoo);
  // RefCnt == 1, myFoo.get() == nullptr now
  // This requires adding a version of
  // NS_DispatchToMainThread(already_AddRefed) or some such
  NS_DispatchToMainThread(foo.forget()); // infallible (soon)
  // foo is nullptr
}

And to be honest, that's generally a better pattern for runnables
anyways - we *want* to give them away on Dispatch()/DispatchToMainThread().

Note that you can also do ehsan's trick for safety instead:

   ~FooRunnable()
   {
 if (!NS_IsMainThread()) {
   // get mainthread
   NS_ProxyRelease(mainthread, mFoo);
 }
   }

I don't love this though.  It adds almost-never-executed code, we
addref/release extra times, it could bitrot or forget to grow new
mainthread-only refs.  And it wastes code and adds semantic boilerplate
to ignore when reading code.

You could add a macro to build these and hide the boilerplate some, but
that's not wonderful either.

So, if I have my druthers:  (And really only #1 is needed, probably)

1) Add already_AddRefed variants of Dispatch/DispatchToMainThread, and
   convert things as needed to .forget() (Preferably most
   Dispatch/DispatchToMainThreads would become this.)  If the Dispatch()
   can fail and it's not a logic error somewhere (due to Dispatch to a
   doomed thread perhaps in some race conditions), then live with a leak
   - we can't send it there to die.  DispatchToMainThread() isn't
   affected by this and will be infallible soon.

2) If you're building a runnable with a complex lifetime and a
   MainThread or thread-locked object, also add the ProxyRelease
   destructor.  Otherwise consider simple boilerplate (for
   MainThread-only runnables) to do:
   
 ~FooRunnable() { if (!NS_IsMainThread()) { MOZ_CRASH(); } }

   (Perhaps DESTROYED_ON_MAINTHREAD_ONLY(FooRunnable)).  Might not be a
   bad thing to have in our pockets for other reasons.
   
3) Consider replacing nsCOMPtr mFoo in the runnable with
   nsReleaseOnMainThread> mFoo and have
   ~nsReleaseOnMainThread() do NS_ProxyRelease, and also have
   nsReleaseOnMainThread block attempts to do assignments that AddRef
   (only set via swap or assign from already_AddRefed).  Unlike the
   MainThreadPtrHolder this could be created on other threads since it
   never AddRefs; it only inherits references and releases them on MainThread

I wonder if we could move to requiring already_AddRefed for
DispatchToMainThread (and Dispatch?), and thus block all attempts to do
DispatchToMainThread(new FooRunnable), etc.  :-)

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: PSA: Network 'jank' - get your blocking IO off of STS thread!

2015-03-26 Thread Randell Jesup
>Can we stop exposing the socket transport service's nsIEventTarget outside
>of Necko?

If we move media/mtransport to necko... or make an exception for it (and
dom/network/UDPSocket and TCPSocket, etc).  Things that remove loaded
footguns (or at least lock them down) are good.

Glad the major real problem was too-similar-names (I'd never heard of
STREAMTRANSPORTSERVICE (or if I had, it had been long-forgotten, or
mis-read as SOCKETTRANSPORTSERVICE)).

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: FYI: Serious windows bug affecting WebRTC when mem use is over 2GB

2015-03-26 Thread Randell Jesup
>Force a buffer in <2GB memory and always copy into/out of that buffer?

That may work, other than it's inconvenient to force a buffer into <2GB
space at the time when you need it, and thus we'd have to perma-allocate
a "large enough" buffer for this.  (Note GetAdapters*() is typically used
with a first call to get the size, then a second to get the real data,
or with a 16K-ish buffer then redone if that isn't enough.)

Each user of GetAdapters*() (I don't think we have any uses of
GetHostByName/Address) would need to pre-allocate a buffer during
startup probably (or use a static buffer).  You still would have to
dynamically allocate one if the static buffer wasn't big enough, and
that could then fail of course.

Better than turning off >2GB memory, though.

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


PSA: Network 'jank' - get your blocking IO off of STS thread!

2015-03-25 Thread Randell Jesup
As a result of some conversations on IRC, it came to light that a bunch
of blocking IO operations were apparently moved off of MainThread
(presumably to avoid UI 'jank'), and instead Dispatched to
STREAMTRANSPORTSERVICE (aka STS thread).

Some examples pointed out to me: FilePicker, the spell-checker, the
DeviceStorage DOM code, DOM cache code in Manager.cpp (via
BodyStartWriteStream()), even perhaps ResolvedCallback in
ServiceWorkers. (I haven't looked closely at all of the uses yet.)

Good, right?  No jank in the UI!  No - now the jank is in the network
code!

STS is the primary network thread, and blocking that without a Darn Good
Reason will cause all sorts of negative effects.  Page loads, XHR,
WebSockets, WebRTC (which will react by adding delay to cover the
'jitter' in apparent network RTT) and many other things will have
small-to-large negative impacts.

Perhaps some of these listed above (and others) don't block or even have
a legitimate need to run on STS thread - ok, if so, comment as to why
it's ok/needed.  Everyone else should not be using STS thread as a
convenient target

It does point out that with the need to get a lot of blocking operations
off of MainThread, it would help other modules to have a single service
or ThreadPool for dumping such operations to.  This would mean less
code, less incorrect/undone thread shutdowns/etc. Thoughts?  (Perhaps
such a service could use LazyIdleThreads, which will shut themselves
down after inactivity.)

-- 
Randell Jesup, Mozilla Corp
remove "news" for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


  1   2   >