Re: Can we remove nsIEntityConverter?

2016-05-01 Thread Frédéric Wang
Le 01/05/2016 02:16, smaug a écrit :
> What would source view for mathml look like if we removed
> nsIEntityConverter?

AFAIK, the only point is to replace things like "∑" with "∑"
in order to make it more readable. However, with appropriate fonts
installed I think reading "∑" is also fine.

Le 30/04/2016 12:25, Henri Sivonen a écrit :
>  In desktop Firefox, these data tables are used only for the
> MathML View Source feature.

Personally, I don't really use this feature, as I find the DOM inspector
or the "MathML Copy" add-on (*) more convenient to check or copy a
MathML formula.

I guess we can move this feature from the Desktop front-end to a
separate Add-on (that could potentially work on mobile too in the
future). However, I can't speak for the users. Maybe we should write to
the Math WG mailing list in order to get more feedback.

(*) https://addons.mozilla.org/en-US/firefox/addon/mathml-copy/

-- 
Frédéric Wang
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Static analysis for "use-after-move"?

2016-05-01 Thread Gerald Squelart
On Monday, May 2, 2016 at 9:49:24 AM UTC+10, Jim Blandy wrote:
> On Fri, Apr 29, 2016 at 4:43 PM, Gerald Squelart  wrote:
> 
> > For example, we know how strings behave when moved from* (the original
> > becomes empty), and it'd be nice to be able to use that trick when possible
> > and really needed.
> >
> 
> No, we don't know that. The contract of a move in C++ is simply that the
> source object is safe to destruct, but otherwise in an undefined state. You
> must not make any assumptions about its value.

"contract of a move" -- Where is this written? (Really: If there is an official 
position on this, I'd like to see it!)

The little bit of legalese I could find was about the std:
"""
17.6.5.15 Moved-from state of library types [lib.types.movedfrom]
Objects of types defined in the C++ standard library may be moved from. Move 
operations may be explicitly specified or implicitly generated. Unless 
otherwise specified, such moved-from objects shall be placed in a valid but 
unspecified state.
"""

"Valid but unspecified" is quite different from the loaded word "undefined".
"Valid" to me means that operations should still be possible on the object, but 
the results will just depends on what's left in there after the move.

And one specific example:
"""
30.3.1 Class thread [thread.thread.class]
[...] Objects of class thread can be in a state that does not represent a 
thread of execution. [ Note: A thread object does not represent a thread of 
execution after default construction, after being moved from, or after a 
successful call to detach or join. -- end note ]
"""
This to me implies that a 'thead' object could be moved from, and after that 
its state is valid but does not actually represent a thread of execution.

> It is not always the case that the fastest move implementation leaves the
> source empty. For example, if the string is using inline storage, then a
> move would need to take extra steps to clear the original.

Agreed, sometimes copy is faster, e.g., for PODs, or yes, objects with inline 
storage.

> You write about "us[ing] that trick when possible and really needed", when
> what you're actually saying is "let's depend on undefined behavior." That
> approach is common, and its history is not pretty.

My (more-restricted) point from the following post is that *we* can decide what 
some objects will contain after a move, and we could work with that *defined* 
behavior where possible...

And because it could be misused as you fear, we could introduce annotations to 
ensure that that kind of use is controlled. E.g. Mark reusable-after-move 
objects as such, and use a different word than 'Move' when making these objects 
xvalues. And use type traits so optionally take this optimized path from 
generic functions.


Thinking of it, I suppose lots (all?) of these optimized content-stealing 
actions could be done through differently-named methods (e.g. 'Take()'), so 
they could not possibly be confused with C++ move semantics.
So I could live with a communal decision to forbid any use-after-move ever.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Static analysis for "use-after-move"?

2016-05-01 Thread Jim Blandy
On Fri, Apr 29, 2016 at 4:43 PM, Gerald Squelart  wrote:

> For example, we know how strings behave when moved from* (the original
> becomes empty), and it'd be nice to be able to use that trick when possible
> and really needed.
>

No, we don't know that. The contract of a move in C++ is simply that the
source object is safe to destruct, but otherwise in an undefined state. You
must not make any assumptions about its value.

It is not always the case that the fastest move implementation leaves the
source empty. For example, if the string is using inline storage, then a
move would need to take extra steps to clear the original.

You write about "us[ing] that trick when possible and really needed", when
what you're actually saying is "let's depend on undefined behavior." That
approach is common, and its history is not pretty.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: ICU proposing to drop support for WinXP (and OS X 10.6)

2016-05-01 Thread Jim Blandy
What are the distributions of memory and flash sizes for the devices people
currently run Fennec on? It'll be almost impossible to have a good
discussion about Fennec size without those numbers. I seem to remember that
is data we felt was okay to collect.





On Sun, May 1, 2016 at 2:21 PM, Boris Zbarsky  wrote:

> On 4/29/16 11:30 AM, sn...@snorp.net wrote:
>
>> The Fennec team has been very clear about why they oppose inclusion of
>> ICU in bug 1215247.
>>
>
> Sort of.  There's been a fair amount of moving of goalposts to get from
> https://bugzilla.mozilla.org/show_bug.cgi?id=1215247#c14 to
> https://bugzilla.mozilla.org/show_bug.cgi?id=1215247#c43 as far as I can
> tell.
>
> I sympathize with the Fennec team's position here: The amount of code in
> libxul keeps growing (not always by as little as possible, I agree!) as we
> add support for more stuff the web is coming to depend on, but some of the
> features being added are perhaps not a big deal in the markets that want a
> small APK download.  It's not clear to me who (if anyone) knows what
> features these are; clearly the JS Intl API (yes, not the only reason to
> include ICU) is one of them, but are there others we've identified?
>
> Of course https://bugzilla.mozilla.org/show_bug.cgi?id=1215247#c43 more
> or less flat-out disagrees with the suggestion that we should have fewer
> Gecko configurations, on a much broader front than ICU support...
>
> I know we have places where we use more space than we should in Gecko, and
> in particular some places where we have traded off space for speed by
> having largish static data tables instead of more dynamic checks... not to
> mention having static bindings code instead much smaller dynamic XPConnect
> code.  This tradeoff was very conscious, akin to Fennec's decision to not
> compress .so, but may have been the wrong one for Fennec in practice.
>
> If we, as an organization, really want to try to reduce the size of the
> Fennec APK, and are actually willing to put platform resources into it
> (which requires either hiring accordingly or starving other goals, in the
> usual way), then we should do that.  So far I've unfortunately seen
> precious little willingness to staff such an effort appropriately.  :(
>
> This type of attitude is why we have people in the Firefox org wanting to
>> axe Gecko.
>>
>
> For the Android case, I expect the only viable replacement that hits the
> desired size limits would be an iOS-like solution, right?  That is, a UI
> using whatever browser engine is already installed on the device?
>
> Just to be clear as to what our real alternatives are here.
>
> The engineers in Platform consistently want to dismiss mobile-specific
>> issues
>>
>
> I think you're painting with a _very_ broad brush here.
>
> -Boris
>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Can we remove nsIEntityConverter?

2016-05-01 Thread Karl Tomlinson
Cross-posting to mozilla.dev.tech.mathml so that this is seen by
people who are interested.

Please follow-up to mozilla.dev.platform.

Henri Sivonen writes:

> We ship data tables for converting from Unicode to HTML entities.
> These tables obviously take space. (They are not optimized for space
> usage, either.) As far as I can tell, these tables are not used at all
> in Fennec. In desktop Firefox, these data tables are used only for the
> MathML View Source feature.
>
> Additionally, a subset of the tables is used by some XPCOM-based
> extensions, but those extensions seem to be obsolete or abandoned or
> don't seem to be using the feature for a very good reason.
>
> These data tables are not exposed to the Web Platform.
>
> In https://bugzilla.mozilla.org/show_bug.cgi?id=1048191 I proposed
> removing this for mobile only, but how about we just remove this
> altogether in order to make both Fennec and desktop Firefox smaller?
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: ICU proposing to drop support for WinXP (and OS X 10.6)

2016-05-01 Thread Boris Zbarsky

On 4/29/16 11:30 AM, sn...@snorp.net wrote:

The Fennec team has been very clear about why they oppose inclusion of ICU in 
bug 1215247.


Sort of.  There's been a fair amount of moving of goalposts to get from 
https://bugzilla.mozilla.org/show_bug.cgi?id=1215247#c14 to 
https://bugzilla.mozilla.org/show_bug.cgi?id=1215247#c43 as far as I can 
tell.


I sympathize with the Fennec team's position here: The amount of code in 
libxul keeps growing (not always by as little as possible, I agree!) as 
we add support for more stuff the web is coming to depend on, but some 
of the features being added are perhaps not a big deal in the markets 
that want a small APK download.  It's not clear to me who (if anyone) 
knows what features these are; clearly the JS Intl API (yes, not the 
only reason to include ICU) is one of them, but are there others we've 
identified?


Of course https://bugzilla.mozilla.org/show_bug.cgi?id=1215247#c43 more 
or less flat-out disagrees with the suggestion that we should have fewer 
Gecko configurations, on a much broader front than ICU support...


I know we have places where we use more space than we should in Gecko, 
and in particular some places where we have traded off space for speed 
by having largish static data tables instead of more dynamic checks... 
not to mention having static bindings code instead much smaller dynamic 
XPConnect code.  This tradeoff was very conscious, akin to Fennec's 
decision to not compress .so, but may have been the wrong one for Fennec 
in practice.


If we, as an organization, really want to try to reduce the size of the 
Fennec APK, and are actually willing to put platform resources into it 
(which requires either hiring accordingly or starving other goals, in 
the usual way), then we should do that.  So far I've unfortunately seen 
precious little willingness to staff such an effort appropriately.  :(



This type of attitude is why we have people in the Firefox org wanting to axe 
Gecko.


For the Android case, I expect the only viable replacement that hits the 
desired size limits would be an iOS-like solution, right?  That is, a UI 
using whatever browser engine is already installed on the device?


Just to be clear as to what our real alternatives are here.


The engineers in Platform consistently want to dismiss mobile-specific issues


I think you're painting with a _very_ broad brush here.

-Boris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: ICU proposing to drop support for WinXP (and OS X 10.6)

2016-05-01 Thread Henri Sivonen
On Sat, Apr 30, 2016 at 11:26 PM, L. David Baron  wrote:
> On Friday 2016-04-29 10:43 +0300, Henri Sivonen wrote:
> I still find it sad that ECMAScript Intl came (as I understand it)
> very close to just standardizing on a piece of software (ICU),

Looking at the standard, it seems intentionally vague about what data
sources are supported, and that's not good for a Web standard.
However, it seems to me that in practice there is no standardized
dependency on ICU but on the CLDR database maintained by Unicode.org.
In a C or C++ program, the easiest and least-NIH way to expose CLDR is
to use ICU like Google and Apple do and like we do on desktop. I'm not
sure what Microsoft does, considering that these days they are no
longer opposed to using open source software, but I believe that Edge
exposes CLDR via some non-ICU Microsoft-developed mechanism. So it
seems like there are two independent interoperable implementations as
far as code goes.

> and
> also find it disturbing that we're going to extend that
> standardization on a particular piece of software (possibly even
> more rigidly) into other areas.

As noted, these other areas are why I care about having ICU
unconditionally available on all platforms that Gecko runs on and why
I think it's harmful when ICU is blocked on one platform. Also, as
noted, I don't care about ICU per se, but I care about being able to
treat operations like Unicode normalization and locale-aware collation
as foundational operations whose availability or correctness is not
optional. I think it would be ideal if we had a library or set of
libraries written in Rust that provided this functionality, but until
such a library written in Rust shows up, ICU is the only option on the
table today that is (if bundled in Gecko in its latest version)
correct and cross-platform consistent.

I think it is harmful that we have to maintain abstractions for
foundational operations to support a configuration where the back end
isn't correct (to latest Unicode data) and cross-platform consistent.
Until Rust-based replacements show up, the most reasonable way to
perform operations that depend on Unicode.org data is to bundle ICU
and to call its APIs directly without abstraction layers in between.

Again, talking about ICU as just an enabler of the ECMAScript
Internationalization API is a bad framing for the issue, because it
makes it seem like blocking ICU "just" turns off one fairly recent Web
API. Yet, Gecko has needs for functionality exposed by ICU in various
places. For example:

 * Hyphenation, spellchecking, layout, gfx and JavaScript parsing need
access to the character properties of the Unicode database. Currently,
we duplicate ICU functionality (in out-of-date manner I believe) to
implement these in libxul.

 * Internationalized domain names and text shaping need Unicode
Normalization. Currently, we duplicate ICU functionality (in
out-of-date manner I believe) to implement these in libxul.

 * IndexedDB, XPath, XUL, SQL storage and history search UI use
locale-sensitive sorting. Currently, we duplicate ICU functionality on
Android for these by calling into the thread-unsafe C standard library
API for this stuff. This is fundamentally broken, because the design
of the C standard library is fundamentally broken: In the C standard
library, there's no to ask for comparison according to a given locale.
Instead, we set the locale process-wide (all threads!), then ask the C
standard library to do a comparison, and then unset the locale
process-wide.

 * Parts of the Firefox UI do locale-sensitive datetime formatting in
a way that calls to legacy platform APIs duplicating ICU function in a
manner that imports system-specific bugs.

 * Based on open bugs, it seems we duplicate ICU functionality for
bidi, but it's not clear to me if we're already building that part of
ICU anyway and the relative correctness is unclear to me.

I think it's neither good use of developer time nor holistic
management of product size in bytes to have this duplication sprinkled
around. (Though I don't believe that getting rid of the above
duplication of ICU functionality would add up to the size of ICU
itself: We should expect ICU to be a net addition to APK size in any
case.)

It's worth noting that the above items split into on one hand the
Unicode character property database and associated algorithms
(normalization, bidi, line breaking, script run identification) and on
the other hand the CLDR database and associated algorithms
(locale-sensitive sorting, date formatting, number formatting, etc.).
We have more foundational dependency needs on the former than the
latter, but the discussion about ICU size as well as the ECMAScript
Internationalization API exposure is mainly about the latter.

Again, ideally, we'd have a an actively-maintained Rust library for
the Unicode character property database and associated algorithms and
another actively-maintained Rust library the CLDR database and
associated algorithms. But absent