[gwt-contrib] Re: RFC: sharded linking

Brendan Kenny Thu, 11 Feb 2010 17:58:22 -0800

On Feb 11, 6:43 pm, Scott Blum <sco...@google.com> wrote:
> I have a few comments, but first I wanted to raise the point that I'm not
> sure why we're having this argument about maximally sharded Precompiles at
> all.  For one thing, it's already implemented, and optional, via
> "-XshardPrecompile".  I can't think of any reason to muck with this, or why
> it would have any relevance to sharded linking.  Can we just table that part
> for now, or is there something I'm missing?
>
> Okay, so now on to sharded linking itself.  Here's what I love:
>
> - Love the overall goals: do more work in parallel and eliminate
> serialization overhead.
> - Love the idea of simulated sharding because it enforces consistency.
> - Love that the linkers all run in the same order.
>
> Here's what I don't love:
>
> - I'm not sure why development mode wouldn't run a sharded link first.
>  Wouldn't it make sense if development mode works just like production
> compile, it just runs a single "development mode" permutation shard link
> before running the final link?
>
> - I dislike the whole transition period followed by having to forcibly
> update all linkers, unless there's a really compelling reason to do so.
>  Maybe I'm missing some use cases, but I don't see what problems result from
> having some linkers run early and others run late.  As Lex noted, all the
> linkers are largely independent of each other and mostly won't step on each
> other's toes.
>
> - It seems unnecessary to have to annotate Artifacts to say which ones are
> transferable, because I thought we already mandated that all Artifacts have
> to be transferable.
>
> I have in mind a different proposal that I believe addresses the same goals,
> but in a less-disruptive fashion.  Please feel free to poke holes in it:
>
> 1) Linker was made an abstract class specifically so that it could be
> extended later.  I propose simply adding a new method "linkSharded()" with
> the same semantics as "link()".  Linkers that don't override this method
> would simply do nothing on the shards and possibly lose out on the
> opportunity to shard work.  Linkers that can effectively do some work on
> shards would override this method to do so.  (We might also have a
> "relinkSharded()" for development mode.)
>
> 2) Instead of trying to do automatic thinning, we just let the linkers
> themselves do the thinning.  For example, one of the most
> serialization-expensive things we do is serialize/deserialze symbolMaps.  To
> avoid this, we update SymbolMapsLinker to do most of its work during
> sharding, and update IFrameLinker (et al) to remove the CompilationResult
> during the sharded link so it never gets sent across to the final link.
>
> The pros to this idea are (I think) that you don't break anyone... instead
> you opt-in to the optimization.  If you don't do anything, it should still
> work, but maybe slower than it could.
>
> The cons are... well maybe it's too simplistic and I'm missing some of the
> corner cases, or ways this could break down.
>
> Thoughts?
> Scott


If this is indeed the direction to go in (and I'm a big fan of the
goals as well), it's probably also worth making a more formal
definition for "won't step on each other's toes". As a use case, I'm
working on a PRE linker that (currently) removes CompilationResults,
alters them based on information collected from across all
permutations, and then emits new ones. Obviously this isn't ideal--its
expensive and CompilationResults were written to be (mostly)
immutable--but it's also perfectly acceptable within the current
design of the artifactSet/linker chain. The primary linker only cares
about the set of compilation results it receives, and if an earlier
linker altered them, it need never know.

It seems (and I could definitely be misinterpreting here) that in both
the simulated sharding procedure and Scott's alternate proposal, there
will be sections of primary and post linkers running before a non-
shardable pre linker. If that's true, then neither will be able to
fully honor the ordering of linkers when shardable and non-shardable
linkers are mixed. But, then again, when I started on this one I think
I could find only one other PRE linker in existence, so now would be
the time to change.

Continuing to think out loud, it seems that the way to alter my linker
is probably either to statically derive what all permutations will
need in every shard (as opposed to just having each triggered
generator emit an artifact and collecting them at the end), or keeping
that the same and creating a custom primary linker, which I was hoping
not to do as it would tend to limit adoption. If that's the largest
price to pay, though, the trade off would seem worth it.

-- 
http://groups.google.com/group/Google-Web-Toolkit-Contributors

[gwt-contrib] Re: RFC: sharded linking

Reply via email to