Re: tree-shaking a jarred Clojure app?

2009-11-21 Thread David Brown
On Fri, Nov 20, 2009 at 06:37:18PM +, Jim Downing wrote:

I might have misunderstood, but isn't the problem the same as in Java;
you can't know from a static analysis which classes are going to be
loaded?

Except that Clojure will load all of them so it can bind them to the
vars in each namespace.  Java code is usually much less dynamic, and
makes some static analysis a lot easier.

JVMs are pretty good about not loading classes until they are used, so
I think the real problem is that the init code for each namespace
loads all of the classes before it does anything.  If that could be
delayed, I suspect most of the startup delay would go away.  The trick
is figuring out how to do this without adding yet another level of
indirection to vars.

David

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: tree-shaking a jarred Clojure app?

2009-11-21 Thread John Harrop
On Sat, Nov 21, 2009 at 8:30 PM, David Brown cloj...@davidb.org wrote:

 On Fri, Nov 20, 2009 at 06:37:18PM +, Jim Downing wrote:

 I might have misunderstood, but isn't the problem the same as in Java;
 you can't know from a static analysis which classes are going to be
 loaded?

 Except that Clojure will load all of them so it can bind them to the
 vars in each namespace.  Java code is usually much less dynamic, and
 makes some static analysis a lot easier.

 JVMs are pretty good about not loading classes until they are used, so
 I think the real problem is that the init code for each namespace
 loads all of the classes before it does anything.  If that could be
 delayed, I suspect most of the startup delay would go away.  The trick
 is figuring out how to do this without adding yet another level of
 indirection to vars.


Are you talking about binding things like String.class to vars referenced by
symbols like String? All you need for that is a flag in the var's metadata
of loaded-yet?, next to the flag that says it's a Java class and not a
normal Clojure var. The held value can just be the class's (fully-qualified)
name until there's an attempt to use it. When it's dereferenced if the flag
is set the flag is replaced by the class object and the flag cleared; the
class object is returned. Subsequently, the flag's not set so the value is
simply returned. Checking the flag is the only added step in most var
dereferences, and var dereference is already somewhat slow, so this
shouldn't cause performance problems, nor would it add yet another level of
indirection to anything except, temporarily, classname vars.

It wouldn't affect the other performance concern here though, that
dependency mapping could help with: trimming jars to the bare essentials to
shrink downloads and the amount of disk space Clojure apps chew up when
installed.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: tree-shaking a jarred Clojure app?

2009-11-21 Thread David Brown
On Sat, Nov 21, 2009 at 08:42:26PM -0500, John Harrop wrote:

Are you talking about binding things like String.class to vars referenced by
symbols like String?

Not just String.class, every single class referenced by a given
namespace will be loaded, and most of them instantiated before a
single line of my code runs.  It's why:

   $ time ant
   ... (no ant file, it just fails)
   real 0m0.155s

   $ time clj -e '(System/exit 0)'
   real 0m0.960s

is so drastically different.  Compiling this:

   (ns foo (:gen-class)) (defn -main [])

and running

   $ time java -cp classes:clojure.jar foo
   real 0m0.749s

still loads and instantiates every single function defined in
core.clj.

David

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: tree-shaking a jarred Clojure app?

2009-11-21 Thread John Harrop
On Sat, Nov 21, 2009 at 8:57 PM, David Brown cloj...@davidb.org wrote:

 On Sat, Nov 21, 2009 at 08:42:26PM -0500, John Harrop wrote:

 Are you talking about binding things like String.class to vars referenced
 by
 symbols like String?

 Not just String.class, every single class referenced by a given
 namespace will be loaded, and most of them instantiated before a
 single line of my code runs.  It's why:

   $ time ant
   ... (no ant file, it just fails)
   real 0m0.155s

   $ time clj -e '(System/exit 0)'
   real 0m0.960s

 is so drastically different.


1 second instead of 1/6 of a second. Yeah, like users will notice that
difference in startup times. :)

running

   $ time java -cp classes:clojure.jar foo
   real 0m0.749s

 still loads and instantiates every single function defined in
 core.clj.


Avoiding instantiating all the Clojure functions used or not is a whole
'nother kettle of fish. Barring eval, you can find out what
functions are used in a codebase by loading it all, reading it in as Clojure
data, taking it apart form by form, subjecting every form to macroexpand,
and then resolving all symbols; this gives you all defn-defined functions
called. You then assume that every closure in any of these functions is also
potentially called, and that in theory gives you your answer, with a
procedure that can theoretically be automated (and in Clojure itself perhaps
without too much difficulty).

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: tree-shaking a jarred Clojure app?

2009-11-21 Thread David Brown
On Sat, Nov 21, 2009 at 11:14:52PM -0500, John Harrop wrote:

1 second instead of 1/6 of a second. Yeah, like users will notice that
difference in startup times. :)

I'm not actually complaining, but I do notice every single time I fire
up a REPL.  The more code that you have, the longer it takes.  It's
basically completely thwarting the JVM's attempt to lazily load
classes.

I have the same complaint about JRuby and Scala.  Scala cheats a
little, and their REPL prints a prompt before actually loading the
classes.

Honestly, I think it's a reasonable penalty to pay for a system that
is both dynamic and very fast.

David

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: tree-shaking a jarred Clojure app?

2009-11-21 Thread John Harrop
On Sat, Nov 21, 2009 at 11:21 PM, David Brown cloj...@davidb.org wrote:

 On Sat, Nov 21, 2009 at 11:14:52PM -0500, John Harrop wrote:

 1 second instead of 1/6 of a second. Yeah, like users will notice that
 difference in startup times. :)

 I'm not actually complaining, but I do notice every single time I fire
 up a REPL.  The more code that you have, the longer it takes.  It's
 basically completely thwarting the JVM's attempt to lazily load
 classes.

 I have the same complaint about JRuby and Scala.  Scala cheats a
 little, and their REPL prints a prompt before actually loading the
 classes.

 Honestly, I think it's a reasonable penalty to pay for a system that
 is both dynamic and very fast.


There's a certain irony in saying something being slower is a reasonable
penalty to pay for a system that is fast. :)

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

tree-shaking a jarred Clojure app?

2009-11-20 Thread Graham Fawcett
Hi folks,

This is somewhat a Java question, but it's in the context of Clojure,
so here goes. Playing with Leiningen got me thinking about bundling a
Clojure application as a JAR, which might include a host of classes
that are loaded but never used. Is it possible to tree-shake such a
jarfile, and eliminate any classes that are not required for the
main-class' operation? (Assuming the program doesn't need 'eval' with
access to all of those classes at runtime.)

This might not save a lot of startup time, but where startup time
matters, maybe it might shave off a meaningful fraction. I'm just
curious whether there is enough dependency information in a set of
class files to calculate a tree-shaking plan; and whether there are
existing tools to do the job.

Best,
Graham

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: tree-shaking a jarred Clojure app?

2009-11-20 Thread Jim Downing
Hi Graham

2009/11/20 Graham Fawcett graham.fawc...@gmail.com:
 Hi folks,

 This is somewhat a Java question, but it's in the context of Clojure,
 so here goes. Playing with Leiningen got me thinking about bundling a
 Clojure application as a JAR, which might include a host of classes
 that are loaded but never used. Is it possible to tree-shake such a
 jarfile, and eliminate any classes that are not required for the
 main-class' operation? (Assuming the program doesn't need 'eval' with
 access to all of those classes at runtime.)

 This might not save a lot of startup time, but where startup time
 matters, maybe it might shave off a meaningful fraction. I'm just
 curious whether there is enough dependency information in a set of
 class files to calculate a tree-shaking plan; and whether there are
 existing tools to do the job.

I might have misunderstood, but isn't the problem the same as in Java;
you can't know from a static analysis which classes are going to be
loaded?

Best regards,

jim

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: tree-shaking a jarred Clojure app?

2009-11-20 Thread Graham Fawcett
On Fri, Nov 20, 2009 at 1:37 PM, Jim Downing jim.down...@gmail.com wrote:
 Hi Graham

 2009/11/20 Graham Fawcett graham.fawc...@gmail.com:
 Hi folks,

 This is somewhat a Java question, but it's in the context of Clojure,
 so here goes. Playing with Leiningen got me thinking about bundling a
 Clojure application as a JAR, which might include a host of classes
 that are loaded but never used. Is it possible to tree-shake such a
 jarfile, and eliminate any classes that are not required for the
 main-class' operation? (Assuming the program doesn't need 'eval' with
 access to all of those classes at runtime.)

 This might not save a lot of startup time, but where startup time
 matters, maybe it might shave off a meaningful fraction. I'm just
 curious whether there is enough dependency information in a set of
 class files to calculate a tree-shaking plan; and whether there are
 existing tools to do the job.

 I might have misunderstood, but isn't the problem the same as in Java;
 you can't know from a static analysis which classes are going to be
 loaded?

If it's not possible in Java, then yes, it wouldn't be any different
in Clojure. But after an admittedly casual search, I don't know what
the specific blockers are to tree-shaking in Java.

I know there are dynamic load-class-by-name facilities in the JVM, and
of course that would be a barrier for static analysis. I've only ever
seen these called with literal strings, though (e.g. in setting a JDBC
driver), and that could be statically analyzed.

But I should be able to know, through class inspection, whether my
'main' program depends on a class which uses, say, the clojure.zip
namespace, and decide whether or not to include it. Or so I am
wondering.

I suppose a better question might be: would a tree-shaker have a
reasonable chance of shaking a typical Clojure jar, or are there too
many dynamic obstacles to a good analysis.

Just curious,
Graham



 Best regards,

 jim

 --
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clojure@googlegroups.com
 Note that posts from new members are moderated - please be patient with your 
 first post.
 To unsubscribe from this group, send email to
 clojure+unsubscr...@googlegroups.com
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: tree-shaking a jarred Clojure app?

2009-11-20 Thread Richard Newman
 But I should be able to know, through class inspection, whether my
 'main' program depends on a class which uses, say, the clojure.zip
 namespace, and decide whether or not to include it. Or so I am
 wondering.

There are impediments to that, too -- your namespace might require  
another, and so on, and your namespace can refer to symbols further  
down the chain without itself including the necessary `require` form.

If you had the entire classpath available, and were willing to  
transitively examine the entire tree (probably including code-walking)  
then you might be able to solve this problem... but as soon as you hit  
a call to `read` where *read-eval* is not known to be false, or a call  
to `eval`, or maybe even some uses of reflection, you have to give up.

Furthermore, some code adjusts itself at compile-time according to its  
environment (e.g., clojure.contrib.logging, which generates different  
functions depending on which logging libraries are available). That's  
not very amenable to static analysis.

 I suppose a better question might be: would a tree-shaker have a
 reasonable chance of shaking a typical Clojure jar, or are there too
 many dynamic obstacles to a good analysis.

I'm not sure it's worth solving this through low-level analysis. Far  
better, IMO, is to rely on correct descriptors and namespace  
definitions -- convention and configuration can save the day. If you  
can sweep the hard parts under the proverbial rug, the rest can be  
solved in a handful of lines of code!

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: tree-shaking a jarred Clojure app?

2009-11-20 Thread John Harrop
On Fri, Nov 20, 2009 at 2:28 PM, Richard Newman holyg...@gmail.com wrote:

  But I should be able to know, through class inspection, whether my
  'main' program depends on a class which uses, say, the clojure.zip
  namespace, and decide whether or not to include it. Or so I am
  wondering.

 There are impediments to that, too -- your namespace might require
 another, and so on, and your namespace can refer to symbols further
 down the chain without itself including the necessary `require` form.


There's the possibility of a macro expanding to a nonobvious require or use
form. Also of eval being used. Even in the Java world, class names that
get dynamically loaded sometimes come from XML files rather than being in
the Java source anywhere.

But there is an alternative to static analysis, at least in theory. One
could, in principle, run one's new app or servlet or whatever on a JVM with
a modified class-loading infrastructure set up to log all loaded classes,
then put it through its paces in typical usage scenarios and the anticipated
atypical situations. If you have good test coverage, you could simply run
the test suite with class-loading logging enabled, and make any appropriate
substitutions of real classes (and their dependencies) for mock-object
classes. Or if the tests don't use any mock objects in really obscure cases
whose real counterparts aren't also used in the typical situation, just log
a normal run and a run through the test suite and dump all the mock object
classes and test-harness classes.

Sun probably provides debug tools that can log classes loaded on a JVM
session, so everything needed for this is in theory available off-the-shelf.

End product: a list of every used class. At least in theory.

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en