Re: tree-shaking a jarred Clojure app?
On Fri, Nov 20, 2009 at 06:37:18PM +, Jim Downing wrote: I might have misunderstood, but isn't the problem the same as in Java; you can't know from a static analysis which classes are going to be loaded? Except that Clojure will load all of them so it can bind them to the vars in each namespace. Java code is usually much less dynamic, and makes some static analysis a lot easier. JVMs are pretty good about not loading classes until they are used, so I think the real problem is that the init code for each namespace loads all of the classes before it does anything. If that could be delayed, I suspect most of the startup delay would go away. The trick is figuring out how to do this without adding yet another level of indirection to vars. David -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: tree-shaking a jarred Clojure app?
On Sat, Nov 21, 2009 at 8:30 PM, David Brown cloj...@davidb.org wrote: On Fri, Nov 20, 2009 at 06:37:18PM +, Jim Downing wrote: I might have misunderstood, but isn't the problem the same as in Java; you can't know from a static analysis which classes are going to be loaded? Except that Clojure will load all of them so it can bind them to the vars in each namespace. Java code is usually much less dynamic, and makes some static analysis a lot easier. JVMs are pretty good about not loading classes until they are used, so I think the real problem is that the init code for each namespace loads all of the classes before it does anything. If that could be delayed, I suspect most of the startup delay would go away. The trick is figuring out how to do this without adding yet another level of indirection to vars. Are you talking about binding things like String.class to vars referenced by symbols like String? All you need for that is a flag in the var's metadata of loaded-yet?, next to the flag that says it's a Java class and not a normal Clojure var. The held value can just be the class's (fully-qualified) name until there's an attempt to use it. When it's dereferenced if the flag is set the flag is replaced by the class object and the flag cleared; the class object is returned. Subsequently, the flag's not set so the value is simply returned. Checking the flag is the only added step in most var dereferences, and var dereference is already somewhat slow, so this shouldn't cause performance problems, nor would it add yet another level of indirection to anything except, temporarily, classname vars. It wouldn't affect the other performance concern here though, that dependency mapping could help with: trimming jars to the bare essentials to shrink downloads and the amount of disk space Clojure apps chew up when installed. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: tree-shaking a jarred Clojure app?
On Sat, Nov 21, 2009 at 08:42:26PM -0500, John Harrop wrote: Are you talking about binding things like String.class to vars referenced by symbols like String? Not just String.class, every single class referenced by a given namespace will be loaded, and most of them instantiated before a single line of my code runs. It's why: $ time ant ... (no ant file, it just fails) real 0m0.155s $ time clj -e '(System/exit 0)' real 0m0.960s is so drastically different. Compiling this: (ns foo (:gen-class)) (defn -main []) and running $ time java -cp classes:clojure.jar foo real 0m0.749s still loads and instantiates every single function defined in core.clj. David -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: tree-shaking a jarred Clojure app?
On Sat, Nov 21, 2009 at 8:57 PM, David Brown cloj...@davidb.org wrote: On Sat, Nov 21, 2009 at 08:42:26PM -0500, John Harrop wrote: Are you talking about binding things like String.class to vars referenced by symbols like String? Not just String.class, every single class referenced by a given namespace will be loaded, and most of them instantiated before a single line of my code runs. It's why: $ time ant ... (no ant file, it just fails) real 0m0.155s $ time clj -e '(System/exit 0)' real 0m0.960s is so drastically different. 1 second instead of 1/6 of a second. Yeah, like users will notice that difference in startup times. :) running $ time java -cp classes:clojure.jar foo real 0m0.749s still loads and instantiates every single function defined in core.clj. Avoiding instantiating all the Clojure functions used or not is a whole 'nother kettle of fish. Barring eval, you can find out what functions are used in a codebase by loading it all, reading it in as Clojure data, taking it apart form by form, subjecting every form to macroexpand, and then resolving all symbols; this gives you all defn-defined functions called. You then assume that every closure in any of these functions is also potentially called, and that in theory gives you your answer, with a procedure that can theoretically be automated (and in Clojure itself perhaps without too much difficulty). -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: tree-shaking a jarred Clojure app?
On Sat, Nov 21, 2009 at 11:14:52PM -0500, John Harrop wrote: 1 second instead of 1/6 of a second. Yeah, like users will notice that difference in startup times. :) I'm not actually complaining, but I do notice every single time I fire up a REPL. The more code that you have, the longer it takes. It's basically completely thwarting the JVM's attempt to lazily load classes. I have the same complaint about JRuby and Scala. Scala cheats a little, and their REPL prints a prompt before actually loading the classes. Honestly, I think it's a reasonable penalty to pay for a system that is both dynamic and very fast. David -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: tree-shaking a jarred Clojure app?
On Sat, Nov 21, 2009 at 11:21 PM, David Brown cloj...@davidb.org wrote: On Sat, Nov 21, 2009 at 11:14:52PM -0500, John Harrop wrote: 1 second instead of 1/6 of a second. Yeah, like users will notice that difference in startup times. :) I'm not actually complaining, but I do notice every single time I fire up a REPL. The more code that you have, the longer it takes. It's basically completely thwarting the JVM's attempt to lazily load classes. I have the same complaint about JRuby and Scala. Scala cheats a little, and their REPL prints a prompt before actually loading the classes. Honestly, I think it's a reasonable penalty to pay for a system that is both dynamic and very fast. There's a certain irony in saying something being slower is a reasonable penalty to pay for a system that is fast. :) -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
tree-shaking a jarred Clojure app?
Hi folks, This is somewhat a Java question, but it's in the context of Clojure, so here goes. Playing with Leiningen got me thinking about bundling a Clojure application as a JAR, which might include a host of classes that are loaded but never used. Is it possible to tree-shake such a jarfile, and eliminate any classes that are not required for the main-class' operation? (Assuming the program doesn't need 'eval' with access to all of those classes at runtime.) This might not save a lot of startup time, but where startup time matters, maybe it might shave off a meaningful fraction. I'm just curious whether there is enough dependency information in a set of class files to calculate a tree-shaking plan; and whether there are existing tools to do the job. Best, Graham -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: tree-shaking a jarred Clojure app?
Hi Graham 2009/11/20 Graham Fawcett graham.fawc...@gmail.com: Hi folks, This is somewhat a Java question, but it's in the context of Clojure, so here goes. Playing with Leiningen got me thinking about bundling a Clojure application as a JAR, which might include a host of classes that are loaded but never used. Is it possible to tree-shake such a jarfile, and eliminate any classes that are not required for the main-class' operation? (Assuming the program doesn't need 'eval' with access to all of those classes at runtime.) This might not save a lot of startup time, but where startup time matters, maybe it might shave off a meaningful fraction. I'm just curious whether there is enough dependency information in a set of class files to calculate a tree-shaking plan; and whether there are existing tools to do the job. I might have misunderstood, but isn't the problem the same as in Java; you can't know from a static analysis which classes are going to be loaded? Best regards, jim -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: tree-shaking a jarred Clojure app?
On Fri, Nov 20, 2009 at 1:37 PM, Jim Downing jim.down...@gmail.com wrote: Hi Graham 2009/11/20 Graham Fawcett graham.fawc...@gmail.com: Hi folks, This is somewhat a Java question, but it's in the context of Clojure, so here goes. Playing with Leiningen got me thinking about bundling a Clojure application as a JAR, which might include a host of classes that are loaded but never used. Is it possible to tree-shake such a jarfile, and eliminate any classes that are not required for the main-class' operation? (Assuming the program doesn't need 'eval' with access to all of those classes at runtime.) This might not save a lot of startup time, but where startup time matters, maybe it might shave off a meaningful fraction. I'm just curious whether there is enough dependency information in a set of class files to calculate a tree-shaking plan; and whether there are existing tools to do the job. I might have misunderstood, but isn't the problem the same as in Java; you can't know from a static analysis which classes are going to be loaded? If it's not possible in Java, then yes, it wouldn't be any different in Clojure. But after an admittedly casual search, I don't know what the specific blockers are to tree-shaking in Java. I know there are dynamic load-class-by-name facilities in the JVM, and of course that would be a barrier for static analysis. I've only ever seen these called with literal strings, though (e.g. in setting a JDBC driver), and that could be statically analyzed. But I should be able to know, through class inspection, whether my 'main' program depends on a class which uses, say, the clojure.zip namespace, and decide whether or not to include it. Or so I am wondering. I suppose a better question might be: would a tree-shaker have a reasonable chance of shaking a typical Clojure jar, or are there too many dynamic obstacles to a good analysis. Just curious, Graham Best regards, jim -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: tree-shaking a jarred Clojure app?
But I should be able to know, through class inspection, whether my 'main' program depends on a class which uses, say, the clojure.zip namespace, and decide whether or not to include it. Or so I am wondering. There are impediments to that, too -- your namespace might require another, and so on, and your namespace can refer to symbols further down the chain without itself including the necessary `require` form. If you had the entire classpath available, and were willing to transitively examine the entire tree (probably including code-walking) then you might be able to solve this problem... but as soon as you hit a call to `read` where *read-eval* is not known to be false, or a call to `eval`, or maybe even some uses of reflection, you have to give up. Furthermore, some code adjusts itself at compile-time according to its environment (e.g., clojure.contrib.logging, which generates different functions depending on which logging libraries are available). That's not very amenable to static analysis. I suppose a better question might be: would a tree-shaker have a reasonable chance of shaking a typical Clojure jar, or are there too many dynamic obstacles to a good analysis. I'm not sure it's worth solving this through low-level analysis. Far better, IMO, is to rely on correct descriptors and namespace definitions -- convention and configuration can save the day. If you can sweep the hard parts under the proverbial rug, the rest can be solved in a handful of lines of code! -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en
Re: tree-shaking a jarred Clojure app?
On Fri, Nov 20, 2009 at 2:28 PM, Richard Newman holyg...@gmail.com wrote: But I should be able to know, through class inspection, whether my 'main' program depends on a class which uses, say, the clojure.zip namespace, and decide whether or not to include it. Or so I am wondering. There are impediments to that, too -- your namespace might require another, and so on, and your namespace can refer to symbols further down the chain without itself including the necessary `require` form. There's the possibility of a macro expanding to a nonobvious require or use form. Also of eval being used. Even in the Java world, class names that get dynamically loaded sometimes come from XML files rather than being in the Java source anywhere. But there is an alternative to static analysis, at least in theory. One could, in principle, run one's new app or servlet or whatever on a JVM with a modified class-loading infrastructure set up to log all loaded classes, then put it through its paces in typical usage scenarios and the anticipated atypical situations. If you have good test coverage, you could simply run the test suite with class-loading logging enabled, and make any appropriate substitutions of real classes (and their dependencies) for mock-object classes. Or if the tests don't use any mock objects in really obscure cases whose real counterparts aren't also used in the typical situation, just log a normal run and a run through the test suite and dump all the mock object classes and test-harness classes. Sun probably provides debug tools that can log classes loaded on a JVM session, so everything needed for this is in theory available off-the-shelf. End product: a list of every used class. At least in theory. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en