Re: [Chicken-hackers] [PATCH] Fix (restore?) specializations from types.db when compiling core itself
On 2016-01-10 17:10, Peter Bex wrote: > I really want to apply this optimization to core, so for the time being > I decided that we can do this, but only when compiling core. That means > that core is priviliged in being able to externally declare types for a > compilation units that it will compile. In other words, if eggs have > types files, the specializations won't be applied when compiling the egg > itself. I think this is an acceptable compromise. > > For CHICKEN 5, we'll always apply the specializations, because we will > not have any toplevel identifiers in types.db anymore when it is > finished. Very good, I've pushed this along with the annotation name change we discussed. Thank you for taking care of that! Evan signature.asc Description: PGP signature ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] [PATCH] Fix (restore?) specializations from types.db when compiling core itself
On Sun, Dec 20, 2015 at 06:11:24PM +1300, Evan Hanson wrote: > Unfortunately the scrutinizer change is a bit hairier. One problem is > that it causes the "-emit-types-file" option to include system-defined > types (i.e. those that come from from types.db) in generated .types > files, whereas it should only include those for values defined in the > file under compilation. That file's entries are generated by walking the > analysis database, so I don't think we'll be able to avoid keeping track > of whether a given type came from the user or from a types file so that > we can determine whether to generate a type for it in this situation. I've now added this, see the attached patch. It simply changes the ##compiler#declared-type from a boolean to a symbol indicating the origin of the declaration. The code in "scrutinize" for "set!" is a bit strange in that it seems to automatically mark variables as declared even if they are only inferred. I don't completely understand this code, but I've changed the marking to be 'implicit, so that we can differentiate between these kinds of types and types that were actually declared by the user using "the", ":" or "(declare (type ...))" syntaxes. To keep the behaviour consistent, now -emit-type-file emits both 'implicit and 'local types, thereby skipping the ones loaded from the db (which have a 'from-db marking). > There's also now the problem of what to do when the user redefines an > identifier without redeclaring its type. For example, with this change, > if I redefine string-split with a different signature, csc will produce > inaccurate scrutiny warnings because calls to *my* string-split won't > match core's but it'll still have a declared-type, now. I haven't dug > into this too deeply yet (I'm hoping to take a crack at these issues > over the holidays) but I do think we'll have to make a call about what's > correct in this and similar situations. I really want to apply this optimization to core, so for the time being I decided that we can do this, but only when compiling core. That means that core is priviliged in being able to externally declare types for a compilation units that it will compile. In other words, if eggs have types files, the specializations won't be applied when compiling the egg itself. I think this is an acceptable compromise. We could make this more sophisticated by allowing namespaced identifiers (i.e., with a "module#" prefix) to be specialised when compiling the module as well. For CHICKEN 5, we'll always apply the specializations, because we will not have any toplevel identifiers in types.db anymore when it is finished. Cheers, Peter From fe88996d070b790633de2e36bd212f935150580d Mon Sep 17 00:00:00 2001 From: Peter Bex Date: Wed, 18 Nov 2015 19:28:08 +0100 Subject: [PATCH] Mark external type declarations as declared. By not being marked as "declared", types loaded from a types database would be considered to be inferred via flow analysis. When scrutinizing procedure definitions, "initial-argument-types" and "variable-result" would simply return '* or '(*) as the type, which doesn't match the loaded declaration. This had the effect of blocking specialization. This fixes the most important part of #1219. --- core.scm | 4 ++-- distribution/manifest | 1 + scrutinizer.scm | 8 +--- tests/runtests.bat| 2 +- tests/runtests.sh | 2 +- tests/specialization-test-2.scm | 6 ++ tests/specialization-test-2.types | 3 +++ 7 files changed, 19 insertions(+), 7 deletions(-) create mode 100644 tests/specialization-test-2.types diff --git a/core.scm b/core.scm index f9ae772..aeca37c 100644 --- a/core.scm +++ b/core.scm @@ -91,7 +91,7 @@ ; ##compiler#pure -> BOOL referentially transparent ; ##compiler#clean -> BOOLdoes not modify local state ; ##compiler#type -> TYPE -; ##compiler#declared-type -> BOOL +; ##compiler#declared-type -> 'from-db | 'local | 'implicit ; - Source language: ; @@ -1659,7 +1659,7 @@ (symbol? (cadr type))) (set-car! (cdr type) name)) (mark-variable name '##compiler#type type) - (mark-variable name '##compiler#declared-type) + (mark-variable name '##compiler#declared-type 'local) (when pure (mark-variable name '##compiler#pure #t)) (when pred diff --git a/distribution/manifest b/distribution/manifest index 2b1c3bd..0160f89 100644 --- a/distribution/manifest +++ b/distribution/manifest @@ -176,6 +176,7 @@ tests/loopy-loop.scm tests/r5rs_pitfalls.scm tests/specialization-test-1.scm tests/specialization-test-2.scm +tests/specialization-test-2.types tests/test-irregex.scm tests/re-tests.txt tests/lolevel-tests.scm diff --git a/scrutinizer.scm b/scrutinizer.scm index 81c2f82..fed2a7a 100644 --- a/scrutinizer.scm +++ b/scrutinizer.scm @@ -88,7 +88,7 @@ ; global symbol properti
Re: [Chicken-hackers] [PATCH] Fix (restore?) specializations from types.db when compiling core itself
Peter Bex scripsit: > We *could* declare this situation hopelessly lost in CHICKEN 4 because > it uses unqualified names everywhere in types.db. Then we can apply > the fix to CHICKEN 5, where the core definitions will all be properly > namespaced into modules. +1 to that. -- John Cowan http://www.ccil.org/~cowanco...@ccil.org The man that wanders far from the walking tree --first line of a non-existent poem by me ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] [PATCH] Fix (restore?) specializations from types.db when compiling core itself
On Sun, Dec 20, 2015 at 06:11:24PM +1300, Evan Hanson wrote: > Hi Peter, > > I've pushed the read-char and types.db fixes. They make sense and offer > a good speedup -- nice work! Thanks! > Unfortunately the scrutinizer change is a bit hairier. One problem is > that it causes the "-emit-types-file" option to include system-defined > types (i.e. those that come from from types.db) in generated .types > files, whereas it should only include those for values defined in the > file under compilation. Damn, I didn't notice that. Well spotted! > That file's entries are generated by walking the > analysis database, so I don't think we'll be able to avoid keeping track > of whether a given type came from the user or from a types file so that > we can determine whether to generate a type for it in this situation. Maybe we can simply mark locally defined variables with another property which is used to determine whether or not to export it to a types file? > There's also now the problem of what to do when the user redefines an > identifier without redeclaring its type. For example, with this change, > if I redefine string-split with a different signature, csc will produce > inaccurate scrutiny warnings because calls to *my* string-split won't > match core's but it'll still have a declared-type, now. I'd consider that pretty much broken, because when you link that definition to code that's assumes the types.db definitions are correct, it will produce invalid code. For example, if your string-split accepts either symbols or strings, the #:enforce declaration would be no longer be correct (because it'll now accept symbols too), and the assumption that the argument is a string is invalidated. For example, in the following code: (define (split-and-original-length x) (let* ((split-up (string-split x)) (len (string-length x))) (values split-up len))) the compiler will specialize "string-length" to ##sys#size. If used with the new overridden user definition, the procedure would return a list of strings and the number 3 if you pass in a symbol, because symbols are always size 3. > I haven't dug into this too deeply yet (I'm hoping to take a crack at > these issues over the holidays) but I do think we'll have to make a > call about what's correct in this and similar situations. We *could* declare this situation hopelessly lost in CHICKEN 4 because it uses unqualified names everywhere in types.db. Then we can apply the fix to CHICKEN 5, where the core definitions will all be properly namespaced into modules. I don't know how to fix it properly to allow user redefinitions of existing procedures with different types, but I also think it makes very little sense to allow this at all. Cheers, Peter signature.asc Description: Digital signature ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] [PATCH] Fix (restore?) specializations from types.db when compiling core itself
Hi Peter, I've pushed the read-char and types.db fixes. They make sense and offer a good speedup -- nice work! Unfortunately the scrutinizer change is a bit hairier. One problem is that it causes the "-emit-types-file" option to include system-defined types (i.e. those that come from from types.db) in generated .types files, whereas it should only include those for values defined in the file under compilation. That file's entries are generated by walking the analysis database, so I don't think we'll be able to avoid keeping track of whether a given type came from the user or from a types file so that we can determine whether to generate a type for it in this situation. There's also now the problem of what to do when the user redefines an identifier without redeclaring its type. For example, with this change, if I redefine string-split with a different signature, csc will produce inaccurate scrutiny warnings because calls to *my* string-split won't match core's but it'll still have a declared-type, now. I haven't dug into this too deeply yet (I'm hoping to take a crack at these issues over the holidays) but I do think we'll have to make a call about what's correct in this and similar situations. Evan signature.asc Description: PGP signature ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] [PATCH] Fix (restore?) specializations from types.db when compiling core itself
On Wed, Nov 18, 2015 at 07:58:49PM +0100, Peter Bex wrote: > Attached is also a diff of the benchmarks, and you can clearly see the > benefit this has in the difference between "test" and "master" on the > kernwyk-wc and kernwyk-cat benchmarks, as I hoped it would. I also ran the intarweb benchmarks (which read/write a lot to string ports), and it's consistently faster with patch. The difference isn't huge, but noticeable regardless. For those that are interested, here are the benchmark run outputs (sorry, it has no fancy tabular output): Without patch: -- Unparsers: == --- Response unparsing --- Unparsing a minimal HTTP/1.1 response many times 0.224s CPU time, 0.004s GC time (major), 108656/1440 mutations (total/tracked), 1/1341 GCs (major/minor) Unparsing a realistic HTTP/1.1 response many times 2.752s CPU time, 0.06s GC time (major), 1037114/7054 mutations (total/tracked), 27/17342 GCs (major/minor) --- Request unparsing --- Unparsing a minimal HTTP/1.1 request many times 0.22s CPU time, 0.004s GC time (major), 108296/1459 mutations (total/tracked), 1/1483 GCs (major/minor) Unparsing a realistic HTTP/1.1 request many times 5.548s CPU time, 0.252s GC time (major), 8687532/240933 mutations (total/tracked), 73/36245 GCs (major/minor) Parsers: == --- Request parsing --- Parsing a minimal HTTP/1.0 request many times 6.912s CPU time, 0.48s GC time (major), 4630912/486536 mutations (total/tracked), 174/47762 GCs (major/minor) Parsing a realistic HTTP/1.1 request many times 29.836s CPU time, 2.876s GC time (major), 11884605/2145481 mutations (total/tracked), 1089/236099 GCs (major/minor) --- Response parsing --- Parsing a minimal HTTP/1.0 response many times 3.24s CPU time, 0.212s GC time (major), 4413434/259920 mutations (total/tracked), 76/20950 GCs (major/minor) Parsing a realistic HTTP/1.1 response many times 18.368s CPU time, 1.604s GC time (major), 8999267/2480008 mutations (total/tracked), 559/131810 GCs (major/minor) With patch: --- Unparsers: == --- Response unparsing --- Unparsing a minimal HTTP/1.1 response many times 0.184s CPU time, 0.004s GC time (major), 108656/1580 mutations (total/tracked), 1/1261 GCs (major/minor) Unparsing a realistic HTTP/1.1 response many times 2.7s CPU time, 0.076s GC time (major), 1037114/6886 mutations (total/tracked), 26/17065 GCs (major/minor) --- Request unparsing --- Unparsing a minimal HTTP/1.1 request many times 0.212s CPU time, 108296/1438 mutations (total/tracked), 1/1403 GCs (major/minor) Unparsing a realistic HTTP/1.1 request many times 5.452s CPU time, 0.204s GC time (major), 8687532/240245 mutations (total/tracked), 75/35694 GCs (major/minor) Parsers: == --- Request parsing --- Parsing a minimal HTTP/1.0 request many times 6.644s CPU time, 0.472s GC time (major), 4629832/484472 mutations (total/tracked), 174/46161 GCs (major/minor) Parsing a realistic HTTP/1.1 request many times 28.796s CPU time, 2.848s GC time (major), 11883525/2141135 mutations (total/tracked), 1064/228281 GCs (major/minor) --- Response parsing --- Parsing a minimal HTTP/1.0 response many times 3.152s CPU time, 0.192s GC time (major), 4411292/258693 mutations (total/tracked), 68/20155 GCs (major/minor) Parsing a realistic HTTP/1.1 response many times 17.552s CPU time, 1.46s GC time (major), 8997089/2459681 mutations (total/tracked), 562/125354 GCs (major/minor) Cheers, Peter signature.asc Description: Digital signature ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers