Re: [Chicken-hackers] [PATCH] Fix (restore?) specializations from types.db when compiling core itself

2016-01-10 Thread Evan Hanson
On 2016-01-10 17:10, Peter Bex wrote:
> I really want to apply this optimization to core, so for the time being
> I decided that we can do this, but only when compiling core.  That means
> that core is priviliged in being able to externally declare types for a
> compilation units that it will compile.  In other words, if eggs have
> types files, the specializations won't be applied when compiling the egg
> itself.  I think this is an acceptable compromise.
> 
> For CHICKEN 5, we'll always apply the specializations, because we will
> not have any toplevel identifiers in types.db anymore when it is
> finished.

Very good, I've pushed this along with the annotation name change we
discussed.

Thank you for taking care of that!

Evan


signature.asc
Description: PGP signature
___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] [PATCH] Fix (restore?) specializations from types.db when compiling core itself

2016-01-10 Thread Peter Bex
On Sun, Dec 20, 2015 at 06:11:24PM +1300, Evan Hanson wrote:
> Unfortunately the scrutinizer change is a bit hairier. One problem is
> that it causes the "-emit-types-file" option to include system-defined
> types (i.e. those that come from from types.db) in generated .types
> files, whereas it should only include those for values defined in the
> file under compilation. That file's entries are generated by walking the
> analysis database, so I don't think we'll be able to avoid keeping track
> of whether a given type came from the user or from a types file so that
> we can determine whether to generate a type for it in this situation.

I've now added this, see the attached patch.  It simply changes the
##compiler#declared-type from a boolean to a symbol indicating the origin
of the declaration.  The code in "scrutinize" for "set!" is a bit strange
in that it seems to automatically mark variables as declared even if they
are only inferred.  I don't completely understand this code, but I've
changed the marking to be 'implicit, so that we can differentiate between
these kinds of types and types that were actually declared by the user
using "the", ":" or "(declare (type ...))" syntaxes.

To keep the behaviour consistent, now -emit-type-file emits both 'implicit
and 'local types, thereby skipping the ones loaded from the db (which have
a 'from-db marking).

> There's also now the problem of what to do when the user redefines an
> identifier without redeclaring its type. For example, with this change,
> if I redefine string-split with a different signature, csc will produce
> inaccurate scrutiny warnings because calls to *my* string-split won't
> match core's but it'll still have a declared-type, now. I haven't dug
> into this too deeply yet (I'm hoping to take a crack at these issues
> over the holidays) but I do think we'll have to make a call about what's
> correct in this and similar situations.

I really want to apply this optimization to core, so for the time being
I decided that we can do this, but only when compiling core.  That means
that core is priviliged in being able to externally declare types for a
compilation units that it will compile.  In other words, if eggs have
types files, the specializations won't be applied when compiling the egg
itself.  I think this is an acceptable compromise.

We could make this more sophisticated by allowing namespaced identifiers
(i.e., with a "module#" prefix) to be specialised when compiling the
module as well.

For CHICKEN 5, we'll always apply the specializations, because we will
not have any toplevel identifiers in types.db anymore when it is
finished.

Cheers,
Peter
From fe88996d070b790633de2e36bd212f935150580d Mon Sep 17 00:00:00 2001
From: Peter Bex 
Date: Wed, 18 Nov 2015 19:28:08 +0100
Subject: [PATCH] Mark external type declarations as declared.

By not being marked as "declared", types loaded from a types database
would be considered to be inferred via flow analysis.  When scrutinizing
procedure definitions, "initial-argument-types" and "variable-result"
would simply return '* or '(*) as the type, which doesn't match the
loaded declaration.  This had the effect of blocking specialization.

This fixes the most important part of #1219.
---
 core.scm  | 4 ++--
 distribution/manifest | 1 +
 scrutinizer.scm   | 8 +---
 tests/runtests.bat| 2 +-
 tests/runtests.sh | 2 +-
 tests/specialization-test-2.scm   | 6 ++
 tests/specialization-test-2.types | 3 +++
 7 files changed, 19 insertions(+), 7 deletions(-)
 create mode 100644 tests/specialization-test-2.types

diff --git a/core.scm b/core.scm
index f9ae772..aeca37c 100644
--- a/core.scm
+++ b/core.scm
@@ -91,7 +91,7 @@
 ;   ##compiler#pure -> BOOL referentially transparent
 ;   ##compiler#clean -> BOOLdoes not modify local state
 ;   ##compiler#type -> TYPE
-;   ##compiler#declared-type -> BOOL
+;   ##compiler#declared-type -> 'from-db | 'local | 'implicit
 
 ; - Source language:
 ;
@@ -1659,7 +1659,7 @@
 	 (symbol? (cadr type)))
 (set-car! (cdr type) name))
 			  (mark-variable name '##compiler#type type)
-			  (mark-variable name '##compiler#declared-type)
+			  (mark-variable name '##compiler#declared-type 'local)
 			  (when pure
 (mark-variable name '##compiler#pure #t))
 			  (when pred
diff --git a/distribution/manifest b/distribution/manifest
index 2b1c3bd..0160f89 100644
--- a/distribution/manifest
+++ b/distribution/manifest
@@ -176,6 +176,7 @@ tests/loopy-loop.scm
 tests/r5rs_pitfalls.scm
 tests/specialization-test-1.scm
 tests/specialization-test-2.scm
+tests/specialization-test-2.types
 tests/test-irregex.scm
 tests/re-tests.txt
 tests/lolevel-tests.scm
diff --git a/scrutinizer.scm b/scrutinizer.scm
index 81c2f82..fed2a7a 100644
--- a/scrutinizer.scm
+++ b/scrutinizer.scm
@@ -88,7 +88,7 @@
 ; global symbol properti

Re: [Chicken-hackers] [PATCH] Fix (restore?) specializations from types.db when compiling core itself

2015-12-20 Thread John Cowan
Peter Bex scripsit:

> We *could* declare this situation hopelessly lost in CHICKEN 4 because
> it uses unqualified names everywhere in types.db.  Then we can apply
> the fix to CHICKEN 5, where the core definitions will all be properly
> namespaced into modules.

+1 to that.

-- 
John Cowan  http://www.ccil.org/~cowanco...@ccil.org
The man that wanders far from the walking tree
--first line of a non-existent poem by me

___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] [PATCH] Fix (restore?) specializations from types.db when compiling core itself

2015-12-20 Thread Peter Bex
On Sun, Dec 20, 2015 at 06:11:24PM +1300, Evan Hanson wrote:
> Hi Peter,
> 
> I've pushed the read-char and types.db fixes. They make sense and offer
> a good speedup -- nice work!

Thanks!

> Unfortunately the scrutinizer change is a bit hairier. One problem is
> that it causes the "-emit-types-file" option to include system-defined
> types (i.e. those that come from from types.db) in generated .types
> files, whereas it should only include those for values defined in the
> file under compilation.

Damn, I didn't notice that.  Well spotted!

> That file's entries are generated by walking the
> analysis database, so I don't think we'll be able to avoid keeping track
> of whether a given type came from the user or from a types file so that
> we can determine whether to generate a type for it in this situation.

Maybe we can simply mark locally defined variables with another property
which is used to determine whether or not to export it to a types file?

> There's also now the problem of what to do when the user redefines an
> identifier without redeclaring its type. For example, with this change,
> if I redefine string-split with a different signature, csc will produce
> inaccurate scrutiny warnings because calls to *my* string-split won't
> match core's but it'll still have a declared-type, now.

I'd consider that pretty much broken, because when you link that
definition to code that's assumes the types.db definitions are correct,
it will produce invalid code.

For example, if your string-split accepts either symbols or strings, the
#:enforce declaration would be no longer be correct (because it'll now
accept symbols too), and the assumption that the argument is a string
is invalidated.  For example, in the following code:

(define (split-and-original-length x)
  (let* ((split-up (string-split x))
 (len (string-length x)))
(values split-up len)))

the compiler will specialize "string-length" to ##sys#size.  If used
with the new overridden user definition, the procedure would return a
list of strings and the number 3 if you pass in a symbol, because
symbols are always size 3.

> I haven't dug into this too deeply yet (I'm hoping to take a crack at
> these issues over the holidays) but I do think we'll have to make a
> call about what's correct in this and similar situations.

We *could* declare this situation hopelessly lost in CHICKEN 4 because
it uses unqualified names everywhere in types.db.  Then we can apply
the fix to CHICKEN 5, where the core definitions will all be properly
namespaced into modules.

I don't know how to fix it properly to allow user redefinitions of
existing procedures with different types, but I also think it makes
very little sense to allow this at all.

Cheers,
Peter


signature.asc
Description: Digital signature
___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] [PATCH] Fix (restore?) specializations from types.db when compiling core itself

2015-12-19 Thread Evan Hanson
Hi Peter,

I've pushed the read-char and types.db fixes. They make sense and offer
a good speedup -- nice work!

Unfortunately the scrutinizer change is a bit hairier. One problem is
that it causes the "-emit-types-file" option to include system-defined
types (i.e. those that come from from types.db) in generated .types
files, whereas it should only include those for values defined in the
file under compilation. That file's entries are generated by walking the
analysis database, so I don't think we'll be able to avoid keeping track
of whether a given type came from the user or from a types file so that
we can determine whether to generate a type for it in this situation.

There's also now the problem of what to do when the user redefines an
identifier without redeclaring its type. For example, with this change,
if I redefine string-split with a different signature, csc will produce
inaccurate scrutiny warnings because calls to *my* string-split won't
match core's but it'll still have a declared-type, now. I haven't dug
into this too deeply yet (I'm hoping to take a crack at these issues
over the holidays) but I do think we'll have to make a call about what's
correct in this and similar situations.

Evan


signature.asc
Description: PGP signature
___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] [PATCH] Fix (restore?) specializations from types.db when compiling core itself

2015-11-28 Thread Peter Bex
On Wed, Nov 18, 2015 at 07:58:49PM +0100, Peter Bex wrote:
> Attached is also a diff of the benchmarks, and you can clearly see the
> benefit this has in the difference between "test" and "master" on the
> kernwyk-wc and kernwyk-cat benchmarks, as I hoped it would.

I also ran the intarweb benchmarks (which read/write a lot to string
ports), and it's consistently faster with patch.  The difference isn't
huge, but noticeable regardless.

For those that are interested, here are the benchmark run outputs (sorry,
it has no fancy tabular output):

Without patch:
--

Unparsers:
==


---  Response unparsing ---
Unparsing a minimal HTTP/1.1 response many times
0.224s CPU time, 0.004s GC time (major), 108656/1440 mutations (total/tracked), 
1/1341 GCs (major/minor)
Unparsing a realistic HTTP/1.1 response many times
2.752s CPU time, 0.06s GC time (major), 1037114/7054 mutations (total/tracked), 
27/17342 GCs (major/minor)

---  Request unparsing ---
Unparsing a minimal HTTP/1.1 request many times
0.22s CPU time, 0.004s GC time (major), 108296/1459 mutations (total/tracked), 
1/1483 GCs (major/minor)
Unparsing a realistic HTTP/1.1 request many times
5.548s CPU time, 0.252s GC time (major), 8687532/240933 mutations 
(total/tracked), 73/36245 GCs (major/minor)

Parsers:
==


---  Request parsing ---
Parsing a minimal HTTP/1.0 request many times
6.912s CPU time, 0.48s GC time (major), 4630912/486536 mutations 
(total/tracked), 174/47762 GCs (major/minor)
Parsing a realistic HTTP/1.1 request many times
29.836s CPU time, 2.876s GC time (major), 11884605/2145481 mutations 
(total/tracked), 1089/236099 GCs (major/minor)

---  Response parsing ---
Parsing a minimal HTTP/1.0 response many times
3.24s CPU time, 0.212s GC time (major), 4413434/259920 mutations 
(total/tracked), 76/20950 GCs (major/minor)

Parsing a realistic HTTP/1.1 response many times
18.368s CPU time, 1.604s GC time (major), 8999267/2480008 mutations 
(total/tracked), 559/131810 GCs (major/minor)


With patch:
---

Unparsers:
==


---  Response unparsing ---
Unparsing a minimal HTTP/1.1 response many times
0.184s CPU time, 0.004s GC time (major), 108656/1580 mutations (total/tracked), 
1/1261 GCs (major/minor)
Unparsing a realistic HTTP/1.1 response many times
2.7s CPU time, 0.076s GC time (major), 1037114/6886 mutations (total/tracked), 
26/17065 GCs (major/minor)

---  Request unparsing ---
Unparsing a minimal HTTP/1.1 request many times
0.212s CPU time, 108296/1438 mutations (total/tracked), 1/1403 GCs (major/minor)
Unparsing a realistic HTTP/1.1 request many times
5.452s CPU time, 0.204s GC time (major), 8687532/240245 mutations 
(total/tracked), 75/35694 GCs (major/minor)

Parsers:
==


---  Request parsing ---
Parsing a minimal HTTP/1.0 request many times
6.644s CPU time, 0.472s GC time (major), 4629832/484472 mutations 
(total/tracked), 174/46161 GCs (major/minor)
Parsing a realistic HTTP/1.1 request many times
28.796s CPU time, 2.848s GC time (major), 11883525/2141135 mutations 
(total/tracked), 1064/228281 GCs (major/minor)

---  Response parsing ---
Parsing a minimal HTTP/1.0 response many times
3.152s CPU time, 0.192s GC time (major), 4411292/258693 mutations 
(total/tracked), 68/20155 GCs (major/minor)

Parsing a realistic HTTP/1.1 response many times
17.552s CPU time, 1.46s GC time (major), 8997089/2459681 mutations 
(total/tracked), 562/125354 GCs (major/minor)


Cheers,
Peter


signature.asc
Description: Digital signature
___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers