[COMMIT] New Assembler in place

2002-05-31 Thread Jeff

As noted, the new assembler is now in place. There isn't much
resemblance to the old assembler, excepting the fact that they both
(almost) read the same .pasm files. The new macro syntax is documented
in assemble.pl and can be read with perldoc, but I'll summarize things
later in the message.

The new assembler introduces two major changes and three new features,
which should get us a long way towards the 1.0 release. Let's look at
the fun stuff first.

Support for keyed parameters now exists. I need to change the name of
the 'set_keyed' and 'get_keyed' operator to just 'set', since they
should be relatively unambiguous. Anyway, here's the new syntax
available at the assembly level:

set_keyed P15["foo"],I7
get_keyed N3,P2[S5]

This is, of course, a simple example. Now, there isn't much magic going
on here, primarily because all it's doing is the following transform:

set_keyed P15["foo"],I7

into

set_keyed P15,"foo",I7

It's really just syntactical sugar, with a little better error checking.

The next feature which Clinton used in his BASIC compiler but had to do
by hand is a manifest constant. You can use the following syntax to
define an assemble-time constant:

..constant PerlHash 6 # Important, because the special names 'PerlHash'
&c went away
new P5,.PerlHash # Again, note the pervasive '.', telling you that
you're dealing with a macro expansion.

These constants are kept in a dynamic hash, and redefined every time you
redefine them as you progress through the file. The last feature was
alluded to earlier, but not stated.

Constants such as '.Array' and '.IntQueue' are predefined for you, so
you can use them as follows without defining them yourself:

..constant ARGV P0
new .ARGV,.PerlHash # .PerlHash is defined for you.

However, it's currently a static list inside assembler.pl, and not
explicitly generated from include/parrot/pmc.h. It should be, but it's
not. I'm planning to add '.include' and '.sub' later on, but I'm not
going to spend too much time on things that won't really be used by
compilers. I'd hope that they can spend time optimizing out uses of
macros and such during register allocation and other passes.

Now, onto the changes.

The changes here are designed to make the various syntaxes more
orthogonal. In the new assembler, every command or argument preceded by
a '.' is a macro argument of some variety. Therefor, after expansion,
nothing in a .pasm file should ever have a '.' prepended to it.

Macros used to look like:
answer macro ( A, B )
  print A
  print "\n"
  print B
endm
answer(1,5)

The new style changes 'macro' and 'endm' so the assembler knows the're
not new instructions, and does the same for 'A' and 'B' so the assembler
knows they're not labels. Here's how that now looks, with comments.

..macro answer ( A, B ) # Macro expansions are prefaced by '.', and the
   # first item, so no lookahead is required to
   # determine if 'macro' is a label.
  print .A # At one time, 'A' might have been confused with
   # a forward-referenced label. No longer.
  print "\n"
  print .B
..endm
..answer(1,5) # Again, anything to be expanded is prefaced by a '.'.

The next change is to local labels. These are now no longer treated
specially by the assembler, and are considered simply macro expansions
inside a macro definition. Local labels really shouldn't be necessary
outside of a macro definition. At the global level, there really isn't a
need to have more than one label with the same name, although I'll admit
it's convenient. Inside a macro, there is a reason, because three macro
invocations each with 'LOOP_START' probably shouldn't die in the
assembler with "multiple label definitions".

I removed local labels at the global level really only for one reason: I
couldn't decide how the semantics should work. Multiple 'ret'
instructions can occur in a single subroutine, which means that we can't
tell if a label invoked twice between rets and between definitions is
pointing to the first label creation or the second. This is sort of hard
to explain at 12:30 am, so I won't continue.

Anyway, the old local label syntax looks like:

$foo: set I0,32
  branch $foo

Now, the local label syntax (only available in macros, remember) is:

..local $foo: set I0,32 # The '.local' now sets off the macro.
 branch .$foo  # And '.' now prefaces the macro invocation.

So, hopefully the bad news isn't too bad. Local labels are still
available where you -really- need them, and macro syntax hasn't changed
too much, but it should be easier to assemble.
--
Jeff <[EMAIL PROTECTED]>



Re: [netlabs #644] [PATCH] More tests

2002-05-31 Thread Josh Wilmes

At 23:25 on 05/31/2002 -, Simon Glover (via RT) 
<[EMAIL PROTECTED]> wrote:

>  This patch adds tests for the index, depth & intdepth ops, as well
>  as adding an extra test for intsave/intrestore.
> 
>  Simon

Committed, daddio.

--Josh




The great inline debacle (longish)

2002-05-31 Thread David M. Lloyd

Nearly two years ago, a great debate was begun on this mailing list about
when to inline code and when not to, as well as the actual merits of
inlining in the first place.  I will attempt to summarize here, becuase
this debate was never concluded and we now are at a point where some
things are inlined, and some other things are bogusly marked as inline but
are not actually usable as such (did that sentence make sense?).

This is my attempt to paraphrase the discussion.  I'm just laying it out
as I understand it, so if I misinterpret somebody please forgive me.

John Tobay started off the whole thing by suggesting that all internal
Parrot functions are declared inline (meaning with the "inline" keyword
where supported), with any external API functions existing as wrappers
that "call" the equivalent inlined function.  [Question: At least on GCC,
inline funcs can be external, and don't have to be static, true?  If true,
why have a wrapper func at all?  My understanding is that inline funcs
will be inline but an extern version will also be generated in that case.]

He alluded to using a preprocessor (which I believe was a popular idea at
the time) to generate these wrappers.  Simon Cozens referenced Sapphire as
an example of the advantages of this approach.

Some discussion was raised about premature optimization.  That thread
didn't seem to go anywhere though; the arguments toward future flexibility
(tied into the preprocessor idea) seemed to win out.

Dan Sugalski expressed his wish to be able to squish all the Parrot source
into a monster .c file, to let the compiler do the most amount of magic
possible.  He also mentioned a point brought up later (and louder) by Nick
Ing-Simmons that inline may not help at all.  Nick also states that most
modern compilers can intelligently decide whether or not to inline
functions in the first place.

At this point there was some discussion about the differences between
using a macro to inline code or using the "inline" keyword where supported
and letting the compiler do it.  It was generally agreed that macros for
this purpose are Bad, many reasons being cited (included in those were
some pretty gross Perl5-isms).

It was also generally agreed that neither the caller or the compiler
should not be forced to inline or not inline anything.  Whether or not
GCC's inline keyword was treated as a hint or an order was not made clear,
and no-one mentioned other platforms (MSVC comes to mind) that support
inline.

Nick I-S summarized his views as follows (here I use a direct quote
becuase he was most efficient):

> >So aren't we all saying the same thing?
>
> I don't think so - it is a question which way we code the source:
>
> A. Use 'inline' every where and trust compiler not to do what we told it
>if it knows better.
> B. No inline hints in the source and trust the compiler to be able to
>do the right thing when prodded with -O9 or whatever.
> C. Make "informed" guesses at to which calls should be inlined.
>
> My view is that (B) is the way to go, aiming for (C) eventually, because
> (A) gives worst-case cache purging.

To which Tobey replied:

> Moving from a first-guess (C) to an optimal (C) (where we make
> reasonable hints all the time, no doubt with the aid of some Configure
> tests or machine-dependent conditionals) can be an ongoing pursuit.

He did not elaborate on the means through which this would happen.

Nick expressed "Nick's performance theory" as follows:

> I have said this before but the gist of the Nick-theory is:
>
> Page boundaries are a don't care unless there is a page miss. Page
> misses are so costly that everything else can be ignored, but for sane
> programs they should only be incured at "startup". (Reducing code size
> e.g. no inline only helps here - less pages to load.)

also stating:

> ...I _fundamentally_ believe inlining is nearly always sub-optimal for
> real programs.
>
> But -O3 (or -finline-functions) is there for the folk that want to
> believe the opposite.
>
> And there is -Dinline -D__inline__ for the inline case. What there isn't
> though is -fhash_define-as-inline or -fno-macros so at very least lets
> avoid _that_ path.

At which point a technical discussion ensued involving page faults and
TLBs, and L1 and L2 cache latency and penalties, not really resulting in
any policy decisions but surely enlightening many of us.

and then the discussion seemed to burn itself out... just when it was
getting good!  Believe it or not, most of this took place over just a
couple days.

But that still leaves us with an INLINE macro that seems to be largely
irrelivant due to the fact that, in order to be useful, inlined functions
have to be defined in the same module that uses them.

So the question boils down to this:

Do we use "inline", or not?

Note that (in my estimation) a "yes" answer should NOT mean "smear the
place with INLINE"; instead it should mean that we use it judiciously
where it seems to give a performance boost.

Have a n

Re: [netlabs #645] [PATCH] Packfile warnings clean-up

2002-05-31 Thread Josh Wilmes


At 23:25 on 05/31/2002 -, Simon Glover (via RT) 
<[EMAIL PROTECTED]> wrote:

> # New Ticket Created by  Simon Glover 
> # Please include the string:  [netlabs #645]
> # in the subject line of all future correspondence about this issue. 
> # http://bugs6.perl.org/rt2/Ticket/Display.html?id=645 >
> 
> 
> 
>  This patch fixes a few "No previous prototype..." warnings in
>  packfile.h, and corrects an obvious bug in the packfile.c
>  documentation.
> 
>  Simon

Applied, thanks.

--Josh




[netlabs #645] [PATCH] Packfile warnings clean-up

2002-05-31 Thread via RT

# New Ticket Created by  Simon Glover 
# Please include the string:  [netlabs #645]
# in the subject line of all future correspondence about this issue. 
# http://bugs6.perl.org/rt2/Ticket/Display.html?id=645 >



 This patch fixes a few "No previous prototype..." warnings in
 packfile.h, and corrects an obvious bug in the packfile.c
 documentation.

 Simon


--- include/parrot/packfile.h.old   Fri May 31 18:05:15 2002
+++ include/parrot/packfile.h   Fri May 31 18:08:12 2002
@@ -162,6 +162,15 @@

 opcode_t PackFile_fetch_op(struct PackFile *pf, opcode_t *stream);

+INTVAL
+PackFile_fetch_iv(struct PackFile *pf, opcode_t *stream);
+
+FLOATVAL
+PackFile_fetch_nv(struct PackFile *pf, opcode_t *stream);
+
+void
+PackFile_assign_transforms(struct PackFile *pf);
+
 /*
 ** Byte Ordering Functions (byteorder.c)
 */


--- packfile.c.old  Fri May 31 18:03:55 2002
+++ packfile.c  Fri May 31 18:04:11 2002
@@ -131,9 +131,9 @@

 /***

-=item fetch_iv
+=item fetch_nv

-Fetch an INTVAL from the stream, converting
+Fetch a FLOATVAL from the stream, converting
 byteorder if needed.

 =cut






[netlabs #644] [PATCH] More tests

2002-05-31 Thread via RT

# New Ticket Created by  Simon Glover 
# Please include the string:  [netlabs #644]
# in the subject line of all future correspondence about this issue. 
# http://bugs6.perl.org/rt2/Ticket/Display.html?id=644 >



 This patch adds tests for the index, depth & intdepth ops, as well
 as adding an extra test for intsave/intrestore.

 Simon

--- t/op/string.t.old   Fri May 31 17:04:18 2002
+++ t/op/string.t   Fri May 31 17:33:25 2002
@@ -1,6 +1,6 @@
 #! perl -w

-use Parrot::Test tests => 77;
+use Parrot::Test tests => 82;

 output_is( <<'CODE', < 29;
+use Parrot::Test tests => 32;
 use Test::More;

 # Tests for stack operations, currently push*, push_*_c and pop*
@@ -589,6 +589,32 @@
 OUTPUT
 }

+output_is(<<'CODE', <<'OUTPUT', "depth op");
+depth I0
+print I0
+print "\n"
+
+save "Foo"
+depth I0
+print I0
+print "\n"
+restore S0
+
+set I1, 0
+LOOP:   save I1
+inc I1
+lt I1, 1024, LOOP
+depth I0
+print I0
+print "\n"
+
+end
+CODE
+0
+1
+1024
+OUTPUT
+
 output_is(<<'CODE', <<'OUTPUT', "intstack");
intsave -1
intsave 0
@@ -622,6 +648,52 @@
 43210-1
 OUTPUT

+output_is(<<'CODE', <<'OUTPUT', "intstack stress test");
+set I0, 0
+LOOP:  intsave I0
+inc I0
+lt I0, 2048, LOOP
+
+LOOP2:  dec I0
+intrestore I1
+ne I0, I1, ERROR
+gt I0, 0, LOOP2
+print "ok\n"
+end
+
+ERROR:  print "Not ok\n"
+end
+
+CODE
+ok
+OUTPUT
+
+output_is(<<'CODE', <<'OUTPUT', "intdepth");
+intdepth I0
+print I0
+print "\n"
+
+intsave 1
+intdepth I0
+print I0
+print "\n"
+intrestore I2
+
+set I1, 0
+LOOP:   intsave I1
+inc I1
+lt I1, 1024, LOOP
+intdepth I0
+print I0
+print "\n"
+
+end
+CODE
+0
+1
+1024
+OUTPUT
+
 ##

 # set integer registers to some value given by $code...





Re: GC, exceptions, and stuff

2002-05-31 Thread Dan Sugalski

At 4:15 PM -0400 5/29/02, Josh Wilmes wrote:
>At 15:22 on 05/29/2002 EDT, Dan Sugalski <[EMAIL PROTECTED]> wrote:
>
>>  I think we'll be safe using longjmp as a C-level exception handler.
>>  I'm right now trying to figure whether it's a good thing to do or
>>  not. (I'd like to unify C and Parrot level exceptions if I can)
>
>Just make sure that we end up with something portable to be able to build
>a miniparrot with just ANSI C.  I assume that's still a design goal.

setjmp and longjmp, for better or worse, are ANSI C, so if we use 
them conservatively (which we will) we're fine.
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk



Re: ICU and Parrot

2002-05-31 Thread Autrijus Tang

On Sat, Jun 01, 2002 at 02:20:15AM +0900, Dan Kogai wrote:
> >2) If not, would a Encode::ICU be wise?
> I'm not so sure.  But if I were the one to implement Encode::ICU, it 
> will not be just a compiled collection of UCM files but a wrapper to all 
> library functions that ICU has to offer.  I, for one, am too lazy for 
> that.

That would be Text::Uconv's job, wouldn't it? Then Encode::ICU could just
interface to that module instead.

> >3) A number of encodings are in HanExtra but not their ucm repository,
> >   namedly big5plus, big5ext and cccii. Is is wise to feed back to them
> >   under the name of e.g. perl-big5plus.ucm?
> You should in time and I should, too, because I have expanded UCM a 
> little so that you can define combined characters commonly seen in 
> Mac*.  But I don't see any reason to be in hurry for the time being.

Understood.

In a related note:

http://www.li18nux.org/docs/html/CodesetAliasTable-V10.html

has spurred quite a bit discussion in Taiwan because of the mandated
standardization of Big5 => TCA-BIG5, and Big5-HKSCS => HKSCS-BIG5 (i.e.
the standard body first.)  But it struck me as making lots of sense,
if in a rather rigid way.

Should Encode.pm probably add them to the Alias table, in the name of
'practical'? In particular, supporting CP-xxx (=> CPxxx) and ISO-646-US 
(=> US-ASCII) should be rather beneficial.

/Autrijus/



msg09924/pgp0.pgp
Description: PGP signature


Re: ICU and Parrot

2002-05-31 Thread Melvin Smith


> On Fri, May 31, 2002 at 06:18:55AM +0900, Dan Kogai wrote:
>actually adopted.  Useful it may be I found raw ICM too Big and too
>Blue :)

Whats that mean, too Blue? :)

-Melvin





Re: ICU and Parrot

2002-05-31 Thread Dan Kogai

On Saturday, June 1, 2002, at 12:34 AM, Autrijus Tang wrote:
> On Fri, May 31, 2002 at 06:18:55AM +0900, Dan Kogai wrote:
>> As a matter of fact GB18030 is ALREADY supported via Encode::HanExtra 
>> by
>> Autrijus Tang.  The only reason GB18030 was not included in Encode main
>> is sheer size of the map.
>
> Yes, partly because it was not implemented algorithmically. :)
>
> I was browsing http://www-124.ibm.com/cvs/icu/charset/data/ucm/ and 
> toying
> with uconv, and wondered:
>
> 1) Does Encode have (or intend to have) them all covered?

No,  Unless they appear in www.unicode.org.  Though some of them are 
actually adopted.  Useful it may be I found raw ICM too Big and too 
Blue :)

> 2) If not, would a Encode::ICU be wise?

I'm not so sure.  But if I were the one to implement Encode::ICU, it 
will not be just a compiled collection of UCM files but a wrapper to all 
library functions that ICU has to offer.  I, for one, am too lazy for 
that.

> 3) A number of encodings are in HanExtra but not their ucm repository,
>namedly big5plus, big5ext and cccii. Is is wise to feed back to them
>under the name of e.g. perl-big5plus.ucm?

You should in time and I should, too, because I have expanded UCM a 
little so that you can define combined characters commonly seen in 
Mac*.  But I don't see any reason to be in hurry for the time being.

If any of you are a member of team ICU you may redirect this dialogue to 
your team so we can work together in future (after 5.8.0, that is).

Dan the Encode Maintainer




Re: ICU and Parrot

2002-05-31 Thread Autrijus Tang

On Fri, May 31, 2002 at 06:18:55AM +0900, Dan Kogai wrote:
> As a matter of fact GB18030 is ALREADY supported via Encode::HanExtra by 
> Autrijus Tang.  The only reason GB18030 was not included in Encode main 
> is sheer size of the map.

Yes, partly because it was not implemented algorithmically. :)

I was browsing http://www-124.ibm.com/cvs/icu/charset/data/ucm/ and toying
with uconv, and wondered:

1) Does Encode have (or intend to have) them all covered?
2) If not, would a Encode::ICU be wise?
3) A number of encodings are in HanExtra but not their ucm repository,
   namedly big5plus, big5ext and cccii. Is is wise to feed back to them
   under the name of e.g. perl-big5plus.ucm?

Thanks,
/Autrijus/



msg09920/pgp0.pgp
Description: PGP signature


RE: inline functions (was Re: [netlabs #629] [PATCH] Memory manager/garbage collector -major revision)

2002-05-31 Thread Brent Dax

Melvin Smith:
# The common way is to define our own INLINE definition and 
# have Configure check for it, define it null if needed, and 
# conditionally include it into a file as extern if so.
# 
# Sounds like a job for. BrentDax++!

We already *have* an INLINE, and it's done with #ifdefs.  :^)

--Brent Dax <[EMAIL PROTECTED]>
@roles=map {"Parrot $_"} qw(embedding regexen Configure)

blink:  Text blinks (alternates between visible and invisible).
Conforming user agents are not required to support this value.
--The W3C CSS-2 Specification




Re: inline functions (was Re: [netlabs #629] [PATCH] Memorymanager/garbage collector -major revision)

2002-05-31 Thread David M. Lloyd

On Fri, 31 May 2002, Melvin Smith wrote:

>
> The common way is to define our own INLINE definition and have Configure
> check for it, define it null if needed, and conditionally include it
> into a file as extern if so.
>
> Sounds like a job for. BrentDax++!

Also some compilers (like SUN Workshop) need to know after the function is
declared via a #pragma, so maybe also a seperate macro is needed.

IMHO compilers that use non-GCC methods for declaring inline functions are
often forgotten about.

- D

<[EMAIL PROTECTED]>




Re: inline functions (was Re: [netlabs #629] [PATCH] Memory manager/garbagecollector -major revision)

2002-05-31 Thread Melvin Smith


The common way is to define our own INLINE definition and have Configure
check for it, define it null if needed, and conditionally include it into a
file
as extern if so.

Sounds like a job for. BrentDax++!

-Melvin Smith

IBM :: Atlanta Innovation Center
[EMAIL PROTECTED] :: 770-835-6984


   
   
  Nicholas Clark   
   
  <[EMAIL PROTECTED]>  To:   Robert Spier 
<[EMAIL PROTECTED]>  
  Sent by: Nicholascc:   Jerome Vouillon 
<[EMAIL PROTECTED]>, Mike Lambert <[EMAIL PROTECTED]>,
  Clark [EMAIL PROTECTED]   
   
  <[EMAIL PROTECTED]Subject:  inline functions (was Re: 
[netlabs #629] [PATCH] Memory manager/garbage collector
  e.org>-major revision)   
   
   
   
   
   
  05/31/2002 07:11 
   
  AM   
   
   
   
   
   



On Tue, May 28, 2002 at 07:54:49AM -0700, Robert Spier wrote:
> We've got enough complicated preprocessor issues right now - I'm
> not sure we want to add another one.  Defining perl5ish macros
> will cause too many troubles down the road.
>
> Or... since C99 supports C function inlining (iirc) - we could
> just rely on a C99 compiler

Many compilers have inline functions in C already. I doubt many are
reliably
C99 enough for us to use them.

[Hell, C89 is still causing some vendors problems, although there is
progress: Solaris 9 will have a conformant fflush()]

Is there any easy clean way we can write functions that will be inlined on
C compilers with inline, but will still work on other compilers
(possibly by (erk) a preprocessing stage to pull out all the inline
definitions to another file, and compile them as conventional functions) ?

That way, we'd get the speed hit we desired on most platforms, but the code
would still run everywhere.

Nicholas Clark