Re: Bytecode metadata

2003-01-25 Thread Leopold Toetsch
Nicholas Clark wrote:


On Thu, Jan 23, 2003 at 02:48:38PM -0800, Brent Dax wrote:

	struct Chunk {
		opcode_t type;
		opcode_t version;
		opcode_t size;
		void data[];
	};



I agree with the roughly bit, but I'd suggest ensuring that you put
in enough bits to get data[] 64 bit aligned. 



If there's a directory of some sort, it should record the type ID and
the offset to the beginning of the chunk.  


Putting this together, and inserting an Id field above, would give 
alignment on a 64 bit boundary for data in PBC - assuming the strings, 
data, ... are also N*64 bit wide.


It might be useful for making portable fat bytecode.



As I stated, I changed all sizes/offsets to be opcode_t. Of course this 
breaks reading 32 bit PBC on machines with 64 bit opcode_t - but this 
was already broken before, e.g.:

   header-magic = PackFile_fetch_op(self, cursor++);

If we want this portable, it probably should kook like

   header-magic = PackFile_fetch_op(self, cursor);

where the _fetch_xx has to advance the cursor by the PBC defined wordsize.

A _fetch_cstring and a _fetch_n_opcodes would also be handy. And for the 
latter, if the packfile is mmap()ed, it shouldn't fetch anything, but 
just set up the code pointer, advance the cursor, and remember, that the 
code_segment-code field should better not be freed at destroy time.


I'm thinking that register usage information from imcc could be of use
to the JIT, as that would save it having to work out things again. So that
probably needs a segment.



Yep. imcc does the whole CFG and life analysis, which JIT is doing 
again. At least basic blocks and register usage could be passed. Though 
register life range in JIT is different and depends on $arch. Calling 
(JIT) external functions ends a registers life, so it must be saved 
before calling and restored after.


Also some way of storing a cryptographic signature in the file, so that you
could compile a parrot that automatically refuses to load code that isn't
signed by you.



The palladium parrot :)



Juergen Boemmels wrote:




It might be even possible to dump the jitted code.



When you then are able to to get the same memory layout for a newly 
created interpreter, it might even run ;-)


So the JITted code contains lots of hard references to address in running
interpreter? It's not just dependent on that particular binary's layout?



JIT/i386 does call parrot functions directly e.g. pmc_new_noinit or 
string_make, so these would need relocation - or probably slightly 
slower but simpler to handle a jump table. We (all? JIT $arch) have at 
least one register pointing to parrot data. Including a jump table there 
for used parrot functions would do it.


I guess in future once the normal JIT works, and we've got the pigs flying
nicely then it would be possible to write a Not Just In Time compiler that
saves out assembly code and relocation instructions.

Bah. That's parrot -o foo.o foo.pmc isn't it?



*g*



Nicholas Clark



leo




Re: Arc: An Unfinished Dialect of Lisp

2003-01-25 Thread Andy Wardley
Adam Turoff wrote:
 The problem with cons/car/cdr is that they're fundemental operations.
 Graham *has* learned from perl, and is receptive to the idea that
 fundemental operators should be huffman encoded (lambda - fn).  It
 would be easy to simply rename car/cdr to first/rest, but that loses
 the huffman nature of car/cdr.  

Good point, but I can't help thinking that list/head/tail or list/item/rest
(for example) would be preferable to cons/car/cdr.  More meaning at the cost
of a character or two.

I doubt there are few people who remember, ever knew or even care that 
car is Contents of the Address Part of the Register and cdr is 
Contents of the Decrement part of the Register (yes, I had to look 
them up :-).  Even when you know what the acronyms stand for, they still
doesn't make a great deal of sense.

A




Re: Bytecode metadata

2003-01-25 Thread Nicholas Clark
On Sat, Jan 25, 2003 at 10:26:22AM +0100, Leopold Toetsch wrote:
 Nicholas Clark wrote:

 Also some way of storing a cryptographic signature in the file, so that you
 could compile a parrot that automatically refuses to load code that isn't
 signed by you.
 
 
 The palladium parrot :)

naa. I said signed by you, not signed by the RIAA^WMPAA^WMicrosoft

Nicholas Clark



Re: Bytecode metadata

2003-01-25 Thread Leopold Toetsch
Dan Sugalski wrote:


At 5:32 PM + 1/24/03, Dave Mitchell wrote:


I just wrote a quick C program that successfully mmap-ed in all 1639
files in my Linux box's /usr/share/man/man1 directory.



Linux is not the universe, though. 


I have it changed to use mmap() bytecode (other segments, with have a 
similar thing (i.e. size and opcode_t[size]) will be mmaped too).

If mmap'ing the packfile fails, a fallback to IO reading is there.

leo



Re: Bytecode metadata

2003-01-25 Thread Leopold Toetsch
Nicholas Clark wrote:


On Sat, Jan 25, 2003 at 10:26:22AM +0100, Leopold Toetsch wrote:




The palladium parrot :)



naa. I said signed by you, not signed by the RIAA^WMPAA^WMicrosoft



Yes, of course. I would do this with a personalized version of 
fingerprint.c and generate a separate executable.


Nicholas Clark



leo







Re: Bytecode metadata

2003-01-25 Thread Sean O'Rourke
On Sat, 25 Jan 2003, Leopold Toetsch wrote:
 Dan Sugalski wrote:

  At 5:32 PM + 1/24/03, Dave Mitchell wrote:
 
  I just wrote a quick C program that successfully mmap-ed in all 1639
  files in my Linux box's /usr/share/man/man1 directory.
 
 
  Linux is not the universe, though.

How true.  On Solaris, for example, mmap's are aligned on 64k boundaries,
which leads to horrible virtual address space consumption when you map
lots of small things.  If we're mmap()ing things, we want to be sure
they're fairly large.

/s




Re: Bytecode metadata

2003-01-25 Thread Jason Gloudon
On Thu, Jan 23, 2003 at 08:39:21PM +, Dave Mitchell wrote:

 This means that a Perl server that relies on a lot of modules, and which
 forks for each connection (imagine a Perl-based web server), doesn't
 consume acres of swap space just to have an in-memory image per Perl
 process, of all the modules.

Are you sure the swap space allocation isn't mostly attributable to the poor
locality in the Perl process's data structures ?

-- 
Jason



AUTOLOADED pre- and post- handler methods?

2003-01-25 Thread Dan Sugalski
Here's a question for the python/ruby folks.

I know (now) that python lets you have an interceptor method that 
gets called before a named method is called even. Does it allow this 
method to be generated by the generic fallback method?

In Perl terms, assume we have a method PRE that gets called before 
any method in a class is called, and AUTOLOAD which is called if you 
call a method on a class and that method doesn't exist. Does AUTOLOAD 
have to get called to check for PRE if PRE doesn't exist in a class?
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: AUTOLOADED pre- and post- handler methods?

2003-01-25 Thread Christopher Armstrong
Dan Sugalski [EMAIL PROTECTED] writes:
 I know (now) that python lets you have an interceptor method that gets
 called before a named method is called even. Does it allow this method
 to be generated by the generic fallback method?

Python doesn't really have interceptor methods. In fact, there
aren't any magic methods that are called specifically on method
calling -- only attribute access (well, except for __call__, which
happens when you do obj(), but that's a separate step in the process).


__getattribute__(self, name) -- Always called on attribute access.

__getattr__(self, name) -- Called after looking up the attribute
fails by regular means.

__get__(self, obj, type=None) -- Called on an object when that object
  is accessed as an attribute of
  another object. f34r this. :-)

Any of which, of course, can return methods. :-) btw, remember how
Python creates `bound methods' when you access a method through an
instance? It's implemented with (the C equivalent of) __get__ (As of
Python 2.2). More information related to that (and other useful
information in general) is here:
http://www.python.org/2.2.2/descrintro.html.

Btw, these all have `set' and `del' friends, but they don't always
perfectly match up with the `get's... I'll get back to you on that.

-- 
 Twisted | Christopher Armstrong: International Man of Twistery
  Radix  |  Release Manager,  Twisted Project
-+ http://twistedmatrix.com/users/radix.twistd/



Re: Bytecode metadata

2003-01-25 Thread Dave Mitchell
On Sat, Jan 25, 2003 at 06:18:47AM -0800, Sean O'Rourke wrote:
 On Sat, 25 Jan 2003, Leopold Toetsch wrote:
  Dan Sugalski wrote:
 
   At 5:32 PM + 1/24/03, Dave Mitchell wrote:
  
   I just wrote a quick C program that successfully mmap-ed in all 1639
   files in my Linux box's /usr/share/man/man1 directory.
  
  
   Linux is not the universe, though.
 
 How true.  On Solaris, for example, mmap's are aligned on 64k boundaries,
 which leads to horrible virtual address space consumption when you map
 lots of small things.  If we're mmap()ing things, we want to be sure
 they're fairly large.

Okay, I just ran a program on a a Solaris machines that mmaps in each
of 571 man files 20 times (a total of 11420 mmaps). The process size
was 181Mb, but the total system swap available only decreased by 1.2Mb
(since files mmapped in RO effecctively don't consume swap).

I think Solaris and Linux can both cut this. If other OSes can't, then
we fallback to reading in the file when necessary.

-- 
Lady Nancy Astor: If you were my husband, I would flavour your coffee
with poison.
Churchill: Madam - if I were your husband, I would drink it.



Re: Bytecode metadata

2003-01-25 Thread Dave Mitchell
On Sat, Jan 25, 2003 at 10:04:37AM -0500, Jason Gloudon wrote:
 On Thu, Jan 23, 2003 at 08:39:21PM +, Dave Mitchell wrote:
 
  This means that a Perl server that relies on a lot of modules, and which
  forks for each connection (imagine a Perl-based web server), doesn't
  consume acres of swap space just to have an in-memory image per Perl
  process, of all the modules.
 
 Are you sure the swap space allocation isn't mostly attributable to the poor
 locality in the Perl process's data structures ?

I was using swap space as a loose term to mean virutal memory consumption
- ie that resource which necessitates buying more RAM and/or swap disks.
The locality wasn't a proplem.

-- 
A walk of a thousand miles begins with a single step...
then continues for another 1,999,999 or so.



Re: Bytecode metadata

2003-01-25 Thread Nicholas Clark
On Sat, Jan 25, 2003 at 11:43:40PM +, Dave Mitchell wrote:
 On Sat, Jan 25, 2003 at 06:18:47AM -0800, Sean O'Rourke wrote:
  On Sat, 25 Jan 2003, Leopold Toetsch wrote:
   Dan Sugalski wrote:
  
At 5:32 PM + 1/24/03, Dave Mitchell wrote:
   
I just wrote a quick C program that successfully mmap-ed in all 1639
files in my Linux box's /usr/share/man/man1 directory.
   
   
Linux is not the universe, though.

There's always NetBSD if Linux won't run on your hardware :-)
ducks

  How true.  On Solaris, for example, mmap's are aligned on 64k boundaries,
  which leads to horrible virtual address space consumption when you map
  lots of small things.  If we're mmap()ing things, we want to be sure
  they're fairly large.
 
 Okay, I just ran a program on a a Solaris machines that mmaps in each
 of 571 man files 20 times (a total of 11420 mmaps). The process size
 was 181Mb, but the total system swap available only decreased by 1.2Mb
 (since files mmapped in RO effecctively don't consume swap).

11420 simultaneous mmaps in the same process? (just checking that I
understand you)

 I think Solaris and Linux can both cut this. If other OSes can't, then
 we fallback to reading in the file when necessary.

Maybe I'm paranoid (or even plain wrong) but we (parrot) can handle it
if an mmap fails - we just automatically fall back to plain file loading.
Can dlopen() cope if an mmap fails? Or on a platform which can only
do a limited number of mmaps do we run the danger of exhausting them early
with all our bytecode segments, and then the first time someone attempts
a require POSIX; it fails because the perl6 DynaLoader can't dlopen
POSIX.so? (And by then we've done our could-have-been-plain-loaded
mmaps, so it's too late to adapt)

Nicholas Clark




Re: Bytecode metadata

2003-01-25 Thread Dave Mitchell
On Sun, Jan 26, 2003 at 12:40:19AM +, Nicholas Clark wrote:
 On Sat, Jan 25, 2003 at 11:43:40PM +, Dave Mitchell wrote:
  Okay, I just ran a program on a a Solaris machines that mmaps in each
  of 571 man files 20 times (a total of 11420 mmaps). The process size
  was 181Mb, but the total system swap available only decreased by 1.2Mb
  (since files mmapped in RO effecctively don't consume swap).
 
 11420 simultaneous mmaps in the same process? (just checking that I
 understand you)

yep, exactly that. Src code included below.

 Maybe I'm paranoid (or even plain wrong) but we (parrot) can handle it
 if an mmap fails - we just automatically fall back to plain file loading.
 Can dlopen() cope if an mmap fails? Or on a platform which can only
 do a limited number of mmaps do we run the danger of exhausting them early
 with all our bytecode segments, and then the first time someone attempts
 a require POSIX; it fails because the perl6 DynaLoader can't dlopen
 POSIX.so? (And by then we've done our could-have-been-plain-loaded
 mmaps, so it's too late to adapt)

If there's such a platform, then presumably we don't bother mmap at all
for that platform.


to run: cd to a man directory, then C/tmp/foo *


#include sys/mman.h
#include sys/types.h
#include sys/stat.h
#include unistd.h
#include fcntl.h
#include stdio.h

main(int argc, char *argv[])
{
int i,j;
int fd;
off_t size;
void *p;
struct stat st;
for (j=0; j20; j++) {
for (i=1; iargc; i++) {
fd = open(argv[i], O_RDONLY);
if (fd == -1) {
perror(open); exit(1);
}
if (fstat(fd, st) == -1) {
perror(fstat); exit(1);
}
size = st.st_size;
/* printf(%d %5d %s\n, i, size, argv[i]); */

p = mmap(0, size, PROT_READ, MAP_SHARED, fd, 0);
if (p  0) {
perror(mmap); exit(1);
}

close(fd);
}
printf(done loop %d\n,j);
}
sleep(1000);

}

-- 
But Sidley Park is already a picture, and a most amiable picture too.
The slopes are green and gentle. The trees are companionably grouped at
intervals that show them to advantage. The rill is a serpentine ribbon
unwound from the lake peaceably contained by meadows on which the right
amount of sheep are tastefully arranged. Lady Croom - Arcadia



Re: L2R/R2L syntax

2003-01-25 Thread Damian Conway
Michael Lazzaro wrote:


When I come home from work each day, I can see my dog eagerly waiting at 
the window, just black snout and frenetically wagging tail visible over 
the sill.

I often think Larry and Damian must feel that way about this group.  
Poor, comical beasts, but so eager and well-meaning.  We greet them so 
enthusiastically when they've arrived, it's hard for them to get too mad 
at us.  Even when they discover we've peed on the carpet while they've 
been gone, and they have an awful mess to clean up.

Whilst I won't speak for Larry in this, *I* certainly don't anything anything
like that. (And, in truth, I know Larry well enough to be sure that he doesn't
either.)

Many of the members of this forum are highly talented and insightful
contributers. Many of the ideas expressed here are worthy of serious
consideration.

For example: Angel Faus had a brilliant suggestion for cleaning up higher
order functions, which we instantly adopted. And Luke Palmer's
reinterpretation of junction semantics was clearly superior to my original,
and will almost certainly be used. I could dig up plenty of other examples of
similar contributions.

And, with very few exceptions, the rest of the contributers -- though their
ideas are not always feasible, elegant, practical, or sometimes even sane ;-)
-- do still contribute their time and energies just as generously and with
just as deep a desire to make Perl 6 as good as it can possibly be.

Yes, there is a lot of tail chasing on this group, and often it only ends when
Larry or I propose our own resolution. Yes, I sometimes choose to ignore a
thread I see as going nowhere. But without that tail-chasing and dead-ending
we mightn't see the underlying problem they're attempting to address in the
first place. And we'd have to explore all the non-optimal alternative
solutions ourselves.

These are all genuine contributions to the design of Perl 6, and command
nothing but my highest respect.

Damian





Re: TPF donations (was Re: L2R/R2L syntax [x-adr][x-bayes])

2003-01-25 Thread Damian Conway
David Storrs wrote:


Correct me if I'm wrong, but isn't the one thing that all those
projects have in common...well...Perl?  And isn't Larry the guy to
whom we owe the existence of Perl?  I'm not fortunate enough to be
using Perl in my job, but I'm still more than happy to pony up for a
donation, purely from gratitude. 

This is something along the lines of the applied research vs basic
research question.  What Larry is doing pretty much amounts to basic
research that will help all of these other projects in the long run.

This is my view as well, but I can understand that it may not be everyone's
(including the TPF's).

At the moment, TPF is preparing to set up an on-line questionnaire to
get feedback from the community on what it's priorities should be.

Then we will all have our chance to have our say.

Damian





Re: L2R/R2L syntax

2003-01-25 Thread Damian Conway
Graham Barr wrote:


This is not a for or against, but there is something that has been
bugging me about this.

Currently in Perl5 it is possible to create a sub that has map/grep-like
syntax, take a look at List::Util

If the function form of map/grep were to be removed, which has been suggested,
and the ~ form maps to methods. How would you go about defining a utility
module similar to List::Util that uses the same syntax as map/grep but
without making the subs methods on the global ARRAY package ?


As far as I know Larry is not planning to remove the functional
forms of Cmap, Cgrep, etc.

Those forms may, it's true, become mere wrappers for the OO forms.
But I confidently expect they will still be available.

Damian





Re: An ignorant opinion from an amateur [was: Re: Civility, please]

2003-01-25 Thread Damian Conway
Sam Vilain wrote:


To me what's missing stands out like a sore thumb - that making sure a
package/class definition can express all the same primitive elements
used by the current emerged standard of modelling data sets - UML.


The design group is currently considering the entire issue of class metadata.
Jarkko has some very solid ideas on what's needed and how it would work.
I suspect we'll see a set of standardized properties that encode the requisite
information.

Damian





Re: Bytecode metadata

2003-01-25 Thread Sean O'Rourke
On Sat, 25 Jan 2003, Dave Mitchell wrote:
 On Sat, Jan 25, 2003 at 06:18:47AM -0800, Sean O'Rourke wrote:
  On Sat, 25 Jan 2003, Leopold Toetsch wrote:
   Dan Sugalski wrote:
  
At 5:32 PM + 1/24/03, Dave Mitchell wrote:
   
I just wrote a quick C program that successfully mmap-ed in all 1639
files in my Linux box's /usr/share/man/man1 directory.
   
   
Linux is not the universe, though.
 
  How true.  On Solaris, for example, mmap's are aligned on 64k boundaries,
  which leads to horrible virtual address space consumption when you map
  lots of small things.  If we're mmap()ing things, we want to be sure
  they're fairly large.

 Okay, I just ran a program on a a Solaris machines that mmaps in each
 of 571 man files 20 times (a total of 11420 mmaps). The process size
 was 181Mb, but the total system swap available only decreased by 1.2Mb
 (since files mmapped in RO effecctively don't consume swap).

The problem's actually _virtual_ memory use/fragmentation, not physical
memory or swap.  Say you map in 10k small files -- that's 640M virtual
memory, just over a fourth of what's available.  Now let's say you're also
using mmap() in your webserver to send large (10M) files quickly over the
network.  The small files, if they're long-lived get scattered all over
VA-space, so there's a non-trivial chance that the OS won't be able to
find a 10MB chunk of free addresses at some point.

To see it, you might try changing your program to map and unmap a large
file periodically while mapping the man pages.  Then take a look at the
process's address space with /usr/proc/bin/pmap to see what the OS is
doing with the maps.

Weird, I know, but that's why it stuck in my mind.  You have to map quite
a few files to get this to happen, but it's a real possibility with a
32-bit address space and a long-running process that does many small
mmap()s and some large ones.

Anyways...

/s