Re: This week's summary

2003-07-03 Thread Alan Burlison
Dan Sugalski wrote:

The more I think about this the more I want to punt on the whole idea. 
Cross-platform async IO is just one big swamp.
Agreed.  Glug, glug, glug ;-)

--
Alan Burlison
--


Re: async i/o (was Re: This week's summary)

2003-07-03 Thread Alan Burlison
Uri Guttman wrote:

who here will be at oscon (or yapc::eu)? i would like to get a small BOF
going on this subject. i agree it is a morass but i have some ideas and
i know dan has plenty. but we had better learn to swim these swamps and
not get eaten by the gators. we can drain them, convert them to dry
deserts and make them safe for camel caravans. :)
I'll be at yapc::eu

--
Alan Burlison
--


Re: This week's summary

2003-07-02 Thread Alan Burlison
Dan Sugalski wrote:

Unfortunately given what the code does we can't use mutexes, since 
they're not interrupt-safe, which I'm not particularly happy about. The 
queues in question are thread-specific, though, which takes some of the 
danger out of things. I fully expect to have to do more work than just 
disabling optimizations to disable reordering to make this function 
right everywhere. (I know, I know, platform-independent interrupt code 
is just not doable, but...)
Hmm, I don't think the issue is that mutexes aren't interrupt-safe, I think 
the issue is that if an interrupt routine is reentered it may already be 
holding a mutex and therefore self-deadlock.  The Solaris attributes(5) 
manpage says this about Async-Signal-Safe routines (which I believe is what 
you are trying to do):

 Async-Signal-Safe
   Async-Signal-Safe refers to  particular  library  rou-
   tines that can be safely called from a signal handler.
   A thread that is executing an  Async-Signal-Safe  rou-
   tine will not deadlock with itself if interrupted by a
   signal. Signals are only a problem  for  MT-Safe  rou-
   tines that acquire locks.
   Signals  are  disabled  when  locks  are  acquired  in
   Async-Signal-Safe  routines.  This  prevents  a signal
   handler that might acquire the same  lock  from  being
   called.
Personally, I'd much prefer to use platform-provided interrupt-safe 
queueing mechanisms, and we will in those places where it's available. I 
know it *is* available on VMS, and *isn't* available on OS X and Linux. 
I'm also very painfully aware of some of the issues that need to be 
dealt with for processors with internal read and write reordering, which 
isn't anywhere near fun to deal with. (Well, OK, it is, but I'm weird 
that way)

What I'd really like is a nice, portable, insqti/remqhi implementation, 
but barring that (since I'm not going to get it) something as close as 
possible.
I suggest you disable signals during the queue operations, take out the 
lock, do the work, drop the lock then reenable signals.

--
Alan Burlison
--


Re: This week's summary

2003-06-30 Thread Alan Burlison
Piers Cawley wrote:

  Small Perl task for the interested
Want to get involved in the Parrot development process? Don't know much
about Virtual Machine design and implementation? Do know Perl? Dan has a
small but interesting task for you.
At present, Parrot gets built without any compiler level optimizations
turned on because files like tsq.c can't have any optimizations turned
on (tsq.c is the thread safe queue module, which is annoyingly
execution-order-dependent because it has to operate safely as interrupt
code potentially interrupting itself).
Dan would like a version of Configure.pl which can build a makefile (or
whatever build tool we end up using) with per-C-file compiler flags, and
it needs to be possible to override those flags, per file, by the
platform configuration module.
Hmm, I'm only a lurker, but that looks *very* suspect to me.  Some compilers 
may choose to reorder even without optimization turned on.  I'd say that it 
is a bug in Parrot if it requires optimization to be off for this code - how 
many different compilers have you tried?

--
Alan Burlison
--


Re: This week's summary

2003-06-30 Thread Alan Burlison
Rafael Garcia-Suarez wrote:

Hmm, I'm only a lurker, but that looks *very* suspect to me.  Some compilers 
may choose to reorder even without optimization turned on.  I'd say that it 
is a bug in Parrot if it requires optimization to be off for this code - how 
many different compilers have you tried?
That doesn't make this per-C-file-cc-option-tweaking necessarily
unuseful. Perl 5 uses something similar, because the lexer is sometimes
miscompiled when some compilers with a high optimization level. Example :
see the hints files and lookup XXX_cflags or toke_cflags in
hints/README.hints.
I'm not saying it isn't useful - per compiler workarounds for brokenness is 
one thing, but the implication was that this tweakage was needed for *all* 
compilers for those particular files, which spells 'broken' in my book.  If 
the code makes assumptions about execution order without using the necessary 
mutexes/cvs to enforce these assumptions, it is very unlikely to work on 
large SMP machines, for example.

--
Alan Burlison
--


Re: Butt-ugliness reduction

2001-11-19 Thread Alan Burlison

[EMAIL PROTECTED] wrote:

  There is.  You can't necessarily convert on the fly - perl5 allows
  dual-typed SVs where the string and number aren't necessarily
  interchangable versions of each other.
 
 Ahem, I was asking about int and num, not num and string :-)

Oops - so you were.  Soz!

-- 
Alan Burlison
--
$ head -1 /dev/bollocks
visioneer 24/365 niches



Re: Size of integer register vs sizeof(void *)

2001-11-19 Thread Alan Burlison

Segher Boessenkool wrote:

  Are there any cases where a void * cannot be placed into an integer
  register?  It seems like it shouldn't happen, especially since jump and
  jsr are supposed to take an integer register and they point to a
  host-machine-address...

Yes, all platforms that support LP64, i.e. virtually all 64-bit address
space platforms.

Sparc is just one example.

-- 
Alan Burlison
--
$ head -1 /dev/bollocks
refactor service-led meta-services, going forwards



Re: Butt-ugliness reduction

2001-11-17 Thread Alan Burlison

Dave Mitchell wrote:

 * Is there ever any need to for a PMC type which holds both an int and a
 num? In the Perl 5 case we were constrained by code that expected to always
 find an int (or a num) in a fixed slot in the SV; with PMCs, all access
 to these slots is via methods, so an int-num or num-int conversion can
 be done on the fly.

There is.  You can't necessarily convert on the fly - perl5 allows
dual-typed SVs where the string and number aren't necessarily
interchangable versions of each other.  I use this ability, for example, to
map C #define constants to perl SVs that behave like the symbolic value in
a string context and the numeric value in a numeric constant.  e.g. given
this in a header file:

#define FOO 2
#define BAR 4

and mapping those to similarly named perl variables

my $fb = $foo | $bar;

assigns 6 to $fb, but

print($foo | $bar\n);

prints out FOO | BAR

This is a really useful feature because it means that you don't need huge
lookup tables to convert from the numeric to the string version of a
constant - it is both at the same time.

-- 
Alan Burlison
--
$ head -1 /dev/bollocks
immutably engage omnipresent server-centric drivers



Re: Revamping the build system

2001-11-14 Thread Alan Burlison

Paolo Molaro wrote:

  And if we have to use make, then we're back with the very problems of portably
  calling compilers and so on that this supposed new build system was meant to
  avoid.
 
 I'm going to bite and say the words (and get the flames).
 
 autoconf automake libtool

Has anyone said 'Jam' on this thread yet?  Jam is a platform-independent
make replacement, see http://www.perforce.com/jam/jam.html 

-- 
Alan Burlison
--
$ head -1 /dev/bollocks
build high-visibility pervasive knowledge capital



Re: SV: Parrot multithreading?

2001-09-28 Thread Alan Burlison


  or have entered a mutex,
 
 If they're holding a mutex over a function call without a
 _really_ good reason, it's their own fault.

Rubbish.  It is common to take out a lock in an outer functions and then
to call several other functions under the protection of the lock.

   The alternative is that _every_ function simply return
  a status, which
   is fundamentally expensive (your real retval has to be
  an out
   parameter, to start with).

Are we talking 'expensive in C' or 'expensive in parrot?'

 It is also slow, and speed is priority #1.

As far as I'm aware, trading correctness for speed is not an option.

-- 
Alan Burlison
--
$ head -1 /dev/bollocks
effectively incubate innovative network infrastructures



Re: SV: Parrot multithreading?

2001-09-28 Thread Alan Burlison

Benjamin Stuhl wrote:

 Again, having a GC makes things easier - we clean up
 anything we lost in the GC run. If they don't actually work
 (are there any platforms where they don't work?), we can
 always write our own ;-).

I eagerly await your design for a mutex and CV garbage collector.

-- 
Alan Burlison
--
$ head -1 /dev/bollocks
systematically coordinate e-business transactional integrity



Re: Stacks registers

2001-05-23 Thread Alan Burlison

Nick Ing-Simmons wrote:

 That comment reminds me of how the register file is implemented in
 a sun sparc. They have a large register file, but only some are accessable
 at any given time, say 16.
 
 32 IIRC but principle is correct.

8 global registers, 8 out registers, 8 local registers and 8 in registers. 
Some are set aside for special purposes, e.g. %r14 is stack pointer, %r15 is
called subroutine return addr etc.  Effectively you have 6 'in' and 6 'out'
registers  Extra arguments above 6 are passed in the caller's stack frame.

 1. When you call deep enough to fall off the end of the large register
file an expensive system call is needed to save some registers
at the other end to memory and wrap, and then again when you
come back to the now-in-memory registers.

Not a system call but a trap - they aren't the same thing (pedant mode off
;-).  The register spill trap handler copies the relevant registers onto the
stack - each stack frame has space allocated for this.

Alan Burlison



Re: PDD: Conventions and Guidelines for Perl Source Code

2001-05-10 Thread Alan Burlison

Dave Mitchell wrote:

 quote
 All entities should be prefixed with the name of the subsystem they appear
 in, eg Cpmc_foo(), Cstruct io_bar. They should be further prefixed
 with the word 'perl' if they have external visibility or linkage,
 /quote

Duh!  Missed it.  Thanks.

Alan Burlison



Re: PDD: Conventions and Guidelines for Perl Source Code

2001-05-10 Thread Alan Burlison

Dave Mitchell wrote:

 My main objection to dSP et al is that it represents the innermost circle
 of the hell that is Perl 5 macros. Stuff like this is totally bemusing to
 the newcomer:
 
   dPOPPOPnnrl;
   if (right == 0.0) 
 
 I was just trying think of ways of altering people that Something Funny
 is Going On Here. Oh well, I surrender...

I strongly agree.  The current macro mayhem in perl is an utter abomination,
and drastically reduces the maintainability of the code.  I think the
performance argument is largely specious, and while abstraction is a
laudable aim, in the case of perl it has turned from abstraction into
obfustification.

Alan Burlison



Re: PDD: Conventions and Guidelines for Perl Source Code

2001-05-08 Thread Alan Burlison

I see nothing about namespacing, e.g. Perl_

Alan Burlison



Re: C Garbage collector

2001-02-20 Thread Alan Burlison

Alan Burlison wrote:

 I've attached the HTML

Well it was there when I sent it... does this list strip attachments or
something?
 
Here is is as badly-formatted text - sorry!

Alan Burlison


Appendix A: How Sun WorkShop Memory Monitor Works 

Memory management in C/C++ is both time consuming and
error prone. Typically, about one-third of development time
is spent on memory management, and most commercially
available programs ship with memory leaks or premature
frees. Even worse, no matter how well written your own
code is, third-party libraries and DLLs used by your
program may leak, leaving you with no way to deliver
leak-free applications.

Imagine what would happen if memory just freed itself when
it was no longer in use. Programs that freed memory in the
wrong place would automatically be fixed, and new code
wouldn't need to free memory at all. Such automatic
garbage collection is the standard in Java and Smalltalk.
I'll lift the hood on the Sun WorkShop Memory Monitor
garbage collector, focusing on the details that make it
practical, transparent, and scalable.

Why Garbage Collection? 

The basic idea behind garbage collection is refreshingly logical. As a
program runs, the garbage collector
periodically scans the program's data marking everything that is in use.
It begins by scanning the
program's stack, registers, and the static segment. Every time it
encounters a pointer, it marks the object
that it points to. Then it scans the object for pointers to other heap
objects. Once it has marked all
reachable heap objects, it frees all the objects that haven't been
marked (see Figure 1).

The hard part of adapting garbage collection to C/C++ is identifying
pointers. Since object libraries and
DLLs often lack source, you can't examine the source for type
information, and you can't recompile.
Typically, you don't even have debugging information. The only remaining
option is to relink. This isn't as
bad as it sounds, since relinking lets you redefine whatever functions
you like.

What functions should you redefine? The natural candidates are
memory-management functions --
malloc(), free(), new, and delete. Redefining free() and delete to Null
operations protects the
programmer from premature frees. This leaves only the malloc() and new
allocators.

But how does this help you identify pointers? A solution to this problem
was first observed by Hans Boehm
and Mark Weiser. (For more information, see their article, "Garbage
Collection In an Uncooperative
Environment," Software Practice and Experience, September 1988, as well
as Boehm's "Advantages and
Disadvantages of Conservative Garbage Collection," at
ftp://parcftp.xerox.com/pub/gc/issues.html.)
Allocators keep track of what memory has been allocated. You can test
whether a data word points inside
an allocated object. If so, it's probably a pointer. Is this test always
right? No, but it's a start.

Sun WorkShop Memory Monitor implements a refined version of this
pointer-finding strategy through a
custom allocator that can efficiently report on the status of any
address. Once you can identify pointers,
you can garbage collect a program.

We Interrupt this Program... 

Proper scheduling may be the most important factor in garbage collector
performance. The earliest
collectors only performed garbage collection when the computer ran out
of memory. When a collection
finally occurred, it analyzed the computer's entire address memory,
interrupting operation for long periods
of time. No wonder early garbage collectors were invariably associated
with annoying interruptions of
program operation.

Many current garbage collectors, especially Java collectors, rely
primarily on a low-priority background
thread to schedule garbage collection. This approach sounds good, but
leads to poor performance. The
low-priority background thread provides lots of garbage collection to
inactive programs that don't need it.
By the same token, computation-intensive programs require large amounts
of garbage collection, but don't
receive any because the background collection thread doesn't have a
chance to run since such a program
always demands CPU cycles. As a result, background collection threads
should, at best, supplement some
other primary collection strategy.

Sun WorkShop Memory Monitor decides when to collect garbage by watching
how much memory has been
allocated since the last collection. This way, programs that use lots of
dynamic memory allocation receive
the collection they need, while programs that don't allocated much
memory don't waste a lot of time doing
unnecessary garbage collection.

To further limit program interruptions, you can do only a small part of
the garbage collection each time. At
first glance, this seems impossible. The heap is constantly changing, so
how can you know that the
analysis from earlier partial collections is still correct?
Memory-management hardware in the CPU provides
some help. After analyzing a page of memory, simply "

C Garbage collector

2001-02-20 Thread Alan Burlison

Documentation excerpt:

With Sun WorkShop Memory Monitor, you can program without calling free()
or delete. Determining when to call free() or delete is difficult. 
Studies indicate that 30% to 40% of programmer time is spent on memory
management in C or C++ programs. Failing to release memory causes memory
leaks, which refers to the accumulation of unused memory in the program,
ultimately causing the program to monopolize or even run out of memory.
Releasing memory too early leaves loose pointers, which are pointers
that do not point to valid memory. Using loose pointers invariably
causes data corruption, segmentation faults, or general protection
violations. Sun WorkShop Memory Monitor automatically eliminates the
problems of freeing memory at the wrong time.

In the field, Sun WorkShop Memory Monitor automatically protects your
program against memory leaks and premature frees, even those in
third-party libraries. Simply linking with Sun WorkShop Memory Monitor
automatically eliminates any leaks. In deployment mode, Sun WorkShop
Memory Monitor's high-performance memory allocator makes your programs
run faster and consume less memory. 

Sounds like nirvana ;-)

Shame it only works with the Sun compilers.

However, there is an explanation of how it works that might be useful
when considering how to do this for perl6.  I've attached the HTML -
sorry about the broken links, but I don't think this is on any
externally-visible webpage.

Alan Burlison


Re: PDD 2, vtables

2001-02-18 Thread Alan Burlison

Dan Sugalski wrote:

 If PMC is a pointer to a structure, "new" will need to allocate memory for a
 new structure, and hence the value of mypmc will have to change.
 
 Nope. PMC structures will be parcelled out from arenas and not malloc'd,
 and they won't be freed and re-malloced much. If we're passing in a PMC
 pointer, we won't be reallocating the memory pointed to--rather we'll be
 reusing it.

So how do you get hold of a PMC from the arena in the first place?

Alan Burlison



Re: Generating Perl 6 source with Perl

2001-02-18 Thread Alan Burlison

Simon Cozens wrote:

 Larry has guaranteed that Perl 6 will be built "out of the same source tree"
 as Perl 5.

Whatever that means... i.e. not much.

 This is a major win for us in two areas. Firstly, we can reuse the
 information determined by Perl 5's Configure process to help make Perl 6
 portable: for instance, I expect we'll still be using the [UI](8|16|32|V)
 typedefs to guarantee integer sizes.

Yeuch.  So now we have to maintain two interdependent hairballs.  Is
that really progress?

 Secondly and more importantly, it guarantees that we've got a copy of Perl on
 hand before Perl 6 is built. This allows us to reduce the level of
 preprocessor muddling by effectively generating the C source to Perl 6 from
 templates and preprocessing. For instance, I expect to see a little macro
 language develop for specifying vtable methods, which, when preprocessed,
 would also generate the vtables and their corresponding enums. I'd also like
 to see Perl choose whether or not a function should invoke the native C
 library's implementation or define its own.
 
 What do people think?

I think macro languages suck, whether they are the C macro preprocessor,
or some fancy dohickey that we knock together.  I think that having to
have perl5 around to build perl6 also sucks.  For example, in our case
we build perl5 every night with the rest of Solaris.  It already takes
too long.  Adding a build of perl5 just to build perl6 - well, forget
it.

Alan Burlison



Re: PDD 2, vtables

2001-02-18 Thread Alan Burlison

Dan Sugalski wrote:

 Grab one via a utility function. getPMC() or something of the sort.

newPMC() ? ;-)

Alan Burlison



Re: Garbage collection (was Re: JWZ on s/Java/Perl/)

2001-02-15 Thread Alan Burlison

Branden wrote:

 Just set autoflush, if you're lazy...

And say goodbye to performance...

  The problem is
  that you can not only count on $fh's DESTROY being called at the end of
  the block, you often can't count on it ever happening.
 
 Anyway, the file would be flushed and closed...

That's not sufficient.  Without deterministic finalisation, what does
the folowing do?

  {
my $fh = IO::File-new("file");
print $fh "foo\n";
  }
  {
my $fh = IO::File-new("file");
print $fh "bar\n";
  }

At present "file" will contain "foo\nbar\n".  Without DF it could just
as well be "bar\nfoo\n".  Make no mistake, this is a major change to the
semantics of perl.

Alan Burlison



Re: Garbage collection (was Re: JWZ on s/Java/Perl/)

2001-02-15 Thread Alan Burlison

Hong Zhang wrote:

 This code should NEVER work, period. People will just ask for trouble
 with this kind of code.

Actually I meant to have specified "" as the mode, i.e. append, then
what I originally said holds true.  This behaviour is predictable and
dependable in the current perl implementation.  Without the  the file
will contain just "bar\n".

The point is that we have a stated goal of preserving the existing
semantics, and of allowing existing perl5 code to continue to work. 
Despite what some people seem to think this is *not* a clean slate
situation.  We may well have to deliberately carry over questionable but
depended-upon behaviour into perl6.

my $fh = do { local *FH; *FH; }

for example, better continue to work.

Alan Burlison



Re: Garbage collection (was Re: JWZ on s/Java/Perl/)

2001-02-15 Thread Alan Burlison

Hong Zhang wrote:

 That was not what I meant. Your code already assume the existence of
 reference counting. It does not work well with any other kind of garbage
 collection. If you translate the same code into C without putting in
 the close(), the code will not work at all.

Wrong, it does *not* assume any such thing.  It assumes that when a
filehandle goes out of scope it is closed.  How that is achieved is a
detail of the implementation, and could be done in a number of ways.  It
could just as well be done by keeping the filehandle on a stack which
was cleared when the scope exits.  C++ does this for local variables
without requiring a refcount.  

 By the way, in order to use perl in real native thread systems, we have
 to use atomic operation for increment/decrement reference count. On most
 systems I have measured (pc and sparc), any atomic operation takes about
 0.1-0.3 micro second, and it will be even worse on large SMP machines.
 The latest garbage collection algorithms (parallel and cocurrent) can
 handle large memory pretty well. The cost will be less DF.

I think you'll find that both GC *and* reference counting scheme will
require the heay use of mutexes in a MT program.

Alan Burlison



Re: GC: what is better, reuse or avoid cloning?

2001-02-10 Thread Alan Burlison

Branden wrote:

 Any suggestions?

Yes, but none of them polite.

You might do well to study the way perl5 handles these issues.

Alan Burlison



Re: yydestruct leaks

2001-02-07 Thread Alan Burlison

[EMAIL PROTECTED] wrote:

 Hmm, so this is (kind of) akin to the regcomp fix - it is the "new" stuff
 that is in yyl?val that should be free-d. And it is worse than that
 as yyl?val is just the topmost the parser state, so if I understand correctly
 it isn't only their current values, but also anything that is in the
 parser's stack (ysave-yyvs) at the time-of-death that needs to go.
 And all of those use the horrid yacc-ish YYSTYPE union, so we don't know
 what they are. Yacc did, it had a table which mapped tokens to types
 which it used to get union assigns right. But byacc does not put that info
 anywhere for run-time to use, so to get it right we would need to
 re-process perly.y and then look at the state stack as we popped it.

Yup - that's about the size of it.

 Yugh.
 
 The way I usually do this is make YYSTYPE an "object" or something
 like a Pascal variant record - which has a tag.

That was my idea - I just couldn't figure out any clean way of capturing
the type information.  If only byacc had a $$type variable as well as $$
etc...

 This would not be easy to fix for perl5.
 The best I can come up with is to make them all OP *, inventing
 special parse-time-only "ops" which can hold ival/pval/gvval values.
 
 Then yydestruct could just free the ops in yylval and yyvs[],
 freeing a gvalOP or pvalOP would do the right thing.
 
 Almost certainly far more than we want to do to the maint branch.

That seems workable, although as you say, far too radical for the maint
branch :-(

 The other way this mess is handled is to use a "slab allocator"
 e.g. GCC used an "obstack" - this allows all the memory allocated
 since one noted a "checkpoint" to be free-d.
 One could fake that idea by making malloc "plugable" and plugging
 it during parse to build a linked list or some such.

Well, that's kinda what we have with the scope stack, the problem is
that you don't know the type of the thing that needs freeing.

 The down side of that scheme is that auxillary allocators tend to
 upset Purify like tools almost as much as memory leaks do.

I've tried 2 approaches to this.  The first is to add "#ifdef PURIFY"
code to pp_ctl.c along the lines of the following:

S_doeval(...)
{
...
/* Flush any existing leaks to the log */
purify_new_leaks();
...
if (yyparse() == failed) {
...
/* Ignore any leaks */
purify_clear_leaks();
}
...
}

However I'm still suspicious of this because of the number of leaks that
only appear when S_doeval is somewhere in the stack trace.

The other approach is to postprocess the purify log and ignore anything
that has S_doeval or Perl_pp_entereval in the stack.  That's the
approach I'm currently using, but of course it ignores any real leaks
that coincidentally appear within an eval.  I think I'll try getting rid
of as many leaks as possible under this restricted regime - even with
this restriction, and with the bugs I've already fixed the test suite
contains 141 memory errors.

The truth of the matter is that I suspect eval and die will always leak
until it is re-architected in perl6 - whenever that might be.

Alan Burlison



Re: PDD 2, vtables

2001-02-06 Thread Alan Burlison

Branden wrote:

 Where can I find how Perl5's stack works (specially about parameter passing
 and returning from subs)?

Oh boy.  What a masochist.

;-)

Alan Burlison



Re: Does perl really need to use sigsetjmp? (18%performancehit)

2001-01-22 Thread Alan Burlison

Uri Guttman wrote:

 my scheme allows the user to do that. they just create a regular thread
 and request all their signal events in that thread. but my design also
 allows different threads to handle different signals (and multiple
 threads to handle the same one).

Hmm.  So how will you know the difference betwen a signal being
delivered once and handled twice and the same signal arriving twice?  I
can see that multiple threads all handling the same signal could be very
confusing.

 my main point there is that faking async i/o and sync signal delivery is
 much easier than faking threads. so we code all the internals to an
 event loop api and fake it underneath as needed on various
 platforms. much more on this soon.

As Jarkko would say:

Yes, yes yes.

:-)

Alan Burlison



Re: Does perl really need to use sigsetjmp? (18% performance hit)

2001-01-21 Thread Alan Burlison

[EMAIL PROTECTED] wrote:

 So the "gloss" is that (sig)longjmp() does not understand locks.
 I can understand that. If all goes according to plan perl will no longer
 longjmp() out of signal handlers. So the issue becomes other places
 where locks can be held. With the possible exception of stdio,  perl
 is in a position to "know" about those and undo them as part of
 its stack unwinding.

Not just stdio - the whole of libc.  malloc for example uses a mutex.

  The short answer is *never use them* in a multithreaded application.
 
 But the short answer (while it may suffice for perl6) is no use to me
 as a perl5 maintainer.

Well, we all have our particular crosses to bear ;-)

At the risk of being boring:  Threads in perl5 are irredeemably broken
and should not be used.

Alan Burlison



Re: Does perl really need to use sigsetjmp? (18% performancehit)

2001-01-21 Thread Alan Burlison

Dan Sugalski wrote:

 *) If, and *only* if, a variable is marked as shared, then its vtable
 function pointer is replaced with a threadsafe version. (Or, rather, a
 pointer to the 'use in case of threads' table--strictly speaking it doesn't
 have to do any locking, but that would be pretty stupid) Multiple threads
 can access the shared data (the thread that shared it and any threads it
 spawns). We do no user-level locking, and if the low-level locking that the
 vtable functions do to ensure we don't core dump is really expensive, well,
 too darned bad)

Hmm.  Might be easier to say that shared variables are visible to all
threads.  I'm not sure that a parent/child relationship makes much sense
between threads, and your proposal kinda implies that sort of
relationship.

 *) We *will* yank out any promises of what threads signals (or any async
 event) fire off in. I really, *really* want to toss the promise of being
 able to die from within a signal handler that catches a real signal, but I
 don't know that Larry will go for that. (Expecting sane shutdown when you
 essentially puke from within an interrupt handler's always struck me as a
 semi-odd thing, though I understand the utility) to some extent

If perl signal handlers are dispatched from the main loop at specified
cancellation-safe points then calling die from a perl-level signal
handler should be doable.  The underlying C level handler will just have
to be careful not to do anything that isn't async signal safe.  If perl
gets a proper event loop, aren't signals just another event?

As for continuing the current pretence that you can write a 'real'
signal handler in perl - well, even Larry can't make it so merely by
wishing it so.

 * We may put in a mechanism to direct an interrupt at a particular thread
 (and probably will have to to preserve the 'die from a signal handler'
 stuff, but that'll only work for perl-generated events and non-fatal
 synchronous signals (of which there are very few))

What about automatically having a dedicated signal handling thread for
any program that is user-level threaded?  

 I don't know what Larry will want from a language perspective, but this is
 what he'll get from the core engine. I think this is the smallest safe set
 of things we can do, and I'm reasonably sure that everything else anyone
 might want can be layered on top, albeit slowly in some cases.

A very sound strategy IMHO.

Alan Burlison



Re: Does perl really need to use sigsetjmp? (18%performancehit)

2001-01-21 Thread Alan Burlison

Dan Sugalski wrote:

 I'm torn between doing that and not, since firing up a second OS-level
 thread can add some overhead that we might not need with the "set a flag
 and check on op boundary" setup. I know that VMS won't kick over to the
 slower fully threadsafe version of its runtime libraries until you actually
 start a second thread, and I thought Solaris did something similar.

It's linking to libthread that makes it use the thread-safe versions of
stuff:

$ cat z.c
int main () { abort(); }

$ cc -o z z.c; ./z; pstack core
Abort(coredump)
core 'core' of 193353:  ./z
 ff3194d0 _kill(ff336000, 0, ff3385a8, 5, 229bc, ff29b784) + 8
 0001061c main (1, ffbef8ac, ffbef8b4, 20800, 0, 0) + 4
 000105f0 _start   (0, 0, 0, 0, 0, 0) + b8

$ cc -o z z.c -mt; ./z; pstack core
Abort(coredump)
core 'core' of 193574:  ./z
-  lwp# 1 / thread# 1  
 ff369494 __sigprocmask (ff36bdb8, 0, 0, 20860, ff37e000, 0) + 8
 ff35da38 _sigon   (20860, ff3859a0, 6, ffbef6f4, 20860, 1) + d0
 ff360abc _thrp_kill (0, 1, 6, ff37e000, 1, ff33a480) + f8
 ff2ca840 raise(6, 0, 0, , ff33a3ec, ) + 40
 ff2b4c04 abort(ff336000, 0, ff3385a8, 5, 100d4, 0) + 100
 00010624 main (1, ffbef8ac, ffbef8b4, 20800, 0, 0) + 4
 000105f8 _start   (0, 0, 0, 0, 0, 0) + b8
-  lwp# 2 / thread# 2  
 ff318690 _signotifywait (ff37e000, 145, 0, 0, 0, 0) + 8
 ff361930 thr_yield (0, 0, 0, 0, 0, 0) + 8c
-  lwp# 3  
 ff316234 _door_return (3, ff37f6d8, ff37f6f0, 3, ff37e000, 1) + 10
 ff35a1cc _lwp_start (ff245d70, 0, 6000, ffbef19c, 0, 0) + 18
 ff361930 thr_yield (0, 0, 0, 0, 0, 0) + 8c
--  thread# 3  
 ff35d6e4 _reap_wait (ff382a50, 20bf8, 0, ff37e000, 0, 0) + 38
 ff35d43c _reaper  (ff37ee80, ff3847b0, ff382a50, ff37ee58, 1, fe40)
+ 38
 ff36b4a0 _thread_start (0, 0, 0, 0, 0, 0) + 40

Alan Burlison



AIO and threads - my prejudices

2001-01-09 Thread Alan Burlison

Copied from p5p as it seemed kinda relevant.

Dan Sugalski wrote:

 Roll on perl6...
 
 Well, besides "Just don't *do* that," any thoughts on how to handle this
 properly in p6?

Hmm.  I've been half-following the async IO and signals thread in
perl6-internals.  The first thing I would say is that if you think there
are portability problems with threads and signals, wait until you get to
play with cross-platform AIO.  General received wisdom seems to be that
using multiple threads and synchronous IO is a much easier thing to get
working than trying to use the various difficult-to-use AIO APIs.

That leads nicely onto the next point - signals.  Again, the received
wisdom here is that if you are going to use threads and signals
together, you should have a dedicated signal handling thread which does
a sigwait and sets a flag when a signal occurs.  The flag would be
polled by the main loop and the signal handler called when it was safe
to do so.

Now I can see that at this point all you speed freaks want jump on me
and tell me how horribly inefficient this additional test would be.  I 
have only one thing to say to you - get real.  One conditional (even if
protected by a mutex) is neither here nor there when compared to the
number of instructions taken to implement a typical perl op, and you
wouldn't do it for every op dispatched anyway.  A sensible thing might
to be to get the compiler to emit statement boundary markers into its
output, and use these as points at which you check for and dispatch
signals.

For example, on my platform (sparc), an empty for loop takes about 10
cycles per iteration, a function call to setjmp takes about 10 cycles
and a call to sigsetjmp takes more than 5300 cycles, due largely to the
fact that it requires a syscall and is therefore also an invitation to
the OS to reschedule your process.  If you want to worry about
something, worry about that 5300 cycles first.  Then worry even more
that setjmp isn't thread safe either.

The upshot of all this is I think perl6 should be mandatorally threaded,
with a mixture of internal threads (signal handler, async IO handler,
garbage collector, whatever), and application threads.  Each application
thread would be an entire instance of the interpreter - none of that
crazy mutex madness that doomed the attempt to thread perl5. 
Interpreter threads would touch at well-defined points only - the aim
being to avoid mutexes as far as possible, rather than to infect
everything with them.  In the degenerate case (existing unthreaded
scripts) there would be only one interpreter instance and therefore only
one application thread.

This has lots of advantages - no changes in behaviour when threads are
used as perl is always threaded, well-defined semantics of how multiple
interpreters coexist and cooperate, system housekeeping can be
modularised and done behind the scenes in dedicated threads etc etc.

The downside is that we would restrict perl6 to only those platform that
supported threads.  I'm not sure how much of a restriction this would
turn out to be in practice - how many current perl platforms don't have
threads?

As for AIO - my guess is that faking it up with threads is a much better
bet.  After all, what proportion of apps are MT vs AIO, and which is
most likely to be available, well tested and well supported?

Alan Burlison



Re: Markup wars (was Re: Proposal for groups)

2000-12-07 Thread Alan Burlison

Russ Allbery wrote:

 I've fiddled with this before and can do text to HTML; the rest is just a
 question of picking different backends and shouldn't be *too* hard.  All
 the heuristics for parsing text are inherently fragile, but if you follow
 a standard text formatting style, it works reasonably well.

Which is precisely the reason for suggesting XML - it doesn't rely on
'fragile heuristics' to get the parsing right.  POD suffers from the
same problem to some extent, and I really can't see how typing =head1 is
better than typing head1 - well apart from being one character
shorter, that is.

However, having previously been told to shut up on this subject, I now
will.

Alan Burlison



Re: Proposal for groups

2000-12-05 Thread Alan Burlison

-- Adam Turoff wrote:

 Are you asking for a Design Document (tm) to be published/updated
 along with an Annotated Design Document (tm)?  Sounds like what Tim
 Bray did for the XML Spec at http://www.xml.com/axml/testaxml.htm.

Wow - I hadn't seen that - neat.  I expect this was generated by writing
a DTD for the spec and then transforming the document into the frameset
and html files.  That would be the obvious way to do it.  That way when
writing the document the commentary could be kept inline, which would
make it much easier for the author.

Alan Burlison



Re: Proposal for groups

2000-12-05 Thread Alan Burlison

Nathan Torkington wrote:

 Alan Burlison writes:
  seem a very optimal way to go about it.  How about a design document
  (format to be decided) and a 'design + commentary' document which is the
  design document with the condensed email discussion inserted into it as
  the commentary.  That way there is a design spec for the implementation,
 
 Cool.  You're volunteering to edit it?

Hah!  You don't ensnare me that easily, Mr. Torkington! ;-)

How about writing the documents in XML and having a 'perl specification'
DTD?  With a bit of careful thought we will be able to do all sorts of
interesting stuff - for example if we tag function definitions we can
start cross-checking other documents and even the code for consistency
with the spec.

Death to POD!

Alan Burlison



Re: Proposal for groups

2000-12-05 Thread Alan Burlison

Adam Turoff wrote:

 xml-evangelist
 Say What?
 /xml-evangelist

Say XML - ex em ell :-)

 We need a better POD, not a cumbersome machine-to-machine interchange
 format for writing docs.

The main problem with POD is that we have to write the tools to do
anything with it.  Witness the endless hacking/cursing/hacking/cursing
cycle on the existing POD tools.  Do we really want to continue? 
Personally I think POD sucks bigtime, although I'm sure hordes of people
will now spring up to defend it.  I still think that with the correct
DTD writing the specs in XML would be doable.

Alan Burlison



Re: Proposal for groups

2000-12-05 Thread Alan Burlison

Simon Cozens wrote:

  I still think that with the correct
  DTD writing the specs in XML would be doable.
 
 DocBook strikes me as being made for this sort of thing.

Yak! no.  DocBook is for specifying published document layout and is
pretty huge - far too weighty for what we want.  I'm thinking more along
the lines of a DTD that specifically couched in terms of what we are
documenting rather than how it should be rendered,

function
declaration
int myfunct(mystruct_t *msp);
/declaration

comment
This function...
/comment

discussion href="http://some.mail.archive/..." /
/function

This could be extended ad nauseam to include all sorts of attibutes and
entities, e.g. module, file, algorithm, pre and post conditions, test
conditions etc.

I think probably the best thing is to draft an initial design doc in POE
(Plain Old English) and see if it makes any sense to tag the various
bits.

Alan Burlison



Re: RFC 178 (v2) Lightweight Threads

2000-09-10 Thread Alan Burlison

Chaim Frenkel wrote:

 Please elaborate.

How deep do you go?

$h{a}{b}{c}{d}{e}{f}

This is my last mail on this subject - it is a half-assed idea, and this
whole thread is becoming too tedious for words.  Actually, I'd extend
that to the whole p6 process.  In fact I think I'll just unsubscribe. 
It's doomed.

Alan Burlison



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Jarkko Hietaniemi wrote:

 Multithreaded programming is hard and for a given program the only
 person truly knowing how to keep the data consistent and threads not
 strangling each other is the programmer.  Perl shouldn't try to be too
 helpful and get in the way.  Just give user the bare minimum, the
 basic synchronization primitives, and plenty of advice.

Amen.  I've been watching the various thread discussions with increasing
despair.  Most of the proposals have been so uninformed as to be
laughable.  I'm sorry if that puts some people's noses out of joint, but
it is true.  Doesn't it occur to people that if it was easy to add
automatic locking to a threaded language it would have been done long
ago?  Although I've seen some pretty whacky Perl6 RFCs, I've yet to see
one that says 'Perl6 should be a major Computer Science research
project'.

-- 
Alan Burlison



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Chaim Frenkel wrote:

 The problem I have with this plan, is reconciling the fact that a
 database update does all of this and more. And how to do it is a known
 problem, its been developed over and over again.

I'm sorry, but you are wrong.  You are confusing transactions with
threading, and the two are fundamentally different.  Transactions are
just a way of saying 'I want to see all of these changes, or none of
them'.  You can do this even in a non-threaded environment by
serialising everything.  Deadlock avoidance in databases is difficult,
and Oracle for example 'resolves' a deadlock by picking one of the two
deadlocking transactions at random and forcibly aborting it.

 Perl has full control of its innards so up until any data leaves perl's
 control, perl should be able to restart any changes.
 
 Take a mark at some point, run through the code, if the changes take,
 we're ahead of the game. If something fails, back off to the checkpoint
 and try the code again.
 
 So any stretch of code with only operations on internal structures could
 be made eligable for retries.

Which will therefore be utterly useless.  And, how on earth will you
identify sections that "only operate on internal data"?

-- 
Alan Burlison



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Chaim Frenkel wrote:

 UG i don't see how you can do atomic ops easily. assuming interpreter
 UG threads as the model, an interpreter could run in the middle of another
 UG and corrupt it. most perl ops do too much work for any easy way to make
 UG them atomic without explicit locks/mutexes. leave the locking to the
 UG coder and keep perl clean. in fact the whole concept of transactions in
 UG perl makes me queasy. leave that to the RDBMS and their ilk.
 
 If this is true, then give up on threads.
 
 Perl will have to do atomic operations, if for no other reason than to
 keep from core dumping and maintaining sane states.

I don't see that this is necessarily true.  The best suggestion I have
seen so far is to have each thread be effectively a separate instance of
the interpreter, with all variables being by default local to that
thread.  If inter-thread communication is required it would be done via
special 'shareable' variables, which are appropriately protected to
ensure all operations on them are atomic, and that concurrent access
doesn't cause corruption.  This avoids the locking penalty for 95% of
the cases where variables won't be shared.

Note however that it will *still* be necessary to provide primitive
locking operations, because code will inevitably require exclusive
access to more than one shared variable at the same time:

   push(@shared_names, "fred");
   $shared_name_count++;

Will need a lock around it for example.

Another good reason for having separate interpreter instances for each
thread is it will allow people to write non-threaded modules that can
still be safely used inside a threaded program.  Let's not forget that
the overwhelming bulk of CPAN modules will probably never be threaded. 
By loading the unthreaded module inside a 'wrapper' thread in the
program you can safely use an unthreaded module in a threaded program -
as far as the module is concerned, the fact that there are multiple
threads is invisible.  This will however require that different threads
are allowed to have different optrees - perhaps some sort of 'copy on
write' semantic should be used so that optrees can be shared cheaply for
the cases where no changes are made to it.

Alan Burlison



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Chaim Frenkel wrote:

 I'd like to make the easy things easy. By making _all_ shared variables
 require a user level lock makes the code cluttered. In some (I think)
 large percentage of cases, a single variable or queue will be use to
 communicate between threads. Why not make it easy for the programmer.

Because contrary to your assertion I fear it will be a special case that
will cover  such a tiny percentage of useful threaded code as to make it
virtually useless.  In general any meaningful operation that needs to be
covered by a lock will involve the update of several pieces of state,
and implicit locking just won't work.  We are not talking syntactical
niceties here - the code plain won't work.

 It's these isolated "drop something in the mailbox" that a lock around
 the statement would make sense.

An exact definition of 'statement' would help.  Also, some means of
beaming into the skull of every perl6 developer exactly what does and
does not constitute a statement would be useful ;-)  It is all right
sweeping awkward details under the rug, but make the mound big enough
and everyone will trip over it.

 my $a :shared;
 $a += $b;

If you read my suggestion carefully, you would see that I explicitly
covered this case and said that the internal consistency of $a would
always be maintained (it would have to be otherwise the interpreter
would explode), so two threads both adding to a shared $a would result
in $a being updated appropriately - it is just that you wouldn't know
the order in which the two additions were made.

I think you are getting confused between the locking needed within the
interpreter to ensure that it's internal state is always consistent and
sane, and the explicit application-level locking that will have to be in
multithreaded perl programs to make them function correctly. 
Interpreter consistency and application correctness are *not* the same
thing.

 my %h :shared;
 $h{$xyz} = $somevalue;
 
 my @queue :shared;
 push(@queue, $b);

Again, all of these would have to be OK in an interpreter that ensured
internal consistency.  The trouble is if you want to update both $a, %h
and @queue in an atomic fashion - then the application programmer MUST
state his intent to the interpreter by providing explicit locking around
the 3 updates.

-- 
Alan Burlison



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Chaim Frenkel wrote:

 AB I'm sorry, but you are wrong.  You are confusing transactions with
 AB threading, and the two are fundamentally different.  Transactions are
 AB just a way of saying 'I want to see all of these changes, or none of
 AB them'.  You can do this even in a non-threaded environment by
 AB serialising everything.  Deadlock avoidance in databases is difficult,
 AB and Oracle for example 'resolves' a deadlock by picking one of the two
 AB deadlocking transactions at random and forcibly aborting it.
 
 Actually, I wasn't. I was considering the locking/deadlock handling part
 of database engines. (Map row - variable.)

Locking, transactions and deadlock detection are all related, but aren't
the same thing.  Relational databases and procedural programming
languages aren't the same thing.  Beware of misleading comparisons.

 How on earth does a compiler recognize checkpoints (or whatever they
 are called) in an expression.

If you are talking about SQL it doesn't.  You have to explicitly say
where you want a transaction completed (COMMIT) or aborted (ROLLBACK). 
Rollback goes back to the point of the last COMMMIT.

 I'm probably way off base, but this was what I had in mind.
 
 (I. == Internal)
 
 I.Object - A non-tied scalar or aggregate object
 I.Expression - An expression (no function calls) involving only SObjects
 I.Operation - (non-io operators) operating on I.Expressions
 I.Function - A function that is made up of only I.Operations/I.Expressions
 
 I.Statement - A statment made up of only I.Functions, I.Operations and
 I.Expressions

And if the aggregate contains a tied scalar - what then?  The only way
of knowing this would be to check every item of an aggregate before
starting.  I think not.

 Because if we can recover, we can take locks in arbitrary order and simply
 retry on deadlock. A variable could put its prior value into an undo log
 for use in recovery.

Nope.  Which one of the competing transactions wins?  Do you want a
nondeterministic outcome?  Deadlocks are the bane of any DBAs life. 
They are exceedingly difficult to track down, and generally the first
course of the DBA is to go looking for the responsible programmer with a
baseball bat in one hand and a body bag in the other.  If you get a
deadlock it means your application is broken - it is trying to do two
things which are mutually inconsistent at the same time.  If you feel
that automatically resolving this class of problem is an appropriate
thing for perl to do, please sumbit an RFC entitled "Why perl6 should
automatically fix all the broken programs out there and how I suggest it
should be done".  Then you can sit back and wait for the phonecall from
Stockholm ;-)

-- 
Alan Burlison



Splitting core functions into multiple shared objects: A warning

2000-08-26 Thread Alan Burlison

Beware of dependencies between shared objects.  Let's assume 2 chunks of
core functionality are seperated off into say A.so and B.so.  This will
work fine as long as there are no interdependencies between A.so and
B.so.  Let's however assume A.so needs to call something in B.so.  That
means before A.so is loaded B.so must already have been loaded for its
symbols to be available in order to be able to resolve the references to
them when A.so is opened.  Also, on Solaris at least special flags must
be specified to dlopen to make symbols in B.so visible to the
subsequently dlopen'd A.so.  The other way of doing this is to specify a
linker dependency between A.so and B.so, but that will effectively cause
B.so to be loaded as soon as A.so is loaded.  In both these scenarios,
both .so files will end up loaded anyway, so it really seems fruitless
to seperate them.

The problem with this whole suggestion is that it will be very, very
easy to bring the whole edifice crashing down - one misplaced
cross-shared-object function call will result not in compile-time or
link-time errors, but in runtime linker errors.  In some cases I'm sure
it is possible to seperate out some bits of the core so that they don't
depend on anything else, but I'm far from persuaded the overall benefit
will be worth the extra complications.

Alan Burlison



Re: RFC 146 (v1) Remove socket functions from core

2000-08-26 Thread Alan Burlison

[EMAIL PROTECTED] wrote:

 Dynamic loading can be noticeably slow if you are loading something
 via NFS. In addition the PIC code and jump tables used for dynamic
 linking result in a 10-15% slowdown in execution speed on SunOS and
 Solaris (at least in my experiments). Not what I'd call really slow, but
 we've complained vigorously about smaller slowdowns.

This is probably all true.  However what you are not taking into account
is the overall improvements in system performance due to the fact that
the shared libraries are only held in memory once (yes I know some bits
of a .so may not be shared, but most of it is).  Paging involves disks,
and they are orders of magnitude slower than the dynamic linking
overhead.  Repeat the excercise with a couple of hundred concurrent
copies of your test.  Drawing conclusions based on a single test can be
misleading.

-- 
Alan Burlison



Feature request: Relocatable perl

2000-08-02 Thread Alan Burlison

I don't think it is worth generating a RFE for this, but I'd like to see
the ability to relocate or install perl in some place other than the
initial install location without everything breaking.  This will require
cleverness in the manipulation of the search paths for both perl modules
and shared objects/DLLs.

Alan Burlison



Re: Feature request: Relocatable perl

2000-08-02 Thread Alan Burlison

Graham Barr wrote:

 It is not just libraries, but also the perl @INC that needs to
 be dynamic

Yes, but that seems a bit more tractable - surely we could fiddle with
@INC based on the location of the perl interpreter?

-- 
Alan Burlison
Solaris Kernel Development, Sun Microsystems



Re: inline mania

2000-08-01 Thread Alan Burlison

Brent Fulgham wrote:

  I think there is an undiscussed assumption about the implementation
  language in there somewhere...
 
 I think you may have missed the context of the message.  John was talking
 about creating his Alpha using various existing projects that had already
 been done in C++.

Why is he bothering?  A year to produce a prototype doesn't seem like a
useful way to expend effort on something that isn't actually perl6.

  We've been down that path already - Topaz.  With all due respect this is
  supposed to be a community rewrite.  Your proposal doesn't seem to be
  along those lines.
 
 With all due respect, I think you may be taking this out of context.  I
 don't believe John's intent was to hijack the process.  He was outling
 a theoretical schedule that could be used to provide a working
 Perl5 - Perl6 migration path.

I'm not saying it was.  However I don't see how the proposal would aid
the migration - after all what he is writing will be neither perl5 nor
perl6.

Alan Burlison