Re: Event model for Perl...

2000-09-09 Thread Nick Ing-Simmons

Grant M. <[EMAIL PROTECTED]> writes:
>I am reading various discussions regarding threads, shared objects,
>transaction rollbacks, etc., and was wondering if anyone here had any
>thoughts on instituting an event model for Perl6? I can see an event model
>allowing for some interesting solutions to some of the problems that are
>currently being discussed.

Yes - Uri has started [EMAIL PROTECTED] to discuss that stuff.


>Grant M.
-- 
Nick Ing-Simmons




Re: RFC 178 (v2) Lightweight Threads

2000-09-09 Thread Nick Ing-Simmons

Chaim Frenkel <[EMAIL PROTECTED]> writes:
>
>NI> Indeed that is exactly how tied arrays work - they (automatically) add 
>NI> 'p' magic (internal tie) to their elements.
>
>Hmm, I always understood a tied array to be the _array_ not each individual
>element.

The perl level tie is on the array. That adds C level 'P' magic.
When you do an access to an array element the 'P' magic adds 'p' magic to 
a (proxy for) the element. The 'p' magic invokes FETCH or STORE.

It does not have to be that way - but it is in perl5.

>
>NI> Tk apps to this all the time :
>
>NI>  $parent->Lable(-textvariable => \$somehash{'Foo'});
>
>NI> The reference is just to get the actual element rather than a copy.
>NI> Tk then ties the actual element so it can see STORE ops and up date 
>NI> label.
>
>Would it be a loss to not allow the elements? The tie would then be
>to the aggregate.

Yes it would be a loss. If only one element in a 1,000 entry array 
is being watched by a widget that is a LOT of extra work checking 
accesses to the other 999 elements. You would also have to allow
1,000 ties on the SAME array so that 1,000 widgets could each watch 
one element. Or force higher level code to implement element watching
by watching the array and then re-despatching to the approriate inner
"tie" - which is what we have now ;-)

>
>I might argue that under threading tieing to the aggregate may be 'more'
>correct for coherency (locking the aggregate before accessing.)

I don't disagree - the lock may well be best on the aggregate - but that does 
not mean the tie has to be. 

-- 
Nick Ing-Simmons




Re: A tentative list of vtable functions

2000-09-09 Thread Nick Ing-Simmons

Ken Fox <[EMAIL PROTECTED]> writes:
>Short
>circuiting should not be customizable by each type for example.

We are already having that argument^Wdiscussion elsewhere ;-)

But I agree variable vtables are not the place for that.

-- 
Nick Ing-Simmons




New Perl rewrite - embedded Perl

2000-09-09 Thread Matthew Gillman

Dear All

I wrote a large C++ program which used embedded Perl. Later, this was changed to 
embedded Python. The reasons for this included:

1) Python allows you to pass a pointer to an object from C/C++ to the embedded Python 
interpreter, wheras Perl makes you push and pop off the stack (as far as I am aware - 
but I'm open to correction :-)

2) Embedded Perl generates vast numbers of Purify errors and memory leaks, although 
this is not the case with embedded Python.

3) Speed (I guess partially as a consequence of (1))

Python has a very large C/Perl API, unlike Perl. Of course, the downside of this is 
that it is potentially more complicated. Also, in Python you have to keep careful 
track of reference counts to your embedded Python objects, so that's hard too.

Basically, my comment is that a lot of commercial applications seem to be mixing and 
matching languages together (like C++ and Perl), so it would be really great if the 
issues such as Purify errors for embedded Perl were addressed (I realise that 
stand-alone Perl is well-Purify'd and tested). I should also stress that I have no 
particular axe to grind - I like both languages and have used both of them as 
appropriate.

Sorry if you're not the person I should have sent this to! But I'd welcome any 
comments.

Thanks.

Yours Sincerely

Matthew Gillman



Re: RFC 178 (v2) Lightweight Threads

2000-09-09 Thread Chaim Frenkel

> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes:

DS> Right, but databases are all dealing with mainly disk access. A 1ms lock 
DS> operation's no big deal when it takes 100ms to fetch the data being locked. 
DS> A 1ms lock operation *is* a big deal when it takes 100ns to fetch the data 
DS> being locked...

Actually, even a database can't waste too much time on locks, not when there
may be thousands or millions of rows effected. But ...

DS> Correctness is what we define it as. I'm more worried about expense.

DS> Detecting deadlocks is expensive and it means rolling our own locking 
DS> protocols on most systems. You can't do it at all easily with PThreads 
DS> locks, unfortunately. Just detecting a lock that blocks doesn't cut it, 
DS> since that may well be legit, and doing a scan for circular locking issues 
DS> every time a lock blocks is expensive.

DS> Rollbacks are also expensive, and they can generate unbounded amounts of 
DS> temporary data, so they're also fraught with expense and peril.

Then all "we" are planning on delivering is correctness with a possiblity
of deadlocks with no notification.

Is deadlock detection really that expensive? The cost would be born by
the thread that will be going to sleep. Can't get lock, do the scan.

I really think we will have to do it. And we should come up with the
deadlock resolution. I don't think we will fly without it. We are going
to be deluged with reports of "my program hangs. Bug in locking."


-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: RFC 178 (v2) Lightweight Threads

2000-09-09 Thread Chaim Frenkel

> "AB" == Alan Burlison <[EMAIL PROTECTED]> writes:

AB> Chaim Frenkel wrote:
>> No scanning. I was considering that all variables on a store would
>> safe store the previous value in a thread specific holding area[*]. Then
>> upon a deadlock/rollback, the changed values would be restored.
>> 
>> (This restoration should be valid, since the change could not have taken
>> place without an exclusive lock on the variable.)
>> 
>> Then the execution stack and program counter would be reset to the
>> checkpoint. And then restarted.

AB> Sigh.  Think about references.  No, think harder.  See?

No, I don't. If the references are to _Variables_ what difference does
it make. If the reference is to a tied variable, so what. The tied
variable wasn't called. If it is to a call to a tied variable, then
this wouldn't apply.

Please elaborate.

(I don't think this is feasible for 6.0, and depending upon what actually
is available in the language and interpreter, who knows for 6.x)


-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



RFCs for thread models

2000-09-09 Thread Steven W McDougall

RFC 178 proposes a shared data model for Perl6 threads. In a shared
data model 
- globals are shared unless localized
- file-scoped lexicals are shared unless the thread recompiles the
  file 
- block scoped lexicals may be shared by
  - passing a reference to them
  - closures
  - declaring one subroutine within the scope of another

In short, lots of stuff is shared, and just about everything can be
shared. 

To prevent the interpreter from crashing in a shared data model, every
access to a named variable must be protected by a mutex

lock(b.mutex)
fetch(b)
unlock(b.mutex)
lock(a.mutex)
store(a)
unlock(a.mutex)

It has been argued on perl6-internals that 
- acquiring mutexes takes time
- most variables aren't shared
- we should optimize for the common case by requiring a :shared
  attribute on shared variables. 

Variables declared without a :shared attribute would be isolated: each
thread gets its own value for the variable. In this model, the user
incurs the cost of mutexes only for data that is actually shared
between threads.

This is a valid argument. However, an isolated data model has its own
costs, and we need to understand these, so that we can compare them to
the costs of a shared data model.

The first interesting question is: How does a thread get access to its
own value for a variable?

We can break the problem into two broad cases
- All threads execute the same op tree
- Each thread executes its own copy of the op tree

Let's look at these in detail

1. All threads execute the same op tree

Consider an op, like

fetch(b)

If you actually compile a Perl program, like

$a = $b

and then look at the op tree, you won't find the symbol "$b", or "b"
anywhere in it. The fetch() op does not have the name of the variable
$b; rather, it holds a pointer to the value for $b.

If each thread is to have its own value for $b, then the fetch() op
can't hold a pointer to *the* value. Instead, it must hold a pointer
to a map that indexes from thread ID to the value of $b for that
thread. Thread IDs tend to be sparse, so the map can't be implemented
as an array. It will have to be a hash, or a B*-tree, or a balanced
B-tree, or the like.

We can do this: we can build maps. But they take space to build, and
they take time to search, and we incur that space for every variable,
and we incur that time for every variable access.


2. Each thread executes its own copy of the op tree
This breaks down further according to how much of the op tree we copy,
and when we copy it. Here are several possibilities

2.1 Copy everything at thread creation
This is simple and straightforward. We copy the op tree for every
subroutine in the entire program at thread creation. As we copy the
ops, we create new values for all the variables, and set the new ops
to point to the new values. Obviously, this takes space and time.

2.2 Copy subroutines on demand
We could defer copying subroutines until they are actually called by
the new thread. However, this leads to a problem analogous to the one
discussed in case 1 above. The entersub() op can no longer hold a
pointer to *the* coderef for the subroutine. Instead, it must hold a
pointer to a map that indexes from thread ID to the coderef.

The first time a thread calls a subroutine, it finds that there is no
entry for it in the map, makes a copy of the subroutine for itself,
and enters it into the map. Subsequent calls find the entry in the map
and call it immediately. All subroutine calls must search the map to
find the coderef.

2.3 Copy just the subroutines we need at thread creation
We could do a control flow analysis to determine the collection of
subroutines that can be called by a thread, and copy just those
subroutines when the thread is created. In this implementation, there
is no thread ID map: the entersub() op holds a pointer to the coderef.

This trades a more complex implementation for greater run-time
efficiency. Constructs like &$foo() are likely to complicate control
flow analysis. We could probably punt on hard cases and make them go
through a thread ID map.


RFC 178 describes a shared data model, and there has been enough
discussion of it on perl6-internals that we have some understanding of
its performance characteristics. RFCs for other thread models would
allow us to discuss them in definite terms, and come to some
understanding of their performance characteristics, as well. This
would then be a basis for choosing one model over another. Any
volunteers?


- SWM



Re: RFCs for thread models

2000-09-09 Thread Chaim Frenkel

> "SWM" == Steven W McDougall <[EMAIL PROTECTED]> writes:

SWM> If you actually compile a Perl program, like

SWM>$a = $b

SWM> and then look at the op tree, you won't find the symbol "$b", or "b"
SWM> anywhere in it. The fetch() op does not have the name of the variable
SWM> $b; rather, it holds a pointer to the value for $b.

Where did you get this idea from? P5 currently does many lookups for
names. All globals. Lexicals live elsewhere.

SWM> If each thread is to have its own value for $b, then the fetch() op
SWM> can't hold a pointer to *the* value. Instead, it must hold a pointer
SWM> to a map that indexes from thread ID to the value of $b for that
SWM> thread. Thread IDs tend to be sparse, so the map can't be implemented
SWM> as an array. It will have to be a hash, or a B*-tree, or a balanced
SWM> B-tree, or the like.

You are imagining an implementation and then arguing against it.
What about a simple block of reserved data per'stack frame' and the
$b becomes an offset into that area? And then there are all the
other offset for variables that are in outer scopes.

Here is my current 'guess'.

A single pointer to the thread interpreters private data.
A thread stack (either machine or implemented)
A thread private area for evaled code op trees (and Damian specials :-)
A thread private file scope lexical area

The lexical variables would live on the stack in some frame, with outer
scope lexicals directly addressable (I don't recall all of the details
but this is standard compiler stuff, I think the dragon book covers
this in detail)

The shared variables (e.g. main::*) would live in the well protected
global area.

Now where
sub recursive() { my $a :shared; ; return recursive() }
would put $a or even which $a is meant, is left as an excersize
for someone brighter than me.


-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: RFCs for thread models

2000-09-09 Thread Steven W McDougall

> SWM> If you actually compile a Perl program, like
> 
> SWM>  $a = $b
>   
> SWM> and then look at the op tree, you won't find the symbol "$b", or "b"
> SWM> anywhere in it. The fetch() op does not have the name of the variable
> SWM> $b; rather, it holds a pointer to the value for $b.
> 
> Where did you get this idea from? P5 currently does many lookups for
> names. All globals. Lexicals live elsewhere.

>From perlmod.pod, Symbol Tables:

the following have the same effect, though the first is more
efficient because it does the symbol table lookups at compile
time:

local *main::foo= *main::bar;
local $main::{foo}  = $main::{bar};

Perhaps I misinterpreted it.


> You are imagining an implementation and then arguing against it.

Yes.


> Here is my current 'guess'.

[...]

> Now where
>   sub recursive() { my $a :shared; ; return recursive() }
> would put $a or even which $a is meant, is left as an excersize

My point is that we can't work with guesses and exercises.
We need a specific, detailed proposal that we can discuss and
evaluate. I'm hoping that someone will submit an RFC for one.


- SWM