Re: [Unicon-group] How to replace side effects efficiently?

Rett Williams Tue, 28 Oct 2003 05:18:28 -0800

I'm a professional with 37 years since my first course in computing, but not the type of language syntactical expert such as Mr's Wampler and Majorinc are. That said, I do know what I like. What I like most is explicit declaration in a form that is implicitly clear, which I consider the hardest of things to accomplish. In other words, what you read should be clear on it's face with as much local information as possible that is clear and unambiguous.

I've had this discussion of reference versus local copy versus read only reference with local copy for write....and on and on. I think that allowing references vs. local copies is as much a scope issue as anything else. I do not believe in globals, only a consistent scope leveling that makes items at a higher level appear as relatively global. That builds a consistent model that can be extended up and down as far as needed. The only thing that is absolutely clear is that large data structures or even very large single pieces of data such as images cannot be replicated endlessly through a set of function/procedure calls. No matter how many resources are available, the overhead and resource consumption of replication will not go away nor become trivial. It may even be one of the dominant factors in program performance. I suspect that there are a huge number of functions/procedures that spend far longer collecting their copies of variables than they do performing actual work. Actually, I don't just suspect that. It's a fact. Other than usage marks or counts or other GC functionality, required in any case, references do not have that overhead. A reference is a compiler trick, with almost zero involvement at run time.

The second case for references as opposed to copies is the use of data passed by separately compiled code or through interfaces to stream data that must be handled in real time. One may take a very small window on such data, but if one starts passing copies of it around very far, resources will swiftly disappear, and the data flow will move on without response, or the data flow will be interrupted for processing. Neither is a useful way to handle the data.

I would say that a single data implementation with a sophisticated and controllable garbage collection algorithm is a lot more important than the philosophical issue over references. Rather than globals and parameter passing, I would rather have the ability to mark data with attributes like constant, persistent, read-only, all of which have one or another effect on the scope of the variable. If there are no globals, and no procedure/function receives any data that it is not passed, then the current globals will become parts of structures passed through the flow, most probably by reference. That also relieves us of the problem of variable masking, one of the most irritating and difficult problems for a programmer to recognize. The program may know which "a" that it is referring to, but the intent of the programmer may not be followed.

I suspect that in the long run, almost all programs will become stateful and/or persistent rather than the public's general perception that processing stops and results are distributed when they close the program. That is seldom the case. MRU's and "undo's" are an examples of stateful processing, and in many cases, programs save state in ways that people presume happens without any real knowledge of or concept of persistence and state. Treating data and programs as stateful and persistent is really what people would like to expect. "You mean that the program doesn't remember what I was doing last or what this looked like before I wiped out this section." Or, "I wrote a letter just like this to John a week ago...now where is that, and how do I get a similar one to Jim".

Then, there is the program that is persistent, but almost cannot be stateful. That would be any program that constantly handles huge streams of possibly unrelated data. Can the program say, "There was a group of emails through here last week that looked just like that...let me see... where did I put those." That program being one of the big backbone routers in the internet, Yo, Ho, Ho.

One last point before getting back to the local case. Data and programs in RAM may have different scopes or persistence. A pointer to data may be passed without any possible program return.

Well, back to Unicon/Icon...and this simpler case. Check mixed in comments below.

Steve Wampler wrote:

On Fri, 2003-10-24 at 16:32, Louis Kruger wrote:

Actually, I really meant "throws an exception", but thats a discussion for another day. ;)
Heh.  Definitely another day.  I like exceptions, but putting them
into Unicon in a clean fashion is a *big* task...

The problem with exceptions is quite often the reestablishment of effective scope. I do not believe that it is possible for automatic processes to properly handle exceptions. However, I do believe that exceptions are a necessary part of programming, especially in long running programs. You might want to peruse my comments on scope and globals and persistence to see how this might affect processes.

There's another possibility I thought of after sending my last message, and that is to have the compiler detect when a local variable has a reference operation applied it to it while its in scope, and allocate it on the heap in that case. It could then be thrown away by the GC when its last reference disappears. This allows you to return a reference to a variable from a function, and not worry about its safety, and also eliminates the performance penalty of checking every dereference.
Not quite.  There are a few 'gotchas' with this approach, though I think
it's the best proposal I've seen so far.
(1) Right now, you can't put scalar values into the heap (not enough
    room in the descriptor for the mark used for mark-and-reclaim GC).
    So, you have to embed it into an internal 'heap' structure [as
    statics are now].  Once, you do that, have you really gained
    anything over putting things you want references to into Unicon
    structures?

I am in favor of the explicit passing of all variables, by structure or by single variable, with no globals.

Circumventing call by value is one use, but there are others. Being able to save and reuse lvalues from functions and generators would help work around the brilliant, but ugly hacks people have come up with. You can also imagine creating data structures of references.
Yes, that's true, but I wonder if it's really that important.  After
all, there are other ways to code those (admittedly less terse, but
perhaps clearer in meaning...).  That is, rather than trying to write
these as single expressions it might just be better to go ahead and
stick a few control structures in there.  After all, you can hide the
code in a procedure and never have to look at it again!
    procedure forceNonNegative(L, v)
         every L[i := 1 to *L] < 0 do L[i] := v
         return L
    end

I think that is a really bad suggestion. Code should be as verbose as it needs to be to be clear and unambiguous. I used to be a really sharp IBM 360/370 Assembler programmer, able to create some really trick code. I soon learned that writing such trick code made me it's permanent guardian, and for good reason. No one else really wanted to figure out my tricks, no matter how well documented. With high level languages, we need to let the compiler/interpreter/? do the trick stuff. I suppose there is some excuse for trick code in really high use libraries that are tested to the max, but even there, subtle bugs abound, as is proved by the constant overflow bugs found in code on almost every OS in the world.

Well, I program 90% in Java, and I certainly would make use of references there. I cringe every time I have to create a special class or array just because I want to return more than one value from a function. Other examples that motivate having references in Icon don't apply to Java, because Java isn't as rich in other ways.

With Unicon's iterators, it can handle multiple values as cleanly as any language that I have seen, but having a function/procedure return multiple unrelated values is a violation of good programming practice and of good logic. If a variable is to be returned, only because it must be used in bookkeeping for the next pass of the function/procedure, then the variable should be marked as persistent within that function/procedure, not returned with the flow of logic unless it is of interest outside the routine. Now that I look, that is what is said in the comments that follow, so I am agreeing with the writer, here. Back to data and program objects.

Interesting - I'm in pretty much the same boat - Java programming is
what pays my bills.  However, I've never found myself cursing Java
for that flaw (there are others...).  I wonder what the difference is?
It may be because I learned Icon before C (C was still hidden in Bell
Labs back then) and so by 'baby duck' tend not to think of thinks like
pointers and explicit references.)  The (related) thing I curse Java
for is for not providing an explicit, clean syntax for classes without
methods - i.e. records [though I understand why they don't].

I guess I'll argue that functions that want to return more than
one value really want to return a structure - that these values *must*
be in someway related [or else computing them in a common function
is a mistake!] and should be related that way explicitly - and *not*
just for passing to a function, but throughout the program.  This
requires a shift in how one thinks about the problem being addressed.
Whether or not this shift is desirable is probably more personal taste
than anything else, of course.

Take for example, Kazimir's local_min() example.  Another approach is
to recognize that what's really wanted is a function that takes another
function f and three parameters x,y,z and computes the local mininum for
based on that function and those parameters.  I see no reason to force
that local minimum back into the same variables as x,y,z, so why not:

   p := local_min(f, [x1,   x2,   x3])
   q := local_min(f, local_min(f, [p[1], p[3], x4])
   r := local_min(f, [p[2], q[2], q[3])
   write(q[1], r[1], r[2], r[3])

which, while not as terse as in Pascal, isn't that far off
and offers the advantage of making it easier to trace where
values are coming from!  [The duplicate 'local_min(x1,x3,x4)' in
Kazimir's example through me for a moment...though I think it's
redundant, if local_min works as advertised.  I left it in the above
code.]  And it may well be advantageous later to have not distroyed the
original values.  (To be fair, I'd probably personally represent
the local minumum as a triple (record) than as a list, but that's
also personal taste.)

Maintenance of the original value should be a programmatic decision, not an enforced language decision. Either way, the handling should be explicit. The examples above give me headaches that I just should not have. If you are going to pass functions as first class values, you might just as well go to the full blown object encapsulation and orientation. lua is a really fine language that very compactly and technically handles most of these issues without formally being an OOP language, but it can be very difficult to read. It does iteration, but not nearly so cleanly as Unicon. Now, I'm no fan of the more formal object methodologies, having obtained more than a small share of headaches trying to follow some of the involutions required to maintain "purity", but there are lessons to be learned from it. When looking at paradigms for original and copy data, the scatter/gather of Foxpro is one of the best.

Everett L.(Rett) Williams
[EMAIL PROTECTED]


-------------------------------------------------------
This SF.net email is sponsored by: The SF.net Donation Program.
Do you like what SourceForge.net is doing for the Open
Source Community?  Make a contribution, and help us add new
features and functionality. Click here: http://sourceforge.net/donate/
_______________________________________________
Unicon-group mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/unicon-group

Re: [Unicon-group] How to replace side effects efficiently?

Reply via email to