I'm a professional with 37 years since my first course in computing, but not the type of
language syntactical expert such as Mr's Wampler and Majorinc are. That said, I do
know what I like. What I like most is explicit declaration in a form that is implicitly clear,
which I consider the hardest of things to accomplish. In other words, what you read
should be clear on it's face with as much local information as possible that is clear and
unambiguous.


I've had this discussion of reference versus local copy versus read only reference with
local copy for write....and on and on. I think that allowing references vs. local copies
is as much a scope issue as anything else. I do not believe in globals, only a consistent
scope leveling that makes items at a higher level appear as relatively global. That builds
a consistent model that can be extended up and down as far as needed. The only thing
that is absolutely clear is that large data structures or even very large single pieces of
data such as images cannot be replicated endlessly through a set of function/procedure
calls. No matter how many resources are available, the overhead and resource
consumption of replication will not go away nor become trivial. It may even be one of
the dominant factors in program performance. I suspect that there are a huge number
of functions/procedures that spend far longer collecting their copies of variables than
they do performing actual work. Actually, I don't just suspect that. It's a fact. Other
than usage marks or counts or other GC functionality, required in any case, references
do not have that overhead. A reference is a compiler trick, with almost zero involvement
at run time.


The second case for references as opposed to copies is the use of data passed by
separately compiled code or through interfaces to stream data that must be handled in
real time. One may take a very small window on such data, but if one starts passing
copies of it around very far, resources will swiftly disappear, and the data flow will
move on without response, or the data flow will be interrupted for processing. Neither
is a useful way to handle the data.


I would say that a single data implementation with a sophisticated and controllable garbage
collection algorithm is a lot more important than the philosophical issue over references.
Rather than globals and parameter passing, I would rather have the ability to mark data
with attributes like constant, persistent, read-only, all of which have one or another effect
on the scope of the variable. If there are no globals, and no procedure/function receives
any data that it is not passed, then the current globals will become parts of structures
passed through the flow, most probably by reference. That also relieves us of the problem
of variable masking, one of the most irritating and difficult problems for a programmer to
recognize. The program may know which "a" that it is referring to, but the intent of the
programmer may not be followed.


I suspect that in the long run, almost all programs will become stateful and/or persistent rather
than the public's general perception that processing stops and results are distributed when
they close the program. That is seldom the case. MRU's and "undo's" are an examples of stateful
processing, and in many cases, programs save state in ways that people presume happens
without any real knowledge of or concept of persistence and state. Treating data and
programs as stateful and persistent is really what people would like to expect. "You mean
that the program doesn't remember what I was doing last or what this looked like before
I wiped out this section." Or, "I wrote a letter just like this to John a week ago...now where
is that, and how do I get a similar one to Jim".


Then, there is the program that is persistent, but almost cannot be stateful. That would be any
program that constantly handles huge streams of possibly unrelated data. Can the program say,
"There was a group of emails through here last week that looked just like that...let me see...
where did I put those." That program being one of the big backbone routers in the internet,
Yo, Ho, Ho.


One last point before getting back to the local case. Data and programs in RAM may have
different scopes or persistence. A pointer to data may be passed without any possible program
return.


Well, back to Unicon/Icon...and this simpler case. Check mixed in comments below.

Steve Wampler wrote:

On Fri, 2003-10-24 at 16:32, Louis Kruger wrote:



Actually, I really meant "throws an exception", but thats a discussion for another day. ;)



Heh. Definitely another day. I like exceptions, but putting them into Unicon in a clean fashion is a *big* task...

The problem with exceptions is quite often the reestablishment of effective scope. I do not believe
that it is possible for automatic processes to properly handle exceptions. However, I do believe
that exceptions are a necessary part of programming, especially in long running programs. You
might want to peruse my comments on scope and globals and persistence to see how this might
affect processes.


There's another possibility I thought of after sending my last message,
and that is to have the compiler detect when a local variable has a reference operation applied it to it while its in scope, and allocate it on the heap in that case. It could then be thrown away by the GC when its last reference disappears. This allows you to return a reference to a variable from a function, and not worry about its safety, and also eliminates the performance penalty of checking every dereference.



Not quite. There are a few 'gotchas' with this approach, though I think it's the best proposal I've seen so far.

(1) Right now, you can't put scalar values into the heap (not enough
    room in the descriptor for the mark used for mark-and-reclaim GC).
    So, you have to embed it into an internal 'heap' structure [as
    statics are now].  Once, you do that, have you really gained
    anything over putting things you want references to into Unicon
    structures?

I am in favor of the explicit passing of all variables, by structure or by single variable, with no
globals.


Circumventing call by value is one use, but there are others. Being able to save and reuse lvalues from functions and generators would help work around the brilliant, but ugly hacks people have come up with. You can also imagine creating data structures of references.



Yes, that's true, but I wonder if it's really that important. After all, there are other ways to code those (admittedly less terse, but perhaps clearer in meaning...). That is, rather than trying to write these as single expressions it might just be better to go ahead and stick a few control structures in there. After all, you can hide the code in a procedure and never have to look at it again!

    procedure forceNonNegative(L, v)
         every L[i := 1 to *L] < 0 do L[i] := v
         return L
    end

I think that is a really bad suggestion. Code should be as verbose as it needs to be to be clear
and unambiguous. I used to be a really sharp IBM 360/370 Assembler programmer, able to
create some really trick code. I soon learned that writing such trick code made me it's permanent
guardian, and for good reason. No one else really wanted to figure out my tricks, no matter how
well documented. With high level languages, we need to let the compiler/interpreter/? do the trick
stuff. I suppose there is some excuse for trick code in really high use libraries that are tested to
the max, but even there, subtle bugs abound, as is proved by the constant overflow bugs found
in code on almost every OS in the world.


Well, I program 90% in Java, and I certainly would make use of references there. I cringe every time I have to create a special class or array just because I want to return more than one value from a function. Other examples that motivate having references in Icon don't apply to Java, because Java isn't as rich in other ways.





With Unicon's iterators, it can handle multiple values as cleanly as any language that I have seen,
but having a function/procedure return multiple unrelated values is a violation of good programming
practice and of good logic. If a variable is to be returned, only because it must be used in bookkeeping
for the next pass of the function/procedure, then the variable should be marked as persistent within
that function/procedure, not returned with the flow of logic unless it is of interest outside the routine.
Now that I look, that is what is said in the comments that follow, so I am agreeing with the writer, here.
Back to data and program objects.


Interesting - I'm in pretty much the same boat - Java programming is
what pays my bills.  However, I've never found myself cursing Java
for that flaw (there are others...).  I wonder what the difference is?
It may be because I learned Icon before C (C was still hidden in Bell
Labs back then) and so by 'baby duck' tend not to think of thinks like
pointers and explicit references.)  The (related) thing I curse Java
for is for not providing an explicit, clean syntax for classes without
methods - i.e. records [though I understand why they don't].

I guess I'll argue that functions that want to return more than
one value really want to return a structure - that these values *must*
be in someway related [or else computing them in a common function
is a mistake!] and should be related that way explicitly - and *not*
just for passing to a function, but throughout the program.  This
requires a shift in how one thinks about the problem being addressed.
Whether or not this shift is desirable is probably more personal taste
than anything else, of course.

Take for example, Kazimir's local_min() example.  Another approach is
to recognize that what's really wanted is a function that takes another
function f and three parameters x,y,z and computes the local mininum for
based on that function and those parameters.  I see no reason to force
that local minimum back into the same variables as x,y,z, so why not:

   p := local_min(f, [x1,   x2,   x3])
   q := local_min(f, local_min(f, [p[1], p[3], x4])
   r := local_min(f, [p[2], q[2], q[3])
   write(q[1], r[1], r[2], r[3])

which, while not as terse as in Pascal, isn't that far off
and offers the advantage of making it easier to trace where
values are coming from!  [The duplicate 'local_min(x1,x3,x4)' in
Kazimir's example through me for a moment...though I think it's
redundant, if local_min works as advertised.  I left it in the above
code.]  And it may well be advantageous later to have not distroyed the
original values.  (To be fair, I'd probably personally represent
the local minumum as a triple (record) than as a list, but that's
also personal taste.)

Maintenance of the original value should be a programmatic decision, not an enforced language
decision. Either way, the handling should be explicit. The examples above give me headaches that
I just should not have. If you are going to pass functions as first class values, you might just as
well go to the full blown object encapsulation and orientation. lua is a really fine language that very
compactly and technically handles most of these issues without formally being an OOP language,
but it can be very difficult to read. It does iteration, but not nearly so cleanly as Unicon. Now, I'm
no fan of the more formal object methodologies, having obtained more than a small share of
headaches trying to follow some of the involutions required to maintain "purity", but there are
lessons to be learned from it. When looking at paradigms for original and copy data, the
scatter/gather of Foxpro is one of the best.


Everett L.(Rett) Williams
[EMAIL PROTECTED]




------------------------------------------------------- This SF.net email is sponsored by: The SF.net Donation Program. Do you like what SourceForge.net is doing for the Open Source Community? Make a contribution, and help us add new features and functionality. Click here: http://sourceforge.net/donate/ _______________________________________________ Unicon-group mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/unicon-group

Reply via email to