Re: Status of 64 picoLisp

2008-10-16 Thread Eugene




Alex !

THANK YOU for sharing your design and rationale for the "picoLisp64".

After reading your example and explanations, I fully understand and
agree with your decision to use assembler and to not use LLVM.

By sharing your roadmap for the future, you have increased the
confidence level in the long term viability of picoLisp as a
streamlined, efficient and very powerful platform for the delivery of
complex applications.=C2=A0 It's as if you've reawakened the dreams for Symbolics / Genera.

It would appear that you could claim portability to any CPU.=C2=A0 Afterall,
porting should only require writing a=C2=A0 translator module for the
different instruction set.=C2=A0 If anyone wants JVM or CLI then that should
be what needs rewriting.

As for the gcc.l, as.l, and the generic call to external libraries, I
have a suggestion:

(For the foregoing, I have to not that my comments pertain to
Linux/BSD.=C2=A0 My experience with Windows and Mac OS/X is too limited to
be of any benefit.)

Provide a simple plug-in mechanism. This should only require :
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 1.=C2=A0=C2=A0=C2=A0 Shared libraries=
 written in the normal manner
(dlopen/dlsym/dlclose/dlerror) and adhering to a simple set of
conventions for picoLisp compatability.
=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 2.=C2=A0=C2=A0 A manifest file for ea=
ch DL which lists the entry points,
argument types and return value type.
a call to (plug-in namespace "module.manifest") would load the
plug-in's details into the appropriate spot in the namespace.=C2=A0 From
that point on, those external functions would be accessible as if they
were part of picoLisp.=C2=A0 IMHO, a plug-in approach is cleaner than a FFI
with the corresponding collection of glue functions.

Cheers,
Eugene







--=20
This message has been scanned for viruses and
dangerous content by
MailScanner, and is
believed to be clean.


-- 
UNSUBSCRIBE: mailto:[EMAIL PROTECTED]


Re: Status of 64 picoLisp

2008-10-16 Thread Alexander Burger
Hi Tomas,

> > Yes, the current version of "gcc.l" will not work any longer :-(
> 
> What is the reason for this not being possible?  I though C and asm
> can be linked together (C is compiled to asm anyway).

On the instruction level this is correct, but the calling conventions of
the (assembly) functions in the PicoLisp kernel are different. It is no
longer possible to write a built-in function in C, as built-in functions
expect their argument - and return the result - in the E register (%rbx
on x86-64). And those functions need to be able to access the rest of
the kernel for type checking, evaluation, garbage collection etc. In
addition, such a function is required to start at an address which is a
multiple of 16 plus 2 which is also difficult to achieve in C.

As I wrote to Konrad, it might be possible "design a similar mechanism,
writing inline assembly instead of C, and calling 'as' instead of
'gcc'", or invent other useful things.

We should not worry about that too much now :-)

Cheers,
- Alex
-- 
UNSUBSCRIBE: mailto:[EMAIL PROTECTED]


Re: Status of 64 picoLisp

2008-10-16 Thread Tomas Hlavaty
Hi Alex,

> Yes, the current version of "gcc.l" will not work any longer :-(

What is the reason for this not being possible?  I though C and asm
can be linked together (C is compiled to asm anyway).

Cheers,

Tomas
-- 
UNSUBSCRIBE: mailto:[EMAIL PROTECTED]


Re: Status of 64 picoLisp

2008-10-16 Thread Alexander Burger
On Thu, Oct 16, 2008 at 10:24:40PM +1100, konrad Zielinski wrote:
> And now back to questions about 64 bit picolisp:
> 
> Is switching to an assembler going to mean the demise of gcc.l ?
> Are we going to see inline picoLisp Assembler instead?O

Yes, the current version of "gcc.l" will not work any longer :-(

It is possible to design a similar mechanism, writing inline assembly
instead of C, and calling 'as' instead of 'gcc'. But I'm not sure if
this is the right way to go.

In any case, I have a concept of a generic call to C functions in
arbitrary external libraries.

And I'm sure we will have plenty of other ideas when time comes ;-)


> How is a new assembler based version of Picolisp going to affect 32
> bit platforms. I imagine they are going to be arround for quite a
> number of decades yet?

Yes, I don't see 3.0 in production use for the near future, and I'll
support both versions as long as they are needed. I can probably not
switch all my customers to another version anyway.

Cheers,
- Alex
-- 
UNSUBSCRIBE: mailto:[EMAIL PROTECTED]


Re: Status of 64 picoLisp

2008-10-16 Thread Alexander Burger
Hi Tomas,

> I was curious to try picolisp bignums and must say that for somebody
> doing anything serious, it is probably rather inefficient.  As a

I'm aware of that. The bignum implementation was not intended to be
particularly fast, or - to put it corrrectly - cannot be expected to be
very fast because numbers are implemented as linked lists instead of
arrays.

I believe the advantage of a singular internal structure (the cell)
outweighs the speed disadvantage. Short numbers are sufficiently fast,
but when numbers grow to a length of hundreds of cells, the speed will
probably go down with the square of the length. For practical programs,
I never experienced any bottlenecks due to arithmetic processing.

Raw arithmetic speed is not of such importance in Lisp as in other
languages, where most primitive operations (like array index
calculations) depend on it.


> Would not it be better to use gmp library for bignums if they are
> going to be supported?

For an application that needs the speed, why not? It is just a lot more
trouble to support and use it.


> What is the reason picolisp has bignums in the first place?  Do

Being "unlimited". One of the main goals of PicoLisp was that the
programmer should never have to think about size limits (see
"doc/ref.html#intro").


> you/somebody else use it for anything?  Would not it be simpler and

I did need it for the RSA library "lib/rsa.l" when I used the Java
Applet API.

Anywah, on a 32bit system, 10 digits for short numbers are not enough
for useful work.

> good enough on 64 bit systems not having them at all?

Also on 64 bits I do not want to remember making sure not to cross the
18-digit border. With scaled fixed point arithmetics this can easily
happen.

In addition, the bignum structures are needed internally anyway, as the
names of symbols are technically also bignums.


In the 64 bit system, the CDR part of a number cell is used now, too,
being more space-efficient. That is, a number is either a short number
(60 bits + sign), or a cell with 64 bits in the CAR, and a number in the
CDR.


> What impact on interfacing foreign libraries the asm rewrite have?

It will still be possible to call external libraries. And also
necessary, as the Posix system interface is used just as before, for
I/O, memory and process management, networking etc. It is only the part
that was written in C until now that is replaced.

Cheers,
- Alex
-- 
UNSUBSCRIBE: mailto:[EMAIL PROTECTED]


Re: Status of 64 picoLisp

2008-10-16 Thread konrad Zielinski
Hi All,

It would appear that setting up a chroot is remarkably easy, well
under debian anyway, I can't speak for other distros as I haven't
tried. AndJus it seems to work quite nicely too, even if their are
other ways to do it.

Anyway now that I have a chroot I can install a version of Firefox
with flash as well : )




Disclaimer: I have not examined the low level virtual machine to any
level of detail. the following is a general observation about VM's.
For all I know the people behind LLVM have done things better.

Some languages are simply sufficiently different that they do not work
well on some virtual machines. Attempting to implement Erlang on the
JVM comes to mind. The underlying primities are different.

Often what generic virtual machine means is good for any language
sufficently close to C, IE staticly typed etc. Lisp is not decented
from C and makes somwhat different assumptions about a lot of things,
so it may not be a good fit.



And now back to questions about 64 bit picolisp:

Is switching to an assembler going to mean the demise of gcc.l ?
Are we going to see inline picoLisp Assembler instead?O

How is a new assembler based version of Picolisp going to affect 32
bit platforms. I imagine they are going to be arround for quite a
number of decades yet?

Regs

Konrad

On 16/10/2008, Tomas Hlavaty <[EMAIL PROTECTED]> wrote:
> Hi Alex,
>
> thanks for explanation.
>
> I was curious to try picolisp bignums and must say that for somebody
> doing anything serious, it is probably rather inefficient.  As a
> benchmark, I tried the example from
> http://paste.lisp.org/display/15116
>
> (setq X 0)
> (setq Y 1)
> (for (N 2 (<= N 100) (inc N))
>(let Z (+ X Y)
>   (setq X Y)
>   (setq Y Z)))
> (prinl Y)
>
> Very rough results using picolisp native bignums:
>
> (<= N 1)
>
> $ time ~/picolisp/p gmp-test2.l -bye > gmp-test2.log
>
> real  0m0.131s
> user  0m0.124s
> sys   0m0.008s
>
> (<= N 10)
>
> $ time ~/picolisp/p gmp-test2.l -bye > gmp-test2.log
>
> real  0m10.190s
> user  0m10.157s
> sys   0m0.008s
>
> (<= N 100)
>
> $ time ~/picolisp/p gmp-test2.l -bye > gmp-test2.log
>   C-c C-cKilled
>
> real  17m58.856s
> user  17m51.687s
> sys   0m5.572s
>
> (killed after 18 mins!)
>
> The original C program:
>
> $ time ./gmp > gmp.log
>
> real  0m50.060s
> user  0m50.059s
> sys   0m0.004s
>
> I wrote simple ffi wrapper for gmp library and the results:
>
> $ time ../../p gmp-test.l -bye > gmp-test.log
>
> real  0m50.507s
> user  0m50.239s
> sys   0m0.248s
>
> using the following code:
>
> (setq X (mpz_new))
> (setq Y (mpz_new))
> (mpz_init X)
> (mpz_init Y)
> (mpz_set_ui X 0)
> (mpz_set_ui Y 1)
> (setq Z (mpz_new))
> (for (N 2 (<= N 100) (inc N))
>(mpz_init Z)
>(mpz_add Z X Y)
>(mpz_set X Y)
>(mpz_set Y Z)
>(mpz_clear Z))
> (mpz_print Y)
> (prinl)
>
> Would not it be better to use gmp library for bignums if they are
> going to be supported?
>
> What is the reason picolisp has bignums in the first place?  Do
> you/somebody else use it for anything?  Would not it be simpler and
> good enough on 64 bit systems not having them at all?
>
> What impact on interfacing foreign libraries the asm rewrite have?
>
> Cheers,
>
> Tomas
> --
> UNSUBSCRIBE: mailto:[EMAIL PROTECTED]
>
-- 
UNSUBSCRIBE: mailto:[EMAIL PROTECTED]


Re: Status of 64 picoLisp

2008-10-16 Thread Tomas Hlavaty
Hi Alex,

thanks for explanation.

I was curious to try picolisp bignums and must say that for somebody
doing anything serious, it is probably rather inefficient.  As a
benchmark, I tried the example from
http://paste.lisp.org/display/15116

(setq X 0)
(setq Y 1)
(for (N 2 (<= N 100) (inc N))
   (let Z (+ X Y)
  (setq X Y)
  (setq Y Z)))
(prinl Y)

Very rough results using picolisp native bignums:

(<= N 1)

$ time ~/picolisp/p gmp-test2.l -bye > gmp-test2.log

real0m0.131s
user0m0.124s
sys 0m0.008s

(<= N 10)

$ time ~/picolisp/p gmp-test2.l -bye > gmp-test2.log

real0m10.190s
user0m10.157s
sys 0m0.008s

(<= N 100)

$ time ~/picolisp/p gmp-test2.l -bye > gmp-test2.log
  C-c C-cKilled

real17m58.856s
user17m51.687s
sys 0m5.572s

(killed after 18 mins!)

The original C program:

$ time ./gmp > gmp.log

real0m50.060s
user0m50.059s
sys 0m0.004s

I wrote simple ffi wrapper for gmp library and the results:

$ time ../../p gmp-test.l -bye > gmp-test.log

real0m50.507s
user0m50.239s
sys 0m0.248s

using the following code:

(setq X (mpz_new))
(setq Y (mpz_new))
(mpz_init X)
(mpz_init Y)
(mpz_set_ui X 0)
(mpz_set_ui Y 1)
(setq Z (mpz_new))
(for (N 2 (<= N 100) (inc N))
   (mpz_init Z)
   (mpz_add Z X Y)
   (mpz_set X Y)
   (mpz_set Y Z)
   (mpz_clear Z))
(mpz_print Y)
(prinl)

Would not it be better to use gmp library for bignums if they are
going to be supported?

What is the reason picolisp has bignums in the first place?  Do
you/somebody else use it for anything?  Would not it be simpler and
good enough on 64 bit systems not having them at all?

What impact on interfacing foreign libraries the asm rewrite have?

Cheers,

Tomas
-- 
UNSUBSCRIBE: mailto:[EMAIL PROTECTED]


Re: Status of 64 picoLisp

2008-10-16 Thread Jakob
> Hi Jakob,
>
> while LLVM is surely an interesting project, I think it would by far be
> overkill for what we need here.

Indeed. What you are doing is very interesting by the way.

// Jako


-- 
UNSUBSCRIBE: mailto:[EMAIL PROTECTED]


Re: Status of 64 picoLisp

2008-10-16 Thread Alexander Burger
Hi Tomas,

> I guess that miniPicoLisp is not 64 bit incarnation of future
> picoLisp-3 then?

Right, it is rather limited, and more of a design study than.


> What are the reasons for a) complete rewrite and b) switching from C
> to asm?

As Konrad noted, there is - due to the doubled word size - an additional
tag bit available, which allows things I wanted but could not do on the
32bit platform. This results in a slightly different tag layout, with
relative large consequences on the overall data structures.

This alone would not cause a "complete rewrite", though. The trigger was
the decision to switch back to assembly language. The first versions of
PicoLisp were written in various assembly languages, more than 20 years
ago, and in fact some things for such an interpreter are even easier to
do in assembly than in C.

I switched to C back then only for portability reasons, and now I hope
to slove that with the generic assembler described in anothter post.


Why is assembly better for writing a PicoLisp interpreter, and why are
some things even easier?

In C you have to go through a lot of trouble to direcly manipulate the
stack, for example. As you don't know in advance how long an evaluated
list will be, you want to 'push' things on the stack in a loop. Or you
need to align the stack pointer to certain boundaries so that the
pointer tags reflect the desired data type. Unwinding the stack is
tedious, possibly only with longjmp() and a lot of overhead. In assembly
this can be reduced to simple stack pointer arithmetics.

Or you don't have access to the CPU flags (the 'carry') for muli-
precision arithmetics.

Then I want to guarantee odd things, like that certain instructions
start at addresses in memory which are a multiple of 16 plus 2.

And assembly also has other advantages, like multiple arguments and
return values in registers, the possibility to directly return status
flag return values from functions, or have multiple function entry
points.

But the killall criterion for me was simply that 64bit C compilers do
not support a number type of 128 bits, as needed, for example, as the
result of 64bit multiplications. This is really stupid. They defined
'long' as 64 bits, and 'long long' too!

Cheers,
- Alex
-- 
UNSUBSCRIBE: mailto:[EMAIL PROTECTED]


Re: Status of 64 picoLisp

2008-10-16 Thread Alexander Burger
On Wed, Oct 15, 2008 at 10:27:59PM +0100, Tomas Hlavaty wrote:
> your 64 bit Linux fine.  You'll need to add the -m32 option to gcc.l

.. or download the latest "testing" version.
-- 
UNSUBSCRIBE: mailto:[EMAIL PROTECTED]


Re: Status of 64 picoLisp

2008-10-16 Thread Alexander Burger
Hi Jakob,

while LLVM is surely an interesting project, I think it would by far be
overkill for what we need here.

What I'm doing is extremely simple. I defined a (mostly) single-address
assembly language, with a rather basic set of instructions which I
believe can easily be mapped to differing architectures. At least, there
are no assumptions about arity, register sets or instruction formats.

And, most important, it is quite readable (as opposed the most current
assembly languages (including LLVM)!


Though I think it is still too early to publish yet, let me show you
example from the current sources ('car' and 'if').

   # (cr 'lst) -> any
   (code 'doCarE_E 2)
  push X
  ldX E
  ldE (E CDR)  # Evaluate CADR
  ldE (E)
  eval
  num? E  # Check list
  jnz lstErrEX
  ldE (E)  # Take CAR
  pop X
  ret

On x86-64 this expands to:

  .balign 16
  nop
  nop
  .global doCarE_E
   doCarE_E:
  pushq %r13
  movq %rbx, %r13
  movq 8(%rbx), %rbx
  movq (%rbx), %rbx
  test $0x06, %bl
  jnz 2f
  test $0x08, %bl
  jnz 1f
  call evListE_E
  jmp 2f
   1:
  movq (%rbx), %rbx
   2:
  testb $0x06, %bl
  jnz lstErrEX
  movq (%rbx), %rbx
  popq %r13
  ret


'if' is a little longer:

   # (if 'any1 'any2 . prg) -> any
   (code 'doIfE_E 2)
  ldE (E CDR)  # Body
  push (E CDR)  # Push rest
  ldE (E)  # Get condition
  eval  # Eval condition
  nil? E
  if ne  # Non-NIL
 stE (AtSym)
 pop E  # Get rest
 ldE (E)  # Consequent
 eval/ret
  end
  xchX (S)  # Get rest in X
  ldX (X CDR)  # Else
  do
 ldE (X)
 ldX (X CDR)
 eval
 atom? X  # Atom?
  until nz  #  Yes
  pop X
  ret

with this on x86-64:

  .balign 16
  nop
  nop
  .global doIfE_E
   doIfE_E:
  movq 8(%rbx), %rbx
  pushq 8(%rbx)
  movq (%rbx), %rbx
  test $0x06, %bl
  jnz 2f
  test $0x08, %bl
  jnz 1f
  call evListE_E
  jmp 2f
   1:
  movq (%rbx), %rbx
   2:
  cmpq $Nil, %rbx
  jz .125
  movq %rbx, AtSym
  popq %rbx
  movq (%rbx), %rbx
  test $0x06, %bl
  jnz ret
  test $0x08, %bl
  jz evListE_E
  movq (%rbx), %rbx
  ret
   .125:
  xchg %r13, (%rsp)
  movq 8(%r13), %r13
   .126:
  movq (%r13), %rbx
  movq 8(%r13), %r13
  test $0x06, %bl
  jnz 2f
  test $0x08, %bl
  jnz 1f
  call evListE_E
  jmp 2f
   1:
  movq (%rbx), %rbx
   2:
  testb $0x0E, %r13b
  jz .126
  popq %r13
  ret


The mapping of the individual instructions is rather straightforward. On
the x86 architecture, most of them expand to a single target
instruction.

The machine register set is defined as:

  +---+---+---+---+---+---+---+---+
  |   A   | B |  \  [A]ccumulator
  +---+---+---+---+---+---+---+---+   D [B]yte register
  |   C   |  /  [C]ount register
  +---+---+---+---+---+---+---+---+ [D]ouble register
  |   E   | [E]xpression register
  +---+---+---+---+---+---+---+---+


  +---+---+---+---+---+---+---+---+
  |   X   | [X] Index register
  +---+---+---+---+---+---+---+---+ [Y] Index register
  |   Y   | [Z] Index register
  +---+---+---+---+---+---+---+---+
  |   Z   |
  +---+---+---+---+---+---+---+---+


  +---+---+---+---+---+---+---+---+
  |   L   | [L]ink register
  +---+---+---+---+---+---+---+---+ [S]tack pointer
  |   S   |
  +---+---+---+---+---+---+---+---+


  +---+
  |  [z]ero[s]ign[c]arry  | [F]lags
  +---+

   Source Adressing Modes:
  ldA 1234  # Immediate
  ldA R # Register
  ldA Label # Direct
  ldA (R)   # Indexed
  ldA (R 8) # Indexed with offset
  ldA (R OFFS)
  ldA (Global)  # Indirect
  ldA (Global OFFS) # Indirect with offset

   Destination Adressing Modes:
  stA (Global)  # Indirect
  stA (Global OFFS) # Indirect with offset
  ldA R # Register
  stA (R)   # Indexed
  stA (R 8) # Indexed with offset
  stA (R OFFS)

   Target Adressing Modes:
  jmp 1234  # Absolute
  jmp Label
  jmp (R)   # Indexed
  jmp (Global)  # Indirect


The whole thing is so simple formost because there is only a single word
size (i.e. 64 bit) for all instructions, with the exception of the 'B'
register for byte operations.

The instructions take the form of

   ldA something

instead of

   ld A, something

The reason for this is to have

Re: Status of 64 picoLisp

2008-10-16 Thread Jakob
> No matter how efficient or clever a virtual machine, it still
> requires additional steps in order to perform useful work.  So
> there are really three "efficient" approaches to consider:
>
>  1.   Accept that we have a ubiquitous x86(-64) mono-culture and primarily
> target that.


I use PPC and ARM quite often, we almost went with Coldfire once, I know
people working with MIPS and OpenRisc.


>  3.   Recognise that established compilers do the work of optimisation,
> etc very well.  In which case use the "universal" assembler as provided
> by the C compiler as our target.  Consider the speed at which compilers
> like tinyCC do their work and the high levels of optimisation provided by
> GNU C and Intel C compilers.


I probably should have mentioned that the LLVM project is not only a
virtual machine, but an optimizing compiler from "assembler" to native
machine code. Tinycc compiles fast, but does not produce very fast machine
code. (It also is mostly limited to x86.) GCC is huge.

>
>  It's interesting to note that Squeak (the latest incarnation of
> Smalltalk) was ported using a subset of Smalltalk called "Slang" in a
> similar manner to Alex's approach of using a generic assembler written in
> picoLisp.

And PyPy is Python ported to a subset of Python. Recently they have seen
the light and not only has a C backend, but an LLVM backend.


>  OOPSLA'96 - "Back to the Future"
>
>  This might be a superflous comment, but I remember seeing somewhere an
> assembler notation which was in effect a form of s-expr.

It might not be s-expr, but LLVM assembler is very clean and is made to
ease analysis and optimisation. I recommend:
http://llvm.org/docs/LangRef.html it is a very fun read.


About the comment made by Konrad Zielinski earlier in thread
"I suspect this would be in contradiction to some of the stated goals of
PicoLisp", I can only say that there need not be a contradiction. The
generic assembler can output LLVM IR as well as x86_64 code and ARM. It is
"just" another target.


regards,
Jakob


-- 
UNSUBSCRIBE: mailto:[EMAIL PROTECTED]