[EMAIL PROTECTED] wrote:
> 
> Hi, having some knowledge about the dark-side of software-protection
> I want to throw in some points (BTW: I support the idea of being able
> to 'hide' some parts of a Rebol project to create a commercial market
> for Rebol applications.
>

Hi, Robert!  I think you're right on target with that observation.
(As for what follows, let me state clearly that I have no priveleged
inside knowledge of how REBOL is actually implemented.  ;-)

> 
> - if Rebol is holding a decoded plaint-text version in memory,
> forget about it, it takes perhaps 60 seconds to find the decrypted
> stuff. This is quite likely, how else should the source command
> work?
>

It doesn't have to.  All data values have printable representations,
and the printable representation of a word (the word itself, not any
value it may refer to) is its name (as a string).  As evidence, let
me offer:

    >> test: [{this} {is} {a} {test}]
    == ["this" "is" "a" "test"]
    >> source test
    test: ["this" "is" "a" "test"]
    >> othertest: "this is a very long string that should trip over
    the length heuristic which decides when to convert from quotes
    to braces"
    == {this is a very long string that should trip over the length
    heuristic which decides when to convert from quotes to braces}

Choice of string delimiters appears to be based on the length of the
string, not on which delimiter was used when the string was
initialized.

    >> body: [
        to-set-word to-string to-char 65 (2 + 2)
        'print to-word to-string to-char (5 * 13)
    ]
    == [to-set-word to-string to-char 65 (2 + 2)
    'print to-word to-string to-char (5 * 13)]

    >> body: reduce body
    == [A: 4 print A]

This leaves 'body referring to a block that had no plain-text
source code at all.  Ever.  Yet 'source can manufacture a printable
version of it (using 'mold).

    >> source body
    body: [A: 4 print A]
    >> do body
    4
    >> source source
    source: func [
        "Print the source code for a word"
        'word [word!]
    ][
        prin join "" [word ": "]
        if not value? word [print "undefined" exit]
        either any [native? get word op? get word action? get word] [
            print ["native" mold third get word]
        ] [print mold get word]
    ]

> 
> - if Rebol is holding a byte-code representation (BTW: than should
> Rebol be compilable) I'm sure a reverse-engineering of the byte-code
> should be possible (I expect it to be very clear if it's done by
> Carl). But this won't be needed at all if the source command re-
> creates source-code. Allocate a binary block big enough to hold your
> decrypted byte-code, replace this block with the byte-code and
> execute source...
> 

No byte code is needed either.  Everything I've seen indicates
that REBOL represents blocks internally as linked data structures.
Consider this:

    >> test: func [a b /local c] [c: a * b  return c * c - a + b]
    >> test 1 2
    == 5
    >> test 2 3
    == 37
    >> test 3 4
    == 145
    >> body: second :test
    == [c: a * b return c * c - a + b]
    >> bb: next next next body
    == [b return c * c - a + b]
    >> change/only bb to-paren reduce [bb/1 to-word "+" 1]
    == [return c * c - a + b]
    >> bb: next body
    == [a * (b + 1) return c * c - a + b]
    >> change/only bb to-paren reduce [bb/1 to-word "+" 1]
    == [* (b + 1) return c * c - a + b]
    >> source test
    test: func [a b /local c][c: (a + 1) * (b + 1) return c * c - a + b]
    >> test 1 2
    == 37
    >> test 2 3
    == 145
    >> test 3 4
    == 401

The point is that we can manipulate the data structure that serves as
the body of 'test, just as we can any block, and then turn around and
"execute" it.  Now, if in-memory data structures really ARE made out
of pointers (especially if memory management can shuffle things around
during garbage collection) then reverse engineering source code from a
memory dump could be significantly harder than disassembling byte-code.

>
> Perhaps a better idea than would be to pre-compile blocks of Rebol
> source into a binary format. This format can than be executed as is,
> be placed in the source-code. It could be blocked like:
> 
> hide
> [
>         ... your source to hid ...
> ]
> 
> If executed with do/transform %myscript.r a new output file will
> be written, which now contains the binary parts, perhaps like
> 
> hidden
> [
>         #{339392232A...}
> ]
> 
> Of course you can reverse engineer this too but no protection
> is safe...
>

But storing the decrypted representation reduces the safety level
SIGNIFICANTLY.

A key part of the proposal I made was that the decrypted block never
exists anywhere except in memory.  I propose NO way to save the
block itself, not even as the value of another word in the running
copy of REBOL.  That approach trades time for security, as
'do-crypt would have to decrypt the encrypted block every time it
executes.  However, with no means for storing the decrypted block,
the only way to get its decrypted content is to dump memory.

If the last thing that 'do-crypt does is destroy its private
references to the decrypted block (e.g., by setting them to 'none),
then there would be no way to examine a memory dump and find a
pointer trail from known data structures to the decrypted block's
content, unless the dump were taken DURING the execution of
'do-crypt.

>
> IMO byte code is a good way to hide such stuff and should give a
> considerable protection level.
> 

I disagree.  Byte-code and conventional CPU object code both have
the property that opcode values don't change between uses.  But in
the case of dynamic pointer-based memory structures (especially in
the presence of garbage collection) you have the possibility that
a data structure may be "here" one moment and "there" the next.
It only can be understood by chasing down the pointer trail, and
once that's broken, all bets are off.

-jn-

Reply via email to