[REBOL] Re: Shamless request for improving function speed.

2002-03-28 Thread Joel Neely

Hi, Alan,

alan parman wrote:
 
 This group is really good at tweaking functions to make them
 faster.


While I have absolutely nothing to say on the subject of
cryptography ;-) your email reminded me of an interesting
question in REBOL data structure manipulation that I had
thought about some time ago.  So I dusted off a few ideas
that I had left on a mental shelf and did a little bit of
benchmarking...

The REBOL data structure manipulation question is this:

What is a fast way to iterate through two series
values in parallel, cycling the shorter one as
needed, and producing a result that is some function
of corresponding pairs of values, one from each
series?

Interestingly enough, the fastest way seems to be to make
a new series, rather than updating the longer one in place,
as shown by the following test cases:

CYCLES0 - uses FOREACH to iterate across the first (assumed
  to be the longer) argument, and explicitly plays
with the second series to cycle it through its positions.

CYCLES1 - uses FOREACH on the first/longer series, but uses
  explicit indexing on the second/shorter one.  The
mod-and-add-one trick cycles an integer counter through
the set of values from one to the modulus.

CYCLES2 - uses explicit indexing on both series arguments.

All of the above variations create a new series to hold the
result, so...

CYCLES3 - modifies the first/longer series in place, rather
  than taking up the storage for a modified copy.


8--begin code--
cycles0: func [b0 [block!] b1 [block!] /local r] [
r: make block! length? b0
foreach a b0 [
insert tail r a + b1/1
if empty? b1: next b1 [b1: head b1]
]
r
]

cycles1: func [b0 [block!] b1 [block!] /local j y r] [
r: make block! length? b0
y: length? b1
j: 1
foreach a b0 [
insert tail r a + b1/:j
j: j // y + 1
]
r
]

cycles2: func [b0 [block!] b1 [block!] /local i j x y r] [
r: make block! length? b0
x: length? b0
y: length? b1
i: j: 1
while [i = x] [
insert tail r b0/:i + b1/:j
i: i + 1
j: j // y + 1
]
r
]

cycles3: func [b0 [block!] b1 [block!] /local j y] [
y: length? b1
j: 1
forall b0 [
change b0 b0/1 + b1/:j
j: j // y + 1
]
b0
]
8---end code---

The timings went as follows (with line-wrapping for email
formatting):

8---begin transcript---
 x: make block! 50
   for i 0 49 1 [insert tail x i]
   print length? x
50

 y: make block! 10
   for i 0 9 1 [insert tail y i]
   print length? y
10

 t: now/time/precise z: cycles0 x y now/time/precise - t
== 0:00:30.65

 t: now/time/precise z: cycles1 x y now/time/precise - t
== 0:00:25.43

 t: now/time/precise z: cycles2 x y now/time/precise - t
== 0:00:33.28

 t: now/time/precise z: cycles3 x y now/time/precise - t
== 0:00:40.86

 copy/part z 50
== [0 2 4 6 8 10 12 14 16 18 10 12 14 16 18 20 22 24 26 28
20 22 24 26 28 30 32 34 36 38 30 32 34 36 38 40 42 44 46 48
40 42 44 46 ...

8end transcript

Note that the longer series was made long enough to really
require some time (on a slower box, at least...), and the
shorter series was made short enough that its manipulation
was stressed (and so that the result could be examined by
eye for correctness).

My conclusions (based on preliminary testing only, the usual
disclaimers about benchmarks apply...) are as follows:

-  Explicit indexing wins over series manipulation for the
   shorter series.

-  WHILE loses to FOREACH (duh.  included for completeness).

-  Preallocation is already known to be necessary for speed.

-  Constructing a new series wins over modifying the original
   series in place.

Of course, these tests were performed using blocks, so it is
only an assumption that the same performance trade-offs would
apply to other series types.

-jn-

-- 
; Joel Neely joeldotneelyatfedexdotcom
REBOL [] do [ do func [s] [ foreach [a b] s [prin b] ] sort/skip
do function [s] [t] [ t:  foreach [a b] s [repend t [b a]] t ] {
| e s m!zauafBpcvekexEohthjJakwLrngohOqrlryRnsctdtiub} 2 ]
-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with unsubscribe in the 
subject, without the quotes.




[REBOL] Re: Shamless request for improving function speed.

2002-03-28 Thread rebol-list

HK Please do not post source code for cryptographic functions to the mailing
HK list. This list is hosted in the US and some subscribers are outside of
HK the US, so posting source code constitutes a violation of US export laws
HK on cryptography.

reading lines like this one always reminds me that for some people we
are second class persons living on this planet [just because our
ancestors were not killing aborigines in America or Australia?]

I now that you just had to write this mail as I wanted to write this
one...

Oldes
-
-[you never know who is observing you!]


-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with unsubscribe in the 
subject, without the quotes.




[REBOL] Re: Shamless request for improving function speed.

2002-03-28 Thread alan parman

Re: Improving function speed (no code)

Thanks Ryan, and Ladislav, for your help with improving function speed.
After removing one 'append from ARCFOUR,
and changing the code such that the series is modified in place I increased speed by 
16-17%.
After replacing all the 'for in ARCFOUR with 'loop, I gained another 15% on top of 
that.

Overall function execution time was decreased by a solid 28% !

I used FX5's Profiler to do my time tests. Great Proggie!

Too bad I can't show you the improved code... ;^/

An interesting phenom: for iterations (k) over 1000, 'loop beats 'for always,
but under 1000 it's a toss-up which one is faster.
(I tried the following... 
cp-orig: func [string][for n 1 k 1 [xor #1 #a]]
cp-new: func [string][loop k [xor #1 #a]]
)
It does not seem to be related to some 1k (1024) barrier.


PS: I've just read Joel's excellent analysis 
and now I may have to try constructing a new series instead of mod-in-place!



I still can't get rid of one 'append though ...
...
s: j: i: 0 
state: make block! 256 
loop 256 [append state s s: s + 1]
...

-- 

___
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup

Win the Ultimate Hawaiian Experience from Travelocity.
http://ad.doubleclick.net/clk;4018363;6991039;n?http://svc.travelocity.com/promos/winhawaii/

-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with unsubscribe in the 
subject, without the quotes.




[REBOL] Re: Shamless request for improving function speed.

2002-03-28 Thread Joel Neely

Hi, Alan,

alan parman wrote:
 
 Re: Improving function speed (no code)
 
...

 I still can't get rid of one 'append though ...
 ...
 s: j: i: 0
 state: make block! 256
 loop 256 [append state s s: s + 1]
 ...
 

It *can* be sped up slightly (saving a millisecond a month
somewhere ;-)

b: make block! n  i: 0   t0a: now/time/precise
loop n [append b i i: i + 1] t0b: now/time/precise

b: make block! n  i: -1  t1a: now/time/precise
loop n [append b i: i + 1]   t1b: now/time/precise

b: make block! n  i: -1  t2a: now/time/precise
loop n [insert tail b i: i + 1]  t2b: now/time/precise

print [t0b - t0a t1b - t1a t2b - t2a]

The first version is similar to yours, the second eliminates
the doubled use of the counter, and the third uses the well-
known APPEND == INSERT TAIL optimization.  Timing the above
five times for N of one million yields these results:

0:00:04.687 0:00:04.507 0:00:02.113
0:00:04.707 0:00:04.547 0:00:02.143
0:00:04.677 0:00:04.526 0:00:02.113
0:00:04.687 0:00:04.536 0:00:02.113
0:00:04.656 0:00:04.586 0:00:02.113

Most of the savings, as expected, results from the second
optimization, but avoiding the double-dipping does have
a measurable value.

-jn-
-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with unsubscribe in the 
subject, without the quotes.




[REBOL] Re: Shamless request for improving function speed.

2002-03-28 Thread Ryan Cole

Wow Joel, I was blown away when I seen 'foreach is cruise through that
so fast.  They must have improved it along the way.  Anyhow, here are
some optomized versions...

; Really fast!
hyper-cycles: func [b0 [block!] b1 [block!] /local j y r] [
r: copy b0
y: length? b1
j: 1
foreach a b0 [
r: change r a + pick b1 j
j: j // y + 1
]
head r
]

; Less memory consuming, yet still fast...
lean-cycles: func [b0 [block!] b1 [block!] /local j y] [
y: length? b1
j: 1
repeat B0-I length? b0 [
poke b0 B0-I b0/:B0-I + b1/:j
J: J // y + 1
]
b0
]

--Ryan

-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with unsubscribe in the 
subject, without the quotes.




[REBOL] Re: Shamless request for improving function speed.

2002-03-28 Thread Andrew Martin

Alan wrote:
 I still can't get rid of one 'append though ...
 ...
 s: j: i: 0
 state: make block! 256
 loop 256 [append state s s: s + 1]

Try:
S: 0
State: make block! 256
loop length? State [
insert state S
S: S + 1
]
reverse State

I think that might be faster.

A better alternative is to pregenerate this block, save to a file and then
load it as needed. For example:
State: [
0 1 2 3 4 ; and so on.
]
save %State.r State
;...
State: load %State.r

Andrew Martin
ICQ: 26227169 http://valley.150m.com/
--



-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with unsubscribe in the 
subject, without the quotes.




[REBOL] Re: Shamless request for improving function speed.

2002-03-27 Thread Ladislav Mecir

Hi Alan,

I noticed that your functions call the FOR function. The FOR function still
contains some bugs I reported pretty long ago and it is a very slow beast
compared to virtually any other cycle function. I suggest you to use WHILE,
UNTIL, REPEAT, LOOP, or my CFOR instead.

Cheers
L

-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with unsubscribe in the 
subject, without the quotes.




[REBOL] Re: Shamless request for improving function speed.

2002-03-27 Thread Graham Chiu

On Wed, 27 Mar 2002 13:31:04 +0100
 Ladislav Mecir [EMAIL PROTECTED] wrote:

 I noticed that your functions call the FOR function. The
 FOR function still
 contains some bugs I reported pretty long ago and it is a
 very slow beast

Hi Ladislav,

There's a beta of the new rebol core out there on IOS.


--
Graham Chiu
-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with unsubscribe in the 
subject, without the quotes.




[REBOL] Re: Shamless request for improving function speed.

2002-03-26 Thread Holger Kruse

On Tue, Mar 26, 2002 at 03:05:09PM -0500, alan parman wrote:
 This group is really good at tweaking functions to make them faster.
 This is a shameless request for help making the following functions _faster_.
 {not necessarily shorter, just faster  :)}
 
 Any help would be deeply appreciated.
 
 The second set (object) is a REBOL version of Ron Rivest's (formerly of RSA) RC4 
encryption algorithm, used in some secure internet transactions.

Please do not post source code for cryptographic functions to the mailing
list. This list is hosted in the US and some subscribers are outside of
the US, so posting source code constitutes a violation of US export laws
on cryptography.

-- 
Holger Kruse
[EMAIL PROTECTED]
-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with unsubscribe in the 
subject, without the quotes.




[REBOL] Re: Shamless request for improving function speed.

2002-03-26 Thread Holger Kruse

On Tue, Mar 26, 2002 at 04:57:41PM -0800, Holger Kruse wrote:
 On Tue, Mar 26, 2002 at 03:05:09PM -0500, alan parman wrote:
  This group is really good at tweaking functions to make them faster.
  This is a shameless request for help making the following functions _faster_.
  {not necessarily shorter, just faster  :)}
  
  Any help would be deeply appreciated.
  
  The second set (object) is a REBOL version of Ron Rivest's (formerly of RSA) RC4 
encryption algorithm, used in some secure internet transactions.
 
 Please do not post source code for cryptographic functions to the mailing
 list. This list is hosted in the US and some subscribers are outside of
 the US, so posting source code constitutes a violation of US export laws
 on cryptography.

Btw, there is another problem, specifically with RC4: AFAIR RSADSI still
considers that algorithm to be their trade secret, and the name RC4 is
trademarked. Because of that this particular algorithm should not be discussed
publically.

You can, however discuss publically (just discuss, not post source code) a
different algorithm, called ARCFOUR, which is assumed to be compatible with
RC4. ARCFOUR is implemented by a lot of software packages (OpenSSL etc.) as a
compatible replacement for RC4 (e.g. for SSL). ARCFOUR is also supported by
REBOL/Command 2.0.

-- 
Holger Kruse
[EMAIL PROTECTED]
-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with unsubscribe in the 
subject, without the quotes.




[REBOL] Re: Shamless request for improving function speed.

2002-03-26 Thread Ryan Cole

It looks pretty darn good to me.  Just a few minor things...

My tests have shown that while making a series to size is and then appending it is 
fast, but using 'poke or 'change and reusing the space is even faster--about double and
thats not counting the time saved by not having to allocate the memory.

It seems to take an amazingly minute amount of time in rebol to assign a word to a 
value.  It is therefore advantageous when using a value more than once, to go ahead and
assign it to a word.  This remains true even in cases of the most simple calculations 
(ie 1 + 1).

--Ryan

alan parman wrote:

 This group is really good at tweaking functions to make them faster.
 This is a shameless request for help making the following functions _faster_.
 {not necessarily shorter, just faster  :)}

-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with unsubscribe in the 
subject, without the quotes.




[REBOL] Re: Shamless request for improving function speed.

2002-03-26 Thread alan parman

Holger,
Sorry, I should have been more careful and specified that the code is ARCFOUR, not 
RC4. (stands for _A_pparently RC4 (four)). (Isn't RSA's patent expired?)

And not to stir up trouble, but my understanding of the latest export laws is that 
source code is free speech and therefore has no export restrictions.

I will, however, abide by the rules of the List.

Thanks, Ryan, for pointing out that 'poke may be faster then 'append.  I see a couple 
of places where I can substitute 'poke for 'append, and will try them out for speed.
-- 

___
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup

Win the Ultimate Hawaiian Experience from Travelocity.
http://ad.doubleclick.net/clk;4018363;6991039;n?http://svc.travelocity.com/promos/winhawaii/

-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with unsubscribe in the 
subject, without the quotes.




[REBOL] Re: Shamless request for improving function speed.

2002-03-26 Thread Holger Kruse

On Tue, Mar 26, 2002 at 09:25:15PM -0500, alan parman wrote:
 Holger,
 Sorry, I should have been more careful and specified that the code is ARCFOUR, not 
RC4. (stands for _A_pparently RC4 (four)). (Isn't RSA's patent expired?)

It's not a patent, but a combination of trade secret and trademark. Unlike
patents, they do not expire.

 And not to stir up trouble, but my understanding of the latest export laws is that 
source code is free speech and therefore has no export restrictions.

AFAIK not universally or automatically. Pretty much anything that is
encryption-related still requires at least the filing of paperwork, in some
cases even a waiting period and/or a permit.

There is also the issue that some countries, like France, even put restrictions
on the use of encryption code, not just on their export, so publically
distributed encryption source code would have to come with a disclaimer :-).

It is probably best to simply keep encryption source code off this list...

-- 
Holger Kruse
[EMAIL PROTECTED]
-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with unsubscribe in the 
subject, without the quotes.