[REBOL] Find speed Re:(2)

2000-07-10 Thread brian . hawley

Alessandro Pini ([EMAIL PROTECTED]) wrote:
 - Open Your Mind -

Quoting from Brian Hawley's message (08-Jul-00 01:31:18).

b   This either means resorting the list after every
b insert, using an incremental insertion sort, or making
b sure the data is inserted in order.

Hmmm. :-
You mean something like this invented example?

probe data
   == [15 98 60 52 10 2]

insert/sorted/compare data 15 :method
   == [98 60 52 15 10 2]

I mentioned three methods. Since we were talking about
working with key/value pairs...

table: [1 "1" 2 "2" 5 "5"]
key: 4 value: "4"

Resorting the list after every insert:
  sort/skip append table reduce [key value] 2
Hey, at least the sort is in native code. Depending
on the method, it could be optimal when the table
is already mostly sorted (read: Not quicksort).

Incremental insertion sort (linear search):
  while [
any [(tail? table) (key = first table)]
] [table: skip table 2]
table: head insert table reduce [key value]
Binary search would be faster for large tables, but
it's a little more complicated.

Making sure the input is sorted in the first place:
(sort input here)
  append table reduce [key value]
Doesn't work very well when you can't sort the input
in the first place.

Does that help?

Brian Hawley




[REBOL] Find speed Re:(2)

2000-07-07 Thread Galt_Barber





Eric wrote:
==
Hashes seem to be particularly slow when searching for nonexistent key
values. For 100,000 searches:

  loop 10 [ if select z "1533695" [x: x + 1]] ; first key value,   0:02
  loop 10 [ if select z "501730" [x: x + 1]]  ; 5000th key value,  0:53
  loop 10 [ if select z "1533696" [x: x + 1]] ; non-key value, 1:41
=


Well, how would it compare to find?
It looks to me like it's doing a linear search like find.

plus have you plugged select back into Ladislav's code to see
if it was any faster really?

this ought to be brainless.  What are we doing wrong on these hashes?

-galt





[REBOL] Find speed Re:(2)

2000-07-06 Thread lmecir

Hi,

regarding memoizing. There is a class of functions, that must use
some kind of memoizing to work at all, I think. The number of the
possibilities for parameters does not matter, you can store only
important results (depending on what you consider important - eg.
expensive or recent results...)

Because I used a non-real-life example for testing, here is
something more adequate for memoizing purposes:

Block/find time: 14.5 count: 7771
Block/bfind time: 5.875 count: 7771
Hash/find time: 28.5 count: 7771

The source code:

REBOL[
Title: "Seconds"
Author: "Ladislav Mecir"
Email: [EMAIL PROTECTED]
Date: 19/5/1999
File: %seconds.r
]

seconds: function [
{Compute difference between dates in seconds}
a [date!] "first date"
b [date!] "second date"
] [diff] [
diff: b - a
diff: 24 * diff + b/time/hour - a/time/hour +
b/zone/hour - a/zone/hour
diff: 60 * diff + b/time/minute - a/time/minute
60 * diff + b/time/second - a/time/second
]


REBOL[
Title: "Time-block"
Author: "Ladislav Mecir"
E-mail: [EMAIL PROTECTED]
Date: 19/5/1999
File: %timblk.r
Purpose: {
Measure time of a block execution with given relative
accuracy.
This function can even time the execution of the empty
block.
Suggested values for accuracy are: 0.05 to 0.30.
Accuracy better than 0.05 may not be reachable,
accuracy worse than 0.30 may not be useful.
}
]

include %seconds.r

time-block: function [
"Time a block."
block [block!]
accuracy [decimal!]
/verbose
] [
guess count lower upper start finish result
] [
if verbose [print ["Timing a block:" block]]
guess: 0
count: 1
lower: 1 - :accuracy
upper: 1 + :accuracy
while [
start: now
loop :count :block
finish: now
result: (seconds start finish) / count
if verbose [
prin "Iterations: "
prin count
prin ". Time/iteration: "
prin result
prin " seconds.^/"
]
any [
result = 0
guess  (result * lower)
guess  (result * upper)
]
][
guess: result
count: count * 2
]
result
]

{
Example:

time-block [] 0,05
}

Rebol []

bfind: func [
blk [block!]
value
/local d u m
] [
d: 1
u: length? blk
while [(m: to integer! (d + u) / 2)  d] [
either (pick blk m) = value [d: m] [u: m]
]
either (pick blk u) = value [u] [d]
]

include %timblk.r

n: 8192 ; the test size
m: 10 * n ; random number range
prin "Block/find time: "
prin time-block [
random/seed 1
c: 0
blk: copy []
repeat i n [
d: random m
if not find blk d [
c: c + 1
append blk d
]
]
] 0.05
prin " count: "
print c
prin "Block/bfind time: "
prin time-block [
random/seed 1
c: 0
blk: copy []
repeat i n [
d: random m
either empty? blk [
c: c + 1
insert blk d
] [
bf: bfind blk d
if not (pick blk bf) = d [
c: c + 1
insert skip blk either (pick blk bf)  d [
bf - 1
] [
 bf
] d
]
]
]
] 0.05
prin " count: "
print c
prin "Hash/find time: "
prin time-block [
random/seed 1
c: 0
blk: make hash! []
repeat i n [
d: random m
if not find blk d [
c: c + 1
insert blk d
]
]
] 0.05
prin " count: "
print c




 Hi, Ladislav,

 if you are interested only in an associative array,
 then a properly implemented hash should beat the
 hell out of even a binary search on large data.

 I guess it depends somewhat on the quality of the
 hash function and other aspects of the implementation.

 So, if you can beat the Rebol hash, which is probably compiled
 c-code, with your binary search written in interpreted Rebol,
 then RT had better take a look at their Rebol hashes.

 Would you like to share some test source code with us
 so we can try to duplicate and analyze your results?

 Thanks!

 -galt

 p.s. Does memo-ize only make sense when there are a limited
 number of possibilities for the parameters?  If the space of
possible
 values is very large and requests could come in a sort of
 random or rarely repeating way then the memo-ize trick wouldn't
 do much good, would it?







[REBOL] Find speed Re:(2)

2000-07-05 Thread lmecir

Hi,

my experiment was raher related to Brian's Memoize, or other
simple associative structures. The result is, that for structures
containing more than 4 096 elements, the natives Find, Select,...
can successfully be replaced by some appropriately designed
approaches.

Regards
Ladislav