[REBOL] Interleaving of strings question Re:(2)

2000-09-26 Thread grantwparks

I think it is an interesting programming topic (and I
did not mean my earlier comments to be absolute - it
was directed at the seemingly unresolvable debates
about whether or not do such and such).  Most sorting
is done, it seems, on some sort of scalar level, and I
was interested in the idea of using each sort item as
a description of something else, in this case a point
on a plane.  (I'm not sure why your #4 comment means
my function is wrong.  Unless you mean that the
resulting sorted list is in order according to "its
distance from a point 0,0 that defines the most
'lower-left' point of an arbitrary plane".  I.E. it
only works in quadrant II of that old x-y graph. 
Please explain, if not.  Maybe what's more useful
would be the function sorting in terms of distance
from any point, where I think the deltas come back
in.)  To me it could relate to some sort of
abstraction on sort (using objects, blocks whatever)
that might be useful.  I guess I knew I was opening
myself up to a "look who's talking" attitude, but I
was thinking about a real-world problem in terms of
some abstraction and trying to really understand what
the original poster's interleave method was getting
at.  Often that leads to a cool generalization that
can be used elsewhere.  That's why I considered it to
be a programming topic.  You could argue it belonged
in a different list, but this is where the original
post was, and I'm trying to learn REBOL, so it's where
I posted.  Is that the same as a thread discussing why
MS is successful, or why they suck?  Again, on some
level you could argue it is.  Up to you, I guess.

 
--- [EMAIL PROTECTED] wrote:
 Hi,
 
 although your question has a little in common with
 Rebol, my answers are as
 follows:
 
 1. It would be more precise to interleave each bit -
 ie. each binary digit -
 as the smallest unit of information
 
 2. in that case the bit - sort yields:
  [0,0 2,2 1,5 5,1]
 
 3. as you pointed out, the result of the sort
 doesn't depend on the
 representation, but on the comparison function, any
 suitable representation
 can do
 
 4. your suggestion doesn't work, because you need
 close things in the terms
 of distance to remain close in the terms of sort.
 Your suggestion works only
 in the neighborhood of the 0,0 point
 
 Best regards
 Ladislav
 
 - Original Message -
 From: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Sent: Monday, September 25, 2000 1:04 PM
 Subject: [REBOL] Interleaving of strings question
 
 
  I was thinking about the earlier postings about
  sorting coordinates by interleaving digits of the
 2
  axis values.  The interleaving, it would seem, is
 to
  make the longitude as significant as the latitude,
 in
  terms of distance from other points.  Ignoring the
  need to have the lat/long pairs stored that way
 for
  later processing, and ignoring stated expected
 values
  for the coordinates I thought:
 
  1.  Wouldn't it be more precise to interleave each
  digit?
 
  2.  In terms of a regular x,y system, even that
 would
  be incorrect sometimes. E.G.: [ 0,0 1,5 2,2 5,1 ]
 (a
  sorted set of simplified coordinates) isn't
 correct.
  2,2 is closer to 0,0; 1,5 and 5,1 are the same
  distance from 0,0.
 
  3.  Could pairs or tuples be used?  Answer: pairs
  didn't sort that way and I have no reason to
 assume
  sort would think of tuples as coordinates.
 
  4.  Wouldn't it be the least convoluted to have a
  /compare function for sort that understood that
 all
  components of the coordinate were equally
 significant?
   That way there's no manipulating the coordinates,
 and
  it should be the most precise.
 
  Would a Pythagorean type thing work, because it
 looks
  like simply summing the absolute differences
 between
  each coordinate component isn't right:
 
  1,1 to 1,5 = 4  ((1-1)+(5-1))
  1,1 to 2,4 = 4  ((2-1)+(4-1))
 
  yet 2,4 is closer to 1,1 than is 1,5.  Until the
 real
  distance is needed, I don't think you'd have to
 take
  the square root of the sum (and how do you do that
 for
  3-dimensions, does it become cubed and cube root -
  aha)
 
  (ABS(x1-x2) to the nth) + (ABS(y1-y2) to the nth)
 ...
  + (ABS(n1-n2) to the nth)
 
 
  Comments?  Is this right?
 
  __
  Do You Yahoo!?
  Send instant messages  get email alerts with
 Yahoo! Messenger.
  http://im.yahoo.com/
 
 
 
 


__
Do You Yahoo!?
Send instant messages  get email alerts with Yahoo! Messenger.
http://im.yahoo.com/




[REBOL] Interleaving of strings question Re:

2000-09-25 Thread grantwparks

Since we want them sorted, we're talking about their
relative distance from an origin ( 0,0,... ).  So it
became (x1 to the nth + y1 to the nth)  (x2 to the
nth + y2 to the nth)...

If one of you great REBOLs would show how to do this
using all the niceties of series! traversal I would
appreciate it; I know my version is a pretty sad
procedural version:

coord-sort: func [ a b /local sum1 sum2 ] [
  sum1: 0
  sum2: 0

  dimension: length? a

  while [ not tail? a ] [
sum1: add sum1 power first a dimension
a: next a
  ]
  while [ not tail? b ] [
sum2: add sum2 power first b dimension
b: next b
  ]
  sum1  sum2
]

--- [EMAIL PROTECTED] wrote:
 
 (ABS(x1-x2) to the nth) + (ABS(y1-y2) to the nth)
 ...
 + (ABS(n1-n2) to the nth)
 


__
Do You Yahoo!?
Send instant messages  get email alerts with Yahoo! Messenger.
http://im.yahoo.com/




[REBOL] Interleaving of strings question Re:(2)

2000-09-25 Thread joel . neely

Hi, Grant!

1)  A couple of minor tweaks to  coord-sort  yields (*WARNING*
*UNTESTED CODE FOLLOWS* *TEST BEFORE USING*)

coord-sort: func [a b /local val dim sum] [
dim: length? a
sum: 0
foreach val a [sum: sum + power val dim]
foreach val b [sum: sum - power val dim]
sum  0
]

2)  Your generalization over dimensionality appeals to me as a
mathematician, but as a "software engineer" I am compelled
to point out that if you're dealing with a performance bottle-
neck, the following is faster:

coord-sort-2d: func [a b /local val dim sum] [
dim: length? a
sum: 0
foreach val a [sum: sum + (val * val)]
foreach val b [sum: sum - (val * val)]
sum  0
]

3)  Putting my mathematics hat back on, I am compelled to
point out that if these are really lat/lon coordinates,
then both the taxicab metric (absolute deltas) and squared
Euclidean metric (sum squared deltas) can give wrong answers
to the general case of "which points are nearest to a specific
target point".

Near the poles, lattitude deltas make MUCH more difference
than comparable longitude deltas.  For small deltas, a GREAT
improvement in accuracy results from weighting longitude
delta by  cos avg lat  in the sum.  Some additional fiddling
may be required due to the singularites at the poles, but
pole crossings involve large longitude deltas anyway...

4)  Putting my s/w eng'ing hat back on, I should add that the
performance of the whole search issue (assuming you might
need to search for a "nearest point" as in the previous
comment) is dominated by the number of points in the entire
search space.  If the space is large, with substantial
dispersion of points, consider using the following speedup trick:

The maximum error of the taxicab metric vs. the Euclidean
metric is the square root of two  (e.g., the point at (1,1)
has a taxicab metric of 2, instead of the Euclidean metric
of 1.141... -- it's too large by a factor of sqrt 2.  Ergo...

Make a prepass across the search space using the taxicab
metric, retaining only those points whose distance from the
target is at most 1.414... times the least distance-
from-target found thusfar.  Then make a second pass across
only the set collected in the prepass, using the correct-
but-more-expensive real metric.

Hope some of this is helpful!

-jn-

[EMAIL PROTECTED] wrote:
 
 Since we want them sorted, we're talking about their
 relative distance from an origin ( 0,0,... ).  So it
 became (x1 to the nth + y1 to the nth)  (x2 to the
 nth + y2 to the nth)...
 
 If one of you great REBOLs would show how to do this
 using all the niceties of series! traversal I would
 appreciate it; I know my version is a pretty sad
 procedural version:
 
 coord-sort: func [ a b /local sum1 sum2 ] [
   sum1: 0
   sum2: 0
 
   dimension: length? a
 
   while [ not tail? a ] [
 sum1: add sum1 power first a dimension
 a: next a
   ]
   while [ not tail? b ] [
 sum2: add sum2 power first b dimension
 b: next b
   ]
   sum1  sum2
 ]





[REBOL] Interleaving of strings question Re:(3)

2000-09-25 Thread grantwparks

Yow!  I guess it's a lot safer to play in the
theoretical space ;)  I was wondering tho' if you
imagine a 3-dimensional space, does changing the power
to 3 from 2 work for Pythagoras?

I recently talked to some GIS guys who pointed out the
inherent errors in GPS because the tectonic plates
move up to 51 cm / year, or something.  They got a big
kick out of that, but somehow I missed the punchline.

I did think, though, that this was a definitively more
accurate method than the original discussion (don't
know if you saw it) that was going to interleave the
long digits into the latitude...

--- [EMAIL PROTECTED] wrote:
 Hi, Grant!
 
 1)  A couple of minor tweaks to  coord-sort  yields
 (*WARNING*
 *UNTESTED CODE FOLLOWS* *TEST BEFORE USING*)
 
 coord-sort: func [a b /local val dim sum] [
 dim: length? a
 sum: 0
 foreach val a [sum: sum + power val dim]
 foreach val b [sum: sum - power val dim]
 sum  0
 ]
 
 2)  Your generalization over dimensionality appeals
 to me as a
 mathematician, but as a "software engineer" I am
 compelled
 to point out that if you're dealing with a
 performance bottle-
 neck, the following is faster:
 
 coord-sort-2d: func [a b /local val dim sum] [
 dim: length? a
 sum: 0
 foreach val a [sum: sum + (val * val)]
 foreach val b [sum: sum - (val * val)]
 sum  0
 ]
 
 3)  Putting my mathematics hat back on, I am
 compelled to
 point out that if these are really lat/lon
 coordinates,
 then both the taxicab metric (absolute deltas) and
 squared
 Euclidean metric (sum squared deltas) can give wrong
 answers
 to the general case of "which points are nearest to
 a specific
 target point".
 
 Near the poles, lattitude deltas make MUCH more
 difference
 than comparable longitude deltas.  For small deltas,
 a GREAT
 improvement in accuracy results from weighting
 longitude
 delta by  cos avg lat  in the sum.  Some additional
 fiddling
 may be required due to the singularites at the
 poles, but
 pole crossings involve large longitude deltas
 anyway...
 
 4)  Putting my s/w eng'ing hat back on, I should add
 that the
 performance of the whole search issue (assuming
 you might
 need to search for a "nearest point" as in the
 previous
 comment) is dominated by the number of points in the
 entire
 search space.  If the space is large, with
 substantial
 dispersion of points, consider using the following
 speedup trick:
 
 The maximum error of the taxicab metric vs. the
 Euclidean
 metric is the square root of two  (e.g., the
 point at (1,1)
 has a taxicab metric of 2, instead of the
 Euclidean metric
 of 1.141... -- it's too large by a factor of
 sqrt 2.  Ergo...
 
 Make a prepass across the search space using the
 taxicab
 metric, retaining only those points whose
 distance from the
 target is at most 1.414... times the least
 distance-
 from-target found thusfar.  Then make a second
 pass across
 only the set collected in the prepass, using the
 correct-
 but-more-expensive real metric.
 
 Hope some of this is helpful!
 
 -jn-
 
 [EMAIL PROTECTED] wrote:
  
  Since we want them sorted, we're talking about
 their
  relative distance from an origin ( 0,0,... ).  So
 it
  became (x1 to the nth + y1 to the nth)  (x2 to
 the
  nth + y2 to the nth)...
  
  If one of you great REBOLs would show how to do
 this
  using all the niceties of series! traversal I
 would
  appreciate it; I know my version is a pretty sad
  procedural version:
  
  coord-sort: func [ a b /local sum1 sum2 ] [
sum1: 0
sum2: 0
  
dimension: length? a
  
while [ not tail? a ] [
  sum1: add sum1 power first a dimension
  a: next a
]
while [ not tail? b ] [
  sum2: add sum2 power first b dimension
  b: next b
]
sum1  sum2
  ]
 
 


__
Do You Yahoo!?
Send instant messages  get email alerts with Yahoo! Messenger.
http://im.yahoo.com/




[REBOL] Interleaving of strings question Re:(4)

2000-09-25 Thread joel . neely

[EMAIL PROTECTED] wrote:
 
 Yow!  I guess it's a lot safer to play in the
 theoretical space ;)  I was wondering tho' if you
 imagine a 3-dimensional space, does changing the power
 to 3 from 2 work for Pythagoras?
 

Yes, if your 3-space is Euclidean.


 I recently talked to some GIS guys who pointed out the
 inherent errors in GPS because the tectonic plates
 move up to 51 cm / year, or something.  They got a big
 kick out of that, but somehow I missed the punchline.
 

ROTFL!  You should respond, "Yeah, but they more than
compensated for that when they turned of SA!"


 I did think, though, that this was a definitively more
 accurate method than the original discussion (don't
 know if you saw it) that was going to interleave the
 long digits into the latitude...
 

Well, I saw it, but didn't understand what problem was
being solved (and still don't).

Is the task is to:

1)  take a given target location and return the point
in the database that is geographically nearest;

2)  take a given target location and return a list of
all points from the database that are "near enough";

3)  sort a collection of points and present a human-
readable display based on nearest-to-farthest; or

4)  something else?

In addition, I'm unclear as to whether:

1)  the target is an arbitrary point or is constrained to
be one of the points in the database;

2)  how many points are in the database;

3)  what resolution is required; and

4)  the dispersion of the points in the database:

4a)  how wide an area is covered, and
4b)  how "clumpy" the distribution is.

All of these would influence my choice of strategy.  For
example, if the task is (1) or (2), and the points are
fairly uniformly distributed over a large range of lat/lon
values, I'd consider a two-level strategy of partitioning
the data into coarse-grained lat/lon cells (one-time,
front-end work), then searching only the cell that should
contain the target point (or nearby cells as needed for
"near enough" or "empty cell" constraints).

For example -- assuming two-degree cells for the sake of
illustration -- to locate the point 37.4968N x 89.9935W,
look first in the cell containing 36-37N x 88-89W.  As
needed, expand the search into the neighboring cells:

..........
..........

..38-39N36-37N34-35N..
..90-91W90-91W90-91W..

..38-39N36-37N34-35N..
..88-89W88-89W88-89W..

..38-39N36-37N34-35N..
..86-87W86-87W86-87W..

..........
..........

This strategy lets you "zoom in" quickly to the area
of relevance, then do the more complex calculations
only for a small subset of the space.


Another strategy would be to pre-compute the true
distance-from-origin for each point (storing that value
with each point), then use that as a figure of merit
for limiting the search space.  Points near the target
will obviously have a similar distance-from-origin to
the target's value.  Of course, the converse is not
true (90Nx0W is the same distance from 0Nx0W as 0Wx90N),
but the point of this trick is just to reduce the nunber
of points that must be individually evaluated.


Working with real data on real computers with real
memory and performance limits can DEFINITELY get
hairy!

Have fun!

-jn-




[REBOL] Interleaving of strings question Re:

2000-09-25 Thread lmecir

Hi,

although your question has a little in common with Rebol, my answers are as
follows:

1. It would be more precise to interleave each bit - ie. each binary digit -
as the smallest unit of information

2. in that case the bit - sort yields:
 [0,0 2,2 1,5 5,1]

3. as you pointed out, the result of the sort doesn't depend on the
representation, but on the comparison function, any suitable representation
can do

4. your suggestion doesn't work, because you need close things in the terms
of distance to remain close in the terms of sort. Your suggestion works only
in the neighborhood of the 0,0 point

Best regards
Ladislav

- Original Message -
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Monday, September 25, 2000 1:04 PM
Subject: [REBOL] Interleaving of strings question


 I was thinking about the earlier postings about
 sorting coordinates by interleaving digits of the 2
 axis values.  The interleaving, it would seem, is to
 make the longitude as significant as the latitude, in
 terms of distance from other points.  Ignoring the
 need to have the lat/long pairs stored that way for
 later processing, and ignoring stated expected values
 for the coordinates I thought:

 1.  Wouldn't it be more precise to interleave each
 digit?

 2.  In terms of a regular x,y system, even that would
 be incorrect sometimes. E.G.: [ 0,0 1,5 2,2 5,1 ] (a
 sorted set of simplified coordinates) isn't correct.
 2,2 is closer to 0,0; 1,5 and 5,1 are the same
 distance from 0,0.

 3.  Could pairs or tuples be used?  Answer: pairs
 didn't sort that way and I have no reason to assume
 sort would think of tuples as coordinates.

 4.  Wouldn't it be the least convoluted to have a
 /compare function for sort that understood that all
 components of the coordinate were equally significant?
  That way there's no manipulating the coordinates, and
 it should be the most precise.

 Would a Pythagorean type thing work, because it looks
 like simply summing the absolute differences between
 each coordinate component isn't right:

 1,1 to 1,5 = 4  ((1-1)+(5-1))
 1,1 to 2,4 = 4  ((2-1)+(4-1))

 yet 2,4 is closer to 1,1 than is 1,5.  Until the real
 distance is needed, I don't think you'd have to take
 the square root of the sum (and how do you do that for
 3-dimensions, does it become cubed and cube root -
 aha)

 (ABS(x1-x2) to the nth) + (ABS(y1-y2) to the nth) ...
 + (ABS(n1-n2) to the nth)


 Comments?  Is this right?

 __
 Do You Yahoo!?
 Send instant messages  get email alerts with Yahoo! Messenger.
 http://im.yahoo.com/







[REBOL] Interleaving of strings question Re:(5) (UnRebolish commentary)

2000-09-25 Thread webdev



In light of the recent posts... 
Yes this has nothing to do with Rebol in the 
strictest sense. 
Mea Culpa.

Regarding this whole interleaving 
debate might I submit that I never intended 
to seed a thread around the validity of interleaving as an optimal method of 
performing complex and or seamless geospatial calculations. But while I'm here 
and the thread is active I'll mention that I just needed some Rebol code to 
explore the concept on a practical and or intellectual level. The method I 
described is  only one variation on this theme, as some of you 
illustratedbyrecasting it in different forms. What I described was 
actually a bastard child from a specific method involving binary interleaving 
along with some other features that I could not possibly explain off the top of 
my head without the original source material. I truncated the method description 
intentionally for the sake of being able to pose my real request for Rebol 
parsing methods. And yes the method does break down across boundaries and goes 
for a tilt in a big way at the poles. In the absence of higher level support the 
conceptfails in a variety of ways that make it a poor contender for 
afully functioningGIS. Do not attempt to do anything more serious 
than a napkin sketch of this method without the supervision of an adult or 
perhaps an attending physician unless you wish to go numerically mad with 
something akin to a division by zero. 

Thanks for the feedback, it was most enlightening 
and informative.

Sincerely,
Jim