Re: [Jprogramming] Spatial trees

Devon McCormick Mon, 12 Nov 2012 21:34:27 -0800

It's actually quite easy to handle trees in J quickly with one of the
simplest of data structures: a vector of integers.  Here's something we'll
be talking about at NYCJUG today that demonstrates how useful this is with
the example of representing a directory structure.  These are quite old
ideas from APL.


[Gmail should apologize for dropping my indentation in the following]

dirtree=. _1 0 1 0 3 1 NB. Parent index for each input; _1 for no parent.

dirnms=. 'c:';'n0';'n00';'n1';'n10';'n01'  NB. Corresponding dir names


 whChild=: [: I. [ e. ]

whRoot=: [:I. _1=]

dirtree whChild whRoot dirtree NB. Indexes of children of root

1 3

nextLevel=: [ whChild ]

dirtree nextLevel whRoot dirtree NB. Next level down from root children

1 3

dirtree nextLevel dirtree nextLevel whRoot dirtree

2 4 5

 dirnms{~whRoot dirtree NB. Show names of each of the above

+--+

|c:|

+--+

dirnms{~dirtree nextLevel whRoot dirtree

+--+--+

|n0|n1|

+--+--+

dirnms{~dirtree nextLevel dirtree nextLevel whRoot dirtree

+---+---+---+

|n00|n10|n01|

+---+---+---+


 newtree=. _1 0 0 1 1 2 2

newnms=. 'new0';'new01';'new02';'new010';'new011';'new020';'new021'

#&>newtree;<newnms NB. Sanity check

7 7

 graftRoot=: 3 : 0

'ot jn nt'=. y NB. old tree;join-node;new tree

ot,(_1=nt)}(nt+#ot),:jn

)

graftRoot dirtree;(dirnms i. <'n0');newtree

_1 0 1 0 3 1 1 6 6 7 7 8 8

dirtree=. graftRoot dirtree;(dirnms i. <'n0');newtree

dirnms=. dirnms,newnms


 dirnms{~dirtree nextLevel whRoot dirtree

+--+--+

|n0|n1|

+--+--+

dirnms{~dirtree nextLevel dirtree nextLevel whRoot dirtree

+---+---+---+----+

|n00|n10|n01|new0|

+---+---+---+----+

dirnms{~dirtree nextLevel^:2 ] whRoot dirtree

+---+---+---+----+

|n00|n10|n01|new0|

+---+---+---+----+

dirnms{~dirtree nextLevel^:3 ] whRoot dirtree

+-----+-----+

|new01|new02|

+-----+-----+

dirnms{~dirtree nextLevel^:4 ] whRoot dirtree

+------+------+------+------+

|new010|new011|new020|new021|

+------+------+------+------+



On Thu, Sep 13, 2012 at 7:28 PM, Raul Miller <rauldmil...@gmail.com> wrote:

> Sure...
>
> J is designed for working with regular data structures.  Trees -- or
> at least binary trees -- are irregular. Specifically, in the context
> of the KD tree, to properly interpret the significance of a child
> node, you must include information which is not represented at the
> node, but which must be extracted during the traversal from the root.
>
> That said, yes... the data structures that work in J, for algorithmic
> cases which are traditionally addressed by trees, tend to have
> tree-like characteristics. They can often be thought of as variants on
> the tree structures, if you are in the right frame of mind.
>
> The perspective offered in Henry Rich's introduction to his "J for C
> Programmers" book
> (http://www.jsoftware.com/help/jforc/introduction.htm) might be
> relevant here.
>
> Put differently: J's operations have significant O(1) interpreter
> overhead. However, when an operation can be applied to a large
> sequence of data this cost will tend to be much smaller than the O(n)
> [or whatever] cost of applying that primitive to that large chunk of
> data.  In other words, large, regular operations tend to be
> (relatively speaking) much cheaper in J than small, step-by-step
> operations. So when you can use them, it's usually a good idea to do
> so.
>
> (As a result, successful J programmers tend to focus on large-scale
> algorithmic issues.)
>
> I hope this made sense,
>
> --
> Raul
>
> On Thu, Sep 13, 2012 at 6:53 PM, slocklin <scott_lock...@yahoo.com> wrote:
> >
> > Raul:
> >
> > Can you expand on why trees are bad in J? Coming from a C and Lisp
> > mentality, tree search wins a lot of problems. kd-trees break down in
> higher
> > dimensions (aka, for things like image databases, where I guess locality
> > hashing will beat a brute force search), but for modest dimensionality
> > (~10), they seem to do well, with cache misses and everything.
> > Your technique works, if I understand it correctly, and I'm guessing it
> > doesn't keep extra copies of the data around to fill up the memory,
> making
> > it effectively a sort of tree.
> >
> > It is going to take me a while to digest the code you and Marshall wrote,
> > thank you both very kindly for taking the time!
> >
> > -Scott
> >
> >
> > Raul Miller-4 wrote:
> >>
> >> Tree structures -- especially binary tree structures -- are frequently
> >> a bad choice in J.
> >>
> >> Personally, if I were working with spatial data in J, I'd probably
> >> start with something like this:
> >>
> >> NB. example arbitrary set of points:
> >> points=: ?.20 3$0
> >> prepad=: $&__@,~@{:@$, ], $&_@,~@{:@$
> >> searchable=: /:~@:((|."0 1) ,:^:(2 - #@$))"0 2~ i.@{:@$
> >> SP=: searchable prepad points
> >>
> >> In other words: make n copies of the list of points (where n is the
> >> number of dimensions) and sort each list so that it can be binary
> >> searched along one of those dimensions.
> >>
> >> Of course, it's not that simple.  To find the nearest point, you need
> >> to search a volume which includes your candidate points.  So, in the
> >> worst case (most points equidistant from the point you are checking)
> >> you wind up searching the entire data structure.
> >>
> >> I've assumed, here, that once we've done a binary search to find a
> >> close match to the point we are looking for that it's worth searching
> >> the n nearest points in each direction from that match (where n is the
> >> number of dimensions).  So this means an initial search of n*(1+n)
> >> points (n lists of candidates each containing 1+n points).  We need to
> >> extend our search if any of the 2n points bounding each of the n lists
> >> of candidates has a coordinate distance from the point we are looking
> >> for which is closer than our closest candidate.
> >>
> >> This might sound significantly less efficient than a kd-tree, but we
> >> need to keep in mind that the kd-tree (or r* tree) has significant
> >> costs from cache misses and from pipeline busting.
> >>
> >> That said, there are a few other tricks that might be needed.  For
> >> example, if you are incrementally building the structure, adding to
> >> sorted lists is inefficient.  You can work around this issue by
> >> maintaining multiple instances of the logical structure and searching
> >> both (or all) of them and then picking the best result (for example).
> >> Then occasionally you find some time to merge them together.
> >>
> >> That said, I have not had time to draft up a real sample.  (The
> >> project I am currently working on releases this weekend, and in my
> >> "spare" time I'm taking a class, which leaves me without much time for
> >> fun stuff like this.)
> >>
> >> I hope this helps,
> >>
> >> --
> >> Raul
> >> ----------------------------------------------------------------------
> >> For information about J forums see http://www.jsoftware.com/forums.htm
> >>
> >>
> >
> > --
> > View this message in context:
> http://old.nabble.com/Spatial-trees-tp34422178s24193p34429335.html
> > Sent from the J Programming mailing list archive at Nabble.com.
> >
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>



-- 
Devon McCormick, CFA
^me^ at acm.
org is my
preferred e-mail
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Spatial trees

Reply via email to