On Oct 17, 2007, at 11:57 AM, Dave Rafkind wrote:
One benefit of Erlang that I can see it the ease with which it supports "parallel" (multicore or distributed) computation. If tracking servers and associated datastructures can be "sharded" (partitioned) along some kind of optimized geographical boundaries then won't that be a win for most simple tracking purposes? Admittedly you won't get much obvious help for global queries but for simple things like geofencing and so on it seems like a nice solution.
What I meant was that the limitations on parallelization and distribution in this particular environment is not really a function of the language; it may be a good solution to A problem but does not really address THE problem(s). You are not suggesting the implementation of anything many people have not implemented without Erlang. Call it a pre-mature optimization, though I can see where it could have some utility.
Using other Erlang apps as an example (Yaws, Ejabberd, RabbitMQ), 1000 updates per server per second seems entirely reasonable, and maybe 10k/server/sec is an upper bound. Is that within the performance bounds of current real-world tracking systems?
Update rates are a function of the lock graphs for a given data structure and the kind of guarantees one needs to make. You cannot make a meaningful inferences about update scalability based on the characteristics of unrelated data structures; it is a limitation of algorithms and data structures, not the programming language.
Also, I'm not sure I see the difficulty in tracking points vs lines vs polygons. Certainly time-swept data is tricky (ie keep track of the last 3 positions) but if you live in the present things like augmented R trees ( http://www.cs.purdue.edu/research/ technical_reports/2005/TR%2005-020.pdf) seem adequate (although I have not tested them myself)
If you restrict yourself to tracking points, you can make assumptions that allow significant data structure optimization. Polygons are much more difficult to generalize in a scalable way because there is no guarantee that a natural boundary exists with which to nicely partition any arbitrary set of polygons. Handling this case well is difficult.
The numerous R-tree variants, like the one above, do not substantially improve R-trees in the general case. Usually it is a case of trading one of the numerous pathological cases for another -- useful if you can tailor it to a specific app -- or making one pathological case less pathological on average. For every R-tree variant, you can find a real-world data set that essentially breaks it and several others that have mediocre performance on it. Again, one of the values of having so many R-tree variants is that you can select the one that best matches the characteristics of your data set.
Cheers, J. Andrew Rogers _______________________________________________ Geowanking mailing list [email protected] http://lists.burri.to/mailman/listinfo/geowanking
