Pete wrote:
Interesting, and it kinda makes sense.  For elements, there's no
positioning required like with lines/words/item, just a case of cycling
through the keys - which is what "repeat for each line <x> in the keys of
<array> does I suppose.

As with most things in computing, the truly optimal solution comes with a lot of "depends"; total date size, size of elements, distance from the start of a chunk to the value being obtained in it, how deeply nested are the array keys - all those and more play a role in total performance, which can sometimes yield unexpected results.

One challenge with arrays is their use in CGIs, where total throughput performance is unusually critical since the app is born, lives, and dies all in the space of satisfying a single request from the user.

The problem with arrays in that context is that they don't exist with the routine begins, since the engine itself needs to be loaded.

Arrays offer blinding speed for random access, but they're able to do this because they rely on memory-specific structures, leaving us with the question: how do we load the array from a cold start?

One can use custom properties, or arrayEncode/arrayDecode, or split/combine, but all of them are only slightly optimized versions of what you'd need to do if you had to script it yourself using "repeat for each line..." and stuffing the array elements sequentially.

So oddly enough, if the context of use requires that you take into account the loading of the array, total throughput will often be substantially slower than scooping up a delimited file and using chunk expressions on it.

Even outside of a total-throughput context, I've seen other cases where arrays can be slower than "repeat for each", such as deeply-nested arrays (say, four levels deep). In such cases, while each traversal of the hash used to identify the location of the element value is pretty darn fast, you'll have to do four traversals of each hash to get at each element, and that can add up.

Moreover, arrays can impact memory in ways that chunks don't, because in a world where we don't yet have structs (see <http://quality.runrev.com/show_bug.cgi?id=8304>), element labels are replicated for every key. With a tab-delimited list the non-data overhead is one char per field, but with arrays it's the length of the key for every field, which can double the size of the data in memory if the keys are as long as the data.

So alas, as you folks have done here, many times the only way to know for sure what an optimal solution will be is to test it.

If you find yourself doing this sort of thing often, I've put together a few tips on benchmarking performance in this LiveCode Journal article:

<http://livecodejournal.com/tutorials/benchmarking-revtalk.html>

--
 Richard Gaskin
 Fourth World
 LiveCode training and consulting: http://www.fourthworld.com
 Webzine for LiveCode developers: http://www.LiveCodeJournal.com
 LiveCode Journal blog: http://LiveCodejournal.com/blog.irv

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to