Tom Longridge wrote:
My current Python project involves lots repeatating code blocks,
mainly centred around a binary string of data. It's a genetic
algorithm in which there are lots of strings (the chromosomes) which
get mixed, mutated and compared a lot.
Given Python's great list processing abilities and the relative
inefficiencies some string operations, I was considering using a list
of True and False values rather than a binary string.
I somehow doubt there would be a clear-cut answer to this, but from
this description, does anyone have any reason to think that one way
would be much more efficient than the other? (I realise the best way
would be to do both and `timeit` to see which is faster, but it's a
sizeable program and if anybody considers it a no-brainer I'd much
rather know now!)
Any advice would be gladly recieved.
It depends more on the operations you are performing, and
more importantly *which of those you have measured and
found to be slow*, than on anything else.
If, for example, you've got a particular complex set of
slicing, dicing, and mutating operations going on, then
that might say use one type of data structure.
If, on the other hand, the evaluation of the fitness
function is what is taking most of the time, then you
should focus on what that algorithm does and needs
and, after profiling (to get real data rather than
shots-in-the-dark), you can pick an appropriate data
structure for optimizing those operations.
Strings seem to be what people pick, by default, without
much thought, but I doubt they're the right thing
for the majority of GA work...
-Peter
--
http://mail.python.org/mailman/listinfo/python-list