It's quite easy to speed it up, actually. Let's take a look at your transform iterator: proc flip(s: seq[string]): seq[string] = result = s # copy result[0] = s[^1] result[^1] = s[0] proc transpose(s: seq[string]): seq[string] = result = s # copy for i in 0 .. s.high: for j in 0 .. s.high: result[j][i] = s[i][j] iterator transform(s: seq[string]): seq[string] = for a in [s, s.flip]: # copy x 3 let b = a.transpose # copy c = b.flip # copy d = c.transpose # copy # x2 yield a; yield b; yield c; yield d # possibly copy x 4 # total of: copy x 9, possibly copy x 17
Well, you take a sequence BY VALUE but still you don't operate in-place but allocate new sequences all the time. And all strings, it'd add. The bare minimum would be to get rid of copies you don't need: proc flip(s: var seq[string]) = swap s[0], s[^1] # in-place proc transpose(s: var seq[string]) = for i in 0 .. s.high: for j in i+1 .. s.high: swap s[j][i], s[i][j] # in-place iterator transform(s: var seq[string]): var seq[string] = var tmp = s # copy tmp.flip # in-place template yieldAll(c) = yield c; c.transpose # in-place yield c; c.flip # in-place yield c; c.transpose # in-place yield c # in-place yieldAll(s) # in-place s = tmp # copy yieldAll(s) # in-place Now a new seq is allocated (and filled; i.e. deepCopied) only two times: at var tmp = s and at s = tmp. You could further reduce that to one by using shallowCopy in the latter case. Then, if you want to truly use only seq all the time, you can check whenever the following would be faster: proc flip2(s: var seq[string]) = for i in 0 .. s.high: swap s[i][0], s[i][^1] # in-place iterator transform(s: var seq[string]): var seq[string] = template yieldAll(c) = yield c; c.transpose # in-place yield c; c.flip # in-place yield c; c.transpose # in-place yield c # in-place yieldAll(s) # in-place s.flip2 # in-place s.flip # in-place yieldAll(s) # in-place It may be tempting to use a 2D array / seq mocking one, as transpose and reading would benefit from memory locality (but not THAT much, I guess). On the other hand, flip (but not flip2!) benefit from memory scattering so it's not that obvious, but it would still faster than flip2, thanks to locality (e.g. you can use copyMem instead of a swap in a loop). Still, I think it might be nice to give 2D array a try. @Araq Are these seq copied in yield? It would be nice if not but the value semantics (which seem to break on let instead of var, so maybe here too?) make me doubt...