Re: [Chicken-users] how to speed up my text filter?

John Cowan Wed, 25 Sep 2013 06:58:13 -0700

Peter Bex scripsit:

> Also if the lines are very very long, you may want to avoid splitting
> it into several substrings beforehand and keeping them around.


In general, you should never split every line in the file unless you
know you have to, as it involves copying all the characters in the file
in a slow conditional loop.  Any time you can postpone splitting, you
should.

> Instead you could search for the next #\tab occurrance using
> string-index, and keep around the previous position, extracting only
> the substring currently under scrutiny.

Yes, that's a big win.

> I think srfi-13's kmp-search stuff is intended for exactly this, but I
> never was able to grok how to use it.

Searching for a single character is a degenerate case for KMP, so it's
not worth doing.  It wins big when you are searching for the same
multi-character search string, especially if it is long, over many
target strings.

Xin Zheng scripsit:

> Thank you guys for your help. I learned a lot. One more question is
> -- as to split a line with many fields, would it be more efficient if
> there were another "string-split" which outputs a vector?

Not really.  The cost, as I mentioned above, is in allocating and
copying all the substrings, not in the data structure that captures
them.

-- 
Possession is said to be nine points of the law,                John Cowan
but that's not saying how many points the law might have.       [email protected]
        --Thomas A. Cowan (law professor and my father)

_______________________________________________
Chicken-users mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/chicken-users

Re: [Chicken-users] how to speed up my text filter?

Reply via email to