On 29 May 2015 at 23:22, Lex Trotman <ele...@gmail.com> wrote:
>>
>>> Does it have any major drawbacks? I read  it has to some kind
>>> "prefix table" prior to running the search, but I  guess that's
>>> negligible for all reasonable search terms?
>>
>> I think KMP does not have drawbacks.   The prefix table is of length m+1,
>> m is the length of the string P for which you wish to find all occurrences
>> in some string T.    For example, P is "hello", m == 5, and you build a
>> prefix table of length m+1 == 6.    Then T may be any long string, for the
>> long string you do not need any additional memory.   You only need the extra
>> m+1 size_t locations for the prefix table, m is length of P, the string you
>> wish to search.
>
> Its drawback is it is more complicated, and as I said you have to
> benchmark to see if its worth it, things that were worthwhile back
> when KMP was invented may not be worthwhile today.
>

I might add that the search time may be irrelevant compared to the
time to display unless you have a huge buffer and a very small window.

>>
>>
>> Have a great day,
>> Marius Ioan Buzea
>>
>>
>>
>>
>>
>>
>> --------------------------------------------
>> On Fri, 5/29/15, Thomas Martitz <ku...@rockbox.org> wrote:
>>
>>  Subject: Re: [Geany-Devel] pull request on GitHub, to add 
>> GeanyHighlightSelectedWords, into Geany Plugins
>>  To: devel@lists.geany.org
>>  Date: Friday, May 29, 2015, 1:49 PM
>>
>>  Am 29.05.2015 um 12:44
>>  schrieb marius buzea:
>>  > Hello,
>>  >
>>  >
>>  >
>>  With KMP it is possible to search all occurrences of a m
>>  length string, into a n length string,
>>  >
>>  using O(m+n) machine operations.    Next page:
>>  >          http://www.inf.fh-flensburg.de/lang/algorithmen/pattern/kmpen.htm
>>  > describes the algorithm.
>>  >
>>  >
>>  >
>>  > The KMP works well
>>  with the utf-8 encoding of unicode.    One property of
>>  utf8 is that
>>  > the encoding one unicode
>>  symbol is not a substring of another utf8 substring.
>>  This
>>  > property allows to take the utf-8
>>  encoding of the string you wish to search, and to
>>  > find this utf8 encoding string, in the
>>  utf8 encoding of the text string.       Geany
>>  uses
>>  > scintilla, and scintilla uses utf8
>>  to encode the document it displays, and scintilla has
>>  > a command that gives the raw utf8 byte
>>  array for a [start, end) range.      So, KMP
>>  > gives great speed for searching all
>>  occurrences, and may be used with the underlying
>>  > text representation of scintilla used by
>>  geany.     The utf-8 encoding of a unicode
>>  > string of length n, is less than 6n, each
>>  utf8 encoding is at most 6 bytes.
>>  >
>>  >
>>  >
>>  >
>>  I also think that including this functionality/feature into
>>  Geany core would be a good choice.
>>  > It
>>  would be a small tradeoff between keeping the core small,
>>  and adding this new functionality,
>>  > but
>>  this is your choice.
>>  >
>>  >
>>  >
>>  >
>>  If you wish to extend automark, then this is good choice
>>  too.   If you wish, and if it helps,
>>  > please reuse any part of the
>>  implementation provided here:
>>  >    
>> http://sourceforge.net/p/geanyhighlightselectedword/code/HEAD/tree/trunk/GeanyHighlightSelectedWord/GeanyHighlightSelectedWord.c
>>  > If needed, I would help.
>>  >
>>  >
>>  >
>>  What should I do next?     Should I not do the
>>  pull request for GeanyHighlightSelectedWord?
>>  > It is okay with me.
>>  GeanyHighlightSelectedWord would then be still available at
>>  sourceforge until
>>  > Geany provides this
>>  functionality from its core, or from automark.
>>  >
>>
>>
>>  I wonder if this algorithm should be applied to
>>  all searches, and thus
>>  be integrated into
>>  scintilla. Does it have any major drawbacks? I read
>>  it has to some kind "prefix table"
>>  prior to running the search, but I
>>  guess
>>  that's negligible for all reasonable search terms?
>>
>>  Best regards
>>  _______________________________________________
>>  Devel mailing list
>>  Devel@lists.geany.org
>>  https://lists.geany.org/cgi-bin/mailman/listinfo/devel
>>
>> _______________________________________________
>> Devel mailing list
>> Devel@lists.geany.org
>> https://lists.geany.org/cgi-bin/mailman/listinfo/devel
_______________________________________________
Devel mailing list
Devel@lists.geany.org
https://lists.geany.org/cgi-bin/mailman/listinfo/devel

Reply via email to