On 4/16/2015 10:43 AM, Sarang Shrivastava wrote:
Yes, indeed CRM114 has a lot criterias for categorization of data and that can be done via a host of methods, including regexes, approximate regexes, a Hidden Markov Model, Orthogonal Sparse Bigrams, WINNOW, Correllation, KNN/Hyperspace, or Bit Entropy.

We can take ideas from them and develop our own plugin that has the capability to compete with CRM114. Afterall there is no place like home. I look forward to work on these given the fact that my proposal gets accepted.
I look forward to that as well. Much of these algorithms are outside of my area of expertise but testing and tweaking them in real-world environments to gauge efficacy is something that I can do very well.

A thought to give: Does using custom plugins hinder the performance of SA in terms of speed ? No doubt that CRM114 is good in classifying spams and hams but does it in any case hamper the speed at all ?

What do you guys say about including these into SA itself if possible ?

The plugin engine in SA is refined and stable. A plugin that is not enabled has no effect on performance that I can think of in any way, shape or form.

Beyond that, if a plugin is enabled, the performance of the plugin is important. Something like what you are discussing is of particular interest to the higher volume users who will be the most sensitive to performance. But if it's an effective tool to classify emails, people might be more accommodating.

But performance and scalability is one of the reasons I've pointed you towards Redis as a backend. Now, the algorithms data store might need more than a hash store can provide. Txrep, for example, relies on SQL and I can't think of a way to make it more Redis compatible.

Regards,
KAM

Reply via email to