Maybe he's referring to "tokenLower.substring(0, end)".
This appears twice in your code and could be moved to a local variable.
Stefano
Vincenzo Gianferrari Pini wrote:
Bernd,
I don't understand what you mean by "duplicated substrings".
If you mean the substrings added to the tokens ArrayList, only the most
significant of them (highest "probability strength") is later on kept by
the calling method (getTokenProbabilityStrengths). This is the way it is
expected to work.
If you have seen something else please let me know.
Vincenzo
Bernd Fondermann wrote:
Vincenzo,
do you intend to also eliminate the duplicated substrings or does it
not significantly lower memory/cpu load?
Bernd
On 8/30/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
Modified:
james/server/trunk/src/java/org/apache/james/util/BayesianAnalyzer.java
do {
+ if (!token.substring(0,
end).equals(tokenLower.substring(0, end))) {
+ tokens.add(header + tokenLower.substring(0, end));
if (header.length() > 0) {
+ tokens.add(tokenLower.substring(0, end));
}
}
if (end > 1 && token.charAt(0) >= 'A' && token.charAt(0)
<= 'Z') {
end).toLowerCase());
+ tokens.add(header + token.charAt(0) +
tokenLower.substring(1, end));
if (header.length() > 0) {
+ tokens.add(token.charAt(0) +
tokenLower.substring(1, end));
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]