18 dec 2007 kl. 13.26 skrev Grant Ingersoll:
I might be missing something here, but why do you clone?
Because the Token is changing and I am not saving all Tokens, just
the ones changed.
Aha!
The first thing to note is that TeeTokenFilter (TTF) is much
_slower_ in the case that all tokens are siphoned off (X = 1).
I added a line to the benchmark so that the "Two fields" does the same
thing as the "Tee field".
next = new Token();
result = new ModuloTokenFilter(new StandardFilter(new
StandardTokenizer(new StringReader(buffer.toString()))), modCounts[j]);
while ((next = result.next(next)) != null) {
+ next.clone(); // simulate what sink does
tfPos += next.getPositionIncrement();
}
}
long finish = System.currentTimeMillis();
System.out.println("ModCount: " + modCounts[j] + " Two fields
took " + (finish - start) + " ms");
This has some effect, but much less than tweaking JVM settings:
MacBook running
-server -Xms64M -Xmx256M
-----Tokens: 50000-----
ModCount: 1 Two fields took 1943 ms
ModCount: 1 Tee fields took 1172 ms
ModCount: 2 Two fields took 833 ms
ModCount: 2 Tee fields took 759 ms
ModCount: 5 Two fields took 632 ms
ModCount: 5 Tee fields took 473 ms
-client -Xmx256M
-----Tokens: 50000-----
ModCount: 1 Two fields took 2025 ms
ModCount: 1 Tee fields took 2537 ms
ModCount: 2 Two fields took 1535 ms
ModCount: 2 Tee fields took 1479 ms
ModCount: 5 Two fields took 1314 ms
ModCount: 5 Tee fields took 1088 ms
Then I cut down the time spent even more by rewriting Token#clone:
-server -Xms64M -Xmx256M
-----Tokens: 50000-----
ModCount: 1 Two fields took 1263 ms
ModCount: 1 Tee fields took 854 ms
ModCount: 2 Two fields took 692 ms
ModCount: 2 Tee fields took 562 ms
ModCount: 5 Two fields took 799 ms
ModCount: 5 Tee fields took 432 ms
-client -Xmx256M
ModCount: 1 Two fields took 1600 ms
ModCount: 1 Tee fields took 1981 ms
ModCount: 2 Two fields took 1363 ms
ModCount: 2 Tee fields took 1244 ms
ModCount: 5 Two fields took 1252 ms
ModCount: 5 Tee fields took 896 ms
// public Object clone() {
// try {
// Token t = (Token)super.clone();
// if (termBuffer != null) {
// t.termBuffer = null;
// t.setTermBuffer(termBuffer, 0, termLength);
// }
// if (payload != null) {
// t.setPayload((Payload) payload.clone());
// }
// return t;
// } catch (CloneNotSupportedException e) {
// throw new RuntimeException(e); // shouldn't happen
// }
// }
public Object clone() {
Token t = new Token();
t.setStartOffset(startOffset);
t.setEndOffset(endOffset);
t.setType(type);
if (termBuffer != null) {
t.termBuffer = null;
t.setTermBuffer(termBuffer, 0, termLength);
}
if (payload != null) {
t.setPayload((Payload) payload.clone());
}
return t;
}
Is my test/thinking valid on this idea?
Yes, I think so.
--
karl
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]