18 dec 2007 kl. 13.26 skrev Grant Ingersoll:
I might be missing something here, but why do you clone?

Because the Token is changing and I am not saving all Tokens, just the ones changed.

Aha!

The first thing to note is that TeeTokenFilter (TTF) is much _slower_ in the case that all tokens are siphoned off (X = 1).


I added a line to the benchmark so that the "Two fields" does the same thing as the "Tee field".

         next = new Token();
result = new ModuloTokenFilter(new StandardFilter(new StandardTokenizer(new StringReader(buffer.toString()))), modCounts[j]);
         while ((next = result.next(next)) != null) {
+           next.clone(); // simulate what sink does
           tfPos += next.getPositionIncrement();
         }
       }
       long finish = System.currentTimeMillis();
System.out.println("ModCount: " + modCounts[j] + " Two fields took " + (finish - start) + " ms");

This has some effect, but much less than tweaking JVM settings:

MacBook running

-server -Xms64M -Xmx256M

-----Tokens: 50000-----
ModCount: 1 Two fields took 1943 ms
ModCount: 1 Tee fields took 1172 ms
ModCount: 2 Two fields took 833 ms
ModCount: 2 Tee fields took 759 ms
ModCount: 5 Two fields took 632 ms
ModCount: 5 Tee fields took 473 ms

-client -Xmx256M

-----Tokens: 50000-----
ModCount: 1 Two fields took 2025 ms
ModCount: 1 Tee fields took 2537 ms
ModCount: 2 Two fields took 1535 ms
ModCount: 2 Tee fields took 1479 ms
ModCount: 5 Two fields took 1314 ms
ModCount: 5 Tee fields took 1088 ms

Then I cut down the time spent even more by rewriting Token#clone:

-server -Xms64M -Xmx256M

-----Tokens: 50000-----
ModCount: 1 Two fields took 1263 ms
ModCount: 1 Tee fields took 854 ms
ModCount: 2 Two fields took 692 ms
ModCount: 2 Tee fields took 562 ms
ModCount: 5 Two fields took 799 ms
ModCount: 5 Tee fields took 432 ms

-client -Xmx256M

ModCount: 1 Two fields took 1600 ms
ModCount: 1 Tee fields took 1981 ms
ModCount: 2 Two fields took 1363 ms
ModCount: 2 Tee fields took 1244 ms
ModCount: 5 Two fields took 1252 ms
ModCount: 5 Tee fields took 896 ms


//  public Object clone() {
//    try {
//      Token t = (Token)super.clone();
//      if (termBuffer != null) {
//        t.termBuffer = null;
//        t.setTermBuffer(termBuffer, 0, termLength);
//      }
//      if (payload != null) {
//        t.setPayload((Payload) payload.clone());
//      }
//      return t;
//    } catch (CloneNotSupportedException e) {
//      throw new RuntimeException(e);  // shouldn't happen
//    }
//  }

  public Object clone() {
    Token t = new Token();
    t.setStartOffset(startOffset);
    t.setEndOffset(endOffset);
    t.setType(type);
    if (termBuffer != null) {
      t.termBuffer = null;
      t.setTermBuffer(termBuffer, 0, termLength);
    }
    if (payload != null) {
      t.setPayload((Payload) payload.clone());
    }
    return t;
  }

Is my test/thinking valid on this idea?

Yes, I think so.


--
karl



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to