On Mon, 27 May 2002, Denis Benoit wrote:

> 1. In the generated page, there is a lot of consecutive:
> 
>       out.write("some string");
>       out.write("another string");
>       and so on.
> 
>    Why don't we merge all these consecutive strings together?
> 
>       out.write("some string\nanother string");

Since you mention this - String->byte conversion usually has a very
large performance penalty. 

With Coyote we do get the same optimizations that 3.3 is using for
reducing the GC involved with the conversion, but it remains 
an expensive operation.

If I remember corectly, it is now possible to use both the writer and
output stream ( and if it's not, we can call some tomcat-specific 
methods - even write directly to the OutputBuffer and avoid another copy).

The only problem with using byte[]  is that for some pages, it is possible
to have requests in multiple encodings, so we may have 2-3 different 
byte[] representations for the same page. 

Most current browsers use unicode, and we could at least support this 
case. 

I think cutting of the String->byte conversion ( and eventually
one or 2 memcpy for the data ) would be quite significant. 

I'm very interested in this subject - chunks of byte[] can 
be further optimized at the connector level, by either NIO or
the web server, plus avoids sending stuff over ajp. Not easy,
but has huge potential.


> 2. This one has nothing to do with the size, it's just something that I think
>    we should plan for: tag reuse.  Some of the pages that have a lot of tags,
>    do so because they have them in an HTML table.  A "big" page can reference 
>    80 or so tags, but these tags can represent only four or five distinct
>    types.  It is not so difficult to find 80 tags in a page, but it would be
>    difficult to find one with 80 _distinct_ tag classes!  Most of these tags
>    could be reused, that is we often call:

This is implemented already in jasper ( the tomcat3.3 version ). I don't
think it'll reduce the method size ( the code is more complex ), but
it'll improve the performance ( even with the best GC, memory allocation
and gc is never free ).

Unfortunately some pages will fail, due to bugs in tags that don't 
clean up properly. 

>   So, the specs seem to imply that tag reuse is allowed.

Yes, and it works pretty well - it's just that it'll expose several bugs
in tags ( not a direct problem for jasper, but for the users who'll
have a hard time figuring out what's happening ).

> optimize them.  Most tags won't see a big performance boost from reuse, but some
> tags can be pretty hefty, and for them, tag reuse can be a big factor.

Some tests showed quite significant boosts. Some extreme cases showed a 
reduction ( in pages with many distinct tags, the overhead is bigger and 
the gain is null ). That happened long ago, check the archives.



> Now, there were two approach to the "64K problem".  Kin-Man proposed creating
> methods for each custom tags, and Costin proposed a state machine.  I tend to

It was just an idea, not a real proposal. 

There are few important things that happen if we reduce the size of the 
class ( either by using a state machine for some of the logic, or moving 
more static content in data files - the largefile option ). First we 
reduce the footprint of the tomcat process, and we can make better use of 
the memory ( the static data can be cached or manipulated much easier than 
.class, where we have no control except the hope it'll be garbaged collected )

The current mechanism doesn't scale well for a large number of pages
( 100.000s ? ) with uniform accesses. ( right now we care more
about how a page scales under increased load, not how the system scales
if you add more documents ). 

 
Costin


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to