Re: Caching support for large attachments

Samisa Abeysinghe Sun, 16 Mar 2008 03:57:04 -0700

Senaka Fernando wrote:

Hi Manjula, Thilina and others,


Yep, I think I'm exactly in the same view point as Thilina when it comes
to handling attachment data. Well for the chunking part. I think I didn't
get Thilina right in his first e-mail.

And, However, the file per MIME part may not always be optimal. I say
rather  each file should have a fixed Max Size and if that is exceeded
perhaps you can divide it to two. Also a user should always be given the
option to choose between Thilina's method and this method through the
axis2.xml (or services.xml). Thus, a user can fine tune memory use.

When it comes to base64 encoded binary data, you can use a mechanism where
the buffer would always have the size which is a multiple of 4, and then
when you flush you decode it and copy it to the file, so that should
essentially be the same to a user when it comes to caching.

OK, so Manjula, you mean when the MIME boundary appears partially in the
first read and partially in the second?

Well this is probably the best solution.

You will allocate enough size to read twice the size of a MIME boundary
and in your very first read, you will read 2 times the MIME boundary, then
you will search for the existence of the MIME boundary. Next you will do a
memmove() and move all the contents of the buffer starting from the
MidPoint until the end, to the beginning of the buffer. After doing this,
you will read a size equivalent to 1/2 the buffer (which again is the size
of the MIME boundary marker) and store it from the Mid Point of the buffer
to the end. Then you will search again. You will iterate this procedure
until you read less than half the size of the buffer.


If you are interested further in this mechanism, I used this approach when
it comes to resending Binary data using TCPMon. You may check that also.

Also, the strstr() has issues when you have '\0' in the middle. Thus you
will have to use a temporary search marker and use that in the process.
Before calling strstr() you will check whether strlen(temp) is greater
than the MIME boundary marker or equal. If it is greater, you only need to
search once. If it is equal, you will need to search exactly twice. If it
is less you increment temp by strlen(temp) and repeat until you cross the
Midpoint. So this makes the search even efficient.

If you want to make the search even efficient, you can make the buffer
size one less than the size of the MIME boundary marker, so when you get
the equals scenario, you will have to search only once.

The fact I've used here is that strstr and strlen behaves the same in a
given implementation. In Windows if strlen() is multibyte aware, so will
strstr(). So, no worries.

We have an efficient parsing mechanism already, tested and proven towork, with 1.3. Why on earth are we discussing this over and over again?

Does caching get affected by the mime parser logic? IMHO no. They aretwo separate concerns, so ahy are we wasting time discussing parsingwhile the problem at had is not parsing but caching?

Writing the partially passed buffer was a solution to caching. Do wehave any other alternatives? If so what, in short, what are they?


Samisa...



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Caching support for large attachments

Reply via email to