Re: Caching support for large attachments

Manjula Peiris Mon, 17 Mar 2008 23:06:47 -0700

Hi Samisa, Senaka and Thilina,

These are my view points on caching and how it can be done in Axis2/C.

For me the main purpose of caching is to support any size of attachment
which depends only on the folder size the attachment is going to be
stored.

In the current implementation (Axis2/C 1.3) we read the whole message
and parse. Before 1.3 the same thing was done (I mean always parse full
buffers not partially read buffers) in a very inefficient manner. 

Now to implement caching we need to change the current logic. In simple
steps these are the things we need to do.

1. First parse the part containing the soap-envelope
2. Then Read up to some threshold and search for the mime_boundary
3. If it is not found then move one half of the buffer to the file and
append some content equal to that part from stream and parse

The step 3 is needed because in case of multiple attachments the mime
boundaries can be there in the middle of the message.

We need above step to fully support caching when there are multiple
attachments. 

If we assume we have only one attachment then we can write whole content
without parsing after it exceeds a certain threshold. Otherwise we need
step 3. So that will require to change the current logic.

Above I described is option 1.

Option 2 is as I suggested previously keep the current logic and after
parsing if the attachment exceeds the limit write it to a file. Please
read the first mail in this thread for more on this. But this will not
achieve the main purpose of caching as I mentioned in the beginning.

Thanks,
-Manjula.   

On Tue, 2008-03-18 at 03:47 +0530, Samisa Abeysinghe wrote:
> Senaka Fernando wrote:
> >> Manjula Peiris wrote:
> >>     
> >>> On Sun, 2008-03-16 at 16:26 +0530, Samisa Abeysinghe wrote:
> >>>
> >>>
> >>>       
> >>>> We have an efficient parsing mechanism already, tested and proven to
> >>>> work, with 1.3. Why on earth are we discussing this over and over
> >>>> again?
> >>>>
> >>>> Does caching get affected by the mime parser logic? IMHO no. They are
> >>>> two separate concerns, so ahy are we wasting time discussing parsing
> >>>> while the problem at had is not parsing but caching?
> >>>>
> >>>>         
> >>> No, the current implementation starts parsing after reading the whole
> >>> stream. Because of that the parsing is simple and efficient. And for
> >>> considerable size of large attachments(eg : 100MB) also it is working
> >>> well. The only problem it has is the attachment size will depend on the
> >>> available system memory.
> >>>
> >>>       
> >> Still, my argument on the separation of concerns on caching vs. parsing
> >> holds.
> >> It is a question about what takes precedence over the other. If the
> >> attachment is too large, we need to interleave the concepts, where you
> >> read a considerable amount that is ideal in size in terms of caching,
> >> parse it for MIME, and then cache it and move on.
> >>     
> >
> > Parsing will always be choice No. 1. We cache only if we can't handle it.
> >
> > However, the real issue is how are we going to implement "parse it for
> > MIME, and then cache it and move on". I still think that it is better to
> > stick to Thilina's viewpoint in having each attachment cached as a
> > separate file. And, each attachment should be cached, even if it is small
> > or large, when the content-length exceeds the threshold. This is because
> > many small attachments == one big attachment. Thus, I'm still on the
> > parse_1st->cache_1st->parse_2nd->cache_2nd->... approach. I don't think
> > that a cache all at once will give us desirable results.
> >   
> 
> I do not think you seem to understand what I am talking about. Seperate
> attachments do need to go to seperate files. There is no question about
> that. The queestion  here is not about multiple attachments. The
> question is about "very large attachment".
> 
> Samisa...
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Caching support for large attachments

Reply via email to