RE: Which API should I use for a web app?

Andreas Probst Thu, 01 Apr 2004 12:32:59 -0800

Hi Ritu,

I'm afraid I can't share my code as I work in a closed source 
project...


Please read between the lines:

On 1 Apr 2004 at 13:02, Ritu Kedia wrote:

> Hi Andreas,
> 
> Could you please share the performance enhancements that you mentioned
> below? Are those enhancements generic enough to be incorporated in Slide
> 2.0?

What I did is not as generic as it should be to be incorporated 
into the Slide product. Besides I had to change a pre-beta 
version with the old database schema.

> 
> Also it would be great if you could provide more details of your
> configuration in which you have 150,000 up documents?

This is hard to answer. I test on a virtual Windows server on 
normal P4 work station machine. Windows got 700 MB RAM. I think 
Tomcat got 256 or 500 MB RAM - but I'm not sure in this very 
moment. Slide has to share it with two other complex web-apps. 
The database runs on MS SQL 2000.

> More specifically I am looking at the following:
> Which Slide APIs are you using (Server, Client, WVCM)?

HTTP PUT from the client side.

> What is the end to end configuration? (for e.g. DesktopClient --> WebServer
> --> J2EE App Server --> Servlet --> EJB --> SlideServerAPI --> J2EEStore for
> Metadata + FileStore for Content)

Java-Client during tests ---WebDAV--> Apache/Tomcat --> 
RDBMS/Custom Content Repository.

> Is slide authorization turned on or off?
on, but currently I'm just using a single user. So far it is 
only planned as a single-user scenario. The single user is 
another software system.

> What store are you using for Metadata?
custom jdbc

> What store are you using for Content?
custom document management system

> What is the average size of the documents?
for tests I used 10 bytes, but in real life it will be normal 
business documents.

> What is the average depth of top most collection under /files?
/files/test1, /files/test2 and so on

> What is the peak load (no. of concurrent users)?
So far I did single-threaded single-client tests to easier 
measure the performance. It's so slow there is no need to put 
more load onto it.

> Does your client do download of complete top level folders (recursive
> download of all files in all sub-folders)? if yes, approximately how long
> does it take for a top level folder with 1000 documents each being of approx
> .5 MB?

As our performance problem has been putting and deleting MANY 
documents so far, I tested just put and delete. 

With the change to load children only on demand, PUT became a 
lot faster. However, browsing a collection won't be faster, 
because the children have to be loaded then. I think it does not 
matter, what size the children are. The most run-time is 
consumed in reading the information from the children table and 
instancing a SubjectNode for each child. Of course size does 
matter when downloading the files.

I hope this helps. Unfortunately I cannot give more information 
about the project.

Andreas
> 
> 
> Does Slide has test scripts for load tests? OR Has anyone written their own
> test scripts to do the same? 
> 
> Regards,
> Ritu
> 
> 
> 
> > -----Original Message-----
> > From: Andreas Probst [mailto:[EMAIL PROTECTED]
> > Sent: Tuesday, March 30, 2004 11:11 PM
> > To: Slide Users Mailing List
> > Subject: RE: Which API should I use for a web app?
> > 
> > 
> > On 30 Mar 2004 at 16:26, [EMAIL PROTECTED] wrote:
> > 
> > > >>If you use the slide API for storing data from your app, 
> > take into 
> > > >>account that it is reallly complicated to store content 
> > in a way that 
> > > >>you can use the versioning stuff, because all of the 
> > versioning is done 
> > > in the webdav layer. 
> > > 
> > > What do you mean ? The slide API can manage 
> > NodeRevisionDescriptors and
> > > NodeRevisionDescriptor.
> > > It is not what you expect to do ? or are you speaking about 
> > other features ?
> > > 
> > > >>For fast content retrieval in the same vm, slide 
> > > >>API might be a good choice.
> > > 
> > > I'm using the Slide API from a Jetspeed service and its 
> > works fine. We have
> > > -/+ 20.000 documents and no problem at all.
> > > Maybe, if we have more and more documents, this solution will not be
> > > scalable. So, next plan is to access to different 
> > "external" repositories
> > > via the webdav client.
> > 
> > Hi Christophe,
> > 
> > there are major performance issues in the Slide kernel and 
> > database layer.
> > 
> > 1. A collection SubjectNode always knows about all its children. 
> > With increasing collections the time to retrieve a collection 
> > SubjectNode will increase. Apart from this all children 
> > SubjectNodes are instanced to prepare the binding information.
> > 
> > Solution would be to load the information about the children 
> > only on demand. To do this, the SubjectNode needs a pointer to 
> > the right NodeStore, which in turn needs some new methods. I 
> > implemented this with sub-classing the SubjectNode, which had 
> > been made more complicated than necessary with some private 
> > members and methods in ObjectNode :-( Of course some WebDAV 
> > methods need adaption to use the custom SubjectNode.
> > 
> > 2. When adding a new child to a collection resource, all the old 
> > child entries of the collection resource are deleted, just to be 
> > saved again afterwards together with the information about the 
> > new child. The same is true for removing children.
> > 
> > Solution would be to enhance the NodeStore interface with 
> > methods such as addChild and removeChild or so. Of course 
> > StructureImpl needs to be adapted too.
> > 
> > Having done these two enhancements, I can tell you the 
> > performance has increased dramatically, especially when talking 
> > about many documents (>1000). Nevertheless, the old database 
> > schema with "slow" datatypes (CLOB), which I had to use when 
> > doing the changes, prevents the usage of Slide with really many 
> > documents: On a server with 150,000 documents (70,000 in one 
> > collection) a put of a new document still needs a few seconds -- 
> > and unfortunately is rising with every new document. Maybe the 
> > new database schema is much better in this regard, but the two 
> > problems above remain.
> > 
> > Regards,
> > 
> > Andreas
> > 
> > > 
> > > Christophe
> > > 
> > > 
> > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: [EMAIL PROTECTED]
> > > 
> > 
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Which API should I use for a web app?

Reply via email to