Hey guys, Thank you so much your discussion on this. I agree with Bartosz that we probably don't want too many files under one directory for editing purposes. Although in some cases, it will be difficult to control the number of files users will create under a directory; or as Reinier mentioned, a process drops the files to repository that only frontend looks it up.
We've decided to split files into subdirectories, so we didn't pursue further with the performance tuning. Here are some of our findings in case anyone is interested though: 1. We are running Hippo CMS with oracle backend on redhat. We start to get long waiting time in CMS's explorer to open a directory with about 1000 files (probably ~1min). 2. It took about 2 hours to upload 30000 small xml files to the repository. 3. To clean out a big directory, the easiest way is to filesync it against an empty local directory. Deleting files using a webdav client like Konqueror doesn't seem to clear out the version_content table. Regards, Jun -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bart van der Schans Sent: Tuesday, November 11, 2008 1:46 AM To: Hippo CMS development public mailinglist Subject: Re: [HippoCMS-dev] too many files in one directory & performancetuning On 11-11-2008 10:35, Reinier van den Born wrote: > Hmmmm, > > Bartosz Oudekerk wrote: > > [EMAIL PROTECTED] wrote: > >> Hi everyone, > >> > >> I found this article by Jasha about importing large number of files to > >> hippo cms. > >> http://blogs.hippo.nl/jasha/2008/07/importing_lots_of_data_into_hi.html > >> In the last paragraph, Jasha mentioned that Slide doesn't like too many > >> files in one directory. > >> Does anyone know what this threshold is? What's your experience with > >> large amount of files in one directory with hippo? > > > > I don't think there's a fixed threshold, it's simply a question of the > > more you put in, the slower it will get. > > > >> We tried around 10000 files in one dir, which took Hippo Repository > >> almost 5 mins for a complete file listing (with default flatfile > >> backend). So here is the second question: what can we do to tune the > >> performance? I am going to try it with db backend, as well as increasing > >> the memory. What are the other things that I can try? > > > > First of all, a 1000 files in one directory, will be unmanageable for > > your editors, try finding one specific file in such a listing. And even > > if you could tune it to be faster, what would be acceptable? twice as > > fast? You'll get much more performance gain, by simply putting less > > files in a folder. > > Who sais editors need to manage those files? > In one of my Hippo installations files are put in the repository by an > external application. > Nobody but the frontend looks at them. > It would be very inconvenient to have to spread them over > subdirectories, to say the least. > Luckily for now the number of files is in the hundreds, so I need not > worry. Yet. > > Question is, of course, what is slides problem? > Getting a folder listing should in principal be linear in the number of > entries. > If it sorts the entries this would get worse, but sorting of 10000 > entries should not be a real problem nowadays. > Is this an intrinsic problem of WebDav, or is it the implementation in > slide? > > Would a workaround using a DASL help? That gets entries from an index, > which, I assume, is much faster. > For a frontend that could easily be implemented. Using a DASL can help. Getting the listing will indeed get too slow to be unusable of you add tens of thousands of files in a folder. But keep in mind that if you use any other (webdav) application with the repository then your home build one, chances are big that it will do a document listing. How many documents you exactly can have in one folder also depends on your backend. For example filesystem performance on windows is terrible if you have a lot of documents in one folder. Database backends don't have this extra overhead. Imho you'll just have to run some tests to find out what is still usable for your application. Regards, Bart -- Hippo B.V. - Amsterdam Oosteinde 11, 1017 WT, Amsterdam, +31(0)20-5224466 Hippo USA Inc. - San Francisco 101 H Street, Suite Q, Petaluma CA, 94952-3329, +1 (707) 773-4646 ----------------------------------------------------------------- http://www.onehippo.com - [EMAIL PROTECTED] ----------------------------------------------------------------- ******************************************** Hippocms-dev: Hippo CMS development public mailinglist Searchable archives can be found at: MarkMail: http://hippocms-dev.markmail.org Nabble: http://www.nabble.com/Hippo-CMS-f26633.html ******************************************** Hippocms-dev: Hippo CMS development public mailinglist Searchable archives can be found at: MarkMail: http://hippocms-dev.markmail.org Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
