On 05/30/2013 02:51 PM, Ehresman,David E. wrote: > We use Docave. IBM used to resale this as TSM for Sharepoint. They > are no longer doing that but you can still get the same functionality > directly from Docave.
M.o.l.a.s.s.e.s. Seriously. When it takes hours and hours and hours AND HOURS, that's not a bug, that's how it works. As far as I can tell, it goes like this: They started with a big website. To be efficient, they wanted it inside of a database, so they put it in a database. They wanted to keep lots and lots of documents in the website (database), so they put blobs in the database. Lots of blobs are hard to organize, so they wanted a familliar organizational schema. People are familiar with a filesystem, so they made it look sort of like a file system. Sharepoint is well optimized for seamless integration with the Win desktop, and that mode of utilization is explicitly encouraged in lots of places, so e.g. your project notes and stuff all go in "website" directories. Including rudimentary version control. So the blob count in the database skyrockets. access control metadata ramifies, so you've got a metadata vocabulary that is more complex and nuanced than just the NTFS universe: your access control includes opinions about the outside world, too. (mostly "NO!", but you've got to write that down.). ... All of this may be semi-obvious, but the point of restating it is to bring to the forefront of your mind that, from the back-end view, Sharepoint has a great deal in common with a complex, customized, locally-hacked-upon filesystem implementation. So, what DocAve does, is it walks that filesystem and does an incremental. But instead of reading something optimized, down to the block level, for rapid access, it's doing a series of database accesses. Think about how long ls -l would take if every file metadata read required two or three index scans and their indicated block retrievals? Ick. ... Our sharepoint guy, Joe Gasper (Hi, Joe!) thinks I'm not talking too much bullshit with this description, and he adds that exporting the BLOBs is a critical performance enhancement for DR, and substantial for normal access, too. Well tuned, he thinks 95% reduction in the SQL database size is readily achievable. At that rate, you can fit the remaining 5% on SSD... - Allen S. Rout