On 19 Feb 2014, at 10:44 pm, Jim Fulton <j...@zope.com> wrote: > On Tue, Feb 18, 2014 at 8:59 PM, Dylan Jay <d...@pretaweb.com> wrote: >> Hi, >> >> I'm seeing a a ZCatalog reindex of a large number of objects take a long >> time while only using 10% cpu. I'm not sure yet if this is due to the size >> of the objects and therefore the network is saturated, or the ZEO file reads >> aren't fast enough. > > How heavily loaded is your storage server, especially %CPU of the > server process?
no not heavily loaded. > > Also, are the ZODB object or client caches big enough for the job? I'm not sure the caches would ever be big enough since it's iterating over 1.7M objects. > >> However looking at the protocol I didn't see a way for code such as the >> ZCatalog to give a hint to ZEO as to what they wanted to load next so the >> time is taken by network delays rather than either ZEO or app. Is that the >> case? > > It is the case that a ZEO client does one read at a time and that > there's no easy way to pre-load objects. > >> I'm guessing if it is, it's a fundamental design problem that can't be fixed >> :( > > I don't think there's a *fundamental* problem. There are three > issues. The hardest to solve > isn't at the storage level. I'll mention the 2 easiest problems first: > > 1. The ZEO client implementation only allows one outstanding request at a > time, > even on a client with multiple threads. This is merely a clumsy > implementation. > > The protocol easily allows for multiple outstanding reads! > > 2. The storage API doesn't provide a way to read multiple objects at once, or > to > otherwise hint that additional objects will be loaded. > > Both of these are fairly straightforward to fix. It's just a matter of time. > :) > > 3. You have to be able to predict what data are going to be needed. > > This IMO is rather hard, at least at a general level. It's what's left > me somewhat under-motivated to address the first 2 problems. > > We really should address problems 1 and 2 to make it possible > for people to experiment with approaches to problem 3. yeah I figured it might be the case thats its hard to predict. In this case it's catalog indexing so I was wondering if something could be done with __iter__ on a btree? It's a reasonably good guess that you could start preloading more of those objects if the first few are loaded? > > Jim > > -- > Jim Fulton > http://www.linkedin.com/in/jimfulton _______________________________________________ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev