At 21:28 2002-08-10 -0400, Casey Duncan said: >On Saturday 10 August 2002 11:25 am, Johan Carlsson [Torped] wrote: > > Now that I understand how the data tuples are copied to the brain > > I'm not at all sure adding a filter when copying the tuple will optimize > > thing, because of the overhead in the filter process. > >This occurs lazily so the savings would be heavily dependant on the >application. For most web apps presenting small batches of records, the >savings in limiting columns returned would be pretty minimal.
But there must be some though implementing Record.pyd i C, but off course I suppose Record.pyd was first used for ZSQL? An easy filter would be to let __record_schema__ control which columns to save, as it works to day __record_schema__ must point on a sequence starting with 0, so I can't specify indexes into the tuple like this: __record_schema__= {'hey':12, 'dude': 22} Maybe this is "easy" to change in the record.pyd, or I just implement it in a special brain base class? After revisited Record.c I realized that the tuple from the catalogs self.data is stored as a tuple (or as a C-array I suppose?) in a Record or as attributes depending on what you provide to the constructor. I suppose coping data to a C-array is much faster than creating attributes on each brain, but if the array is large and the number attributes needed to be set is small it might be the other way around. I have no idea where they would break even. Maybe I just will settle with having two different brain base classes and use one that suits the current need. >The general usage is to put a minimal set of columns in metadata, only enough >to create a results page and load the objects in cases where either large, >dynamic or otherwise arbitrary data elements are needed. Yes, and that is somewhat restricting. My current applications use several different catalogs to get the width of the meta_data down. The downside of this approach is that I end up with allot of catalogs and that it's a multitude time more things to do for management, e.g. I must reindex all catalogs instead of just one. My primary goals are: 1. Get a general ZCatalog that can be used for all ZCatalog requirement (not only site searches), 2. Implement feature that removes the need for external RDBS (for instance report generation is hard with ZCatalogs because of the lack of grouping/statistics). 3. Make ZCatalogs easier to manage, for instance the need of updating indexes and meta_data definitions every time you change your applications data structure is annoying, especially at development time. Objects could tell the ZCatalog which meta_data and indexes it wants removing the need to manually add them. Off course you will need to clean up the ZCatalog from time to time. > > (The way that I "solved" the group/calc part of my "project", I don't think > > it will lead to memory bloat. I'm going to implement a LacyGroupMap > > which take an extra parameter (a list of IISet). Each brain created > > in the LacyMap will have methods for calculations directly on the self.data > > in the Catalog. The data it self will not be stored. > > There will most probably be a pre calculate method that calculate all > > variables that are applicable and caches the result.) > >Sounds like a pretty good solution. However, I would be hesitant in creating >direct dependancies on the internal Catalog data structures if you can help >it (sometimes you can't though). I could "soften" the dependency by providing the catalog with an interface for calculations and give the brain an reference to the catalog it self and use the interface on that reference. > > One way to reduce memory consumption in wide Catalogs would be > > to have LacyBrains (vertical lacyness, there might be reasons > > why that would be a bad idea, which I'm not aware of) > >That would pretty much require a rewrite of the Catalog as the data >structures >would need to be completely different. It would introduce significant >database overhead since each metadata field would need to be loaded >individually. I think that would negate whatever performance benefit metadata >might have over simply loading the objects. I'm not sure that it would be necessary to change the data structure, the brain could use the same method as the LacyMap uses to load the data. But LacyBrain would need to save all applicable data at once to be efficient. The different would be that the brain will not fetch any data before the first attribute has been called. When the first is called all applicable data will be copied to the attribute according to __record_schema__. This would probably not be more efficient for regular use of brains, but for calculated group brains they wouldn't need to store the data at all if they only used calculated fields. > > Another way would be to have multiple data attributes in the Catalog, like > > tables, and to join the tuples from them with a "from table1, table2" > > statement. > > In this way it would be possible to control the width of the brains. > > It would also be possible for the object indexing it self to tell the >catalog > > in which "tables" it should store meta data. > >Yes, this would be better. You could have different sets of metadata for each >catalog record. You would select which one you wanted at query time. Yeah I like it as well. It would also require a more SQL-like query interface. > > > There have been some proposals (ObjectHub et al) which I read some > > time ago. I didn't feel then that we what I was looking for. > > Please tell me if there's been any proposals or discussions regarding this. > >I don't think so. If you feel strongly about this, write up a proposal and >provide some use cases for discussion. Yes, but first implementation :-) I'm very XP in that aspect. I find code easier to communicate when specifications :-) Or at least Python Code, I don't C-code easier to communicate. Cheers, Johan Carlsson -- Torped Strategi och Kommunikation AB Johan Carlsson [EMAIL PROTECTED] Mail: Birkagatan 9 SE-113 36 Stockholm Sweden Visit: Västmannagatan 67, Stockholm, Sweden Phone +46-(0)8-32 31 23 Fax +46-(0)8-32 31 83 Mobil +46-(0)70-558 25 24 http://www.torped.se http://www.easypublisher.com _______________________________________________ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )