Re: [Zope-dev] [ZODB-Dev] [announce] NEO 1.0 - scalable and redundant storage for ZODB
On Mon, 27 Aug 2012 14:37:37 +0200, Vincent Pelletier vinc...@nexedi.com wrote : Under the hood, it relies on simple features of SQL databases To make things maybe a bit clearer, from the feedback I get: You can forget about SQL presence. NEO usage of SQL is as a relational as a handful of python dicts is. Except there is no way to load only part of a pickled dict, or do range searches (ZODB's BTrees are much better in this regard), or writable to disk atomically without having to implement this level of atomicity ourselves. Ideally, NEO would use something like libhail, or maybe even simpler like kyotocabinet (except that we need composed keys, and kyotocabinet b-trees have AFAIK no such notion). SQL as a data definition language was simply too convenient during development (need a new column ? easy, even if you have a 40GB table), and it stuck - and we have yet to find a significant drawback to implement a new storage backend. As a side effect, SQL allows gathering some statistics over the data contained in a database very efficiently. Number of current objects, number of revisions per object, number of transactions, when transactions occured in base history, average object size, largest object, you name it. -- Vincent Pelletier ERP5 - open source ERP/CRM for flexible enterprises ___ Zope-Dev maillist - Zope-Dev@zope.org https://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - https://mail.zope.org/mailman/listinfo/zope-announce https://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [ZODB-Dev] [announce] NEO 1.0 - scalable and redundant storage for ZODB
On Tue, 28 Aug 2012 16:31:20 +0200, Martijn Pieters m...@zopatista.com wrote : Anything else different? Did you make any performance comparisons between RelStorage and NEO? I believe the main difference compared to all other ZODB Storage implementation is the finer-grained locking scheme: in all storage implementations I know, there is a database-level lock during the entire second phase of 2PC, whereas in NEO transactions are serialised only when they alter a common set of objects. This removes an argument in favour of splitting databases (ie, using mountpoints): evading the tpc_vote..tpc_finish database-level locking. Also, NEO distributes objects over several servers (aka, some or all servers might not contain the whole database), for load balancing/ parallelism purposes. This is not possible if one relies on relational database replication alone. I forgot in the original mail to mention that NEO does all conflict resolutions on client side rather than server side. The same happens in relStorage, but this is different from ZEO. Packing on client side makes it easier to get the setup right: with ZEO you will get more conflicts than normal if it cannot load some class which implements conflict resolution, and this might go unnoticed until someone worries about a performance drop or so. With client-side resolution, if you don't see Broken Objects, conflict resolution for those classes works. Some comments on some points you mentioned: * NEO supports MySQL and sqlite, RelStorage MySQL, PostgreSQL and Oracle. It should be rather easy to adapt to more back-ends. We (Nexedi) are not interested in proprietary software, so we will probably not implement Oracle support ourselves. For PostgreSQL, it's just that we do not have a setup at hand and the experience to implement a client properly. I expect that it would not take more than a week to get PostgreSQL implemented by someone used to it and knowing python, but new to NEO. Just to demonstrate that NEO really does not rely on fancy features of SQL servers, you may dig in older revisions in NEO's git repository. You can find a btree.py[1] test storage, which is based on ZODB.BTree class. It was just a toy, without persistence support (I initially intended to provide it, but never finished it) and hence limited by the available amount of RAM. But it was otherwise a fully functional NEO storage backend. I think it took me a week-end to put it together, while discovering ZODB.Btree API and adapting NEO's storage backend API along the way (this was the first non-MySQL backend ever implemented, so API was a bit too ad-hoc at that time). sqlite was chosen as a way to get rid of the need to setup a stand-alone SQL server in addition to NEO storage process. We are not sure yet of how well our database schema holds when there are several (10+) GB of data in each storage node. * RelStorage can act as a BlobStorage, NEO can not. I would like to stress that this has nothing to do with design, rather it's just not implemented. We do not wish to rely on filesystem-level sharing, so we consider something along the lines of providing a FUSE-based to share blob storage, which then can abstract the blobs being distributed over several servers. This is just the general idea, we don't have much experience with blob handling ourselves (which is why we preferred to leave it asides rather than providing an unrealistic - and hence unusable - implementation). [1]http://git.erp5.org/gitweb/neoppod.git/blob/75d83690bd4a34cfe5ed83c949e4a32c7dec7c82:/neo/storage/database/btree.py Regards, -- Vincent Pelletier ERP5 - open source ERP/CRM for flexible enterprises ___ Zope-Dev maillist - Zope-Dev@zope.org https://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - https://mail.zope.org/mailman/listinfo/zope-announce https://mail.zope.org/mailman/listinfo/zope )
[Zope-dev] [announce] NEO 1.0 - scalable and redundant storage for ZODB
Hi, We've just tagged the 1.0 NEO release. NEO aims at being a replacement for use-cases where ZEO is used, but with better scalability (by allowing data of a single database to be distributed over several machines, and by removing database-level locking), with failure resilience (by mirroring database content among machines). Under the hood, it relies on simple features of SQL databases (safe on-disk data structure, efficient memory usage, efficient indexes). Release highlights: - production-ready ! - asynchronous replication across clusters, for inter-datacenter redundancy - there will be no further expensive data schema changes within the 1.x branch as there were in 0.x branch - replication performance is significantly increased - general implementation performance improved - several bugfixes What you need to know if you are used to ZODB features: - Blob API is not implemented yet. - pack's GC phase will not be implemented in NEO, it relies on zc.zodbdgc for this: http://pypi.python.org/pypi/zc.zodbdgc For more details, look at README and CHANGES included with the sources: http://git.erp5.org/gitweb/neoppod.git/blob/HEAD:/README http://git.erp5.org/gitweb/neoppod.git/blob/HEAD:/CHANGES NEO is published on pypi as neoppod: http://pypi.python.org/pypi/neoppod Regards, -- Vincent Pelletier ERP5 - open source ERP/CRM for flexible enterprises ___ Zope-Dev maillist - Zope-Dev@zope.org https://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - https://mail.zope.org/mailman/listinfo/zope-announce https://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Zope 4 sprint 27-31 August, Lille (FR)
On Mon, 16 Jul 2012 19:12:16 +0200, Vincent Pelletier vinc...@nexedi.com wrote : A wiki page[1] has been set up with a list of subscribers, a topic list, and general information about transport stay accommodation. Something I forgot: to subscribe, you should either update the wiki page or mail me directly. -- Vincent Pelletier ERP5 - open source ERP/CRM for flexible enterprises ___ Zope-Dev maillist - Zope-Dev@zope.org https://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - https://mail.zope.org/mailman/listinfo/zope-announce https://mail.zope.org/mailman/listinfo/zope )
[Zope-dev] Zope 4 sprint 27-31 August, Lille (FR)
Hi. Nexedi would like to host a Zope 4 sprint in Lille, France, from the 27th of august to 31st. A wiki page[1] has been set up with a list of subscribers, a topic list, and general information about transport stay accommodation. A mailing list[2] is available to help coordinate participants (deciding on meeting points...). If you would like to attend the sprint but cannot afford the trip to Lille, please let me know. [1] http://www.erp5.org/Zope4Sprint2012 [2] https://mail.tiolive.com/mailman/listinfo/zope4sprint2012 -- Vincent Pelletier ERP5 - open source ERP/CRM for flexible enterprises ___ Zope-Dev maillist - Zope-Dev@zope.org https://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - https://mail.zope.org/mailman/listinfo/zope-announce https://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [ZODB-Dev] Bug (?) in zope/publisher/publish.py:unwrapMethod
Le mardi 25 janvier 2011 19:08:11, Tres Seaver a écrit : The Zope2-specific version of 'mapply()' (in 'ZPublisher.mapply') is the right place to fix this issue, if it is to be fixed: Thanks for the info. P.S. This issue is off-topic for the ZODB list: I have cross-posted to 'zope-dev': please follow up there. Woops, lazy typing and wrong mail client completion. I indeed intended it for zope-dev. For some reason, I didn't see your mail on zope-dev (I checked the archives too, but they might be lagging). -- Vincent Pelletier ___ Zope-Dev maillist - Zope-Dev@zope.org https://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - https://mail.zope.org/mailman/listinfo/zope-announce https://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [ZODB-Dev] Bug (?) in zope/publisher/publish.py:unwrapMethod
Le mercredi 26 janvier 2011 08:54:02, Vincent Pelletier a écrit : For some reason, I didn't see your mail on zope-dev As this mail reached the list, I think Tres' mail got caught by some filter. Original mail was: In publish.py[1], unwrapMethod tried to find what can be used to publish an object. In a site, I had someone create a very-badly-named func_code external method in a place accessible by acquisition from every page on the site (this bad by itself, and I corrected it already). This caused unwrapMethod to think it can use any object directly for publishing, because of: elif getattr(unwrapped, 'func_code', None) is not None: break and unwrapped is still in an acquisition context. Shouldn't the checks be done on unwrapped (from acquisition context) objects instead, to prevent such stupid mistake to have such a wide impact. I have the intuition that this could even be a security problem, allowing an unexpected object to be called instead of another, but I cannot come out with an example. Do you think there is anything to fix in zope.publisher[2] ? If so, I'll open a bug. [1] http://svn.zope.org/zope.publisher/trunk/src/zope/publisher/publish.py?view=markup [2] following Tres' answer, make this Zope2's mapply Regards, -- Vincent Pelletier ___ Zope-Dev maillist - Zope-Dev@zope.org https://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - https://mail.zope.org/mailman/listinfo/zope-announce https://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] NEO High Performance Distributed Fault Tolerant ZODB Storage
if a node starts flapping, and could be made configurable. Storage comes back When a storage comes back to life (after a power failure or whatever), it asks other storage nodes for the transactions it missed, and replicate them. That's unfortunate. Why not a less restrictive license? Because we have several FSF members in Nexedi, so we use the GPL for all our products (with very few exceptions, notably monkey-patches reusing code under different licenses). These seem to be very high level. You provide a link to the source, which is rather low level. Anything in between? You mean, a presentation of NEO (maybe what I wrote above would fit) ? Or on the write scalable Zope application topic ? Regards, -- Vincent Pelletier ___ Zope-Dev maillist - Zope-Dev@zope.org https://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - https://mail.zope.org/mailman/listinfo/zope-announce https://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] NEO High Performance Distributed Fault Tolerant ZODB Storage
Le mercredi 31 mars 2010 18:32:31, vous avez écrit : A few questions that you may want to add in a FAQ. We started that page and will publish it very soon, based on most points you raised. Other pages are also being worked on, such as an overview of a simple NEO cluster. - Why not include ZODB 3.10/Python 2.6 as a goal of the project? - I understand *today* the technologies use python 2.4 but ZODB 3.10/Plone 4/Zope 2.12 use python 2.6 We indeed aim at supporting more recent versions of python and Zope. Actually, your remark made us realise that our functional tests are currently (accidentally) running in a mixed 2.4/2.5 python environment: test process is started explicitly with 2.4, and forked processes (masters, storage and admin nodes) are running on default python, which is 2.5 (as of Debian stable). The standard Zope version in Nexedi is 2.8, which explains why we want to support it. We will switch to 2.12, as we have ERP5 unit tests running on 2.12 for some weeks[1] now. NEO will move to 2.12 at the same time or earlier. - Maybe explain the goal of the project clearer: NEO provides distributed, redundant and transactional storage designed for petabytes of persistent (python?) objects. Thanks, updated. - A buildout for NEO would lower bar for evaluation This is on our roadmap (...to be published along with the FAQ), but the priority currently goes to 2 developments which might/will break compatibility: pack support (required an undo rework which was recently integrated, pack itself needs more unit testing prior to integration) and multi-export support (aka ZODB mountpoints, also in need for more testing before integration). [1] http://mail.nexedi.com/pipermail/erp5-report/ (_z212 in subject) -- Vincent Pelletier ___ Zope-Dev maillist - Zope-Dev@zope.org https://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - https://mail.zope.org/mailman/listinfo/zope-announce https://mail.zope.org/mailman/listinfo/zope )
[Zope-dev] NEO High Performance Distributed Fault Tolerant ZODB Storage
Hi, I would like to present you the NEOPPOD project, aiming at improving ZODB Storage scalability. The implementation is in a rather good shape, although it fails at a few ZODB tests at the moment (they are currently being worked on). Scalability is achieved by distributing data over multiple servers (replication and load balancing) with the ability to extend/reduce cluster on-line. Its code is available under the GPL, more information can be found on the project website[1]. One nice aspect is that the underlying protocol is being analysed with model-checking tools based on Petri Nets by a team of post-doc, PhD students and researchers. An article should appear in PETRI NETS 2010 [2]. We hope that NEO will be usable for production systems in 12 months from now, and will notify the community the day we think it is, ie. after using it ourselves. Meanwhile, it can be interesting for research and fun. Contributions are very welcome (extend portability beyond Linux 2.6, replace MySQL daemon dependency with a lighter embeddable transactional storage, etc). For now, it has been manually confirmed to run Plone and ERP5. Of course, to get the best out of NEO, a Zope application (or ZODB-based application) needs to be designed in a way taking advantage of back-end parallelism (much in the same way that a single-process application cannot take advantage of SMP). We wrote a presentation[3] out of our experience with ERP5 scalability testing improvement, which might be an interesting read for people developing on the Zope framework. It describes what was the most common mistakes we corrected, and tools we developed to further extend scalability at various levels (NEO being the latest). And for most productions system, ZEO is really great. We have for example used ZEO for now more than 2 years to operate a Central Bank ERP with 300 concurrent users and ZEO never crashed. [1] http://www.neoppod.org/ [2] http://acsd-petrinets2010.di.uminho.pt/ [3] http://www.myerp5.com/kb/enterprise-High.Performance.Zope/view Note: If this mail would fit another list better, please advise. Regards, -- Vincent Pelletier ___ Zope-Dev maillist - Zope-Dev@zope.org https://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - https://mail.zope.org/mailman/listinfo/zope-announce https://mail.zope.org/mailman/listinfo/zope )
[Zope-dev] Dangerous shutdown procedure behaviour
Hi. I think I discovered a dangerous code behaviour at zope shutdown. I've had a strange problem on a site where persistent objects are created from data inserted in an SQL table. Upon object creation, SQL table is updated to mark the line as imported. Such import got triggered just before a shutdown. After restart and another import, documents were created twice. What I believe happened (but I could not find any hard evidence of it) is that Zope blindly exited while the working thread was runing, and in the worst possible method: tpc_finish. ZODB was already commited, but mysql was not. So mysql did a rollback on changes, and the lines were in a ready to import state. And imported again at next import attemp. Reading shutdown code, I discovered 2 distinct timeout mechanism (note: having just one is enough to trigger the problem): - Lifetime.py: iterating through asyncore sockets, it alerts servers that it will shut down soon. If they take the veto for too long, the veto is ignored and shutdown continues. Default timeout is 20 seconds, meaning there is at most one minute from the first shutdown notice to the effective process exit (taking all runing threads down). When invoking zopectl stop, it's runing a fast shutdown, which means the timeout is shortened to 1 second, so total maximum sutdown time is 3 seconds. This timeout can be worked around by just writing blocking shutdown methods and not using the veto system. - zdaemon/zdrun.py: if the instance being shut down still responds after 10 seconds, it will be sent a SIGKILL. This cannot be worked around without changing code in zdrun.py or not executing it at all (no idea if there is any alternative). I could easily reproduce the problem by writing a simple connection mamager which calls time.wait(3600) in _finish method and defining a sortKey method to make it commit after another connection manager. I could not find a trace of any mechanism preventing commit from happening when a shutdown is in progress, and I don't think there should be any: considering that some storages might be accessed through a network, latency can become a problem, so tpc_finish can take time to complete, so just checking that there is no pending shutdown before entering this function would not solve the problem. I suggest removing all those timeouts. If a user wants a Zope to shutdown for a reason serious enough to send it a SIGKILL or causing immediate python thread termination, it's his responsibility. But I think regular shutdown mechanism must not do that. Also, the same problem can happen with zopectl fg since Zope does not go through any shutdown sequence as far as I can tell (it just dies). -- Vincent Pelletier ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )