[ZODB-Dev] zodbpickle claim (but OS X is not unix)

2013-07-02 Thread Christian Tismer

Hi guys,

I very much appreciate the creation of zodbpickle, as it really solves
the problem of pickling compatibility.

What I do not like is if a package makes a claim in its doap record,
is uploaded on PyPI, and the claim is not the reality.

I installed zodbpickle happily on Python 3.3 and assumed that it would
work under 2.7.5 as well, but it does not!

After second reading, I retracted my complaint after realizing that the 
classifiers
don't include the OS X operating system, so here I apologize but beg for 
a fix.



classifiers=[
'Development Status :: 4 - Beta',
'License :: OSI Approved :: Zope Public License',
'License :: OSI Approved :: Python Software Foundation License',
'Programming Language :: Python',
'Programming Language :: Python :: 2',
'Programming Language :: Python :: 2.6',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.2',
'Programming Language :: Python :: 3.3',
'Programming Language :: Python :: Implementation :: CPython',
'Framework :: ZODB',
'Topic :: Database',
'Topic :: Software Development :: Libraries :: Python Modules',
'Operating System :: Microsoft :: Windows',
'Operating System :: Unix',
],



So on OS X Mountain lion, I get with "$ pip install zodbpickle":

cc -fno-strict-aliasing -fno-common -dynamic -I/usr/local/include 
-I/usr/local/opt/sqlite/include -DNDEBUG -g -fwrapv -O3 -Wall 
-Wstrict-prototypes 
-I/usr/local/Cellar/python/2.7.5/Frameworks/Python.framework/Versions/2.7/include/python2.7 
-c src/zodbpickle/_pickle_27.c -o 
build/temp.macosx-10.8-x86_64-2.7/src/zodbpickle/_pickle_27.o


src/zodbpickle/_pickle_27.c:6254:13: error: void function 
'init_pickle' should not return a value [-Wreturn-type]


return -1;

^  ~~

src/zodbpickle/_pickle_27.c:6259:13: error: void function 
'init_pickle' should not return a value [-Wreturn-type]


return -1;

^  ~~

2 errors generated.

error: command 'cc' failed with exit status 1


I would appreciate if that ridiculous but could be removed.
It is a very easy fix and I would actually like to do it.

And while we are at it: How about completion of the module, to let it
define things like DEFAULT_PROTOCOL ?

cheers - Chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] zodbpickle claim (but OS X is not unix)

2013-07-02 Thread Christian Tismer

On 03.07.13 00:41, Christian Tismer wrote:

sorry, I hit the send button while refining my text.
Here it goes:


So on OS X Mountain lion, I get with "$ pip install zodbpickle":

cc -fno-strict-aliasing -fno-common -dynamic -I/usr/local/include 
-I/usr/local/opt/sqlite/include -DNDEBUG -g -fwrapv -O3 -Wall 
-Wstrict-prototypes 
-I/usr/local/Cellar/python/2.7.5/Frameworks/Python.framework/Versions/2.7/include/python2.7 
-c src/zodbpickle/_pickle_27.c -o 
build/temp.macosx-10.8-x86_64-2.7/src/zodbpickle/_pickle_27.o


src/zodbpickle/_pickle_27.c:6254:13: error: void function 
'init_pickle' should not return a value [-Wreturn-type]


return -1;

^  ~~

src/zodbpickle/_pickle_27.c:6259:13: error: void function 
'init_pickle' should not return a value [-Wreturn-type]


return -1;

^  ~~

2 errors generated.

error: command 'cc' failed with exit status 1



I would appreciate if that simple-to-fix bug could be removed,
and I would be happy to help with this.

And while we are at it: How about completion of the module, to let it
define standard things like DEFAULT_PROTOCOL ?
Or is there a reason to avoid this (because Python2 doesn't have it)?

Please don't get me wrong, I really like that module and want it to set 
the standard.


Cheers -- chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] zodbpickle claim (but OS X is not unix)

2013-07-03 Thread Christian Tismer
Hi Stefan,

fine with me. I have the patch working. Will complete the 3.3 module, update 
the doap record to 0.5.1 and submit
a pull request this evening. 

All the best - chris


Sent from my Ei4Steve

On Jul 3, 2013, at 2:09, Stephan Richter  wrote:

> Hi Christian,
> 
> On Wednesday, July 03, 2013 01:01:14 AM Christian Tismer wrote:
>> I would appreciate if that simple-to-fix bug could be removed,
>> and I would be happy to help with this.
> 
> I am not a C expert, so I cannot comment.
> 
>> And while we are at it: How about completion of the module, to let it
>> define standard things like DEFAULT_PROTOCOL ?
>> Or is there a reason to avoid this (because Python2 doesn't have it)?
> 
> We might have simply forgotten it. Also, we started Python 3.2's version, if 
> I 
> remember correctly, maybe it was missing there as well?
> 
>> Please don't get me wrong, I really like that module and want it to set the
>> standard.
> 
> Hey, not at all. You can clone the git repo, make the fix and create a pull 
> request. We (as in the Zope devs) have been pretty good about merging in pull 
> requests after quick reviews.
> 
> I really want to be zodbpickle to be rock-solid as well, since we need it for 
> the Python 3 ports to go ahead.
> 
> Regards,
> Stephan
> -- 
> Entrepreneur and Software Geek
> Google me. "Zope Stephan Richter"
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] zodb-dev mail problem

2013-07-03 Thread Christian Tismer

Hi friends,

I just saw about 16 messages on this list without header.
The messages are between my message from 6:08 pm today
and Jeremy's message  "cache gc api"  from 10:03 pm today.

Looking at the headers:
All headers are stripped away, the messages start with a "From" that
is un-escaped.

I remember such problems from 1998, but then no longer.
So I doubt this is a mailman bug, either.

Was this list "repaired", maybe, and then in a quick hack that confuses
certain parsers?

I tried to find the bug on my side, but it really seems to come from
zope itself.

So if my theory is right, can someone please inject the messages again
but in a way that preserves the headers?

Thanks & cheers -- chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] zodb-dev mail problem

2013-07-03 Thread Christian Tismer

Oops,

has anyone else seen this?
I just saw the header oj Jeremy's message:


 From jer...@zope.com Fri Nov 02 11:53:54 2001
Received: from [208.184.249.90] (helo=yyz.digicool.com)
by mail.python.org with esmtp (Exim 3.21 #1)


So if nobody saw these messages, then it might be a problem with the latest
Thunderbird update that found old, malformed messages from my archive.
... yes!

So then please forget what I said. I'm happy that this was just old stuff.

Cheers - Chris


On 7/3/13 10:27 PM, Christian Tismer wrote:

Hi friends,

I just saw about 16 messages on this list without header.
The messages are between my message from 6:08 pm today
and Jeremy's message  "cache gc api"  from 10:03 pm today.

Looking at the headers:
All headers are stripped away, the messages start with a "From" that
is un-escaped.

I remember such problems from 1998, but then no longer.
So I doubt this is a mailman bug, either.

Was this list "repaired", maybe, and then in a quick hack that confuses
certain parsers?

I tried to find the bug on my side, but it really seems to come from
zope itself.

So if my theory is right, can someone please inject the messages again
but in a way that preserves the headers?

Thanks & cheers -- chris




--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] zodbpickle claim (but OS X is not unix)

2013-07-03 Thread Christian Tismer

Hi Stephan,

there is now a pull request, completely tested on OS X,
ready to be applied with no other work involved.

It would be nice if that version could be uploaded, soon, so
that I can close this issue and move on. ;-)

Thanks guys, and keep up the good work

  -- Chris


On 03.07.13 02:09, Stephan Richter wrote:

Hi Christian,

On Wednesday, July 03, 2013 01:01:14 AM Christian Tismer wrote:

I would appreciate if that simple-to-fix bug could be removed,
and I would be happy to help with this.

I am not a C expert, so I cannot comment.


And while we are at it: How about completion of the module, to let it
define standard things like DEFAULT_PROTOCOL ?
Or is there a reason to avoid this (because Python2 doesn't have it)?

We might have simply forgotten it. Also, we started Python 3.2's version, if I
remember correctly, maybe it was missing there as well?


Please don't get me wrong, I really like that module and want it to set the
standard.

Hey, not at all. You can clone the git repo, make the fix and create a pull
request. We (as in the Zope devs) have been pretty good about merging in pull
requests after quick reviews.

I really want to be zodbpickle to be rock-solid as well, since we need it for
the Python 3 ports to go ahead.

Regards,
Stephan



--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] zodbpickle claim (but OS X is not unix)

2013-07-05 Thread Christian Tismer

On 04.07.13 17:52, Stephan Richter wrote:

On Thursday, July 04, 2013 05:47:39 AM Christian Tismer wrote:

there is now a pull request, completely tested on OS X,
ready to be applied with no other work involved.

Yeah, I saw that. thanks.


It would be nice if that version could be uploaded, soon, so
that I can close this issue and move on. ;-)

Yeah, I just want Tres and/or Jim to weigh in, since they did the last
iterations on this code. And you know, we have Independence Day today, so this
week might be a little bit tough. ;-)


Thanks guys, and keep up the good work

Thank you for using/testing zodbpickle and sending us the patch. BTW, are you
using zodbpickle by itself to create Py2/3 compatible code?



Hi Stephan,

I am reviving Durus right now as my "super pickle for the pocket",
and don't want to stay stuck on protocol 2, incompatibility with the
python version etc. .
Personally, I have moved my projects to Py3.3, but a database is
a different thing that should really not suffer from that.

After some hacking, I realized that the problem is not so trivial, and
fortunately found zodbpickle.

So I thought that is the way to go, contribute a bit and use it.

I'm working on BTree forests for a versioned, read-only database,
and those versions come every two weeks, but I want to keep them all
in the same database without keeping redundant data.

That got me to the forest idea.

First thing I was looking into was the B+Tree impl. of Zodb, but that
was too much for me to change just for a prototype, because of all
the optimized C code.
And also the bucket pointers of B+Tree are disturbing a bit, because
every bucket/subtree can be part of many trees, so I have to think
how that should be.

But in the end I agree that Zodb is the real thing, and I will eventually
move there, when my forests prove useful and working.

Oh, back on the question:
Yes! We will use zodbpickle for all persistence stuff.

And I want python.org to incorporate these patches, because I think
that would help everyone. Why don't they want that small change?

cheers - Chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] zodbpickle claim (but OS X is not unix)

2013-07-09 Thread Christian Tismer

On 7/4/13 9:40 PM, Tres Seaver wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/04/2013 11:52 AM, Stephan Richter wrote:

On Thursday, July 04, 2013 05:47:39 AM Christian Tismer wrote:

there is now a pull request, completely tested on OS X, ready to be
applied with no other work involved.

Yeah, I saw that. thanks.


It would be nice if that version could be uploaded, soon, so that I
can close this issue and move on. ;-)

Yeah, I just want Tres and/or Jim to weigh in, since they did the last
  iterations on this code. And you know, we have Independence Day
today, so this week might be a little bit tough. ;-)

I've done some review on Christian's PR:  he is preparing some additional
changes.



Can somebody please have a look, again?
I think it is now very complete, everythink updated.

In addition, it adds two little convenience modules 'fastpickle' and 
'slowpickle'

which I will use for my projects.
Let me know what you think, please.

ciao - chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] zodbpickle: need a new tag (0.5.1)

2013-07-16 Thread Christian Tismer

Hi Stefan,

thanks a lot that you merged my work!

Now there is a little bit missing:
For some reason, the 0.5.1 tag was not merget from the clone.
All files there, but the tag is still 0.5.0.

I would like to push it on PyPI, but for that I need a new tag.

Can you please add a new tag, like 0.5.1 ?

Thanks & cheers -- chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] zodbpickle: need a new tag

2013-07-16 Thread Christian Tismer

Hi Stephan,

thanks a lot that you merged my work!

Now there is a little bit missing:
For some reason, the 0.5.1 tag was not merget from the clone.
All files there, but the tag is still 0.5.0.

I would like to push it on PyPI, but for that I need a new tag.

Can you please add a new tag, like 0.5.1 ?

Thanks & cheers -- chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] zc.zlibstorage missing from zodb package

2013-07-20 Thread Christian Tismer

Hi friends,

I'm trying to work with ZODB. (!)

Coming from durus development since a couple of weeks, I am
spoiled by simplicity.

Actually, I'm annoyed by durus' incapability to accept patches,
so I'm considering to put my efforts into ZODB.

On the other hand, ZODB tries to become small and non-intrusive,
but looking at its imports, this is still not a small package, and I'm
annoyed of this package as well.

- missing

   the zc.zlibstorage module is missing, IMHO.

   besides that, zc.zlibstorage was not maintained since quite a while
   and imports ZOPE3.

- bugs

   installing ZODB on OS X still gives complaints after Marius' latest 
patch which

   did not cover it all. It works, so this is a minor issue.

- discussion

   zc.zlibstorage requites a wrapper to add it to filestorage.
   I consider this an option, instead, and a simple boolean flag to switch
   it on and off.
   The module is way too simple to add all this config extra complication
   to even think of it.

- proposal:
   let me integrate that with ZODB and add a config option, instead of
   a wrapper.


* this is just the beginning of a series of proposals to ZODB.
  I would love to use it if it was as small as it claims to be.
  Or tries to be. There are serious flaws that voiden this nice attempt.

At the moment, I'm considering to re-package everything into a really
isolated, single package. This is my major reason why I worked on Durus.
My reason to go back to ZODB is the better code.
But be warned, there are bugs in the BTrees package, which I will report 
next time.


Meant in a friendly, collaborative sense -- Chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] BTrees package problems

2013-07-20 Thread Christian Tismer

The BTrees package is an attempt to isolate certain things from ZODB.

While I appreciate the general intent, I cannot see the advantage at
this point:

- BTrees can be imported alone, yes. But it has the extensions prepared
   with special ZODB slots, which makes this very questionable.

- BTrees furthermore claims the BTrees global bame for it, all though it
   is not a general BTree package, but for ZODB BTrees, only.

- BTrees has a serious bug, see the following example:


>>> from BTrees import OOBTree as BT
>>> t = BT.BTree()
>>> for num in range(100):
...   k = str(num)
...   t[k] = k
...
>>> t._firstbucket._next = None
>>> len(t)
Bus error: 10
(tmp)minimax:doc tismer$


So there is either an omission to make t._next() read-only, or a check
of its validity is missing.

Actually, I would like to add a callable-check instead, to allow for more
flexible derivatives.

* this was my second little rant about ZODB. Not finished as it seems.

please, see this again as my kraut way of showing interest in improving
very good things.

cheers -- chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] BTrees and ZODB simplicity

2013-07-20 Thread Christian Tismer

Third rant, dear Zope-Friends (and I mean it as friends!).

In an attempt to make the ZODB a small, independant package, ZODB
has been split into many modules.

I appreciate that, while I think it partially has the opposite effect:

- splitting BTrees apart is a good idea per se.
   But the way as it is, it adds more Namespace-pollution than benefits:

   To make sense of BTrees, you need the ZODB, and only the ZODB!
   So, why should then BTrees be a top-level module at all?

   This does not feel natural, but eavesdropping, pretending as something
   that is untrue.

I think:

 - BTrees should either be a ZODB sub-package in its current state,

 - or a real stand-alone package with some way of adding persistence as
   an option.

* there is a conclusion following as well.

Thanks for audience this far ;-)

cheers - chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] make ZODB as small and compact as expected

2013-07-20 Thread Christian Tismer

This is my last emission for tonight.

I would be using ZODB as a nice little package if it was one.

There should be nothing else but

ZODB.

Instead, there is

BTrees
persistent
transaction
zc.lockfile
zc.zlibstorage
ZConfig
zdaemon
ZEO
ZODB
ZODB3   (zlibstorage)
zope.interface

and what I might have forgotton.

Exception:
There is also
zodbpickle
which I think is very usefull and general-purpose, and I wan to keep it,
also I will try to push it into standard CPython.

So, while all the packages are not really large, there are too many 
namespaces

touched, and things like "Zope Enterprize Objects" are not meant to be here
as open source pretending modules which the user never asked for.

I think these things could be re-packed into a common namespace
and be made simpler. Even zope.interface could be removed from
this intended-to-be user-friendly simple package.

So while the amount of code is astonishingly small, the amount of
abstraction layering tells the reader that this was never really meant to
be small.

And this makes average, simple-minded users like me shy away and go
back to simpler modules like Durus.

But the latter has serious other pitfalls, which made me want to re-package
ZODB into something small, pretty, tool-ish, versatile thing for the pocket.

Actually I'm trying to re-map ZOPE to the simplistic Durus interface,
without its short-comings and lack of support.
I think a successfully down-scaled, isolated package with ZODB's
great implementation, but a more user-oriented interface would
help ZODB a lot to get widely accepted and incorporated into very
many projects.
Right now people are just too much concerned of implicit complication which
actually does not exist.

I volunteer to start such a project. Proposing the name "david", as opposed
to "goliath".

cheers -- chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] zc.zlibstorage missing from zodb package

2013-07-22 Thread Christian Tismer

On 22.07.13 11:54, Adam GROSZER wrote:

On 07/21/2013 05:09 AM, Christian Tismer wrote:


- discussion

zc.zlibstorage requites a wrapper to add it to filestorage.
I consider this an option, instead, and a simple boolean flag to 
switch

it on and off.
The module is way too simple to add all this config extra 
complication

to even think of it.


IMHO the wrapper architecture is a good thing, you can do some handy 
things as:


https://pypi.python.org/pypi/cipher.encryptingstorage



I agree (and congrats for that module!).

Just the compression felt so little (because I come from Durus).
My other comment holds:
Why is zlibstorage not on zopefoundation?

cheers - chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] BTrees and ZODB simplicity

2013-07-22 Thread Christian Tismer

On 22.07.13 13:13, Jim Fulton wrote:

On Sat, Jul 20, 2013 at 11:43 PM, Christian Tismer  wrote:

Third rant, dear Zope-Friends (and I mean it as friends!).

In an attempt to make the ZODB a small, independant package, ZODB
has been split into many modules.

Maybe not as many as you think:
persistent, transaction, ZEO, ZODB and BTrees.

5 


I appreciate that, while I think it partially has the opposite effect:

- splitting BTrees apart is a good idea per se.
But the way as it is, it adds more Namespace-pollution than benefits:

To make sense of BTrees, you need the ZODB, and only the ZODB!
So, why should then BTrees be a top-level module at all?

This does not feel natural, but eavesdropping, pretending as something
that is untrue.

I think:

  - BTrees should either be a ZODB sub-package in its current state,

  - or a real stand-alone package with some way of adding persistence as
an option.

I don't agree that because a package depends on ZODB
it should be in ZODB.  There are lots of packages that depend
on ZODB.


This is generally true. In the case of BTrees, I think the ZODB
is nothing without BTrees, and BTrees make no sense without
a storage and carry those _p_ which are not optional.

BTrees would make more sense as a standalone package if the persistence
model were pluggable. But that is also theoretical because I don't see
right now how to split that further with all the C code.

That made me think it belongs to ZODB, what else could it support,
and who would ever install ZODB without it.


I agree with your sentiments about namespace pollution.
You and I may be the only ones that care though .3 ;).



Yay, actually I care mainly because just trying 'pip install ZODB'
spreads out n folders in my site-packages, and 'pip uninstall ZODB' 
leaves n-1

to pick the names by hand. That's why I want things nicely grouped ;-)

cheers - chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] BTrees package problems

2013-07-22 Thread Christian Tismer

On 22.07.13 16:38, Patrick Strawderman wrote:

On Jul 20, 2013, at 11:27 PM, Christian Tismer  wrote:


- BTrees has a serious bug, see the following example:


from BTrees import OOBTree as BT
t = BT.BTree()
for num in range(100):

...   k = str(num)
...   t[k] = k
...

t._firstbucket._next = None
len(t)

Bus error: 10
(tmp)minimax:doc tismer$

So there is either an omission to make t._next() read-only, or a check
of its validity is missing.

Maybe you could open an issue on Github?


Yes I can do that (and fix it).

I was just telling it here because I'd like to know how it is meant to
be.

- should the attributes be exposed at all? (I guess yes)

- are they meant to be writable? (probably not, although that is handy :)

I would actually like to be able to derive from Bucket and implement
copy-on-write semantics for FrozenBTree (not yet existing) without
re-coding much in C, this was the reason while I played around here.

For that purpose (sharing buckets) I need a way to make the _next
pointers indirect.

cheers - chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] make ZODB as small and compact as expected

2013-07-22 Thread Christian Tismer

On 22.07.13 15:15, Stephan Richter wrote:

On Sunday, July 21, 2013 06:12:34 AM Christian Tismer wrote:

  BTrees

I agree, this could be part of ZODB and it would be fine.

...

  ZODB3   (zlibstorage)

Well, this package is deprecated. It is available for backward-compatibility.



Yes, but I meant zlibstorage, which pulls ZODB3 in.
I would like to put that as an optional package, but in ZODB (4)

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] make ZODB as small and compact as expected

2013-07-22 Thread Christian Tismer

On 22.07.13 18:01, Tres Seaver wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/22/2013 09:15 AM, Stephan Richter wrote:

BTrees

I agree, this could be part of ZODB and it would be fine.

Splitting out BTrees was a conscious decision to serve two goals:

- - Allow evolving it (in particular, the work to port it to Py3k / PyPy)
   without stalling on the larger ZODB project.  For ongoing work, it is
   useful to be able to release a fix for a BTrees-only bug without needing
   to release ZODB.

- - Allow projects which use BTrees (as base classes or attributes) to be
   tested without needing to install all of ZODB.

I consider both of those concerns still important, and so am -1 on
re-absorbing BTrees into ZODB.



Yes, I understand this intention and see no problem:
Just the namespace might be ZODB.Btrees which would not change
the split. They would still live alone, separate projects.

This is just plugged in, like zlibstorage (if it were not ZODB3 ;-) )

Minor point, anyway ;-)

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] BTrees package problems

2013-07-22 Thread Christian Tismer

On 22.07.13 13:08, Jim Fulton wrote:

On Sat, Jul 20, 2013 at 11:27 PM, Christian Tismer  wrote:

The BTrees package is an attempt to isolate certain things from ZODB.

While I appreciate the general intent, I cannot see the advantage at
this point:

- BTrees can be imported alone, yes. But it has the extensions prepared
with special ZODB slots, which makes this very questionable.

- BTrees furthermore claims the BTrees global bame for it, all though it
is not a general BTree package, but for ZODB BTrees, only.

Yeah, I worried about this when we broke it out.

OTOH, there isn't much concern with namespace
pollution in the Python community. :/


- BTrees has a serious bug, see the following example:


from BTrees import OOBTree as BT
t = BT.BTree()
for num in range(100):

...   k = str(num)
...   t[k] = k
...

t._firstbucket._next = None
len(t)

Bus error: 10
(tmp)minimax:doc tismer$

Ouch.


So there is either an omission to make t._next() read-only, or a check
of its validity is missing.

Yup.  OTOH, you're the first person to encounter this
after many years, so while this is bad, and needs to be
fixed, I'm not sure how serious it is as a practical matter.


Actually, I would like to add a callable-check instead, to allow for more
flexible derivatives.

I don't understand this.


Simple: I am writing BTree forests for versioned, read-only databases.

For that, I need a way to create a version of Bucket that allows to
override the _next field by maybe a callable.
Otherwise all the buckets are chained together and I have no way
to let frozen BTrees share buckets.

When I played with the structure, I was happy/astonished to see the 
_next field

being writable and thought it was intended to be so.
It was not, in the end ;-)

cheers - Chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] BTrees package problems

2013-07-23 Thread Christian Tismer

Hey Jim,

On 23.07.13 19:18, Jim Fulton wrote:

On Mon, Jul 22, 2013 at 9:06 PM, Christian Tismer  wrote:
...

Actually, I would like to add a callable-check instead, to allow for more
flexible derivatives.

I don't understand this.


Simple: I am writing BTree forests for versioned, read-only databases.

For that, I need a way to create a version of Bucket that allows to
override the _next field by maybe a callable.
Otherwise all the buckets are chained together and I have no way
to let frozen BTrees share buckets.

In retrospect, it might make more sense to do the chaining a level up.
Buckets themselves don't care about chaining. The tree wants buckets
to be chained to support iteration.  I'm not really sure if that helps your
use case.


Yes I know.
I was thinking of a minimal-intrusive, minimal-overhead way to get it
without forking/re-writing, but I'm not settled, yet.


When I played with the structure, I was happy/astonished to see the _next
field
being writable and thought it was intended to be so.
It was not, in the end ;-)

It's clearly a bug.  The code has a comment right above the attribute definition
stating that it's (supposed to be) read only, but the implementation makes
them writable.

There doesn't seem to be anything that depends on writing this attribute.
I verified this by adding a fix and running the tests (in 3.10).


I know that it is a serious bug (by definition, since it causes a bus error)
but it also is not an urgent bug (because it needed me to find it at all).

Actually, I have a tendency to find them; the first time I look intensively
into a project that I like, this happens almost all the time.

Do you know Mr. Adrian Monk from that wonderful Monk series?

On software development, it seems to be just me. ::

"""It's a gift, and a curse."""


For what you're trying to do, I suspect you want to fork BTrees, or start
over.



Starting over/forking is the easy but heavy way.
Before I do that, I will analyze everything and find out if it makes more
sense to share the existing code, which is (after my intense investigation
and analysis) very good and a highly optimized implementation.

It is my goal to

- either add to this quasi-perfect  thing in a way that the overhead 
cycles are

   below 1.8 percent, or

- find out that the benefit of a patched solution is too low to justify a
   patch and do a re-write fork.

What I'm after is a way to over-ride the implementation by user code.
I did not yet check it this is implemented already, in the Python way of
sub-classing built-ins.

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] make ZODB as small and compact as expected

2013-07-29 Thread Christian Tismer

On 29.07.13 13:22, Lennart Regebro wrote:

On Mon, Jul 22, 2013 at 8:44 PM, Christian Tismer  wrote:

Yes, I understand this intention and see no problem:
Just the namespace might be ZODB.Btrees which would not change
the split. They would still live alone, separate projects.

This seems to be a complete red herring. What difference does the
namespace make?

 from BTrees.IIBTree import IITree

or

 from ZODB.BTrees.IIBTree import IITree

That difference is completely insiginifant in all ways. But making
that change would break all BTree usage in existence. I see no benefit
in that change at all.


Interesting that nobody sees the problem.
If you are always living in Zope world, your claim is understandable ;-)

Here is the sketch of an example::

from durus.btree import BTree
from BTrees.IIBTree import IITree
# ...
# now transform the one into the other...

Here we have no problem because of two facts:

- durus is packaged well,
- BTrees luckily has the plural.

There are other packages floating around, and I just like to work
with many things installed together.

But I agree it would be a bit harder to move it. I was confused and thought
ZODB was already a namespace package.

So let's cook the herring and have it for dinner -- Chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] zodbpickle: need a new tag

2013-08-16 Thread Christian Tismer

Howdy,

fixed, see https://github.com/zopefoundation/zodbpickle/pull/8

A wrong import made it through merging.
This fixes #6 and #7 .

cheers - chris

On 14.08.13 20:47, Tres Seaver wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/14/2013 02:14 PM, Tres Seaver wrote:

On 07/16/2013 03:33 PM, Christian Tismer wrote:

Hi Stephan,
thanks a lot that you merged my work!
Now there is a little bit missing: For some reason, the 0.5.1 tag
was not merget from the clone. All files there, but the tag is
still 0.5.0.
I would like to push it on PyPI, but for that I need a new tag.
Can you please add a new tag, like 0.5.1 ?
Thanks & cheers -- chris

Guys, the 0.5.1 releas / trunk has test failures under Py3k:

https://github.com/zopefoundation/zodbpickle/issues/6

In addition to the problem just reported ("Unpickler instance has no
attribute 'noload').  In both cases, the 0.5.0 tag / release does not
have the problem.  Can you have a look?


Here is the issue for missing 'noload':

   https://github.com/zopefoundation/zodbpickle/issues/7


Tres.
- -- 
===

Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with undefined - http://www.enigmail.net/

iEYEARECAAYFAlIL0MIACgkQ+gerLs4ltQ4EUgCfWgaS9yUgy0+3cP8DNZbolxnz
RhcAn13hS2jnB44d1gfmvdD9RYtuNWa4
=0+eF
-END PGP SIGNATURE-



--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] polite advice request

2013-08-16 Thread Christian Tismer

Hi Jim et all!

I am struggling with a weird data base, and my goal is to show off how
great this works with (zodb|durus, the latter already failed pretty much).

Just to give you an impression of the size of the problem:

There are about 25 tables, each with currently 450,000 records.
After all the changes since 20120101, there were 700,000 records involved
and morphed for each table.

These records have some relevant data, but extend to something like 95
additional columns which are pretty cumbersome.

This database is pretty huge and contains lots of irrelevant data.

When I create the full database in native dumb style (create everything
as tuples), this crap becomes huge and nearly untractable by Python.

I managed to build some versions, but see further:

In extent to the 25 tables snapshot, this database mutates every 2 weeks!
Most of the time, there are a few thousand updates.
But sometimes, the whole database changes, because they decided to
remove and add some columns, which creates a huge update that changes
almost everything.

I am trying to cope with that in a better way.
I examined lots of approaches to cope with such structures and tried some
things with btree forests.

After all, it turned out that structural changes of the database (2 columns
removed, 5 inserted) result in huge updates with no real effect.

Question:
Did you have that problem, and can you give me some advice?
I was thinking to switch the database to a column-oriented layout, since
this way I could probably get rid of big deltas which just re-arrange very
many columns.

But the overhead for doing this seems to be huge, again.

Do you have a good implementation of a column store?
I would like to implement a database that tracks everything, but is able 
to cope

with such massive but simple changes.

In effect, I don't want to keep all the modified records, but have some 
function

that creates the currently relevant tuples on-demand.
Even that seems difficult. And the whole problem is quite trivial, it 
just suffers

from Python's idea to create so very many objects.



So my question, again:

- you have 25 tables

- tables are huge (500,000 to 1,000,000 records)

- highly redundant (very many things could be resolved by a function 
with special cases)


- a new version comes every two weeks

- I need to be able to inquire every version

How would you treat this?

What would you actually store?

Would you generate a full DB every 2 weeks, or would you (as I do) try to
find a structure that knows about the differences?

Is Python still the way to go, or should I stop this and use something like
PostgreSQL? (And I doubt that this would give a benefit, actually).

Would you implement a column store, and how would you do that?


Right now, everything gets too large, and I'm quite desperate. 
Therefore, I'm

asking the master, which you definately are!

cheers -- Chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] polite advice request

2013-08-18 Thread Christian Tismer

On 18.08.13 17:09, Jim Fulton wrote:

On Fri, Aug 16, 2013 at 11:49 PM, Christian Tismer  wrote:



Explaining very concisely, now.

I don't think I/we understand your problem well enough to answer. If 
data has a very low shelf life, then replacing it frequently might 
make sense. If the schema changes that frequently, I'd as why. If this 
is a data analysis application, you might be better served by tools 
designed for that.

Is Python still the way to go, or should I stop this and use something like
PostgreSQL? (And I doubt that this would give a benefit, actually).

Ditto,


Would you implement a column store, and how would you do that?

Ditto.


Right now, everything gets too large, and I'm quite desperate. Therefore,
I'm
asking the master, which you definately are!

"large" can mean many things. The examples you give don't
seem very large in terms of storage, at least not for ZODB.

Beyond that there are lots of dimensions of scale that ZODB
doesn't handle well (e.g. large transaction rates, very
high availability).

It's really hard to make specific recommendations without
knowing more about the problem. (And it's likely that someone
wouldn't be able to spend the time necessary to learn more
about the problem without a stake in it. IOW, don't assume I'll
read a much longer post getting into details. :)



Ok, just the sketch of it to make things clearer, don't waste time on 
this ;-)


We get a medication prescription database in a certain serialized format
which is standard in Germany for all pharmacy support companies.

This database comes in ~25 files == tables in a zip file every two weeks.
The DB is actually a structured set of SQL tables with references et al.

I actually did not want to change the design and simply created the table
structure that they have, using ZODB, with tables as btrees that contain
tuples for the records, so this is basically the SQL model, mimicked in 
Zodb.


What is boring is the fact, that the database gets incremental updates 
all the time,

changed prices, packing info, etc.
We need to cope with millions of recipes that come from certain dates
and therefore need to inquire different versions of the database.

I just hate the huge redundancy that these database versions would have
and tried to find a way to put this all into a single Zodb with a way to
time-travel to every version.

The weird thing is that the DB also changes its structure over time:

- new fields are added, old fields dropped.

That's the reason why I thought to store the tables by column, and each 
column is

a BTree on itself. Is that feasible at all?

Of the 25 tables, there are 4 quite large, like
4 tables x 500,000 rows x 100 columns,
== 200,000,000 cells in one database.

With a btree bucket size of ~60, this gives ~ 3,333,333 buckets.
With multiple versions, this will be even more.

-- Can Zodb handle so many objects and still open the db fast?
-- Or will the huge index kill performance?

That's all I'm asking before doing another experiment ;-)

but don't waste time, just telling you the story -- chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] polite advice request

2013-08-18 Thread Christian Tismer

Hi Claudiu,

On 18.08.13 20:07, Claudiu Saftoiu wrote:

I wonder, if you have a problem which an SQL database would be so good for that 
youre mimicking an SQL database with zodb, why not just use an SQL database? It 
doesn't sound like you'll gain much from being able to persist objects which is 
one of the main reasons to use an object database...


This is because I hate to create DB servers in the first place, loose 
all the

flexibility of Python, create import scripts which deal with the limitations
of the RDBMS, ...

Of cource, it probably makes sense to switch to an SQL database, in the end.
I just wanted to keep things in Python as long as possible, to explore the
data and not having to understand the relations in the first place.

I need to squeeze and treat and brush the data, before I use something else.
This is pretty much like switching from Python to C - it is the very 
last thing

that I want to do, because Python -> SQLDB is like Python -> C:

You are carving things into stone, get lots of constraints and loose 
flexibility.


In this case I was a bit over the tops, but I'm already quite pleased 
with today's

approach, 25 btrees of namedtuple records are very nice to explore.
Utilizing a tuple cache (also as zodb/durus), I can create and save the 
database

in 20 minutes, resulting in compressed size of 300 MB. Quite a starter...

cheers - chris



On Aug 18, 2013, at 12:17 PM, Christian Tismer  wrote:


On 18.08.13 17:09, Jim Fulton wrote:

On Fri, Aug 16, 2013 at 11:49 PM, Christian Tismer  wrote:


Explaining very concisely, now.


I don't think I/we understand your problem well enough to answer. If data has a 
very low shelf life, then replacing it frequently might make sense. If the 
schema changes that frequently, I'd as why. If this is a data analysis 
application, you might be better served by tools designed for that.

Is Python still the way to go, or should I stop this and use something like
PostgreSQL? (And I doubt that this would give a benefit, actually).

Ditto,


Would you implement a column store, and how would you do that?

Ditto.


Right now, everything gets too large, and I'm quite desperate. Therefore,
I'm
asking the master, which you definately are!

"large" can mean many things. The examples you give don't
seem very large in terms of storage, at least not for ZODB.

Beyond that there are lots of dimensions of scale that ZODB
doesn't handle well (e.g. large transaction rates, very
high availability).

It's really hard to make specific recommendations without
knowing more about the problem. (And it's likely that someone
wouldn't be able to spend the time necessary to learn more
about the problem without a stake in it. IOW, don't assume I'll
read a much longer post getting into details. :)


Ok, just the sketch of it to make things clearer, don't waste time on this ;-)

We get a medication prescription database in a certain serialized format
which is standard in Germany for all pharmacy support companies.

This database comes in ~25 files == tables in a zip file every two weeks.
The DB is actually a structured set of SQL tables with references et al.

I actually did not want to change the design and simply created the table
structure that they have, using ZODB, with tables as btrees that contain
tuples for the records, so this is basically the SQL model, mimicked in Zodb.

What is boring is the fact, that the database gets incremental updates all the 
time,
changed prices, packing info, etc.
We need to cope with millions of recipes that come from certain dates
and therefore need to inquire different versions of the database.

I just hate the huge redundancy that these database versions would have
and tried to find a way to put this all into a single Zodb with a way to
time-travel to every version.

The weird thing is that the DB also changes its structure over time:

- new fields are added, old fields dropped.

That's the reason why I thought to store the tables by column, and each column 
is
a BTree on itself. Is that feasible at all?

Of the 25 tables, there are 4 quite large, like
4 tables x 500,000 rows x 100 columns,
== 200,000,000 cells in one database.

With a btree bucket size of ~60, this gives ~ 3,333,333 buckets.
With multiple versions, this will be even more.

-- Can Zodb handle so many objects and still open the db fast?
-- Or will the huge index kill performance?

That's all I'm asking before doing another experiment ;-)

but don't waste time, just telling you the story -- chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F

Re: [ZODB-Dev] polite advice request

2013-08-18 Thread Christian Tismer

Ah, danke mabe ;-)

On 18.08.13 19:56, Jim Fulton wrote:

On Sun, Aug 18, 2013 at 1:40 PM, [mabe]  wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

He meant prescription.

In german Rezept is the word for both prescription and recipe (like in
cooking). Easy to confuse for us germans in english :)

Great.  Now I don't know what he meant by prescription. :) Does it
matter?  Might it as easily be foos and bars?

Christian,

Are you saying that you might need to access items
from an old database that aren't in the current snapshot?


Yes, prescription, sorry.
Yes, we need to look into different versions of the
continuously actualized data base. Like I did it now this creates
a slightly different, read-only data base every two weeks. Not that big deal
after I built the first DB today, we can probably live with < 300 MB
of database each version. (using zlibstorage)
It is just my optimizer brain, and the fact that the whole history of 
the stuff

since 2012-01-01 fits into 125 MB of ZIP files, as delta-updates.

There must be a solution that utilizes this incremental update stuff nicely.
I wanted to use a versioned variant of btree, until I found out that even
the table lauout changed a bit three times, which creates a huge update.

cheers - chris

p.s.:
I needed to patch zlibstorage for Python 3.
Where can I put a pull request?


Jim



On 08/18/2013 06:34 PM, Jim Fulton wrote:

On Sun, Aug 18, 2013 at 12:17 PM, Christian Tismer
 wrote:

We need to cope with millions of recipes that come from certain
dates and therefore need to inquire different versions of the
database.

I don't understand this. What's a "recipe"?  Why do you need to
consider old versions of the database?


-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.20 (GNU/Linux)

iQIcBAEBAgAGBQJSEQb/AAoJEAOmTcUxK/swEXgP/Ry3x9Y98wp43e2F2cf2063O
F2UGRNZfylMjG3kTBLfwW9eH5KWk7AmCXdzUw/fXggueyg0NrH9f8aScYVPYHSEp
g3q9n/I93DrMdDakqLXcnpHlKuUrd1ZfBk+XSyavvnOdV4LWGJ6+Wd8yqAFmUUCl
bn//STvajUqSpO1+nG0aQsSceeTCVTEuyzQ/O4nSujhERG2ED7XOwi/1WwgruZSY
2ZGZCeLmHHLgYg6G8zPDRX6q/Y0GYLGi2bCQ0aQWlHEkBJBtPgCWn3rG+9GBlNXv
bSXu0yjbaHL3q8VvdwAh4Y7n8E9TV1KVojOJmCg6MOA+AusL475Lao2/yBtZG3s3
mg12/NSUY/hGGoqtnsvXkIV8+ggK7WVlZRDzAoiHymR/3kdNO4MWYxFcvjCrvu8x
RB6gIsVLglWKu5cuCJDrK7eGmdVK/y0Tmtl2qGKNnn+PJrZqNB9rk2kfmPMVIBdy
VkFjvBQICL3aFZjSEDeqOeLdis221V9y3ndgKer6K5OG2KBNsv8dUX2smb7Qx7RT
dbhhXwhI3C9i7ifzDEcrUavUfJCDQNLQovo1F/sL5hChFJAFS6USeWALt7B41YBu
lN5ThjgIhkuyWfhs+ZAPeze5rRcY5lt+3oWLcD9fav+jJsifGodBdLrJ2dbljtWw
4FJBrKq/+ULC03toajwM
=A/VY
-END PGP SIGNATURE-
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev






--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] polite advice request

2013-08-18 Thread Christian Tismer

Leo,
You seem to understand me perfectly!
Have we met before?

ciao - chris

 On 19.08.13 00:31, Leonardo Rochael Almeida wrote:

AFAICT, Chistian's problem is that an SQL database would not be such a
good fit, due to the "time travel" requirement. IIUC, he has to look
up records as they were in the past, including whatever fields they
had in the past, even if they're no longer part of the schema for the
current data. To do this in SQL:

  * EITHER he creates a new SQL database (or a new set of tables on the
same database) for each new revision of the incoming information, and
each database (or set of tables) would then be free to have its own
schema, but there would be lots of duplication in the data that hasn't
changed between databases,

  * OR he'll have to do a time-travel superstructure on his one
database [1], adding time range columns to each table indicating the
validity of each record, and having lots of duplicated records
differing only in the time-range and a few fields. Not to mention the
fact that the schema for these tables would contain the union of all
columns that were valid at any one point in time, and lots of NULLS in
these columns.

[1] http://en.wikipedia.org/wiki/Temporal_database

So, I believe Christian is considering flexible (or rather,
non-existent) schema of ZODB (and perhaps the built-in time-travel
capabilities) as pretty good fit to his problem.

But he seems to be worried about data volume and its impact on
performance. He's also wondering how to best design the storage of
this data on ZODB taking into account the fact that the schema changes
frequently.

If (as he indicates) he stores the data as tuples in BTrees (one BTree
per "table", keyed by the primary key of the original table), he'll be
forced to rewrite all the tuples of each BTree (table) that changes
schema, which could mean almost as much duplication as the "one SQL
Database per revision" case.

On the other hand, he seems to speculate that perhaps he could store
one BTree per table COLUMN (per revision?), keyed by the primary key
of the original table. This way, each new incoming data revision would
only need touch the data that actually changed, and schema changes
would mean the deletion or addition of entire BTrees, w/o having to
touch the unchanged data.

Cheers,

Leo


On Sun, Aug 18, 2013 at 3:07 PM, Claudiu Saftoiu  wrote:

I wonder, if you have a problem which an SQL database would be so good for that 
youre mimicking an SQL database with zodb, why not just use an SQL database? It 
doesn't sound like you'll gain much from being able to persist objects which is 
one of the main reasons to use an object database...


On Aug 18, 2013, at 12:17 PM, Christian Tismer  wrote:


On 18.08.13 17:09, Jim Fulton wrote:

On Fri, Aug 16, 2013 at 11:49 PM, Christian Tismer  wrote:


Explaining very concisely, now.


I don't think I/we understand your problem well enough to answer. If data has a 
very low shelf life, then replacing it frequently might make sense. If the 
schema changes that frequently, I'd as why. If this is a data analysis 
application, you might be better served by tools designed for that.

Is Python still the way to go, or should I stop this and use something like
PostgreSQL? (And I doubt that this would give a benefit, actually).

Ditto,


Would you implement a column store, and how would you do that?

Ditto.


Right now, everything gets too large, and I'm quite desperate. Therefore,
I'm
asking the master, which you definately are!

"large" can mean many things. The examples you give don't
seem very large in terms of storage, at least not for ZODB.

Beyond that there are lots of dimensions of scale that ZODB
doesn't handle well (e.g. large transaction rates, very
high availability).

It's really hard to make specific recommendations without
knowing more about the problem. (And it's likely that someone
wouldn't be able to spend the time necessary to learn more
about the problem without a stake in it. IOW, don't assume I'll
read a much longer post getting into details. :)


Ok, just the sketch of it to make things clearer, don't waste time on this ;-)

We get a medication prescription database in a certain serialized format
which is standard in Germany for all pharmacy support companies.

This database comes in ~25 files == tables in a zip file every two weeks.
The DB is actually a structured set of SQL tables with references et al.

I actually did not want to change the design and simply created the table
structure that they have, using ZODB, with tables as btrees that contain
tuples for the records, so this is basically the SQL model, mimicked in Zodb.

What is boring is the fact, that the database gets incremental updates all the 
time,
changed prices, packing info, etc.
We need to cope with millions of recipes that come from certain dates
and therefore need

Re: [ZODB-Dev] polite advice request

2013-08-18 Thread Christian Tismer

On 18.08.13 18:34, Jim Fulton wrote:

On Sun, Aug 18, 2013 at 12:17 PM, Christian Tismer  wrote:
...

We get a medication prescription database in a certain serialized format
which is standard in Germany for all pharmacy support companies.

This database comes in ~25 files == tables in a zip file every two weeks.
The DB is actually a structured set of SQL tables with references et al.

So you get an entire database snapshot every 2 weeks?


I actually did not want to change the design and simply created the table
structure that they have, using ZODB, with tables as btrees that contain
tuples for the records, so this is basically the SQL model, mimicked in
Zodb.

OK.  I don't see what advantage you hope to get from ZODB.


I want its flexibility. I need python and zodb to transform the data tables
before I understand them. I use Python to stress and inquire and validate my
implementation, and their data structures, before I trust it and maybe 
turn it
(painfully) into an SQL db. Maybe not at all, as I learn from playing 
with Zodb.


Have you ever tried to "play" with an SQL DB?
This is very painful and boring to set up and get right.
I only do that after I have studied the data with Python.
In this case, simply looking at pickles huge dicts did not scale, because of
too much data. That was the reason to dive into Zodb. With success.




What is boring is the fact, that the database gets incremental updates all
the time,
changed prices, packing info, etc.

Are these just data updates? Or schema updates too?


At first I was told that there are data updates, only. Then, due to my 
validation
analyze during parsing, I found out that there were structural schema 
changes

as well. Some were just relaxations or strengthened constraints, but there
were three major changes lately, that incolved the whole tables by inserting
and removing columns.
The whole catastrope, so to say.

As always, when the customer swears "this will never happen", you should be
prepared to implement exactly that impossible case. :-)




We need to cope with millions of recipes that come from certain dates
and therefore need to inquire different versions of the database.

I don't understand this. What's a "recipe"?  Why do you need to
consider old versions of the database?



Not recipes, but prescriptions. (Unfortunately these words collapse in 
German).

We get millions of these every month and have to use the right data from the
DB version which was active at that time when the prescription was issued.

That made me want to create a "time machine" interface to the DB without the
need to have several GB of that crap as slightly different variations of
basically the same stuff.

Made some promising experiments today with column btrees.
ZODB is performing well with 100 million of buckets!

cheers - Chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] polite advice request

2013-08-19 Thread Christian Tismer

Issue resolved, see at end

On 19.08.13 00:39, Alan Runyan wrote:

Would you implement a column store, and how would you do that?

Ditto.

So many Dittos, it  sounds like a Rush Limbaugh talk show :)


"large" can mean many things. The examples you give don't
seem very large in terms of storage, at least not for ZODB.

One app we have is 26,344,368 objects.
ZODB is the least of its concerns.


It's really hard to make specific recommendations without
knowing more about the problem. (And it's likely that someone
wouldn't be able to spend the time necessary to learn more
about the problem without a stake in it. IOW, don't assume I'll
read a much longer post getting into details. :)

This is fair.  ZODB is intimately tied to the application design so
it is a bit difficult for someone to qualify what they are doing
without having to explain the application design.

This sucks from a newbie's point of view but its reality.

I just wrote up some thoughts on ZODB.
Might be useful for others - doubtful - but maybe.

https://docs.google.com/document/d/12RGOTSMrl0CttkCZJ5rp-TSaakAY2Pn4VnWhVMcFMQw/edit?usp=sharing

Anyway.  Tismer if you write up more thoughts; I will read them.



Hey, nice write-up, thanks a lot!

On 19.08.13 09:33, Dylan Jay wrote:

In some ways the ZODB is less flexible. It requires you to understand more 
about how you will access the data before you import it, than does an SQL 
database. This is because the datastructure defines how you can query it in a 
ZODB.
For example, if you need multiple indexes to your data, then to make it 
efficient you might choose a different data structure. Whereas in SQL you can 
add indexes after the fact. Which ever way you go however, you are always 
better off thinking about how you will access your data first. for example when 
you reimport the data do you need to do a look up on each item to see if it's 
there and merge, or will you just delete the lot and start from scratch?

Having said this, you might look at a project like souper that tries to support 
tabular type data without having to think too much about the data structures.


I looked a bit into souper, maybe I'll try.

Right now I'm happy with this very dumb brute-force solution:

I turned all the 25 tables into a column-store, very simple implementation
with no keys, nothing.
I just took the original table data, sorted it by primary key, and then
built a persistent list for each column.

This unoptimized solution has very little overhead. The primary key can be
searched by bisect, which is right now all we need.

I used ZlibStorage, and the stunning effect:

The database is now 44.5 MB, it loads the few columns that we need
in a fraction of a second, and the original serialization format
took 44.4 MB as a ZIP file. :-D

So the former bloat of almost a GB is gone, versions are cheap, and I don't
try to do further reduction of size or calculate deltas between versions,
but happily use the small, absolute column store databases
which I calculate every two weeks, together with an index database.

cheers - chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] https://github.com/zopefoundation/zc.beforestorage

2013-09-27 Thread Christian Tismer

Hi,

I saw me subscribed to zc.beforestorage , today.

If I'm not mislead, versions are no longer supported in 4.0, or
is this still a supported approach?

I think history should not depend on having pack()'ed or not,
but an explicit snapshot feature that puts a set of objects
into some history object.
Has that been discussed, and can someone please point me at it?

cheers - chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] https://github.com/zopefoundation/zc.beforestorage

2013-09-28 Thread Christian Tismer

Hi Jim,

On 27.09.13 20:12, Jim Fulton wrote:

...
"versions" were removed in 3.9.


Ok, that was my newbie-question, and the answer.
I was unsure if beforestorage was still working and
had to sort out that the removed "versions" are a different thing.

thanks & cheers - chris

--
Christian Tismer :^)   <mailto:tis...@stackless.com>
Software Consulting  : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 :*Starship* http://starship.python.net/
14482 Potsdam: PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04   9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
  whom do you want to sponsor today?   http://www.stackless.com/

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev