I have two questions about Metakit for y'all.

First the background: I'm writing ETL software in Python. This means I am interfacing with databases, but also storing serialized objects on disk with ordering. I like both BSDDB and Metakit as a library for object serialization because they both allow the data to be stored in either pure RAM or on disk transparently. Metakit seems a lot easier to use since it's more aware of python data types than BSDDB.

I have some questions about Metakit though. First, can it handle large files? It's quite common to have to process tables that are 10+ GB in size when dealing with data warehouses. Are there any limits on number of rows, etc?

Second, is there support for arbitrary types? Currently the only type I need to handle that's not supported by either Metakit or BSDDB is the decimal data type. Databases often have decimal, Numeric, or money data types are common in BI data, and I don't really want to convert them to floats, especially if I'm doing arithmetic on them. Are there any plans to add the decimal data type to Metakit?

Also on the topic of data types, how does Metakit deal with unicode strings? I don't remember seeing types for those.

In BSDDB, everything has to be serialized into binary strings. This means atypical data types like decimal and unicode strings must be serialize into binary strings that will compare properly with a low level memcmp operation. I think I can do that for the decimal type, but I don't know at all about unicode strings. If Metakit can handle these types and create large files, it will definitely be my choice.

Brian
_____________________________________________
Metakit mailing list  -  [email protected]
http://www.equi4.com/mailman/listinfo/metakit

Reply via email to