I have two questions about Metakit for y'all.
First the background: I'm writing ETL software in Python. This means I
am interfacing with databases, but also storing serialized objects on
disk with ordering. I like both BSDDB and Metakit as a library for
object serialization because they both allow the data to be stored in
either pure RAM or on disk transparently. Metakit seems a lot easier to
use since it's more aware of python data types than BSDDB.
I have some questions about Metakit though. First, can it handle large
files? It's quite common to have to process tables that are 10+ GB in
size when dealing with data warehouses. Are there any limits on number
of rows, etc?
Second, is there support for arbitrary types? Currently the only type I
need to handle that's not supported by either Metakit or BSDDB is the
decimal data type. Databases often have decimal, Numeric, or money data
types are common in BI data, and I don't really want to convert them to
floats, especially if I'm doing arithmetic on them. Are there any plans
to add the decimal data type to Metakit?
Also on the topic of data types, how does Metakit deal with unicode
strings? I don't remember seeing types for those.
In BSDDB, everything has to be serialized into binary strings. This
means atypical data types like decimal and unicode strings must be
serialize into binary strings that will compare properly with a low
level memcmp operation. I think I can do that for the decimal type, but
I don't know at all about unicode strings. If Metakit can handle these
types and create large files, it will definitely be my choice.
Brian
_____________________________________________
Metakit mailing list - [email protected]
http://www.equi4.com/mailman/listinfo/metakit