Another option is dask (https://docs.dask.org/en/latest/). I've used `map_partitions` from dask to bulk convert a column of smiles strings into various computed properties. You could then output to a CSV or other database file.
-- Peter On Mon, Jan 21, 2019 at 1:45 AM Markus Sitzmann <[email protected]> wrote: > > SQLalchemy creates a fairly specific ecosystem that you have to buy > > into for it to make sense. When you don't have objects, only a table > > of properties, OR mapper is just bloat. > > There is no need for objects with SQLAlchemy, SQLAlchemy's Core and its > expression language is pretty excellent without objects ... > > >With parallel processing your bottleneck is going to be database > >inserts. One option is write out CSV file(s) from each thread/job, > >concatenate them in the final node, and then bulk-import into the > >database: typically CSV (or other such format) bulk import is orders > >of magnitude faster than inserting one SQL statement at a time. > > ... and bulk-inserts of Python data types into the database. > > Markus > > On Sun, Jan 20, 2019 at 9:17 PM Dmitri Maziuk via Rdkit-discuss < > [email protected]> wrote: > >> On Sun, 20 Jan 2019 12:03:50 +0100 >> Shojiro Shibayama <[email protected]> wrote: >> >> > ... I guess SQLalchemy >> > in python might be good, but I'm not sure. Hope that you'll find out >> > a good library of SQL OR mapper for python. >> >> SQLalchemy creates a fairly specific ecosystem that you have to buy >> into for it to make sense. When you don't have objects, only a table >> of properties, OR mapper is just bloat. >> >> With parallel processing your bottleneck is going to be database >> inserts. One option is write out CSV file(s) from each thread/job, >> concatenate them in the final node, and then bulk-import into the >> database: typically CSV (or other such format) bulk import is orders >> of magnitude faster than inserting one SQL statement at a time. >> >> -- >> Dmitri Maziuk <[email protected]> >> >> >> _______________________________________________ >> Rdkit-discuss mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > _______________________________________________ > Rdkit-discuss mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

