Re: [Rdkit-discuss] Question
[redirecting to list since this may be of general interest] Yes, I generally store molecules in databases in blob columns containing the pickles. The primary reason for this is that one can then skip all the work of parsing the molecule, perceiving the chemistry, etc. I don't have a good general answer for how long pickles are. It really depends on the molecules. One example I have handy is a sqlite database containing the pubchem screening deck. The molecules are stores as follows: sqlite .schema CREATE TABLE molecules (compound_id varchar not null unique,molpkl blob); sqlite select count(*) from molecules; 214178 % ls -l Compounds.sqlt -rw-r--r-- 1 landrgr1 staff 167240704 Nov 22 07:28 Compounds.sqlt There is, no doubt, some overhead associated with the sqlite data, but this gives a rough estimate. -greg On Thu, Apr 30, 2009 at 10:55 AM, Evgueni Kolossov ekolos...@gmail.com wrote: and what's the length of Pickles? 2009/4/30 Evgueni Kolossov ekolos...@gmail.com: Greg, In this case you probably storing Pickles into database together with fingerprints. Am I right? Regards, Evgueni 2009/4/30 Greg Landrum greg.land...@gmail.com: nope... the transformation is a lossy one On Thu, Apr 30, 2009 at 9:56 AM, Evgueni Kolossov ekolos...@gmail.com wrote: Hi Greg, Another probably stupid question - is it possible to re-create ROMol from fingerprints? Regards, Evgueni -- Dr. Evgueni Kolossov (PhD) ekolos...@gmail.com Tel. +44(0)1628 627168 Mob. +44(0)7812070446 -- Dr. Evgueni Kolossov (PhD) ekolos...@gmail.com Tel. +44(0)1628 627168 Mob. +44(0)7812070446
Re: [Rdkit-discuss] Question
Thanks Greg, Unfortunately I do not quite got it - you mean the size of your example is 167240704 bytes? Regards, Evgueni 2009/4/30 Greg Landrum greg.land...@gmail.com: [redirecting to list since this may be of general interest] Yes, I generally store molecules in databases in blob columns containing the pickles. The primary reason for this is that one can then skip all the work of parsing the molecule, perceiving the chemistry, etc. I don't have a good general answer for how long pickles are. It really depends on the molecules. One example I have handy is a sqlite database containing the pubchem screening deck. The molecules are stores as follows: sqlite .schema CREATE TABLE molecules (compound_id varchar not null unique,molpkl blob); sqlite select count(*) from molecules; 214178 % ls -l Compounds.sqlt -rw-r--r-- 1 landrgr1 staff 167240704 Nov 22 07:28 Compounds.sqlt There is, no doubt, some overhead associated with the sqlite data, but this gives a rough estimate. -greg On Thu, Apr 30, 2009 at 10:55 AM, Evgueni Kolossov ekolos...@gmail.com wrote: and what's the length of Pickles? 2009/4/30 Evgueni Kolossov ekolos...@gmail.com: Greg, In this case you probably storing Pickles into database together with fingerprints. Am I right? Regards, Evgueni 2009/4/30 Greg Landrum greg.land...@gmail.com: nope... the transformation is a lossy one On Thu, Apr 30, 2009 at 9:56 AM, Evgueni Kolossov ekolos...@gmail.com wrote: Hi Greg, Another probably stupid question - is it possible to re-create ROMol from fingerprints? Regards, Evgueni --
Re: [Rdkit-discuss] Question
Yes, the database containing the 214K molecules is 167MB On Thu, Apr 30, 2009 at 7:55 PM, Evgueni Kolossov ekolos...@gmail.com wrote: Thanks Greg, Unfortunately I do not quite got it - you mean the size of your example is 167240704 bytes? Regards, Evgueni 2009/4/30 Greg Landrum greg.land...@gmail.com: [redirecting to list since this may be of general interest] Yes, I generally store molecules in databases in blob columns containing the pickles. The primary reason for this is that one can then skip all the work of parsing the molecule, perceiving the chemistry, etc. I don't have a good general answer for how long pickles are. It really depends on the molecules. One example I have handy is a sqlite database containing the pubchem screening deck. The molecules are stores as follows: sqlite .schema CREATE TABLE molecules (compound_id varchar not null unique,molpkl blob); sqlite select count(*) from molecules; 214178 % ls -l Compounds.sqlt -rw-r--r-- 1 landrgr1 staff 167240704 Nov 22 07:28 Compounds.sqlt There is, no doubt, some overhead associated with the sqlite data, but this gives a rough estimate. -greg
Re: [Rdkit-discuss] Question
There really isn't a maximum. It depends on the number of atoms, number of bonds, and number of conformers. On Thu, Apr 30, 2009 at 9:09 PM, Evgueni Kolossov ekolos...@gmail.com wrote: Ok , so the average size 781 byte. What's the max size of one molecule can be in theory? 2009/4/30 Greg Landrum greg.land...@gmail.com: Yes, the database containing the 214K molecules is 167MB On Thu, Apr 30, 2009 at 7:55 PM, Evgueni Kolossov ekolos...@gmail.com wrote: Thanks Greg, Unfortunately I do not quite got it - you mean the size of your example is 167240704 bytes? Regards, Evgueni 2009/4/30 Greg Landrum greg.land...@gmail.com: [redirecting to list since this may be of general interest] Yes, I generally store molecules in databases in blob columns containing the pickles. The primary reason for this is that one can then skip all the work of parsing the molecule, perceiving the chemistry, etc. I don't have a good general answer for how long pickles are. It really depends on the molecules. One example I have handy is a sqlite database containing the pubchem screening deck. The molecules are stores as follows: sqlite .schema CREATE TABLE molecules (compound_id varchar not null unique,molpkl blob); sqlite select count(*) from molecules; 214178 % ls -l Compounds.sqlt -rw-r--r-- 1 landrgr1 staff 167240704 Nov 22 07:28 Compounds.sqlt There is, no doubt, some overhead associated with the sqlite data, but this gives a rough estimate. -greg -- Dr. Evgueni Kolossov (PhD) ekolos...@gmail.com Tel. +44(0)1628 627168 Mob. +44(0)7812070446