Re: [Rdkit-discuss] Extracting SMILES from text

2016-12-03 Thread Andrew Dalke
On Dec 2, 2016, at 5:46 PM, Brian Kelley wrote: > I hacked a version of RDKit's smiles parser to compute heavy atom count, > perhaps some version of this could be used to check smiles validity without > making the actual molecule. FWIW, here's my regex code for it, which makes the assumption tha

Re: [Rdkit-discuss] Extracting SMILES from text

2016-12-03 Thread Andrew Dalke
On Dec 3, 2016, at 3:02 PM, Brian Kelley wrote: > If I had to pick, I would just use the normal MolFromSmiles, if you don't > expect many actual smiles strings in your corpus, it's plenty fast. I didn't follow from your timings what you used to see if something was a SMILES candidate? Was it wo

Re: [Rdkit-discuss] Extracting SMILES from text

2016-12-03 Thread Brian Kelley
Note: I turned logging off, otherwise a lot of time was spent spewing to stderr: from rdkit import Chem, rdBase rdBase.DisableLog("rdApp.*") On Sat, Dec 3, 2016 at 9:02 AM, Brian Kelley wrote: > Here are some number from my laptop for parsing: > > Normal Smiles parser: > = > P

Re: [Rdkit-discuss] Extracting SMILES from text

2016-12-03 Thread Brian Kelley
Here are some number from my laptop for parsing: Normal Smiles parser: = Proper Smiles 11K/s Non Smiles words: 94K/s Don't make molecules (n.b. accepts some 'bad' smiles like C1CCC3) = Proper Smiles: 110K/s Non Smiles words: 130K/s If I had to pick, I would just

Re: [Rdkit-discuss] Hankering after faster builds

2016-12-03 Thread Greg Landrum
Lest my previous reply be mis-interpreted: I agree that it would be great if the builds were quicker - I end up needing to do a large number of them before each release - but I don't see much that can really be done other than removing a bunch of functionality. On Sat, Dec 3, 2016 at 8:42 AM, Gian

Re: [Rdkit-discuss] Hankering after faster builds

2016-12-03 Thread Greg Landrum
Builds do take a while, but there is *no way* they should be taking 2 hours unless they are running on extremely overloaded hardware. The travis builds, which include running all the tests, typically take less than 40 minutes. If, for some reason, you do still need to deal with this, I would guess