On 09/11/2010 13:53, Andrew Dalke wrote: > I am also willing to write tests which cover these.
On Nov 10, 2010, at 11:22 AM, Chris Morley wrote: > Even better! I spent a lot of time on trains yesterday, and test cases were the right mental level for me. Here's what I developed: http://dalkescientific.com/test_python.py They are meant so that people have a way to figure out how to call different functions, and they are meant to be fast, so they can be used as real unit tests. (The 66 test cases / ~400 tests) run in about 1.5 seconds on my laptop.) I don't know how you all want that hooked into the existing test framework, but I figured it was best to start with test cases. It also looks like I should get commit privs to OpenBabel. I noticed a number of minor problems when developing the tests. These are all marked with an "XXX" in the above code, although there are a few places marked with XXX which are either TODOs or places where I didn't understand the result when I wrote it but I believe I do now. 1) If I do something like mol = parse_smiles("CCO") mol.SetTitle("#1") conv.WriteFile(mol, tempdir("blah.sdf")) I get the message Warning in WriteMolecule No 2D or 3D coordinates exist. Any stereochemical information will be lost. To generate 2D or 3D coordinates use --gen2D or --gen3d. Since I'm not (directly) calling WriteMolecule and since my test cases don't support the --gen2D or --gen3d flags, this warning isn't that helpful in a library function. 2) ob.OBFingerprint.List("fingerprints") This writes to an output stream, and I can't figure out how to generate a string-based output stream in Python. However, this is not that important because there are other ways to access the same data. It does suggest that List() is not a useful library function. 3) Why does the fingerprint GetBit require a fingerprinter instance? That is, I have to go through a bound method x = ob.OBFingerprint.FindFingerprint("FP2") self.assertEquals(x.GetBit(v, 0), True) instead of using a static/global function self.assertEquals(ob.OBFingerprint.GetBit(v, 0), True) 4) Here's a test case I didn't expect. It seems that GetUMapList interacts with the results of GetMapList. def test_basic_match_behavior_which_I_did_not_expect(self): mol = parse_smiles("c1ccccc1O") pat = parse_smarts("c1ccccc1") pat.Match(mol) self.assertEquals(pat.NumMatches(), 12) results = pat.GetUMapList() self.assertEquals(pat.NumMatches(), 1) results = pat.GetMapList() # I really expected these to be 12. # It appears the UMapList does an in-place trim. # XXX Is that the right/expected behavior? self.assertEquals(pat.NumMatches(), 1) self.assertEquals(len(results), 1) pat.Match(mol) # Here they are 12 results = pat.GetMapList() self.assertEquals(pat.NumMatches(), 12) results = pat.GetUMapList() self.assertEquals(pat.NumMatches(), 1) self.assertEquals(len(results), 1) 5) The BeginMList/EndMList seems broken in Python. That is, I couldn't figure how how to use them. 6) I included tests for iterating over atoms and bonds in a molecule, and for atoms and bonds connected to an atom. I expected I could also iterate over the atoms in a bond, but there is no OBBondAtomIter . While not important, it appeals to my sense of symmetry and it's very simple to write. 7) Why can't I do mol.AddBond(C, N, 3) where C and N are atoms? Instead, I need to do mol.AddBond(C.GetIdx(), N.GetIdx(), 3) 8) Why are there two ways to add an atom? These are: atom = mol.NewAtom() -and- atom = OBAtom() mol.AddAtom(atom) Is one of these forms preferred? 9) How does OpenBabel define "implicit hydrogen"? It's different than my definition, which in [NH3].C is the explicitly stated implicit h-count in [NH3] and the inferred H-count in C (which is 4). OpenBabel treats the H3 as explicit. It appears that OpenBabel doesn't have an implicit h-count field and instead tracks the valence, then calculates the h-count based on the valence. 10) Shouldn't the molecular formula for "[NH4+]" include the total formal charge? 11) The formal results for c1ccccc1O.[NH4+] are very strange Take a look at this test self.assertEquals(mol.GetSpacedFormula(), "C 6 H 10 N 1 O 1 ") self.assertEquals(mol.GetSpacedFormula(1), "C 6 H 10 N O ") Why is there a double space in "N 1" and "O 1"? Why is there a space after the last "1"? And just how useful is the "include implicit hydrogens" flag to generate self.assertEquals(mol.GetSpacedFormula(0, ' ', 0), "C 6 H 4 N 1 O 1 ") ? 12) I don't think atom.GetCoordinate() is accessible from Python except through ctypes (Python's foreign function interface) 13) Why does GetAtom() start from 1 while GetBond() start from 0? (Perhaps this is in the documentation somewhere, but I was on a train without access to the internet and Google) mol = parse_smiles("C#N") C = mol.GetAtom(1) N = mol.GetAtom(2) # XXX Why do bonds starts from 0 and not 1 self.assertEquals(mol.GetBond(1), None) 14) The documentation for OBBond.GetNbrAtom needs to warn that if the atom isn't one of the bond atoms then the GetBeginAtom() will be returned. (This documentation for GetNbrAtomIdx already has a warning to that effect.) 15) In an OBRing, if there are multiple non-C atoms, which one is the root? And since ring.GetRootAtom() returns an integer, shouldn't it be named GetRootAtomIdx()? 16) Why does GetAngle() return degrees instead of radians? I see the warning to that effect, but it was rather unexpected to see. 17) If you have something like "CNOS" then C.IsConnected(C) is true. I did not expect that. This is because IsConnected checks to see if the target is any of the atoms of either side of the bonds that are connected to C. And of course, C is always one of the atoms connected to a bond of C. This has a downstream consequence in that C.IsOneFour(C) == True If I read the code right, the C in "CN" will also be in a 1-4 relationship with itself. 18) There are some minor problems with the spectrophoretest.cpp test01. (I didn't review all of the tests.) In OB_REQUIRE( (r[13] > 3.470) && (r[13] < 3.472) ); OB_REQUIRE( (r[14] > 6.698) && (r[14] < 6.800) ); OB_REQUIRE( (r[15] > 9.485) && (r[15] < 9.487) ); the line with r[14] should have 6.700 as the upper bound. and OB_REQUIRE( (r[18] > 4.899) && (r[18] < 4.902) ); should be slightly smaller in the upper bound OB_REQUIRE( (r[18] > 4.899) && (r[18] < 4.901) ); 19) I see that MMFF94's ValidateGradients() dumps output to stdout. The other ValidateGradients I tested (gaff and ghemical) do not do that, and the output probably isn't appropriate for a library. Then again, the API docs say that it's meant for debugging. Should the debugging API be published for general use? 20) OBAtom.HtoMethtyl() can also dump code to stderr v3: < -0.28, 0, 0 > v4: < -0.28, 0, 0 > This is quite clearly a bit of debugging code which wasn't removed: cerr << "v3: " << v3 << " v4: " << v4 << endl; Cheers! Andrew da...@dalkescientific.com ------------------------------------------------------------------------------ Centralized Desktop Delivery: Dell and VMware Reference Architecture Simplifying enterprise desktop deployment and management using Dell EqualLogic storage and VMware View: A highly scalable, end-to-end client virtualization framework. Read more! http://p.sf.net/sfu/dell-eql-dev2dev _______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss