Re: [OpenBabel-Devel] Algorithm for FF atomtypes

2019-10-19 Thread Geoffrey Hutchison
I'd have to hunt through the code more carefully, but in formats like Gaussian, 
we know that all atoms (including hydrogens) are explicit.

Unfortunately, that's not true from PDB — in the vast majority of cases, 
hydrogens are omitted and implicit.

So, for example in src/formats/pdbformat.cpp (master):
> FOR_ATOMS_OF_MOL(matom, mol)
>   OBAtomAssignTypicalImplicitHydrogens(&*matom);

Now maybe it's useful to have a PDB read option that says "I know I have 
explicit hydrogens" ?

-Geoff___
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel


Re: [OpenBabel-Devel] Algorithm for FF atomtypes

2019-10-19 Thread David van der Spoel

Den 2019-10-18 kl. 23:04, skrev Geoffrey Hutchison:

How about Gaussian log files? It works with those as well.
Somehow there bond orders are derived correctly as well although no such 
information is extracted from the log files.


Are you saying that PDB is giving you different information than the Gaussian 
files? If so, I'd be curious to see some files.

-Geoff


I've uploaded a file to my work webserver
http://folding.bmc.uu.se/diethyl-sulfate-3-oep.log.gz

Using the master branch, freshly updated and built:

% obenergy -ff GAFF diethyl-sulfate-3-oep.log.gz

3   os  NO
4   s6  NO
5   os  NO
6   o   NO
7   o   NO


Now, using pdb:
% obabel -ig09 diethyl-sulfate-3-oep.log.gz -opdb -O test.pdb
1 molecule converted
% obenergy -ff GAFF test.pdb

3   os  NO
4   s6  NO
5   os  NO
6   oh  NO
7   oh  NO

(note that no hydrogens are added but the atomtype changes).

Possibly related to this is that conversion to sdf seems to have changed 
for this compound from version 2.4.1 to master (3.0). The double bonds 
S=O have disappered, however there is another change in the sixth column 
after the element symbol:


% diff ds2.4.1.sdf ds3.0dev.sdf
10,11c10,11
< 0.00011.2905   -1.2428 O   0  0  0  0  0  0  0  0  0  0  0  0
<-0.00021.29051.2428 O   0  0  0  0  0  0  0  0  0  0  0  0
---
> 0.00011.2905   -1.2428 O   0  0  0  0  0  1  0  0  0  0  0  0
>-0.00021.29051.2428 O   0  0  0  0  0  1  0  0  0  0  0  0
29c29
<   4  7  2  0  0  0  0
---
>   4  7  1  0  0  0  0
32c32
<   6  4  2  0  0  0  0
---
>   6  4  1  0  0  0  0

To make things more complicated the charges generated are different for 
the two SDF file as well...


Sorry for the long message!
--
David van der Spoel, Ph.D., Professor of Biology
Head of Department, Cell & Molecular Biology, Uppsala University.
Box 596, SE-75124 Uppsala, Sweden. Phone: +46184714205.
http://www.icm.uu.se


___
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel


Re: [OpenBabel-Devel] Algorithm for FF atomtypes

2019-10-18 Thread Geoffrey Hutchison
> obenergy -ff GAFF file.pdb
> I get different bonds then when I run
> obenergy -ff GAFF file.sdf
> 
> Since the atomtyping is done based on smarts I wonder whether different 
> algorithms are used for generating smarts depending on the input file?

Different file types may or may not present full bonding information, esp. XYZ 
or PDB.

Something like SDF is great because it provides bonds, valence, etc. (i.e., the 
full valence structure of the molecule).

With PDB.. well, we have full bonding information for residues, but otherwise 
it depends on what's available in the CONECT records (if they exist). Normally, 
we have to at least do bond order perception.

I'm pretty sure you asked over the summer - my group is working on some 
ML-based methods for improved bond order assignments and we'd be happy to 
collaborate. Obviously, once it's in good state, we'll put everything up, 
including the training and test sets (i.e., you can add problematic molecules 
to improve the training).

Hope that helps,
-Geoff

---
Prof. Geoffrey Hutchison
Department of Chemistry
University of Pittsburgh
tel: (412) 648-0492
email: geo...@pitt.edu
twitter: @ghutchis
web: https://hutchison.chem.pitt.edu/

___
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel


Re: [OpenBabel-Devel] Algorithm for FF atomtypes

2019-10-18 Thread Noel O'Boyle
Does your pdb file have hydrogens? If not, even more guessing. Gaussian
files have all the atoms at least.

On Fri, 18 Oct 2019, 19:54 David van der Spoel, 
wrote:

> Den 2019-10-18 kl. 20:37, skrev Noel O'Boyle:
> > With pdb files we have to guess the bond orders. With sdf files, we
> > don't. There shouldn't be any other difference.
>
> How about Gaussian log files? It works with those as well.
> Somehow there bond orders are derived correctly as well although no such
> information is extracted from the log files.
> >
> > On Fri, 18 Oct 2019, 18:52 David van der Spoel,  > > wrote:
> >
> > Hi,
> >
> > it seems that the algorithm for determining bonds and atomtypes
> depends
> > on the input file, is that correct?
> > When I run
> >
> > obenergy -ff GAFF file.pdb
> > I get different bonds then when I run
> > obenergy -ff GAFF file.sdf
> >
> > Since the atomtyping is done based on smarts I wonder whether
> different
> > algorithms are used for generating smarts depending on the input
> file?
> >
> > Alternatively, tips for debugging this, where to look in the code,
> > would
> > be appreciated.
> >
> > Cheers,
> > --
> > David van der Spoel, Ph.D., Professor of Biology
> > Head of Department, Cell & Molecular Biology, Uppsala University.
> > Box 596, SE-75124 Uppsala, Sweden. Phone: +46184714205.
> > http://www.icm.uu.se
> >
> >
> > ___
> > OpenBabel-Devel mailing list
> > OpenBabel-Devel@lists.sourceforge.net
> > 
> > https://lists.sourceforge.net/lists/listinfo/openbabel-devel
> >
>
>
> --
> David van der Spoel, Ph.D., Professor of Biology
> Head of Department, Cell & Molecular Biology, Uppsala University.
> Box 596, SE-75124 Uppsala, Sweden. Phone: +46184714205.
> http://www.icm.uu.se
>
>
> ___
> OpenBabel-Devel mailing list
> OpenBabel-Devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-devel
>
___
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel


Re: [OpenBabel-Devel] Algorithm for FF atomtypes

2019-10-18 Thread David van der Spoel

Den 2019-10-18 kl. 20:37, skrev Noel O'Boyle:
With pdb files we have to guess the bond orders. With sdf files, we 
don't. There shouldn't be any other difference.


How about Gaussian log files? It works with those as well.
Somehow there bond orders are derived correctly as well although no such 
information is extracted from the log files.


On Fri, 18 Oct 2019, 18:52 David van der Spoel, > wrote:


Hi,

it seems that the algorithm for determining bonds and atomtypes depends
on the input file, is that correct?
When I run

obenergy -ff GAFF file.pdb
I get different bonds then when I run
obenergy -ff GAFF file.sdf

Since the atomtyping is done based on smarts I wonder whether different
algorithms are used for generating smarts depending on the input file?

Alternatively, tips for debugging this, where to look in the code,
would
be appreciated.

Cheers,
-- 
David van der Spoel, Ph.D., Professor of Biology

Head of Department, Cell & Molecular Biology, Uppsala University.
Box 596, SE-75124 Uppsala, Sweden. Phone: +46184714205.
http://www.icm.uu.se


___
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/openbabel-devel




--
David van der Spoel, Ph.D., Professor of Biology
Head of Department, Cell & Molecular Biology, Uppsala University.
Box 596, SE-75124 Uppsala, Sweden. Phone: +46184714205.
http://www.icm.uu.se


___
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel


Re: [OpenBabel-Devel] Algorithm for FF atomtypes

2019-10-18 Thread Noel O'Boyle
With pdb files we have to guess the bond orders. With sdf files, we don't.
There shouldn't be any other difference.

On Fri, 18 Oct 2019, 18:52 David van der Spoel, 
wrote:

> Hi,
>
> it seems that the algorithm for determining bonds and atomtypes depends
> on the input file, is that correct?
> When I run
>
> obenergy -ff GAFF file.pdb
> I get different bonds then when I run
> obenergy -ff GAFF file.sdf
>
> Since the atomtyping is done based on smarts I wonder whether different
> algorithms are used for generating smarts depending on the input file?
>
> Alternatively, tips for debugging this, where to look in the code, would
> be appreciated.
>
> Cheers,
> --
> David van der Spoel, Ph.D., Professor of Biology
> Head of Department, Cell & Molecular Biology, Uppsala University.
> Box 596, SE-75124 Uppsala, Sweden. Phone: +46184714205.
> http://www.icm.uu.se
>
>
> ___
> OpenBabel-Devel mailing list
> OpenBabel-Devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-devel
>
___
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel


Re: [OpenBabel-Devel] Algorithm for FF atomtypes

2019-10-18 Thread Stefano Forli
David,

have you tried to see if other bond-typed formats, like Mol2, give the same or 
different 
results?

I suspect the bond typing to play a role in this issue. In my experience with 
SMARTS and 
PDBs, I noticed that very often bond recognition failed, sometimes due to tiny 
deviations 
from ideal geometries, and sometimes for unknown reasons.

If your 'file.sdf' was generated from 'file.pdb', then could it be that 
obenergy does not 
perform the exact same automated perceptions obabel does when converting the 
files.

My 2 cents,

S

On 10/18/19 10:26 AM, David van der Spoel wrote:
> Hi,
> 
> it seems that the algorithm for determining bonds and atomtypes depends on 
> the input file, 
> is that correct?
> When I run
> 
> obenergy -ff GAFF file.pdb
> I get different bonds then when I run
> obenergy -ff GAFF file.sdf
> 
> Since the atomtyping is done based on smarts I wonder whether different 
> algorithms are 
> used for generating smarts depending on the input file?
> 
> Alternatively, tips for debugging this, where to look in the code, would be 
> appreciated.
> 
> Cheers,

-- 

  Stefano Forli, PhD

  Assistant Professor
  Center for Computational Structural Biology

  Dept. of Integrative Structural
  and Computational Biology, MB-112A
  The Scripps Research Institute
  10550  North Torrey Pines Road
  La Jolla,  CA 92037-1000,  USA.

 tel: +1 (858)784-2055
 fax: +1 (858)784-2860
 email: fo...@scripps.edu
 http://www.scripps.edu/~forli/

___
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel