Dear cdk users,
It seems that it's impossible to get other results than NaN values for
the following descriptors:
Wgamma1.unity = NaN
Wgamma2.unity = NaN
Wgamma3.unity = NaN
WG.unity = NaN
I've tested with the last CDK version. The CDKDesc GUI of Rajarshi
gives similar results with a previous CDK version.
Bellow is attached a small snippet that perform calculation for WHIM
descriptors & PSA to perform a quicktest. Running on a simple SDF file
containing cyclohexane, pyrrole & 3H-indole, I get the previously
described problem, plus all values at NaN for the pyrrole molecule.
Don't know if it's a known issue, a new bug, or my own ignorance on
how to use correctely this descriptors.
Also, I used the CDK to load several millions of molecules from
existing chemical providers; During the process, several atom types
were not recognized. If Egon or anyone else is interested in viewing
these compounds, just send me an email.
Cheers :)
Vincent.
=========== SNIPPET ==========/**
import java.io.FileInputStream;
import java.util.ArrayList;
import java.util.List;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.openscience.cdk.DefaultChemObjectBuilder;
import org.openscience.cdk.aromaticity.CDKHueckelAromaticityDetector;
import org.openscience.cdk.atomtype.CDKAtomTypeMatcher;
import org.openscience.cdk.graph.ConnectivityChecker;
import org.openscience.cdk.interfaces.IAtom;
import org.openscience.cdk.interfaces.IAtomType;
import org.openscience.cdk.interfaces.IMolecule;
import org.openscience.cdk.interfaces.IMoleculeSet;
import org.openscience.cdk.interfaces.IPseudoAtom;
import org.openscience.cdk.io.iterator.IteratingMDLReader;
import org.openscience.cdk.qsar.DescriptorValue;
import org.openscience.cdk.qsar.descriptors.molecular.TPSADescriptor;
import org.openscience.cdk.qsar.descriptors.molecular.WHIMDescriptor;
import org.openscience.cdk.tools.CDKHydrogenAdder;
import org.openscience.cdk.tools.manipulator.AtomContainerManipulator;
import org.openscience.cdk.tools.manipulator.AtomTypeManipulator;
/**
*
* @author vince
*/
public class CdkTest {
public static void test(String fileName) throws Exception {
// Open SD file
FileInputStream ins = new FileInputStream(fileName) ;
IteratingMDLReader reader = new IteratingMDLReader(ins,
DefaultChemObjectBuilder.getInstance());
// Load all molecules in memory & clean them
List<IMolecule> mols = new ArrayList() ;
IMolecule mol ;
while (reader.hasNext()) {
mol = (IMolecule) reader.next();
try {
mol = cleanMolecule(mol, true, true, false) ;
mols.add(mol) ;
}
catch(Exception e) {
e.printStackTrace();
}
}
reader.close();
// Calculate descriptor & print results
for(int i = 0 ; i < mols.size() ; i++) {
System.out.println("\n=== Molecule "+i+": \n ");
DescriptorValue dv = new
WHIMDescriptor().calculate(mols.get(i)) ;
DescriptorValue psa = new
TPSADescriptor().calculate(mols.get(i)) ;
String [] vals = dv.getValue().toString().split(",") ;
String [] names = dv.getNames() ;
for(int j = 0 ; j < vals.length ; j++) {
System.out.println(names[j]+" = "+vals[j]);
}
System.out.println(psa.getNames()[0]+" =
"+psa.getValue().toString());
}
}
/**
*
* Clean an input molecule, including configuration of atom types and
* aromaticity detection. Additional actions are available; see
parameters.
*
* @param m
* @param keepLargestFrag If true, remove any disconected
fragment, and keep the largest one
* @param explicitH If true, add explicit hydrogens. If
false, only implicit H are added
* @param forceExotic If true, do not throw exception for
unrecognized atom types
* @return The cleaned molecule
* @throws Exception
*/
public static IMolecule cleanMolecule(IMolecule m,
boolean keepLargestFrag,
boolean explicitH,
boolean forceExotic)
throws Exception {
// Check for salts and such, if asked -> simply keep the
largest fragment
if (keepLargestFrag) {
if (!ConnectivityChecker.isConnected(m)) {
IMoleculeSet fragments =
ConnectivityChecker.partitionIntoMolecules(
m);
int maxID = 0;
int maxVal = Integer.MIN_VALUE;
int atomCount = -1;
for (int i = 0; i < fragments.getMoleculeCount(); i++) {
atomCount = fragments.getMolecule(i).getAtomCount();
if (atomCount > maxVal) {
maxID = i;
maxVal = atomCount;
}
}
m = fragments.getMolecule(maxID);
}
}
// Configure the molecule atom types & add implicit hydrogens
// 1. The fastest way (fastest = fiewer code), but we don't control
// everything, namely exotic atom types
// AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(m);
//
// CDKHydrogenAdder hAdder =
CDKHydrogenAdder.getInstance(m.getBuilder());
// hAdder.addImplicitHydrogens(m);
// 2. The custom way: more code, but more control on atom typing
CDKAtomTypeMatcher matcher = CDKAtomTypeMatcher.getInstance(
m.getBuilder());
CDKHydrogenAdder hAdder =
CDKHydrogenAdder.getInstance(m.getBuilder());
// Assign atom types for all atoms
for (IAtom atom : m.atoms()) {
if (!(atom instanceof IPseudoAtom)) {
IAtomType matched = matcher.findMatchingAtomType(m, atom);
if (matched != null) {
AtomTypeManipulator.configure(atom, matched);
hAdder.addImplicitHydrogens(m, atom);
}
else {
// Here the CDK doesn't know the atom type...
if (!forceExotic) {
throw new Exception("Unknown atom type " +
atom.getSymbol());
}
}
}
}
// Detect aromaticity
CDKHueckelAromaticityDetector.detectAromaticity(m);
// Add explicit hydrogens, if asked
if (explicitH) {
AtomContainerManipulator.convertImplicitToExplicitHydrogens(m);
// Percieve atom types again to assign hydrogens atom types
AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(m);
}
return m;
}
public static void main(String [] args) {
String file = (args.length > 0 && args[0] != null) ?
args[0]:"dummy.sdf" ;
try {
test(file);
}
catch (Exception ex) {
ex.printStackTrace();
}
}
}
------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user