Dear Helpdesk, I was using CDK (version 2.7) to generate FCFP4 and 6 for the compound butyramide (https://www.ebi.ac.uk/chembl/api/data/molecule/CHEMBL1231396.sdf) and ethanol (https://www.ebi.ac.uk/chembl/api/data/molecule/CHEMBL545.sdf) from their MolFiles which I got from CHEMBL. I was using the following commands in CDK: --------------------------------------------------------------------------------------------------------------- package ecfp; import java.io.*; import com.opencsv.CSVReader; import com.opencsv.CSVReaderBuilder; import com.opencsv.CSVWriter; import com.opencsv.exceptions.CsvException; import java.util.Arrays; import java.util.List; import java.io.FileInputStream; import java.io.IOException;
import org.openscience.cdk.exception.CDKException; import org.openscience.cdk.fingerprint.CircularFingerprinter; import org.openscience.cdk.fingerprint.ExtendedFingerprinter; import org.openscience.cdk.fingerprint.ICountFingerprint; import org.openscience.cdk.interfaces.IAtomContainer; import org.openscience.cdk.interfaces.IChemObjectBuilder; import org.openscience.cdk.io.MDLV2000Reader; import org.openscience.cdk.silent.SilentChemObjectBuilder; public class main{ public static void main(String[] args) throws CDKException, IOException { String filename = "C:\\Users\\NGWO0001\\Downloads\\CHEMBL545.sdf.txt"; FileInputStream in = new FileInputStream(filename); MDLV2000Reader reader = new MDLV2000Reader(in); IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance(); IAtomContainer mol = reader.read(bldr.newAtomContainer()); CircularFingerprinter fingerprinter0 = new CircularFingerprinter( CircularFingerprinter.CLASS_FCFP4 ); System.out.println("FCFP4 Ethanol:"); ICountFingerprint result0 = fingerprinter0.getCountFingerprint(mol); for (int k=0, n = result0.numOfPopulatedbins(); k < n; ++k) { String ans4 = ""; ans4 += result0.getHash(k); ans4 += " " + result0.getCount(k); System.out.printf("%s\n",ans4); } reader.close(); } } --------------------------------------------------------------------------------------------------------------- The results I got were: FCFP4 Butyramide: -1393198889 1 -1212393386 1 -1131767167 2 0 4 2 1 3 1 425233353 1 785469695 1 824716024 1 994111779 1 1429107614 1 FCFP6 Butyramide: -1393198889 1 -1212393386 1 -1131767167 2 0 4 2 1 3 1 425233353 1 785469695 1 824716024 1 994111779 1 1429107614 1 FCFP4 Ethanol: -1212393386 1 0 2 3 1 629394235 1 824716024 1 FCFP6 Ethanol: -1212393386 1 0 2 3 1 629394235 1 824716024 1 I think these results may not be right since I thought that fingerprints are supposed to be a series of hash and so they ought to be a series of fixed-length integers. However, as you see in the results I got, for example, for the FCFP6 for ethanol, one is 10-digits long while others are single digits and 9-digits long. Can you please tell me what I am doing wrong? Thanking you in advance for your assistance and time. Best regards, Woon Yee
_______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user