Dear Helpdesk,

I was using CDK (version 2.7) to generate FCFP4 and 6 for the compound 
butyramide (https://www.ebi.ac.uk/chembl/api/data/molecule/CHEMBL1231396.sdf) 
and ethanol (https://www.ebi.ac.uk/chembl/api/data/molecule/CHEMBL545.sdf) from 
their MolFiles which I got from CHEMBL. I was using the following commands in 
CDK:
---------------------------------------------------------------------------------------------------------------
package ecfp;
import java.io.*;
import com.opencsv.CSVReader;
import com.opencsv.CSVReaderBuilder;
import com.opencsv.CSVWriter;
import com.opencsv.exceptions.CsvException;
import java.util.Arrays;
import java.util.List;
import java.io.FileInputStream;
import java.io.IOException;

import org.openscience.cdk.exception.CDKException;
import org.openscience.cdk.fingerprint.CircularFingerprinter;
import org.openscience.cdk.fingerprint.ExtendedFingerprinter;
import org.openscience.cdk.fingerprint.ICountFingerprint;
import org.openscience.cdk.interfaces.IAtomContainer;
import org.openscience.cdk.interfaces.IChemObjectBuilder;
import org.openscience.cdk.io.MDLV2000Reader;
import org.openscience.cdk.silent.SilentChemObjectBuilder;

public class main{
              public static void main(String[] args) throws CDKException, 
IOException {
              String filename = 
"C:\\Users\\NGWO0001\\Downloads\\CHEMBL545.sdf.txt";
              FileInputStream in = new FileInputStream(filename);
              MDLV2000Reader reader = new MDLV2000Reader(in);
              IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
              IAtomContainer mol = reader.read(bldr.newAtomContainer());

              CircularFingerprinter fingerprinter0 = new CircularFingerprinter(
                CircularFingerprinter.CLASS_FCFP4
              );


              System.out.println("FCFP4 Ethanol:");
              ICountFingerprint result0 = 
fingerprinter0.getCountFingerprint(mol);
              for (int k=0, n = result0.numOfPopulatedbins(); k < n; ++k) {
                           String ans4 = "";
                           ans4 += result0.getHash(k);
                           ans4 += " " + result0.getCount(k);
                           System.out.printf("%s\n",ans4);
              }

              reader.close();
    }
}
---------------------------------------------------------------------------------------------------------------

The results I got were:

FCFP4 Butyramide:
-1393198889 1
-1212393386 1
-1131767167 2
0 4
2 1
3 1
425233353 1
785469695 1
824716024 1
994111779 1
1429107614 1

FCFP6 Butyramide:
-1393198889 1
-1212393386 1
-1131767167 2
0 4
2 1
3 1
425233353 1
785469695 1
824716024 1
994111779 1
1429107614 1

FCFP4 Ethanol:
-1212393386 1
0 2
3 1
629394235 1
824716024 1

FCFP6 Ethanol:
-1212393386 1
0 2
3 1
629394235 1
824716024 1

I think these results may not be right since I thought that fingerprints are 
supposed to be a series of hash and so they ought to be a series of 
fixed-length integers. However, as you see in the results I got, for example, 
for the FCFP6 for ethanol, one is 10-digits long while others are single digits 
and 9-digits long.

Can you please tell me what I am doing wrong?

Thanking you in advance for your assistance and time.

Best regards,
Woon Yee
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to