Hi Egon,

I have tried the getFingerprint() method but the outputs are not a series of 
fixed-length integers too. The output remains to have different lengths of 
integers, for example, 9-digit vs 10-digit.

Hi John,
I have modify the code from ans4 += result0.getHash(k); to ans4 += 
Integer.toHexString(result0.getHash(k));.
Below is my result for Ethanol’s FCFP4:

FCFP4 Ethanol:
b7bc5856 1
0 2
3 1
2583cb3b 1
31282af8 1

After padding:
FCFP4 Ethanol:
b7bc5856 1
00000000 2
00000003 1
2583cb3b 1
31282af8 1

Can you please tell me if I understand you correctly in the previous email?

Thank you very much to both of you for the attention to this matter.

Best Regards,
Woon Yee.

From: John Mayfield<mailto:john.wilkinson...@gmail.com>
Sent: Monday, 25 July, 2022 6:33 PM
To: Egon Willighagen<mailto:egon.willigha...@gmail.com>
Cc: #NG WOON YEE#<mailto:ngwo0...@e.ntu.edu.sg>; 
cdk-user@lists.sourceforge.net<mailto:cdk-user@lists.sourceforge.net>; Chong 
Kim San Allen<mailto:kimsanallen.ch...@ntu.edu.sg>
Subject: Re: [Cdk-user] Generation of FCFP Fingerprint

Hi Woon Yee,

The method is correct, you can emit them at hexadecimal and pad with 0.

John

On Mon, 25 Jul 2022 at 10:11, Egon Willighagen 
<egon.willigha...@gmail.com<mailto:egon.willigha...@gmail.com>> wrote:

Dear Woon Yee,

you can use the getFingerprint() method instead.

Egon

On Mon, 25 Jul 2022 at 10:59, #NG WOON YEE# via Cdk-user 
<cdk-user@lists.sourceforge.net<mailto:cdk-user@lists.sourceforge.net>> wrote:
Dear Helpdesk,

I was using CDK (version 2.7) to generate FCFP4 and 6 for the compound 
butyramide (https://www.ebi.ac.uk/chembl/api/data/molecule/CHEMBL1231396.sdf) 
and ethanol (https://www.ebi.ac.uk/chembl/api/data/molecule/CHEMBL545.sdf) from 
their MolFiles which I got from CHEMBL. I was using the following commands in 
CDK:
---------------------------------------------------------------------------------------------------------------
package ecfp;
import java.io.*;
import com.opencsv.CSVReader;
import com.opencsv.CSVReaderBuilder;
import com.opencsv.CSVWriter;
import com.opencsv.exceptions.CsvException;
import java.util.Arrays;
import java.util.List;
import java.io.FileInputStream;
import java.io.IOException;

import org.openscience.cdk.exception.CDKException;
import org.openscience.cdk.fingerprint.CircularFingerprinter;
import org.openscience.cdk.fingerprint.ExtendedFingerprinter;
import org.openscience.cdk.fingerprint.ICountFingerprint;
import org.openscience.cdk.interfaces.IAtomContainer;
import org.openscience.cdk.interfaces.IChemObjectBuilder;
import org.openscience.cdk.io.MDLV2000Reader;
import org.openscience.cdk.silent.SilentChemObjectBuilder;

public class main{
              public static void main(String[] args) throws CDKException, 
IOException {
              String filename = 
"C:\\Users\\NGWO0001\\Downloads\\CHEMBL545.sdf.txt";
              FileInputStream in = new FileInputStream(filename);
              MDLV2000Reader reader = new MDLV2000Reader(in);
              IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
              IAtomContainer mol = reader.read(bldr.newAtomContainer());

              CircularFingerprinter fingerprinter0 = new CircularFingerprinter(
                CircularFingerprinter.CLASS_FCFP4
              );


              System.out.println("FCFP4 Ethanol:");
              ICountFingerprint result0 = 
fingerprinter0.getCountFingerprint(mol);
              for (int k=0, n = result0.numOfPopulatedbins(); k < n; ++k) {
                           String ans4 = "";
                           ans4 += result0.getHash(k);
                           ans4 += " " + result0.getCount(k);
                           System.out.printf("%s\n",ans4);
              }

              reader.close();
    }
}
---------------------------------------------------------------------------------------------------------------

The results I got were:

FCFP4 Butyramide:
-1393198889 1
-1212393386 1
-1131767167 2
0 4
2 1
3 1
425233353 1
785469695 1
824716024 1
994111779 1
1429107614 1

FCFP6 Butyramide:
-1393198889 1
-1212393386 1
-1131767167 2
0 4
2 1
3 1
425233353 1
785469695 1
824716024 1
994111779 1
1429107614 1

FCFP4 Ethanol:
-1212393386 1
0 2
3 1
629394235 1
824716024 1

FCFP6 Ethanol:
-1212393386 1
0 2
3 1
629394235 1
824716024 1

I think these results may not be right since I thought that fingerprints are 
supposed to be a series of hash and so they ought to be a series of 
fixed-length integers. However, as you see in the results I got, for example, 
for the FCFP6 for ethanol, one is 10-digits long while others are single digits 
and 9-digits long.

Can you please tell me what I am doing wrong?

Thanking you in advance for your assistance and time.

Best regards,
Woon Yee
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net<mailto:Cdk-user@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/cdk-user


--
----
Super happy with this new eLife paper describing an Open Science project where 
we discuss 260 thousand natural products and where they came from, all 700 
thousand pairs linked to their primary literature: "The LOTUS initiative for 
open knowledge management in natural products research", 
https://doi.org/10.7554/elife.70780

-----
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Twitter/Mastodon: @egonwillighagen<https://twitter.com/egonwillighagen> / 
@egonw<https://scholar.social/@egonw>
Homepage: http://egonw.github.io/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: 0000-0001-7542-0286<http://orcid.org/0000-0001-7542-0286>
ImpactStory: https://impactstory.org/u/egonwillighagen
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net<mailto:Cdk-user@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/cdk-user

_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to