Yep that’s correct, they are just integers and there are different ways to encode them.
- John > On 25 Jul 2022, at 17:42, #NG WOON YEE# <ngwo0...@e.ntu.edu.sg> wrote: > > > Hi Egon, > > I have tried the getFingerprint() method but the outputs are not a series of > fixed-length integers too. The output remains to have different lengths of > integers, for example, 9-digit vs 10-digit. > > Hi John, > I have modify the code from ans4 += result0.getHash(k); to ans4 += > Integer.toHexString(result0.getHash(k));. > Below is my result for Ethanol’s FCFP4: > > FCFP4 Ethanol: > b7bc5856 1 > 0 2 > 3 1 > 2583cb3b 1 > 31282af8 1 > > After padding: > FCFP4 Ethanol: > b7bc5856 1 > 00000000 2 > 00000003 1 > 2583cb3b 1 > 31282af8 1 > > Can you please tell me if I understand you correctly in the previous email? > > Thank you very much to both of you for the attention to this matter. > > Best Regards, > Woon Yee. > > From: John Mayfield > Sent: Monday, 25 July, 2022 6:33 PM > To: Egon Willighagen > Cc: #NG WOON YEE#; cdk-user@lists.sourceforge.net; Chong Kim San Allen > Subject: Re: [Cdk-user] Generation of FCFP Fingerprint > > Hi Woon Yee, > > The method is correct, you can emit them at hexadecimal and pad with 0. > > John > > On Mon, 25 Jul 2022 at 10:11, Egon Willighagen <egon.willigha...@gmail.com> > wrote: > > Dear Woon Yee, > > you can use the getFingerprint() method instead. > > Egon > > On Mon, 25 Jul 2022 at 10:59, #NG WOON YEE# via Cdk-user > <cdk-user@lists.sourceforge.net> wrote: > Dear Helpdesk, > > I was using CDK (version 2.7) to generate FCFP4 and 6 for the compound > butyramide (https://www.ebi.ac.uk/chembl/api/data/molecule/CHEMBL1231396.sdf) > and ethanol (https://www.ebi.ac.uk/chembl/api/data/molecule/CHEMBL545.sdf) > from their MolFiles which I got from CHEMBL. I was using the following > commands in CDK: > --------------------------------------------------------------------------------------------------------------- > package ecfp; > import java.io.*; > import com.opencsv.CSVReader; > import com.opencsv.CSVReaderBuilder; > import com.opencsv.CSVWriter; > import com.opencsv.exceptions.CsvException; > import java.util.Arrays; > import java.util.List; > import java.io.FileInputStream; > import java.io.IOException; > > import org.openscience.cdk.exception.CDKException; > import org.openscience.cdk.fingerprint.CircularFingerprinter; > import org.openscience.cdk.fingerprint.ExtendedFingerprinter; > import org.openscience.cdk.fingerprint.ICountFingerprint; > import org.openscience.cdk.interfaces.IAtomContainer; > import org.openscience.cdk.interfaces.IChemObjectBuilder; > import org.openscience.cdk.io.MDLV2000Reader; > import org.openscience.cdk.silent.SilentChemObjectBuilder; > > public class main{ > public static void main(String[] args) throws CDKException, > IOException { > String filename = > "C:\\Users\\NGWO0001\\Downloads\\CHEMBL545.sdf.txt"; > FileInputStream in = new FileInputStream(filename); > MDLV2000Reader reader = new MDLV2000Reader(in); > IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance(); > IAtomContainer mol = reader.read(bldr.newAtomContainer()); > > CircularFingerprinter fingerprinter0 = new > CircularFingerprinter( > CircularFingerprinter.CLASS_FCFP4 > ); > > > System.out.println("FCFP4 Ethanol:"); > ICountFingerprint result0 = > fingerprinter0.getCountFingerprint(mol); > for (int k=0, n = result0.numOfPopulatedbins(); k < n; ++k) { > String ans4 = ""; > ans4 += result0.getHash(k); > ans4 += " " + result0.getCount(k); > System.out.printf("%s\n",ans4); > } > > reader.close(); > } > } > --------------------------------------------------------------------------------------------------------------- > > The results I got were: > > FCFP4 Butyramide: > -1393198889 1 > -1212393386 1 > -1131767167 2 > 0 4 > 2 1 > 3 1 > 425233353 1 > 785469695 1 > 824716024 1 > 994111779 1 > 1429107614 1 > > FCFP6 Butyramide: > -1393198889 1 > -1212393386 1 > -1131767167 2 > 0 4 > 2 1 > 3 1 > 425233353 1 > 785469695 1 > 824716024 1 > 994111779 1 > 1429107614 1 > > FCFP4 Ethanol: > -1212393386 1 > 0 2 > 3 1 > 629394235 1 > 824716024 1 > > FCFP6 Ethanol: > -1212393386 1 > 0 2 > 3 1 > 629394235 1 > 824716024 1 > > I think these results may not be right since I thought that fingerprints are > supposed to be a series of hash and so they ought to be a series of > fixed-length integers. However, as you see in the results I got, for example, > for the FCFP6 for ethanol, one is 10-digits long while others are single > digits and 9-digits long. > > Can you please tell me what I am doing wrong? > > Thanking you in advance for your assistance and time. > > Best regards, > Woon Yee > _______________________________________________ > Cdk-user mailing list > Cdk-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/cdk-user > > > -- > ---- > Super happy with this new eLife paper describing an Open Science project > where we discuss 260 thousand natural products and where they came from, all > 700 thousand pairs linked to their primary literature: "The LOTUS initiative > for open knowledge management in natural products research", > https://doi.org/10.7554/elife.70780 > > ----- > E.L. Willighagen > Department of Bioinformatics - BiGCaT > Maastricht University (http://www.bigcat.unimaas.nl/) > Twitter/Mastodon: @egonwillighagen / @egonw > Homepage: http://egonw.github.io/ > Blog: http://chem-bla-ics.blogspot.com/ > PubList: https://www.zotero.org/egonw > ORCID: 0000-0001-7542-0286 > ImpactStory: https://impactstory.org/u/egonwillighagen > _______________________________________________ > Cdk-user mailing list > Cdk-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/cdk-user >
_______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user