Do you have a mol2 file or a SMILES file? It's not clear. Mol2 support
isn't great in the CDK mainly because it's more a compchem/modelling format
than cheminformations which primarily use SMILES or MOLfile.
Presume you know how to read line by line from a file here is an example
from SMILES:
IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
> // load from SMILES and compute the ECFP (circular) fingerprint
> IFingerprinter fpr = new CircularFingerprinter();
> SmilesParser smipar = new SmilesParser(bldr);
> List<String> smiles = Arrays.asList("Clc1ccccc1",
> "Fc1ccccc1",
> "Ic1ccccc1",
> "Clc1ncccc1");
> List<BitSet> fps = new ArrayList<>();
> for (String smi : smiles) {
> IAtomContainer mol = smipar.parseSmiles(smi);
> fps.add(fpr.getBitFingerprint(mol).asBitSet());
> }
> // print N^2 comparison table
> for (int j = 0; j < fps.size(); j++)
> System.out.print("," + smiles.get(j));
> System.out.print('\n');
> for (int i = 0; i < fps.size(); i++) {
> System.out.print(smiles.get(i));
> for (int j = 0; j < fps.size(); j++) {
> System.out.printf(",%.3f", Tanimoto.calculate(fps.get(i),
> fps.get(j)));
> }
> System.out.print('\n');
> }
,Clc1ccccc1,Fc1ccccc1,Ic1ccccc1,Clc1ncccc1
Clc1ccccc1,1.000,0.368,0.368,0.292
Fc1ccccc1,0.368,1.000,0.368,0.192
Ic1ccccc1,0.368,0.368,1.000,0.192
Clc1ncccc1,0.292,0.192,0.192,1.000
There are a lot more optimal ways of doing it and for a large comparison
table use ChemFP: https://chemfp.com/.
On Wed, 24 Mar 2021 at 06:42, Stesycki, Manuel <[email protected]>
wrote:
> Good morning,
>
> Use this class for Tanimoto calucations:
> org.openscience.cdk.similarity.Tanimoto (see doc:
> http://cdk.github.io/cdk/latest/docs/api/index.html)
>
> you could do something like this to calculate your tanimoto score:
>
> public static double cdkCalculateTanimotoCoef(IFingerprinter
> fingerprinter, IAtomContainer acReference, IAtomContainer acStructure ) {
>
> double ret = 0.0;
>
> try {
>
> IBitFingerprint fpReference = fingerprinter.getBitFingerprint(
> acReference);
>
> //Tanimoto-score
> IBitFingerprint fpStructure = fingerprinter.getBitFingerprint(
> acStructure);
> ret = Tanimoto.calculate(fpReference, fpStructure);
>
> } catch (Exception ex) {
> //...
> }
>
> return ret;
> }
>
>
>
> Viele Grüße,
> Manuel Stesycki
>
> IT
> 0208 / 306-2146
> Physikbau, Büro 117
> [email protected]
>
> Max-Planck-Institut für Kohlenforschung
> Kaiser-Wilhelm-Platz 1
> D-45470 Mülheim an der Ruhr
> http://www.kofo.mpg.de/de
>
> Am 24.03.2021 um 04:55 schrieb Sub Jae Shin <[email protected]>:
>
> To CDK developers.
>
> Hello, I'm trying to get drug-drug similarity by Tanimoto score.
>
> I'm a beginner of cdk and java, so I'm stuck in the process of changing
> smiles file to Tanimoto score's calculate method's variable.
>
> package com.company;
> import org.openscience.cdk.ChemFile;
> import org.openscience.cdk.exception.CDKException;
> import org.openscience.cdk.interfaces.IChemFile;
> import org.openscience.cdk.io.SMILESReader;
> import java.io.*;
>
> public class Main {
>
> public static void main(String[] args) {
> try {
>
> InputStream mol2DataStream = new
> FileInputStream("../data/drugbank/structure.smiles");
> SMILESReader reader = new SMILESReader(mol2DataStream);
> IChemFile file = reader.read(new ChemFile());
>
> } catch (FileNotFoundException | CDKException e) {
> System.out.println(e.getMessage());
> }
> }
> }
>
> Sincerely
> Seopjae Shin.
>
>
> _______________________________________________
> Cdk-user mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
>
> _______________________________________________
> Cdk-user mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user