Thanks for the tip!

On Sun, Mar 28, 2021 at 8:41 PM John Mayfield <john.wilkinson...@gmail.com>
wrote:

> You should use the *CircularFingerprinter* for similarity.
>
> On Sun, 28 Mar 2021 at 08:39, Sub Jae Shin <cnb.mons...@gmail.com> wrote:
>
>> To John Mayfield
>>
>> Hi, I found the drugbank id property from AtomContainer's getproperties
>> method, so that I could specify which atom container indicates which drug.
>>
>> I think my goal to get drug-drug similarity has been achieved in my guess.
>>
>> package com.company;
>> import org.openscience.cdk.ChemFile;
>> import org.openscience.cdk.exception.CDKException;
>> import org.openscience.cdk.fingerprint.Fingerprinter;
>> import org.openscience.cdk.fingerprint.IBitFingerprint;
>> import org.openscience.cdk.fingerprint.IFingerprinter;
>> import org.openscience.cdk.graph.rebond.Bspt;
>> import org.openscience.cdk.interfaces.IAtomContainer;
>> import org.openscience.cdk.interfaces.IChemFile;
>> import org.openscience.cdk.io.MDLV2000Reader;
>> import org.openscience.cdk.similarity.Tanimoto;
>> import org.openscience.cdk.tools.manipulator.ChemFileManipulator;
>>
>> import java.io.*;
>> import java.lang.reflect.Array;
>> import java.util.ArrayList;
>> import java.util.List;
>> import java.util.Map;
>>
>> public class Main {
>>
>>     public static void main(String[] args) {
>>         try {
>>
>>             InputStream structures = new 
>> FileInputStream("../data/drugbank/structures.sdf");
>>             MDLV2000Reader reader = new MDLV2000Reader(structures);
>>             IChemFile file = reader.read(new ChemFile());
>>             //Where can I find drugbank id?
>>
>>             Fingerprinter finger = new Fingerprinter();
>>             List<IAtomContainer> AtomData = 
>> ChemFileManipulator.getAllAtomContainers(file);
>>             int count = AtomData.size();
>>             ArrayList<ArrayList> df = new ArrayList<>();
>>
>>             for(int i = 0; i < count; ++i) {
>>                 ArrayList<Object> list = new ArrayList<>();
>>                 IAtomContainer acReference = AtomData.get(i);
>>                 Map refProperties = acReference.getProperties();
>>                 list.add(refProperties.get("DATABASE_ID"));
>>                 for(int j = 0; j < count; ++j) {
>>                     IAtomContainer acStructure = AtomData.get(j);
>>                     Map structProperties = acStructure.getProperties();
>>                     System.out.println("REF DATABASE_ID : " + 
>> refProperties.get("DATABASE_ID") +
>>                             "-" + "COMP DATABASE_ID" + 
>> structProperties.get("DATABASE_ID") + " similarity is now calculating....");
>>                     double similarity = cdkCalculateTanimotoCoef(finger, 
>> acReference, acStructure);
>>                     list.add(similarity);
>>                 }
>>                 df.add(list);
>>             }
>>             FileWriter result_csv = new 
>> FileWriter("../data/drugbank/drug_drug_sim.csv");
>>
>>             for(ArrayList a : df){
>>                 String row = "";
>>                 for(int i = 0; i < a.size(); ++i) {
>>                     if(i == a.size() - 1) {
>>                         row = row + a.get(i).toString() + "\n";
>>                     }
>>                     else {
>>                         row = row + a.get(i).toString() + ",";
>>                     }
>>                 }
>>                 // System.out.println(row);
>>                 result_csv.write(row);
>>             }
>>
>>             result_csv.close();
>>
>>             //System.out.println(acReference.toString());
>>
>>
>>         } catch (FileNotFoundException | CDKException e) {
>>             System.out.println(e.getMessage());
>>         } catch (IOException e) {
>>             e.printStackTrace();
>>         }
>>     }
>>
>>     public static double cdkCalculateTanimotoCoef(IFingerprinter 
>> fingerprinter, IAtomContainer acReference, IAtomContainer acStructure ) {
>>
>>         double ret = 0.0;
>>
>>         try {
>>
>>             IBitFingerprint fpReference = 
>> fingerprinter.getBitFingerprint(acReference);
>>
>>             //Tanimoto-score
>>             IBitFingerprint fpStructure = 
>> fingerprinter.getBitFingerprint(acStructure);
>>             ret = Tanimoto.calculate(fpReference, fpStructure);
>>
>>         } catch (Exception ex) {
>>             //...
>>         }
>>
>>         return ret;
>>     }
>> }
>>
>>
>> I hope this code result matches with my goal.
>>
>> I always thank you all, cdk developers.
>>
>> Sincerely
>> Seopjae Shin
>>
>>
>> On Fri, Mar 26, 2021 at 6:36 PM John Mayfield <
>> john.wilkinson...@gmail.com> wrote:
>>
>>> Do you have a mol2 file or a SMILES file? It's not clear. Mol2 support
>>> isn't great in the CDK mainly because it's more a compchem/modelling format
>>> than cheminformations which primarily use SMILES or MOLfile.
>>>
>>> Presume you know how to read line by line from a file here is an example
>>> from SMILES:
>>>
>>> IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
>>>> // load from SMILES and compute the ECFP (circular) fingerprint
>>>> IFingerprinter fpr = new CircularFingerprinter();
>>>> SmilesParser smipar = new SmilesParser(bldr);
>>>> List<String> smiles = Arrays.asList("Clc1ccccc1",
>>>>         "Fc1ccccc1",
>>>>         "Ic1ccccc1",
>>>>         "Clc1ncccc1");
>>>> List<BitSet> fps = new ArrayList<>();
>>>> for (String smi : smiles) {
>>>>     IAtomContainer mol = smipar.parseSmiles(smi);
>>>>     fps.add(fpr.getBitFingerprint(mol).asBitSet());
>>>> }
>>>> // print N^2 comparison table
>>>> for (int j = 0; j < fps.size(); j++)
>>>>     System.out.print("," + smiles.get(j));
>>>> System.out.print('\n');
>>>> for (int i = 0; i < fps.size(); i++) {
>>>>     System.out.print(smiles.get(i));
>>>>     for (int j = 0; j < fps.size(); j++) {
>>>>         System.out.printf(",%.3f", Tanimoto.calculate(fps.get(i),
>>>> fps.get(j)));
>>>>     }
>>>>     System.out.print('\n');
>>>> }
>>>
>>>
>>> ,Clc1ccccc1,Fc1ccccc1,Ic1ccccc1,Clc1ncccc1
>>> Clc1ccccc1,1.000,0.368,0.368,0.292
>>> Fc1ccccc1,0.368,1.000,0.368,0.192
>>> Ic1ccccc1,0.368,0.368,1.000,0.192
>>> Clc1ncccc1,0.292,0.192,0.192,1.000
>>>
>>> There are a lot more optimal ways of doing it and for a large comparison
>>> table use ChemFP: https://chemfp.com/.
>>>
>>> On Wed, 24 Mar 2021 at 06:42, Stesycki, Manuel <
>>> stesy...@mpi-muelheim.mpg.de> wrote:
>>>
>>>> Good morning,
>>>>
>>>> Use this class for Tanimoto calucations:
>>>>  org.openscience.cdk.similarity.Tanimoto (see doc:
>>>> http://cdk.github.io/cdk/latest/docs/api/index.html)
>>>>
>>>> you could do something like this to calculate your tanimoto score:
>>>>
>>>> public static double cdkCalculateTanimotoCoef(IFingerprinter
>>>> fingerprinter, IAtomContainer acReference, IAtomContainer acStructure ) {
>>>>
>>>>         double ret = 0.0;
>>>>
>>>>         try {
>>>>
>>>>             IBitFingerprint fpReference =
>>>> fingerprinter.getBitFingerprint(acReference);
>>>>
>>>>             //Tanimoto-score
>>>>             IBitFingerprint fpStructure =
>>>> fingerprinter.getBitFingerprint(acStructure);
>>>>             ret = Tanimoto.calculate(fpReference, fpStructure);
>>>>
>>>>         } catch (Exception ex) {
>>>>             //...
>>>>         }
>>>>
>>>>         return ret;
>>>>     }
>>>>
>>>>
>>>>
>>>> Viele Grüße,
>>>>    Manuel Stesycki
>>>>
>>>> IT
>>>>    0208 / 306-2146
>>>>    Physikbau, Büro 117
>>>>    stesy...@mpi-muelheim.mpg.de
>>>>
>>>> Max-Planck-Institut für Kohlenforschung
>>>>    Kaiser-Wilhelm-Platz 1
>>>>    D-45470 Mülheim an der Ruhr
>>>>    http://www.kofo.mpg.de/de
>>>>
>>>> Am 24.03.2021 um 04:55 schrieb Sub Jae Shin <cnb.mons...@gmail.com>:
>>>>
>>>> To CDK developers.
>>>>
>>>> Hello, I'm trying to get drug-drug similarity by Tanimoto score.
>>>>
>>>> I'm a beginner of cdk and java, so I'm stuck in the process of changing
>>>> smiles file to Tanimoto score's calculate method's variable.
>>>>
>>>> package com.company;
>>>> import org.openscience.cdk.ChemFile;
>>>> import org.openscience.cdk.exception.CDKException;
>>>> import org.openscience.cdk.interfaces.IChemFile;
>>>> import org.openscience.cdk.io.SMILESReader;
>>>> import java.io.*;
>>>>
>>>> public class Main {
>>>>
>>>>     public static void main(String[] args) {
>>>>         try {
>>>>
>>>>             InputStream mol2DataStream = new 
>>>> FileInputStream("../data/drugbank/structure.smiles");
>>>>             SMILESReader reader = new SMILESReader(mol2DataStream);
>>>>             IChemFile file = reader.read(new ChemFile());
>>>>
>>>>         } catch (FileNotFoundException | CDKException e) {
>>>>             System.out.println(e.getMessage());
>>>>         }
>>>>     }
>>>> }
>>>>
>>>> Sincerely
>>>> Seopjae Shin.
>>>>
>>>>
>>>> _______________________________________________
>>>> Cdk-user mailing list
>>>> Cdk-user@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>>
>>>>
>>>> _______________________________________________
>>>> Cdk-user mailing list
>>>> Cdk-user@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>>
>>>
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to