Dear Thomas, You can get the SMILES of substructures that are extracted via `GetMorganFingerprint` function as follows. Then, you can append any labels to the SMILES string but not real numbers.
```python from rdkit import Chem mol = Chem.MolFromSmiles('Cc1ncccc1') info = {} AllChem.GetMorganFingerprint(mol, radius=2, bitInfo=info) radius, atom_id = list(info.values())[0][0][::-1] env = Chem.FindAtomEnvironmentOfRadiusN(mol, radius, atom_id) sub_struct = Chem.PathToSubmol(mol, env) type(sub_struct) #=> rdkit.Chem.rdchem.Mol Chem.MolToSmiles(sub_struct) #=> 'ccc' ``` Best, On Fri, 22 Nov 2019 at 23:40, Thomas Evangelidis <teva...@gmail.com> wrote: > Greetings, > > Could someone please clarify how can I pass atomic partial charges to the > ECFP fingerprint generator along with the default atomic properties that it > considers? Can I pass the real charge values or do I have to group them > into bins and pass the bin identifier? I found a function in utilsFP.py > file which generates invariants as follows: > > def generateAtomInvariant(mol): > """ > >>> generateAtomInvariant(Chem.MolFromSmiles("Cc1ncccc1")) > [341294046, 3184205312, 522345510, 1545984525, 1545984525, 1545984525, > 1545984525] > """ > num_atoms = mol.GetNumAtoms() > invariants = [0]*num_atoms > for i,a in enumerate(mol.GetAtoms()): > descriptors=[] > descriptors.append(a.GetAtomicNum()) > descriptors.append(a.GetTotalDegree()) > descriptors.append(a.GetTotalNumHs()) > descriptors.append(a.IsInRing()) > descriptors.append(a.GetIsAromatic()) > invariants[i]=hash(tuple(descriptors))& 0xffffffff > return invariants > > > And then generate the fingerprint like this: > > > fp = AllChem.GetMorganFingerprint(mol, radius=3, > invariants=generateAtomInvariant(mol)) > > > Would just suffice to add this extra line in generateAtomInvariant() function? > > > descriptors.append(a.GetFormalCharge()) > > > > I thank you in advance. > Thomas > > > > -- > > ====================================================================== > > Dr. Thomas Evangelidis > > Research Scientist > > IOCB - Institute of Organic Chemistry and Biochemistry of the Czech > Academy of Sciences <https://www.uochb.cz/web/structure/31.html?lang=en>, > Prague, > Czech Republic > & > CEITEC - Central European Institute of Technology <https://www.ceitec.eu/> > , Brno, Czech Republic > > email: teva...@gmail.com, Twitter: tevangelidis > <https://twitter.com/tevangelidis>, LinkedIn: Thomas Evangelidis > <https://www.linkedin.com/in/thomas-evangelidis-495b45125/> > > website: https://sites.google.com/site/thomasevangelidishomepage/ > > > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- ---- The University of Tokyo 2nd year Ph.D. candidate Shojiro Shibayama ----
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss