Hi Mike, I put together a gist that might help:
https://gist.github.com/ptosco/7bbad9e6441724e9638bc4093f48e31b This is basically a modification of the MACCSkeys._pyGenMACCSKeys() RDKit Python function, combined with a function I wrote some time ago to count non-overlapping matches in a molecule (largely untested). Please do check that the results make sense - I haven't tested this code on any other molecule than the acetone/benzene mixture in the example. Cheers, p. On Tue, Sep 8, 2020 at 10:18 PM Mike Mazanetz <mi...@novadatasolutions.co.uk> wrote: > Hi, > > > > On second thoughts… The KNIME node does a lot of double counting for the > RDKit Substructure Counter, so it’s not a useful tool for counting MACCS > keys. > > > > Anyone got any better ideas? > > > > Cheers, > > mike > > > > *From:* Mike Mazanetz <mi...@novadatasolutions.co.uk> > *Sent:* 08 September 2020 18:42 > *To:* rdkit-discuss@lists.sourceforge.net > *Subject:* Re: [Rdkit-discuss] MACCS keys > > > > Hi folks, > > > > I found that I can always use the KNIME nodes to count these, so no need > to reply. > > > > Best, > > mike > > > > *From:* Mike Mazanetz <mi...@novadatasolutions.co.uk> > *Sent:* 08 September 2020 13:30 > *To:* rdkit-discuss@lists.sourceforge.net > *Subject:* [Rdkit-discuss] MACCS keys > > > > Hello Forum, > > Does anyone know whether it’s possible to obtain not just a fingerprint > keys for MACCS (binary values) but the number of occurrences of the keys, > particularly these details: > > > > Thanks, > > mike > > > > 1: #isotopes > 2: #atoms with atomic number > 103 > 3: #group IVA, VA and VIA periods 4-6 > 4: #Actinides > 5: #group IIIB, IVB elements > 6: #Lanthanides > 7: #group VB, VIB, VIIB elements > 8: #heteroatoms in 4-membered rings > 9: #group VIIIB elements > 10: #alkaline earth elements > 11: #atoms in 4 ring > 12: #group IB, IIB elements > 13: #N connected to 1 O and 2 C > 14: #S atoms in S-S groups > 15: #C connected to 3 O > 16: #heteroatoms in 3-membered rings > 17: #C in CC triple bonds > 18: #group IIIA elements > 19: #atoms in 7 ring > 20: #silicon atoms > 21: #C = bonded to C and 3 heavy atoms > 22: #atoms in 3 ring > 23: #C bonded 1 N and 2 O > 24: #O-N single bonds > 25: #C bonded to at least 3 N atoms > 26: #C in 3 ring bonds and a double bond > 27: #iodine atoms > 28: #XCH2X, where X<>C > 29: #phosphorous atoms > 30: #non-C Q4 bonded to >= 3 C > 31: #halogens connected to non carbons > 32: #S bonded to an N and a C > 33: #S atoms bonded to N > 34: #CH2= units > 35: #alkali (group IA ) elements > 36: #S atoms in rings > 37: #C bonded to >= 1 O & >=2 N > 38: #C bonded >= 2 N and 1 C > 39: #S atoms bonded to 3 O > 40: #S single bonded to OQ2 > 41: #N in C#N > 42: #fluorine atoms > 43: #X-H heteroatoms 2 bonds from another > 44: #other elements > 45: #N atoms adjacent to -C=C > 46: #bromine atoms > 47: #S two bonds from an N > 48: #non C bonded to >= 3 O > 49: #charged atoms > 50: #C in C=C bonded to >= 3 C > 51: #S bonded to a C and an O > 52: #N bonded to N > 53: #QH 4 bonds from another QH > 54: #QH 3 bonds from another QH > 55: #S bonded to >=2 O > 56: #N bonded to >= 2O and >= 1 C > 57: #O in rings > 58: #S bonded to >=2 non-carbon atoms > 59: #non-aromatic S-[a] > 60: #[S+]-[O-] > 61: #SQ3 > 62: #non-ring bonds that connect rings > 63: #N atoms in double bonds with O > 64: #non-ring S attached to a ring > 65: #N in aromatic bonds with C > 66: #CX4 bonded to >=3 carbons > 67: #S attached to heteroatoms > 68: #QH bonded to another QH > 69: #QH bonded to another Q > 70: #N bonded to two non-C heavy atoms > 71: #N bonded to O > 72: #O separated by 3 bonds > 73: #S in double/charge separated bonds > 74: #dimethyl substituted atoms > 75: #N non-ring bonded to a ring > 76: #C in C=C bonded to >= 3 heavy atoms > 77: #N separated by 2 bonds > 78: #N double bonded to C > 79: #N separated by 3 bonds > 80: #N separated by 4 bonds > 81: #S attached to Q >= 3 atoms > 82: #heteratoms attached to a CH2 > 83: #heteroatoms in 5 ring > 84: #NH2 groups > 85: #N bonded to >= 3 C > 86: #CH2 or CH3 separated by non-C > 87: #halogens bonded to any ring > 88: #sulfurs > 89: #O separated by 4 bonds > 90: #het. 3 bonds from a CH2 > 91: #het. 4 bonds from a CH2 > 92: #C bonded to >=1 N, >=1 C & >= 1 O > 93: #methylated heteroatoms > 94: #N bonded to non C > 95: #O 3 bonds from an N > 96: #atoms in 5-rings > 97: #O 4 bonds from an N > 98: #het. in 6-ring > 99: #C in C=C > 100: #N attached to CH2 > 101: #atoms in 8-ring or higher > 102: #O bonded to non C heavy atoms > 103: #chlorine atoms > 104: #hets. 2 bonds from a CH2 > 105: #hets. ring bonded to a 3-ring bond X > 106: #X bonded to >= 3 non-C > 107: #XQ>3 bonded to at least 1 halogen > 108: #CH3 4 bonds from a CH2 > 109: #O attached to CH2 > 110: #O 1 C from an N > 111: #N 2 bonds from a CH2 > 112: #atoms with coordination number >= 4 > 113: #O in non-aromatic bonds to an [a] > 114: #CH3 attached to CH2 > 115: #CH3 2 bonds from a CH2 > 116: #CH3 3 bonds from a CH2 > 117: #N 2 bonds from an O > 118: (key(147)-1 if key(147)>1; else 0) > 119: #N in double bonds > 120: (key(137)-1 if key(137)>1; else 0) > 121: #N in rings > 122: #N with coordination number >=3 > 123: #O separated by 1 C > 124: #het-het bonds > 125: Is # AROMATIC RING > 1? > 126: #non-ring O bonded to 2 heavy atoms > 127: (key(143)-1 if key(143)>1; else 0) > 128: #CH2s separated by 4 bonds > 129: #CH2s separated by 3 bonds > 130: (key(124)-1 if key(124)>1; else 0) > 131: (# het atoms with H) > 132: #O 2 bonds from CH2 > 133: #N non-ring bonded to a ring > 134: #halogens > 135: #N in a non-aromatic bond with [a] > 136: Bit: is there more than 1 O= > 137: Total # ring HETEROCYCLE atoms > 138: (key(153)-1 if key(153)>1; else 0) > 139: #OH groups > 140: (key(164)-3 if key(164)>3; else 0) > 141: (key(160)-2 if key(160)>2; else 0) > 142: (key(161)-2 if key(161)>1; else 0) > 143: #non ring O connected to a ring > 144: #atoms separated by (!:):(!:) > 145: #6M RING > 1 > 146: Key(164)-2 if key(164)>2; else 0 > 147: #CH2 attached to CH2 > 148: #non-C with coordination number >=3 > 149: (key(160)-1 if key(160)>1; else 0) > 150: #X separated by (!r)-r-(!r) > 151: #NH > 152: #C bonded to >=2 C and 1 O > 153: #non-carbons attached to CH2 > 154: #O in C=O > 155: #non-ring CH2 > 156: #XN where coord. # of X>=3 > 157: #O in C-O single bonds > 158: #N in C-N single bonds > 159: Key(164)-1 if key(164)>1; else 0 > 160: #CH3 groups > 161: #N > 162: #aromatics > 163: #atoms in 6 rings > 164: #oxygens > 165: #ring atoms > 166: Is there more than 1 fragment? > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss