Hi Mike,

I put together a gist that might help:

https://gist.github.com/ptosco/7bbad9e6441724e9638bc4093f48e31b

This is basically a modification of the MACCSkeys._pyGenMACCSKeys() RDKit
Python function, combined with a function I wrote some time ago to count
non-overlapping matches in a molecule (largely untested).
Please do check that the results make sense - I haven't tested this code on
any other molecule than the acetone/benzene mixture in the example.

Cheers,
p.

On Tue, Sep 8, 2020 at 10:18 PM Mike Mazanetz <mi...@novadatasolutions.co.uk>
wrote:

> Hi,
>
>
>
> On second thoughts… The KNIME node does a lot of double counting for the
> RDKit Substructure Counter, so it’s not a useful tool for counting MACCS
> keys.
>
>
>
> Anyone got any better ideas?
>
>
>
> Cheers,
>
> mike
>
>
>
> *From:* Mike Mazanetz <mi...@novadatasolutions.co.uk>
> *Sent:* 08 September 2020 18:42
> *To:* rdkit-discuss@lists.sourceforge.net
> *Subject:* Re: [Rdkit-discuss] MACCS keys
>
>
>
> Hi folks,
>
>
>
> I found that I can always use the KNIME nodes to count these, so no need
> to reply.
>
>
>
> Best,
>
> mike
>
>
>
> *From:* Mike Mazanetz <mi...@novadatasolutions.co.uk>
> *Sent:* 08 September 2020 13:30
> *To:* rdkit-discuss@lists.sourceforge.net
> *Subject:* [Rdkit-discuss] MACCS keys
>
>
>
> Hello Forum,
>
> Does anyone know whether it’s possible to obtain not just a fingerprint
> keys for MACCS (binary values) but the number of occurrences of the keys,
> particularly these details:
>
>
>
> Thanks,
>
> mike
>
>
>
> 1: #isotopes
> 2: #atoms with atomic number > 103
> 3: #group IVA, VA and VIA periods 4-6
> 4: #Actinides
> 5: #group IIIB, IVB elements
> 6: #Lanthanides
> 7: #group VB, VIB, VIIB elements
> 8: #heteroatoms in 4-membered rings
> 9: #group VIIIB elements
> 10: #alkaline earth elements
> 11: #atoms in 4 ring
> 12: #group IB, IIB elements
> 13: #N connected to 1 O and 2 C
> 14: #S atoms in S-S groups
> 15: #C connected to 3 O
> 16: #heteroatoms in 3-membered rings
> 17: #C in CC triple bonds
> 18: #group IIIA elements
> 19: #atoms in 7 ring
> 20: #silicon atoms
> 21: #C = bonded to C and 3 heavy atoms
> 22: #atoms in 3 ring
> 23: #C bonded 1 N and 2 O
> 24: #O-N single bonds
> 25: #C bonded to at least 3 N atoms
> 26: #C in 3 ring bonds and a double bond
> 27: #iodine atoms
> 28: #XCH2X, where X<>C
> 29: #phosphorous atoms
> 30: #non-C Q4 bonded to >= 3 C
> 31: #halogens connected to non carbons
> 32: #S bonded to an N and a C
> 33: #S atoms bonded to N
> 34: #CH2= units
> 35: #alkali (group IA ) elements
> 36: #S atoms in rings
> 37: #C bonded to >= 1 O & >=2 N
> 38: #C bonded >= 2 N and 1 C
> 39: #S atoms bonded to 3 O
> 40: #S single bonded to OQ2
> 41: #N in C#N
> 42: #fluorine atoms
> 43: #X-H heteroatoms 2 bonds from another
> 44: #other elements
> 45: #N atoms adjacent to -C=C
> 46: #bromine atoms
> 47: #S two bonds from an N
> 48: #non C bonded to >= 3 O
> 49: #charged atoms
> 50: #C in C=C bonded to >= 3 C
> 51: #S bonded to a C and an O
> 52: #N bonded to N
> 53: #QH 4 bonds from another QH
> 54: #QH 3 bonds from another QH
> 55: #S bonded to >=2 O
> 56: #N bonded to >= 2O and >= 1 C
> 57: #O in rings
> 58: #S bonded to >=2 non-carbon atoms
> 59: #non-aromatic S-[a]
> 60: #[S+]-[O-]
> 61: #SQ3
> 62: #non-ring bonds that connect rings
> 63: #N atoms in double bonds with O
> 64: #non-ring S attached to a ring
> 65: #N in aromatic bonds with C
> 66: #CX4 bonded to >=3 carbons
> 67: #S attached to heteroatoms
> 68: #QH bonded to another QH
> 69: #QH bonded to another Q
> 70: #N bonded to two non-C heavy atoms
> 71: #N bonded to O
> 72: #O separated by 3 bonds
> 73: #S in double/charge separated bonds
> 74: #dimethyl substituted atoms
> 75: #N non-ring bonded to a ring
> 76: #C in C=C bonded to >= 3 heavy atoms
> 77: #N separated by 2 bonds
> 78: #N double bonded to C
> 79: #N separated by 3 bonds
> 80: #N separated by 4 bonds
> 81: #S attached to Q >= 3 atoms
> 82: #heteratoms attached to a CH2
> 83: #heteroatoms in 5 ring
> 84: #NH2 groups
> 85: #N bonded to >= 3 C
> 86: #CH2 or CH3 separated by non-C
> 87: #halogens bonded to any ring
> 88: #sulfurs
> 89: #O separated by 4 bonds
> 90: #het. 3 bonds from a CH2
> 91: #het. 4 bonds from a CH2
> 92: #C bonded to >=1 N, >=1 C & >= 1 O
> 93: #methylated heteroatoms
> 94: #N bonded to non C
> 95: #O 3 bonds from an N
> 96: #atoms in 5-rings
> 97: #O 4 bonds from an N
> 98: #het. in 6-ring
> 99: #C in C=C
> 100: #N attached to CH2
> 101: #atoms in 8-ring or higher
> 102: #O bonded to non C heavy atoms
> 103: #chlorine atoms
> 104: #hets. 2 bonds from a CH2
> 105: #hets. ring bonded to a 3-ring bond X
> 106: #X bonded to >= 3 non-C
> 107: #XQ>3 bonded to at least 1 halogen
> 108: #CH3 4 bonds from a CH2
> 109: #O attached to CH2
> 110: #O 1 C from an N
> 111: #N 2 bonds from a CH2
> 112: #atoms with coordination number >= 4
> 113: #O in non-aromatic bonds to an [a]
> 114: #CH3 attached to CH2
> 115: #CH3 2 bonds from a CH2
> 116: #CH3 3 bonds from a CH2
> 117: #N 2 bonds from an O
> 118: (key(147)-1 if key(147)>1; else 0)
> 119: #N in double bonds
> 120: (key(137)-1 if key(137)>1; else 0)
> 121: #N in rings
> 122: #N with coordination number >=3
> 123: #O separated by 1 C
> 124: #het-het bonds
> 125: Is # AROMATIC RING > 1?
> 126: #non-ring O bonded to 2 heavy atoms
> 127: (key(143)-1 if key(143)>1; else 0)
> 128: #CH2s separated by 4 bonds
> 129: #CH2s separated by 3 bonds
> 130: (key(124)-1 if key(124)>1; else 0)
> 131: (# het atoms with H)
> 132: #O 2 bonds from CH2
> 133: #N non-ring bonded to a ring
> 134: #halogens
> 135: #N in a non-aromatic bond with [a]
> 136: Bit: is there more than 1 O=
> 137: Total # ring HETEROCYCLE atoms
> 138: (key(153)-1 if key(153)>1; else 0)
> 139: #OH groups
> 140: (key(164)-3 if key(164)>3; else 0)
> 141: (key(160)-2 if key(160)>2; else 0)
> 142: (key(161)-2 if key(161)>1; else 0)
> 143: #non ring O connected to a ring
> 144: #atoms separated by (!:):(!:)
> 145: #6M RING > 1
> 146: Key(164)-2 if key(164)>2; else 0
> 147: #CH2 attached to CH2
> 148: #non-C with coordination number >=3
> 149: (key(160)-1 if key(160)>1; else 0)
> 150: #X separated by (!r)-r-(!r)
> 151: #NH
> 152: #C bonded to >=2 C and 1 O
> 153: #non-carbons attached to CH2
> 154: #O in C=O
> 155: #non-ring CH2
> 156: #XN where coord. # of X>=3
> 157: #O in C-O single bonds
> 158: #N in C-N single bonds
> 159: Key(164)-1 if key(164)>1; else 0
> 160: #CH3 groups
> 161: #N
> 162: #aromatics
> 163: #atoms in 6 rings
> 164: #oxygens
> 165: #ring atoms
> 166: Is there more than 1 fragment?
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to