Hi, It's not quite clear exactly what you want to do. Do you want to pull the matched substructure out of a molecule? This often isn't needed as you either have the pattern as is or you just need the atom/bond indexes for downstream processing.
Anyways if you really want to do that - you don't need fragment the molecule just add the matched atoms/bonds to another atom container: IAtomContainer mol = ...; SmartsPattern pat = ...; IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance(); for (Map<IChemObject, IChemObject> map : pat.matchAll(mol).toAtomBondMap()) { IAtomContainer subgraph = bldr.newAtomContainer(); for (IChemObject cobj : map.values()) { if (cobj instanceof IAtom) subgraph.addAtom((IAtom) cobj); else if (cobj instanceof IBond) subgraph.addBond((IBond) cobj); } } You can do it more like you described but it's less efficient since removals are less efficient and you would need to work out how to handle more than one pattern etc. Note you can't copy the molecule as would then have different object references/ IAtomContainer mol = ...; SmartsPattern pat = ...; IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance(); for (Map<IChemObject, IChemObject> mapping : pat.matchAll(mol).toAtomBondMap()) { Set<IAtom> atomsToDelete = new HashSet<>(); for (IAtom atom : mol.atoms()) { if (!mapping.containsKey(atom)) atomsToDelete.add(mol); } // note: bond's take care of themselves for (IAtom atom : atomsToDelete) mol.removeAtom(atom); } On Sat, 20 Nov 2021 at 12:46, biotech7 <biote...@163.com> wrote: > hi,everyone! > one molecule has ring system(isolated rings or fused rings). firstly, by > using *RingSearch() *to find rings. secondly, locate functional groups > linked to the rings as the final target submolecule. to reach this goal, by > utilizing * findSubstructure()* pattern (plus other algorithms) to search > and locate all linked functional groups' positions. when this > goal accomplished, break all bonds at these positions and acquire target > substructure. > *take a detailed example :* > i want to split this molecule( *UniversalSmiles*) > *CCCOC(=O)C1=C(C=C(C(=C1)S(=O)(=O)NC(=O)NC2=NC(=NC(=N2)OC)C)OCCCl)N(=O)=*O > at atom positions: 30-31, 1-2,16-17 to get this target substructure > *CCOC(=O)C1=C(C=C(C(=C1)S(=O)(=O)N)OC)N(=O)=O* > > currently, after getting all the posistions, i use > *FragmentUtils.splitMolecule()*(in a protected class) method to split > molecules. but this strategy only supports step by step splitting and > requires reconstructing structure as final tartget substructure. > > the question is : is there an algorithm(or a strategy) to split the > molecule at all the positions(30-31, 1-2,16-17) only once to fully get the > substructure(*CCOC(=O)C1=C(C=C(C(=C1)S(=O)(=O)N)OC)N(=O)=O*) without > reconstruction? > this issue has trapped me for many days. > > Regards! > > > > _______________________________________________ > Cdk-user mailing list > Cdk-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/cdk-user >
_______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user