Hi, It's not quite clear exactly what you want to do. Do you want to pull
the matched substructure out of a molecule? This often isn't needed as you
either have the pattern as is or you just need the atom/bond indexes for
downstream processing.

Anyways if you really want to do that - you don't need fragment the
molecule just add the matched atoms/bonds to another atom container:

IAtomContainer mol = ...;
SmartsPattern pat = ...;
IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
for (Map<IChemObject, IChemObject> map : pat.matchAll(mol).toAtomBondMap())
{
    IAtomContainer subgraph = bldr.newAtomContainer();
    for (IChemObject cobj : map.values()) {
        if (cobj instanceof IAtom)
            subgraph.addAtom((IAtom) cobj);
        else if (cobj instanceof IBond)
            subgraph.addBond((IBond) cobj);
    }
}

You can do it more like you described but it's less efficient since
removals are less efficient and you would need to work out how to handle
more than one pattern etc. Note you can't copy the molecule as would then
have different object references/

IAtomContainer mol = ...;
SmartsPattern pat = ...;
IChemObjectBuilder bldr = SilentChemObjectBuilder.getInstance();
for (Map<IChemObject, IChemObject> mapping :
pat.matchAll(mol).toAtomBondMap()) {
    Set<IAtom> atomsToDelete = new HashSet<>();
    for (IAtom atom : mol.atoms()) {
        if (!mapping.containsKey(atom))
            atomsToDelete.add(mol);
    }
    // note: bond's take care of themselves
    for (IAtom atom : atomsToDelete)
        mol.removeAtom(atom);
}

On Sat, 20 Nov 2021 at 12:46, biotech7 <biote...@163.com> wrote:

> hi,everyone!
> one molecule has ring system(isolated rings or fused rings). firstly, by
> using *RingSearch() *to find rings. secondly, locate functional groups
> linked to the rings as the final target submolecule. to reach this goal, by
> utilizing * findSubstructure()* pattern (plus other algorithms) to search
> and locate all linked functional groups' positions. when this
> goal accomplished, break all bonds at these positions and acquire target
> substructure.
> *take a detailed example :*
> i want to split this molecule( *UniversalSmiles*)
> *CCCOC(=O)C1=C(C=C(C(=C1)S(=O)(=O)NC(=O)NC2=NC(=NC(=N2)OC)C)OCCCl)N(=O)=*O
> at atom positions: 30-31, 1-2,16-17 to get this target substructure
> *CCOC(=O)C1=C(C=C(C(=C1)S(=O)(=O)N)OC)N(=O)=O*
>
> currently, after getting all the posistions, i use
> *FragmentUtils.splitMolecule()*(in a protected class) method to split
> molecules. but this strategy only supports step by step splitting and
> requires reconstructing  structure as final tartget substructure.
>
> the question is : is there an algorithm(or a strategy) to split the
> molecule at all the positions(30-31, 1-2,16-17) only once to fully get the
> substructure(*CCOC(=O)C1=C(C=C(C(=C1)S(=O)(=O)N)OC)N(=O)=O*) without
> reconstruction?
> this issue has trapped me for many days.
>
> Regards!
>
>
>
> _______________________________________________
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
>
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to