Trouble is, you're mixing chemical operations and lexical ones. It might be handy if this 'just worked' but in practice it's not going to produce valid SMILES without more work.
I've written code in the past to do this kind of thing for virtual library building, using dummy atoms to mark link positions in the fragments, and using Perl code to transform between the dummy atoms and bond-closure numbers to give text strings which could be assembled to give valid dot-disconnected SMILES. This required additional lexical transformations in order to maintain valid SMILES depending on where the dummy atom was, and to make sure that stereochemistry worked properly. If you want to do this kind of thing I don't think you can expect to avoid these additional lexical operations. I don't think it's reasonable to expect that invalid SMILES strings should be coerced into giving a particular result for convenience when 1) - they're invalid! and 2) - the behaviour is actually a reasonable interpretation of the order of connections in the SMILES (even though they are invalid). I don't think the current RDKit interpretation of these SMILES should change, though it might be useful if it could issue a warning that SMILES of this type are not correct. Best regards, Chris On 9 November 2017 at 15:09, Brian Cole <[email protected]> wrote: > Here's an example of why this is useful at maintaining molecular > fragmentation inside your molecular representation: > >>>> from rdkit import Chem >>>> smiles = 'F9.[C@]91(C)CCO1' >>>> fluorine, core = smiles.split('.') >>>> fluorine > 'F9' >>>> fragment = core.replace('9', '([*:9])') >>>> fragment > '[C@]([*:9])1(C)CCO1' >>>> mol = Chem.RWMol(Chem.MolFromSmiles(fragment)) ### RDKit is flipping >>>> the stereo on me here even the order of the bonds has not changed >>>> idx = mol.AddAtom(Chem.Atom(0)) >>>> mol.AddBond(idx, 4, Chem.rdchem.BondType.SINGLE) > 7 >>>> mol.GetAtomWithIdx(idx).SetIntProp("molAtomMapNumber", 8) >>>> new_core = Chem.MolToSmiles(mol, True) >>>> new_core = new_core.replace('([*:9])', '9').replace('([*:8])', '8') >>>> new_core > 'C[C@]19CC8O1' >>>> analog_smiles = 'Cl8.' + fluorine + '.' + new_core >>>> analog_smiles > 'Cl8.F9.C[C@]19CC8O1' >>>> analog = Chem.MolFromSmiles(analog_smiles) >>>> analog.HasSubstructMatch(Chem.MolFromSmiles(smiles), useChirality=True) >>>> # Uh oh! My original molecule didn't match > False >>>> analog.HasSubstructMatch(Chem.MolFromSmiles(smiles.replace('@', '@@')), >>>> useChirality=True) # flipping the stereo of the original causes it to >>>> match again > True > > > > > On Thu, Nov 9, 2017 at 4:41 AM, Andrew Dalke <[email protected]> > wrote: >> >> On Nov 9, 2017, at 08:13, Greg Landrum <[email protected]> wrote: >> > As was discussed in the comments of >> > https://github.com/rdkit/rdkit/issues/786, I think it's pretty gross that >> > the second syntax is even legal. But that's a side point. >> >> To belabor that point. Neither Daylight SMILES nor OpenSMILES accept it, >> which are the only two explicit sources of "legal" that people use. >> >> "allowed" might be a better term. >> >> Andrew >> [email protected] >> >> >> >> >> ------------------------------------------------------------------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> _______________________________________________ >> Rdkit-discuss mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Rdkit-discuss mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

