[Rdkit-discuss] The Chlorine molfile question

2016-01-20 Thread Peter Shenkin
It seems to me that what we are talking about now has (or should have!) more to do with the interpretation of the terrible old PDB file format than about any software convention. It seems to me that software that must read this format should turn the contents into something generally chemically ac

Re: [Rdkit-discuss] MCS module - bonding and hybridization in substructure search

2015-11-15 Thread Peter Shenkin
Say, Greg, If you understand Janusz's request, could you perhaps explain it in other words? I don't quite follow it, despite having read the two emails. I'm getting the sense that he wants to make sure that SP2 nitrogens match only SP2 nitrogens (for example). Is this right? I know OpenEye has an

Re: [Rdkit-discuss] cis/trans directional bond and smiles strings in python

2015-10-14 Thread Peter Shenkin
FWIW, this makes sense to me. To the extent that RDKit can recognize an invalid SMARTS or SMILES and throw an exception for it, the user is protected from some classes of error. On Wed, Oct 14, 2015 at 10:39 AM, Rocco Moretti wrote: > > Would raising an error (or warning) be appropriate here? The

Re: [Rdkit-discuss] Generation of stereo-isomers

2015-09-24 Thread Peter Shenkin
Umm... would that be all stereoisomers or all realizable stereoisomers? For example consider two bridgeheads in a norbonane-type compound. In this case, only a particular enantiomeric pair would be realizable, and not all four diastereomers. -

Re: [Rdkit-discuss] Stereochemistry - Differences between RDKit & Indigo

2015-08-20 Thread Peter Shenkin
"My initial answer, and I would love input on this, is that three-coordinate N should always have stereochemistry removed." Umm... even if it's a bridgehead? -P. On Thu, Aug 20, 2015 at 10:30 AM, Greg Landrum wrote: > This isn't a simple one, so it may take a bit to get to an answer that's > c

Re: [Rdkit-discuss] Stereochemistry - Differences between RDKit & Indigo

2015-08-19 Thread Peter Shenkin
Maybe when you have a toolkit as blazingly fast as RDKit it captures the chirality of N center before it has time to interconvert -P. On Wed, Aug 19, 2015 at 10:17 PM, John M wrote: > More odd is the carbon stereocentre with two methyls... > > Generally trivalent nitrogens are not considere

Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-17 Thread Peter Shenkin
. I'm happy to find that the prevailing view is in agreement with my opinion that these specific cases are bugs. (Happy only because that means they'll likely be fixed at some point!) -P. On Wed, Jun 17, 2015 at 1:34 PM, Dimitri Maziuk wrote: > On 06/17/2015 08:36 AM, Pet

Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-17 Thread Peter Shenkin
ion ;-) > > On Wed, Jun 17, 2015 at 8:22 AM, Peter Shenkin wrote: > >> Hi, Greg, >> >> Within the SMILES framework, it seems to me that if you allow the atoms >> to be aromatic, then these are two Kekule structures of the same aromatic >> system, and however

Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-16 Thread Peter Shenkin
Hi, Greg, Within the SMILES framework, it seems to me that if you allow the atoms to be aromatic, then these are two Kekule structures of the same aromatic system, and however you do the canonicalization, they ought to canonicalize to the same structure, which the two examples did not do. I don't

Re: [Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-16 Thread Peter Shenkin
Thanks, Andrew... "BTW, to help it out, you can ask RDKit to include all of the bond information, as otherwise it will use the "single-or-aromatic" notation." That's a nice feature. "I don't know how it is that RDKit adds a double bond to the second cubane, given only aromatic carbons and single

[Rdkit-discuss] Two SMILES that (I think) should canonicalize to the same thing, but don't

2015-06-16 Thread Peter Shenkin
[N-]=[N+]=NC(=O)N1C(=O)N([N+]([O-])=O)C2(C13C4=C56)C4=C5C2=C36 [N-]=[N+]=NC(=O)N(C(=O)N1[N+]([O-])=O)C(c23)(c4c56)C16c3c5c24 rdkit canonicalizes the two to the following, respectively: [N-]=[N+]=NC(=O)N1C(=O)N([N+](=O)[O-])C23c4c5c2c2c-5c4C213 [N-]=[N+]=NC(=O)N1C(=O)N([N+](=O)[O-])C23c4c5c6c(c2c4

Re: [Rdkit-discuss] SMILES: Why are rings consisting of wildcards assumed to be aromatic?

2015-06-16 Thread Peter Shenkin
think I have found an example of two equivalent SMILES for a real molecule (no wildcards) that canonicalize differently in RDKit. I'll start a separate thread for this. -P. On Tue, Jun 16, 2015 at 12:36 AM, Greg Landrum wrote: > On Mon, Jun 15, 2015 at 6:11 PM, Peter Shenkin wrote: ..

Re: [Rdkit-discuss] SMILES: Why are rings consisting of wildcards assumed to be aromatic?

2015-06-15 Thread Peter Shenkin
By the way, lest I appear ungrateful, I'd like to thank Greg/RDKit for making the inter-ring bonds in biphenylene single, rather than aromatic, in the unique SMILES. This is something that several other kits of my acquaintance get wrong -P.

Re: [Rdkit-discuss] SMILES: Why are rings consisting of wildcards assumed to be aromatic?

2015-06-15 Thread Peter Shenkin
On Mon, Jun 15, 2015 at 9:54 AM, Greg Landrum wrote: > > On Thu, Jun 11, 2015 at 5:54 PM, Peter Shenkin wrote: > >> If I canonicalize *1**1 in RDKit, I get [*]1:[*]:[*]:1. >> >> I expected [*]1[*][*]1. >> > ... > > This is certainly a bug and I'

[Rdkit-discuss] SMILES: Why are rings consisting of wildcards assumed to be aromatic?

2015-06-11 Thread Peter Shenkin
If I canonicalize *1**1 in RDKit, I get [*]1:[*]:[*]:1. I expected [*]1[*][*]1. I can think of no reason that the wildcard type in this context should be assumed to be aromatic. Indeed, ** is canonicalized as [*][*], demonstrating that RDKit does not in general require wildcards to be aromatic.