Hi Greg,
Thanks for the explanation; we had just about come to that conclusion.
Within our database we have tried to combine the 'mol' column with the
smiles column, which I think is where we went wrong.
Cheers,
Dave
On Tue, Oct 7, 2014 at 4:57 PM, Greg Landrum <[email protected]> wrote:
> Hi Dave,
>
> This is a bug/feature (depends on how you look at it, I think it's a bug)
> of the molecule-molecule equality function. That does not take
> stereochemistry into account.
>
> If you want a uniqueness constraint, I would suggest using a canonical
> smiles column (i.e. by using mol_to_smiles() on the molecule column)
>
> -greg
>
>
> On Tue, Oct 7, 2014 at 2:51 PM, Dave Wood <[email protected]> wrote:
>
>> Dear All,
>>
>> A colleague and I have set up a structure database using the RDKit
>> Postgres cartridge and we are getting a bit stuck on how to handle
>> stereochemistry when adding new structures.
>>
>> Below is a test protocol to reproduce our problem.
>>
>> *Use case:*
>>
>> Each structure record should be unique taking into account any specified
>> stereochemical information.
>>
>> Therefore, for the simple case of a compound with one stereocentre we
>> should be able to create 3 records: both the R and S enantiomers, and the
>> racemic/undefined case.
>>
>> *Problem:*
>>
>> Even when setting rdkit.do_chiral_sss=true it is only possible to create
>> a record for the racemic/undefined structure if neither of the enantiomers
>> is present.
>>
>> *Example to reproduce the behaviour:*
>>
>> Set up the emolecules db* as per this section of the RDKit tutorial (
>> http://www.rdkit.org/docs/Cartridge.html#creating-a-database-from-a-file)
>> and add a unique constraint in psql:
>>
>> ALTER TABLE mols ADD CONSTRAINT constraint_unique_m UNIQUE (m);
>>
>> Then try:
>>
>> SET rdkit.do_chiral_sss=true;
>> INSERT INTO mols (m) VALUES ('CCC(C)O');
>> INSERT INTO mols (m) VALUES ('CC[C@@H](C)O');
>> INSERT INTO mols (m) VALUES ('CC[C@H](C)O');
>>
>> All 3 queries execute correctly, but:
>>
>> INSERT INTO mols (m) VALUES ('CCC[C@@H](C)O');
>> INSERT INTO mols (m) VALUES ('CCC[C@H](C)O');
>> INSERT INTO mols (m) VALUES ('CCCC(C)O'); -- this query fails
>>
>> ERROR: duplicate key value violates unique constraint
>> "constraint_unique_m"
>> DETAIL: Key (m)=(CCCC(C)O) already exists.
>>
>>
>> So the question is whether this behaviour is by design (in order to
>> fulfil some other requirement maybe), or whether it's an issue with the
>> implementation?
>>
>>
>> * the test db was created using the first 1000 entries from the latest
>> emolecules smiles file.
>> Many thanks in advance for your help with this.
>>
>> Dave
>>
>>
>> ------------------------------------------------------------------------------
>> Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
>> Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
>> Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
>> Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Rdkit-discuss mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
*David Wood, PhD*
*Discovery Team Leader*
*Molplex Pharmaceuticals*
*The Biohub at Alderley Park*
*Macclesfield*
*Cheshire*
*SK10 4TG*
*01625 238702*
*[email protected] <[email protected]>*
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss