On Mon, Sep 22, 2014 at 3:49 PM, JP <[email protected]> wrote:
>
> Ola RDKitters,
>
> I have a molecule in postgresql, and I would like to calculate the
overall formal charge of the molecule as separate + and - counts.
>
> I currently came up with (warning: hack ahead!)
>
> substruct_count(rdkitmol, mol_from_smarts('[-]'), true) +
(substruct_count(rdkitmol, mol_from_smarts('[--]'), true) * 2) +
(substruct_count(rdkitmol, mol_from_smarts('[---]'), true) * 3) as neg
>
> But this being RDKit, there probably is a better way (and what about my
[U+4] ?).
>

There's not really a good way that I can think of to do this.

If you were interested in the number of non-zero charged atoms, you could
do:

chembl_19=# select
substruct_count('[Na+].CCC.[Cl-]',mol_from_smarts('[!+0]'),true);

 substruct_count

-----------------

               2

(1 row)

but that doesn't seem to be what you're looking for.


There are also some string-manipulation games you could play on SMILES
columns that are still ugly, but which would probably be more efficient.
Here are a couple of queries that might be useful there:


chembl_19=# select count(*) from compound_structures where
length(canonical_smiles)-length(regexp_replace(canonical_smiles,'\+\]',''))
> 0;

 count

--------

 110832

(1 row)


Time: 2188.448 ms

chembl_19=# select count(*) from compound_structures where
length(canonical_smiles)-length(regexp_replace(canonical_smiles,'\+2\]',''))
> 0;

 count

-------

   274

(1 row)


Time: 2257.251 ms

chembl_19=# select count(*) from compound_structures where
length(canonical_smiles)-length(regexp_replace(canonical_smiles,'\-\]',''))
> 0;

 count

-------

 96973

(1 row)


Time: 2199.070 ms

-greg
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to