Hi Rocco,

the locale Python module will allow you to do this sort of normalizations
on strings, e.g.:

import locale

locale.getlocale()

('en_US', 'UTF-8')


locale.setlocale(locale.LC_ALL, "it_IT")

'it_IT'


locale.delocalize("1,222")

'1.222'


But this requires you to know the locale the values where originally encoded in.


HTH, cheers

p.


On Thu, Sep 29, 2022 at 8:16 PM Rocco Moretti <rmoretti...@gmail.com> wrote:

> Hello,
>
> I have a number of SDFs of molecules with associated data blocks. (That
> is, the `>` section that comes after `M END` and before `$$$$`.)
>
> The problem I have is that these SDFs were generated in different
> countries, and have different locales -- most notably, some of them use "."
> as the decimal separator for real-valued properties and some use ",".  To
> make things even more fun, some use a mix of both, depending on who
> calculated which properties where.
>
> Is there any facility in RDKit for reading in such locale-varying SDF
> files and normalizing them?
>
> Thanks,
> Rocco
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to