Re: [Rdkit-discuss] Error reading SDF dataset from BindingDB

2023-07-25 Thread Ling Chan
Interesting. Thanks Rafael. So it's a bug of BindingDB. Perhaps you should
let them know too.

Ling


Rafael L via Rdkit-discuss  於
2023年7月25日週二 上午12:53寫道:

> Hi, I'm just creating this thread to get the problem and the solution
> indexed by Google
>
> I downloaded several SDF datasets from BindingDB and got errors like this
> one when using Chem.SDMolSupplier:
>
> ERROR: Cannot convert 1. to unsigned int
>
> After some digging I found [
> https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/CAB3Bi0AYKAYOzMUumk6BscipsujkmG_-uuho3%3Dsf9mNyfoBAwA%40mail.gmail.com/#msg32641808],
> and it turns out the headers (before every mol) should have three lines.
> The BindingDB files only had two.
>
> In these files, each mol+properties block was separated by four dollar
> sign symbols. My solution was to add a blank line after each  by using
> Notepad++ Find and replace:  by  + (new line).
>
> --
> *Rafael da Fonseca Lameiro*
> PhD Student - Medicinal and Biological Chemistry Group (NEQUIMED)
> São Carlos Institute of Chemistry - University of São Paulo - Brazil
> [image: orcid logo 16px] https://orcid.org/-0003-4466-2682
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Error reading SDF dataset from BindingDB

2023-07-25 Thread Rafael L via Rdkit-discuss
Hi, I'm just creating this thread to get the problem and the solution
indexed by Google

I downloaded several SDF datasets from BindingDB and got errors like this
one when using Chem.SDMolSupplier:

ERROR: Cannot convert 1. to unsigned int

After some digging I found [
https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/CAB3Bi0AYKAYOzMUumk6BscipsujkmG_-uuho3%3Dsf9mNyfoBAwA%40mail.gmail.com/#msg32641808],
and it turns out the headers (before every mol) should have three lines.
The BindingDB files only had two.

In these files, each mol+properties block was separated by four dollar sign
symbols. My solution was to add a blank line after each  by using
Notepad++ Find and replace:  by  + (new line).

--
*Rafael da Fonseca Lameiro*
PhD Student - Medicinal and Biological Chemistry Group (NEQUIMED)
São Carlos Institute of Chemistry - University of São Paulo - Brazil
[image: orcid logo 16px] https://orcid.org/-0003-4466-2682
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss