Thanks Paolo, that's fantastic.
The first option was what I needed.
Tim
On Thu, Nov 4, 2021 at 4:36 PM Paolo Tosco <[email protected]>
wrote:
> Hi Tim,
>
> if you need access to the original text, you'll have to do the chunking
> yourself, e.g.:
>
> import gzip
>
> def molgen(hnd):
> mol_text_tmp = ""
> while 1:
> line = hnd.readline()
> if not line:
> return
> line = line.decode("utf-8")
> mol_text_tmp += line
> if line.startswith("$$$$"):
> mol_text = mol_text_tmp
> mol_text_tmp = ""
> yield mol_text
>
> with gzip.open("yourfile.sdf.gz", "rb") as gzip_hnd:
> for mol_text in molgen(gzip_hnd):
> print(mol_text)
> suppl = Chem.SDMolSupplier()
> suppl.SetData(mol_text)
> mol = next(suppl)
> print(mol.GetNumAtoms())
> print("------------------")
>
> If you are happy with the RDKit-generated text, you can combine the
> ForwardSDMolSupplier with the SDWriter:
>
> import gzip
> from io import StringIO
>
> with gzip.open("yourfile.sdf.gz", "rb") as gzip_hnd:
> with Chem.ForwardSDMolSupplier(gzip_hnd) as suppl:
> for mol in suppl:
> buf = StringIO()
> with Chem.SDWriter(buf) as w:
> w.write(mol)
> print(buf.getvalue())
> print(mol.GetNumAtoms())
> print("------------------")
>
> Cheers,
> p.
>
> On Thu, Nov 4, 2021 at 5:09 PM Tim Dudgeon <[email protected]> wrote:
>
>> I am needing to access the text of each record of a SDF, as well as
>> creating a mol instance.
>> I was successfully doing this using SDMolSupplier.GetItemText().
>> Then I needed to switch to handling gzipped SD files, and SDMolSupplier
>> can only take a file name in its constructor.
>> ForwardSDMolSupplier can handle a gzip file-like instance, but doesn't
>> have the GetItemText() function.
>> Reading the file records as text is easy enough, but I can't figure out
>> how to get the SD file properties (Chem.MolFromMolBlock() does not handle
>> the properties).
>>
>> Seems like there should be an easy way to handle this that I'm not seeing!
>>
>> Tim
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss