Re: [Rdkit-discuss] SDF properties in case of error

2015-05-04 Thread Dimitri Maziuk
On 2015-05-03 15:06, Michael Reutlinger wrote: Well... I think my proposal should enable us to put more strict, robust QC in place, but I guess you are missing this point. My definition of strict and robust is if the input is bad, what comes out does is an out of band error signal. Such that

Re: [Rdkit-discuss] SDF properties in case of error

2015-05-03 Thread Markus Sitzmann
No, cutting out a chunk of lines from a file might be simple, but can become an expensive operation if you want to deal with thousands of files and million of records. That is one of the reasons why I (unfortunately) couldn't consider rdkit any further for one of my projects a few years ago. So, I

Re: [Rdkit-discuss] SDF properties in case of error

2015-05-02 Thread Greg Landrum
Hi Michael, What you request is certainly possible, but it is a pretty fundamental change in the way the supplier (and mol file parser) works, so it would need some thought. Once concern that immediately occurs to me is that you will not be able to tell which molecules from the input file were

Re: [Rdkit-discuss] SDF properties in case of error

2015-05-02 Thread Michael Reutlinger
Hi Greg, thanks for your answer, I agree that the lighter weighted solution is certainly also a possibility and would clearly solve my (and possibly others) problem. Maybe a suppl.GetLastItemError() would then also be handy to get the error messages that usually are only visible in the log. But

Re: [Rdkit-discuss] SDF properties in case of error

2015-05-01 Thread Dimitri Maziuk
On 04/30/2015 05:01 PM, Michael Reutlinger wrote: However, in some cases this does not help. E.g. when an unknown atom (most of the time this is X) is found in the MolBlock the import fails with an Post-condition Violation and None is yielded. This is fine to detect the problem BUT it is

[Rdkit-discuss] SDF properties in case of error

2015-04-30 Thread Michael Reutlinger
Hi all, I am currently working on a program which needs to process libraries of large SDF files. One requirement is to always produce a valid output including the molecule title/name or a specified property for referencing. With specifying sanitize=False with ForwardSDMolSupplier and using