On 16/09/2011 14:32, Noel O'Boyle wrote:
> Hi Chris,
>
> Earlier in the year there were a couple of suggestions for conversion
> features; one for splitting sdf files into chunks, and the other for
> using the filename as a descriptor. What do you think about these?
>
> http://forums.openbabel.org/Split-SDF-file-in-chunks-td3570274.html
> http://forums.openbabel.org/Add-filename-to-title-during-conversion-tc3579062.html

I agree that these would be useful. In June I started to draft a reply 
on the first one, which I didn't send. But here it is.

----
There is obviously a need for file splitting functionality, so last year 
I started to develop an option to do this in OpenBabel. This is not 
ready for release, largely because its scope was too ambitious. It was 
intended:
  - to work transparently with all types of multi-object files, not just 
sdf;
   - to copy any headers and footers into the child files (essential for 
XML formats and sometimes valuable for others);
   - to convert the format on the fly, if requested (it is much faster 
if there is no conversion);
  - and to work with very large files.

Identifying the headers and footers in a general way was difficult and 
maybe I should resurrect the code with this feature restricted to XML files.
----

A second type of SDF splitter is to extend the -m option (a file for 
each molecule) to give the files a more useful name in a more flexible 
way. This is written, but for full functionality requires OBConversion 
to make available the output filename, which it does not at present do. 
Unfortunately this breaks binary compatibility and so I was waiting 
until v2.4.0 .

Adding a descriptor with the input file name sounds easy, but 
descriptors are about the properties of molecules and do not have access 
to OBConversion which holds the filename. There are ways round this, the 
simplest being an op --appendfilename. It sort of breaks the pattern but 
I'll have a go. Incidentally, in some formats if there is no molecule 
title the filename is used instead. In the past, I've found the lack of 
control over this annoying.

Chris



------------------------------------------------------------------------------
BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA
http://p.sf.net/sfu/rim-devcon-copy2
_______________________________________________
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Reply via email to