Sounds like you are on top of this. :-)

While on the subject of things that must wait until OB 2.4 (I think),
I was talking to Andrew Dalke recently and he mentioned it would be
useful to have the fingerprint formats report the number of bits. This
is important for certain types of similarity calculations (not
Tanimoto).

- Noel

On 16 September 2011 19:25, Chris Morley <c.mor...@gaseq.co.uk> wrote:
> On 16/09/2011 14:32, Noel O'Boyle wrote:
>>
>> Hi Chris,
>>
>> Earlier in the year there were a couple of suggestions for conversion
>> features; one for splitting sdf files into chunks, and the other for
>> using the filename as a descriptor. What do you think about these?
>>
>> http://forums.openbabel.org/Split-SDF-file-in-chunks-td3570274.html
>>
>> http://forums.openbabel.org/Add-filename-to-title-during-conversion-tc3579062.html
>
> I agree that these would be useful. In June I started to draft a reply on
> the first one, which I didn't send. But here it is.
>
> ----
> There is obviously a need for file splitting functionality, so last year I
> started to develop an option to do this in OpenBabel. This is not ready for
> release, largely because its scope was too ambitious. It was intended:
>  - to work transparently with all types of multi-object files, not just sdf;
>  - to copy any headers and footers into the child files (essential for XML
> formats and sometimes valuable for others);
>  - to convert the format on the fly, if requested (it is much faster if
> there is no conversion);
>  - and to work with very large files.
>
> Identifying the headers and footers in a general way was difficult and maybe
> I should resurrect the code with this feature restricted to XML files.
> ----
>
> A second type of SDF splitter is to extend the -m option (a file for each
> molecule) to give the files a more useful name in a more flexible way. This
> is written, but for full functionality requires OBConversion to make
> available the output filename, which it does not at present do.
> Unfortunately this breaks binary compatibility and so I was waiting until
> v2.4.0 .
>
> Adding a descriptor with the input file name sounds easy, but descriptors
> are about the properties of molecules and do not have access to OBConversion
> which holds the filename. There are ways round this, the simplest being an
> op --appendfilename. It sort of breaks the pattern but I'll have a go.
> Incidentally, in some formats if there is no molecule title the filename is
> used instead. In the past, I've found the lack of control over this
> annoying.
>
> Chris
>
>
>

------------------------------------------------------------------------------
BlackBerry&reg; DevCon Americas, Oct. 18-20, San Francisco, CA
http://p.sf.net/sfu/rim-devcon-copy2
_______________________________________________
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Reply via email to