On 10/08/2013 11:35 AM, Geoffrey Hutchison wrote:

> I'm willing to hear suggestions as far as SMILES and InChI limits,
> but I don't even know if a 1000-atom SMILES would be particularly useful.
That, plus the potential for overwhelming the computer (i.e., the
computer would "hang" while it computes the SMILES and InChI) were the
reason for these particular limits.

Finding duplicates is a basic database normalization task, and with
molecules you have to start with graph-based searches. Canonical smiles
and inchi strings are supposed to be useful for that. I don't know if
they actually will be, in our case, but unless I try I'll never know.

Our off the wall DB averages are 96 residues/entity and 46 atoms/residue
-- IRL the ones we can work with will have fewer atoms, but some (many?)
will have longer sequences.

So that's 4K+ atoms average. The part where computer "hangs" is, well,
yeah, what'd you expect. I've an entire compute cluster to ship this to,
and I don't care if it takes a week to calculate: the source sequences
won't change while it's crunching.

It may well be worth keeping the limit if the job's started from the GUI
-- or at least popping up a dialog box saying "are you sure you wanna do
this?". And even then I'd say the issue is memory usage: these days you
probably have at least 3 more cores for other stuff, so as long as OB
doesn't gobble up all the RAM it shouldn't "hang".

my $.02
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu

Attachment: signature.asc
Description: OpenPGP digital signature

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to