[Open Babel] running Open Babel in parallel / distributed mode

2014-02-02 Thread Francois Berenger
Hello,

I do this almost everyday so I think I should share it with this list.

In case you need to execute many Open Babel commands
and don't want to wait, you can execute them in parallel
on a multi-core computer.
Of course, the commands should be independent, for example
processing different datasets.

Let's say the commands are in a file called for_par.sh.
I developped a tool called PAR years ago that can do this:

par -i for_par.sh -v -o log

It will use all cores of the computer, display a completion
percentage and store all output messages in the file log.

If your user can connect to several computers e.g. via
SSH then you can even run commands in a distributed manner.
I use it daily on Linux but know some people used it on Mac OS X
as well.

The project is there:

https://savannah.nongnu.org/projects/par

The paper is freely available there:

http://bioinformatics.oxfordjournals.org/content/26/22/2918.long

-- 
Best regards,
Francois Berenger.

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
___
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss


Re: [Open Babel] running Open Babel in parallel / distributed mode

2014-02-03 Thread Maciek Wójcikowski
You can also use xargs.


Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl


2014-02-03 16:10 GMT+01:00 Igor Filippov :

> How is it different from GNU parallel?
> http://www.gnu.org/software/bash/manual/html_node/GNU-Parallel.html
>
> Igor
>
>
> On Mon, Feb 3, 2014 at 1:37 AM, Francois Berenger wrote:
>
>> Hello,
>>
>> I do this almost everyday so I think I should share it with this list.
>>
>> In case you need to execute many Open Babel commands
>> and don't want to wait, you can execute them in parallel
>> on a multi-core computer.
>> Of course, the commands should be independent, for example
>> processing different datasets.
>>
>> Let's say the commands are in a file called for_par.sh.
>> I developped a tool called PAR years ago that can do this:
>>
>> par -i for_par.sh -v -o log
>>
>> It will use all cores of the computer, display a completion
>> percentage and store all output messages in the file log.
>>
>> If your user can connect to several computers e.g. via
>> SSH then you can even run commands in a distributed manner.
>> I use it daily on Linux but know some people used it on Mac OS X
>> as well.
>>
>> The project is there:
>>
>> https://savannah.nongnu.org/projects/par
>>
>> The paper is freely available there:
>>
>> http://bioinformatics.oxfordjournals.org/content/26/22/2918.long
>>
>> --
>> Best regards,
>> Francois Berenger.
>>
>>
>> --
>> Managing the Performance of Cloud-Based Applications
>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
>> Read the Whitepaper.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
>> ___
>> OpenBabel-discuss mailing list
>> OpenBabel-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>>
>
>
>
> --
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
> ___
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
>
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk___
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss


Re: [Open Babel] running Open Babel in parallel / distributed mode

2014-02-03 Thread Igor Filippov
How is it different from GNU parallel?
http://www.gnu.org/software/bash/manual/html_node/GNU-Parallel.html

Igor


On Mon, Feb 3, 2014 at 1:37 AM, Francois Berenger  wrote:

> Hello,
>
> I do this almost everyday so I think I should share it with this list.
>
> In case you need to execute many Open Babel commands
> and don't want to wait, you can execute them in parallel
> on a multi-core computer.
> Of course, the commands should be independent, for example
> processing different datasets.
>
> Let's say the commands are in a file called for_par.sh.
> I developped a tool called PAR years ago that can do this:
>
> par -i for_par.sh -v -o log
>
> It will use all cores of the computer, display a completion
> percentage and store all output messages in the file log.
>
> If your user can connect to several computers e.g. via
> SSH then you can even run commands in a distributed manner.
> I use it daily on Linux but know some people used it on Mac OS X
> as well.
>
> The project is there:
>
> https://savannah.nongnu.org/projects/par
>
> The paper is freely available there:
>
> http://bioinformatics.oxfordjournals.org/content/26/22/2918.long
>
> --
> Best regards,
> Francois Berenger.
>
>
> --
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
> ___
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk___
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss


Re: [Open Babel] running Open Babel in parallel / distributed mode

2014-02-03 Thread Francois Berenger
On 02/04/2014 12:14 AM, Maciek Wójcikowski wrote:
> You can also use xargs.

Yes, xargs with the -P option, but the command lines are not trivial then.

> 
> Pozdrawiam,  |  Best regards,
> Maciek Wójcikowski
> mac...@wojcikowski.pl 
>
>
> 2014-02-03 16:10 GMT+01:00 Igor Filippov  >:
>
> How is it different from GNU parallel?
> http://www.gnu.org/software/bash/manual/html_node/GNU-Parallel.html

It should be quite similar in functionality.

> Igor
>
>
> On Mon, Feb 3, 2014 at 1:37 AM, Francois Berenger  > wrote:
>
> Hello,
>
> I do this almost everyday so I think I should share it with this
> list.
>
> In case you need to execute many Open Babel commands
> and don't want to wait, you can execute them in parallel
> on a multi-core computer.
> Of course, the commands should be independent, for example
> processing different datasets.
>
> Let's say the commands are in a file called for_par.sh.
> I developped a tool called PAR years ago that can do this:
>
> par -i for_par.sh -v -o log
>
> It will use all cores of the computer, display a completion
> percentage and store all output messages in the file log.
>
> If your user can connect to several computers e.g. via
> SSH then you can even run commands in a distributed manner.
> I use it daily on Linux but know some people used it on Mac OS X
> as well.
>
> The project is there:
>
> https://savannah.nongnu.org/projects/par
>
> The paper is freely available there:
>
> http://bioinformatics.oxfordjournals.org/content/26/22/2918.long
>
> --
> Best regards,
> Francois Berenger.
>
> 
> --
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common
> Pitfalls.
> Read the Whitepaper.
> 
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
> ___
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> 
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
>
>
> 
> --
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
> 
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
> ___
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> 
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
>


-- 
Best regards,
Francois Berenger.

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
___
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss


Re: [Open Babel] running Open Babel in parallel / distributed mode

2014-02-04 Thread Noel O'Boyle
It would be nice to see some explicit examples of how Open Babel might
be used in this way, using one or all of these tools.

- Noel

On 4 February 2014 00:52, Francois Berenger  wrote:
> On 02/04/2014 12:14 AM, Maciek Wójcikowski wrote:
>> You can also use xargs.
>
> Yes, xargs with the -P option, but the command lines are not trivial then.
>
>> 
>> Pozdrawiam,  |  Best regards,
>> Maciek Wójcikowski
>> mac...@wojcikowski.pl 
>>
>>
>> 2014-02-03 16:10 GMT+01:00 Igor Filippov > >:
>>
>> How is it different from GNU parallel?
>> http://www.gnu.org/software/bash/manual/html_node/GNU-Parallel.html
>
> It should be quite similar in functionality.
>
>> Igor
>>
>>
>> On Mon, Feb 3, 2014 at 1:37 AM, Francois Berenger > > wrote:
>>
>> Hello,
>>
>> I do this almost everyday so I think I should share it with this
>> list.
>>
>> In case you need to execute many Open Babel commands
>> and don't want to wait, you can execute them in parallel
>> on a multi-core computer.
>> Of course, the commands should be independent, for example
>> processing different datasets.
>>
>> Let's say the commands are in a file called for_par.sh.
>> I developped a tool called PAR years ago that can do this:
>>
>> par -i for_par.sh -v -o log
>>
>> It will use all cores of the computer, display a completion
>> percentage and store all output messages in the file log.
>>
>> If your user can connect to several computers e.g. via
>> SSH then you can even run commands in a distributed manner.
>> I use it daily on Linux but know some people used it on Mac OS X
>> as well.
>>
>> The project is there:
>>
>> https://savannah.nongnu.org/projects/par
>>
>> The paper is freely available there:
>>
>> http://bioinformatics.oxfordjournals.org/content/26/22/2918.long
>>
>> --
>> Best regards,
>> Francois Berenger.
>>
>> 
>> --
>> Managing the Performance of Cloud-Based Applications
>> Take advantage of what the Cloud has to offer - Avoid Common
>> Pitfalls.
>> Read the Whitepaper.
>> 
>> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
>> ___
>> OpenBabel-discuss mailing list
>> OpenBabel-discuss@lists.sourceforge.net
>> 
>> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>>
>>
>>
>> 
>> --
>> Managing the Performance of Cloud-Based Applications
>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
>> Read the Whitepaper.
>> 
>> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
>> ___
>> OpenBabel-discuss mailing list
>> OpenBabel-discuss@lists.sourceforge.net
>> 
>> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>>
>>
>
>
> --
> Best regards,
> Francois Berenger.
>
> --
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
> ___
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
___
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss


Re: [Open Babel] running Open Babel in parallel / distributed mode

2014-02-04 Thread Francois Berenger
On 2/4/14, 6:28 PM, Noel O'Boyle wrote:
> It would be nice to see some explicit examples of how Open Babel might
> be used in this way, using one or all of these tools.

Let's say you have a file commmands.sh:
---
obabel some_file1 -Osome_other_file1.other_format
...
obabel some_file2 -Osome_other_file2.other_format
---

With PAR, you run them like this in parallel:

par -i commands.sh -v -o log # all your cores will be used by default

to check the logs (in case you are really careful about
what you are doing, e.g. when preparing datasets for scientific use),
I recommend:

sort -u log | less

> - Noel
>
> On 4 February 2014 00:52, Francois Berenger  wrote:
>> On 02/04/2014 12:14 AM, Maciek Wójcikowski wrote:
>>> You can also use xargs.
>>
>> Yes, xargs with the -P option, but the command lines are not trivial then.
>>
>>> 
>>> Pozdrawiam,  |  Best regards,
>>> Maciek Wójcikowski
>>> mac...@wojcikowski.pl 
>>>
>>>
>>> 2014-02-03 16:10 GMT+01:00 Igor Filippov >> >:
>>>
>>>  How is it different from GNU parallel?
>>>  http://www.gnu.org/software/bash/manual/html_node/GNU-Parallel.html
>>
>> It should be quite similar in functionality.
>>
>>>  Igor
>>>
>>>
>>>  On Mon, Feb 3, 2014 at 1:37 AM, Francois Berenger >>  > wrote:
>>>
>>>  Hello,
>>>
>>>  I do this almost everyday so I think I should share it with this
>>>  list.
>>>
>>>  In case you need to execute many Open Babel commands
>>>  and don't want to wait, you can execute them in parallel
>>>  on a multi-core computer.
>>>  Of course, the commands should be independent, for example
>>>  processing different datasets.
>>>
>>>  Let's say the commands are in a file called for_par.sh.
>>>  I developped a tool called PAR years ago that can do this:
>>>
>>>  par -i for_par.sh -v -o log
>>>
>>>  It will use all cores of the computer, display a completion
>>>  percentage and store all output messages in the file log.
>>>
>>>  If your user can connect to several computers e.g. via
>>>  SSH then you can even run commands in a distributed manner.
>>>  I use it daily on Linux but know some people used it on Mac OS X
>>>  as well.
>>>
>>>  The project is there:
>>>
>>>  https://savannah.nongnu.org/projects/par
>>>
>>>  The paper is freely available there:
>>>
>>>  http://bioinformatics.oxfordjournals.org/content/26/22/2918.long
>>>
>>>  --
>>>  Best regards,
>>>  Francois Berenger.
>>>
>>>  
>>> --
>>>  Managing the Performance of Cloud-Based Applications
>>>  Take advantage of what the Cloud has to offer - Avoid Common
>>>  Pitfalls.
>>>  Read the Whitepaper.
>>>  
>>> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
>>>  ___
>>>  OpenBabel-discuss mailing list
>>>  OpenBabel-discuss@lists.sourceforge.net
>>>  
>>>  https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>>>
>>>
>>>
>>>  
>>> --
>>>  Managing the Performance of Cloud-Based Applications
>>>  Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
>>>  Read the Whitepaper.
>>>  
>>> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
>>>  ___
>>>  OpenBabel-discuss mailing list
>>>  OpenBabel-discuss@lists.sourceforge.net
>>>  
>>>  https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>>>
>>>
>>
>>
>> --
>> Best regards,
>> Francois Berenger.
>>
>> --
>> Managing the Performance of Cloud-Based Applications
>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
>> Read the Whitepaper.
>> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
>> ___
>> OpenBabel-discuss mailing list
>> OpenBabel-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss


--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
___
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
htt

Re: [Open Babel] running Open Babel in parallel / distributed mode

2014-02-04 Thread Maciek Wójcikowski
Francois, could you please elaborate on the "log" argument? Because, that
might be the only difference here. Using either gnu parallel or xargs
standard output and error are "FIFO", which makes them useless.
Other than that I cant see any advantage, since parallel line would be:

cat commands.sh | parallel -P 8 > log 2> error.log
(or 2>&1 if someone prefers joint log)


Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl


2014-02-04 11:58 GMT+01:00 Francois Berenger :

> On 2/4/14, 6:28 PM, Noel O'Boyle wrote:
>
>> It would be nice to see some explicit examples of how Open Babel might
>> be used in this way, using one or all of these tools.
>>
>
> Let's say you have a file commmands.sh:
> ---
> obabel some_file1 -Osome_other_file1.other_format
> ...
> obabel some_file2 -Osome_other_file2.other_format
> ---
>
> With PAR, you run them like this in parallel:
>
> par -i commands.sh -v -o log # all your cores will be used by default
>
> to check the logs (in case you are really careful about
> what you are doing, e.g. when preparing datasets for scientific use),
> I recommend:
>
> sort -u log | less
>
>
> - Noel
>>
>> On 4 February 2014 00:52, Francois Berenger  wrote:
>>
>>> On 02/04/2014 12:14 AM, Maciek Wójcikowski wrote:
>>>
 You can also use xargs.

>>>
>>> Yes, xargs with the -P option, but the command lines are not trivial
>>> then.
>>>
>>> 
 Pozdrawiam,  |  Best regards,
 Maciek Wójcikowski
 mac...@wojcikowski.pl 


 2014-02-03 16:10 GMT+01:00 Igor Filippov >>> >:

  How is it different from GNU parallel?
  http://www.gnu.org/software/bash/manual/html_node/GNU-
 Parallel.html

>>>
>>> It should be quite similar in functionality.
>>>
>>>  Igor


  On Mon, Feb 3, 2014 at 1:37 AM, Francois Berenger <
 beren...@riken.jp
  > wrote:

  Hello,

  I do this almost everyday so I think I should share it with
 this
  list.

  In case you need to execute many Open Babel commands
  and don't want to wait, you can execute them in parallel
  on a multi-core computer.
  Of course, the commands should be independent, for example
  processing different datasets.

  Let's say the commands are in a file called for_par.sh.
  I developped a tool called PAR years ago that can do this:

  par -i for_par.sh -v -o log

  It will use all cores of the computer, display a completion
  percentage and store all output messages in the file log.

  If your user can connect to several computers e.g. via
  SSH then you can even run commands in a distributed manner.
  I use it daily on Linux but know some people used it on Mac OS
 X
  as well.

  The project is there:

  https://savannah.nongnu.org/projects/par

  The paper is freely available there:

  http://bioinformatics.oxfordjournals.org/content/26/
 22/2918.long

  --
  Best regards,
  Francois Berenger.

  
 --
  Managing the Performance of Cloud-Based Applications
  Take advantage of what the Cloud has to offer - Avoid Common
  Pitfalls.
  Read the Whitepaper.
  http://pubads.g.doubleclick.net/gampad/clk?id=121051231&;
 iu=/4140/ostg.clktrk
  ___
  OpenBabel-discuss mailing list
  OpenBabel-discuss@lists.sourceforge.net
  
  https://lists.sourceforge.net/lists/listinfo/openbabel-discuss



  
 --
  Managing the Performance of Cloud-Based Applications
  Take advantage of what the Cloud has to offer - Avoid Common
 Pitfalls.
  Read the Whitepaper.
  http://pubads.g.doubleclick.net/gampad/clk?id=121051231&;
 iu=/4140/ostg.clktrk
  ___
  OpenBabel-discuss mailing list
  OpenBabel-discuss@lists.sourceforge.net
  
  https://lists.sourceforge.net/lists/listinfo/openbabel-discuss



>>>
>>> --
>>> Best regards,
>>> Francois Berenger.
>>>
>>> 
>>> --
>>> Managing the Performance of Cloud-Based Applications
>>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
>>

Re: [Open Babel] running Open Babel in parallel / distributed mode

2014-02-04 Thread David Hall
An example for gnu parallel

$ parallel -L1000  -k obabel -:{} -osdf --gen2D < gdb11_size09.smi >
gdb11_size09.sdf

I've taken the gdb11 molecules of size 9, which are 444,313 molecules.
Using -L1000 , parallel reads 1000 lines at a time and passes them into
openbabel for conversion to sdf
By using the "-k" option, the order of the output sdf is maintained as the
same order in the input smi file

Timing this, it takes 1m54s on my computer.

The serial version
$  obabel gdb11_size09.smi -O gdb11_size09.sdf --gen2D
takes  13m11s

These are on a six-core processor that has hyperthreading to give 12
virtual cores, so maybe the hyperthreading gave the little bit over 6x
speedup, all without the need to separate by file into multiple parts and
create a script that has multiple commands.

-David




On Tue, Feb 4, 2014 at 4:28 AM, Noel O'Boyle  wrote:

> It would be nice to see some explicit examples of how Open Babel might
> be used in this way, using one or all of these tools.
>
> - Noel
>
> On 4 February 2014 00:52, Francois Berenger  wrote:
> > On 02/04/2014 12:14 AM, Maciek Wójcikowski wrote:
> >> You can also use xargs.
> >
> > Yes, xargs with the -P option, but the command lines are not trivial
> then.
> >
> >> 
> >> Pozdrawiam,  |  Best regards,
> >> Maciek Wójcikowski
> >> mac...@wojcikowski.pl 
> >>
> >>
> >> 2014-02-03 16:10 GMT+01:00 Igor Filippov  >> >:
> >>
> >> How is it different from GNU parallel?
> >> http://www.gnu.org/software/bash/manual/html_node/GNU-Parallel.html
> >
> > It should be quite similar in functionality.
> >
> >> Igor
> >>
> >>
> >> On Mon, Feb 3, 2014 at 1:37 AM, Francois Berenger <
> beren...@riken.jp
> >> > wrote:
> >>
> >> Hello,
> >>
> >> I do this almost everyday so I think I should share it with this
> >> list.
> >>
> >> In case you need to execute many Open Babel commands
> >> and don't want to wait, you can execute them in parallel
> >> on a multi-core computer.
> >> Of course, the commands should be independent, for example
> >> processing different datasets.
> >>
> >> Let's say the commands are in a file called for_par.sh.
> >> I developped a tool called PAR years ago that can do this:
> >>
> >> par -i for_par.sh -v -o log
> >>
> >> It will use all cores of the computer, display a completion
> >> percentage and store all output messages in the file log.
> >>
> >> If your user can connect to several computers e.g. via
> >> SSH then you can even run commands in a distributed manner.
> >> I use it daily on Linux but know some people used it on Mac OS X
> >> as well.
> >>
> >> The project is there:
> >>
> >> https://savannah.nongnu.org/projects/par
> >>
> >> The paper is freely available there:
> >>
> >>
> http://bioinformatics.oxfordjournals.org/content/26/22/2918.long
> >>
> >> --
> >> Best regards,
> >> Francois Berenger.
> >>
> >>
> --
> >> Managing the Performance of Cloud-Based Applications
> >> Take advantage of what the Cloud has to offer - Avoid Common
> >> Pitfalls.
> >> Read the Whitepaper.
> >>
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
> >> ___
> >> OpenBabel-discuss mailing list
> >> OpenBabel-discuss@lists.sourceforge.net
> >> 
> >> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
> >>
> >>
> >>
> >>
> --
> >> Managing the Performance of Cloud-Based Applications
> >> Take advantage of what the Cloud has to offer - Avoid Common
> Pitfalls.
> >> Read the Whitepaper.
> >>
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
> >> ___
> >> OpenBabel-discuss mailing list
> >> OpenBabel-discuss@lists.sourceforge.net
> >> 
> >> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
> >>
> >>
> >
> >
> > --
> > Best regards,
> > Francois Berenger.
> >
> >
> --
> > Managing the Performance of Cloud-Based Applications
> > Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> > Read the Whitepaper.
> >
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
> > ___
> > OpenBabel-discuss mailing list
> > OpenBabel-discuss@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/openbabel-disc

Re: [Open Babel] running Open Babel in parallel / distributed mode

2014-02-04 Thread Francois Berenger
On 02/04/2014 08:09 PM, Maciek Wójcikowski wrote:
> Francois, could you please elaborate on the "log" argument? Because,
> that might be the only difference here. Using either gnu parallel or
> xargs standard output and error are "FIFO", which makes them useless.
> Other than that I cant see any advantage, since parallel line would be:

Here are the most useful PAR options:

# par
-i or -c is mandatory
Usage: parallel.py [options] {-i | -c} ...
Execute commands in a parallel and/or distributed way.

Options:
   -h, --helpshow this help message and exit
   -c SERVER_NAME, --client=SERVER_NAME
 read commands from a server instead of a file
 (incompatible with -i)
   -i COMMANDS_FILE, --input=COMMANDS_FILE
 /dev/stdin for example (incompatible with -c)
   -o OUTPUT_FILE, --output=OUTPUT_FILE
 log to a file instead of stdout
   -s, --server  accept remote workers
   -v, --verbose enable progress bar
   -w NB_LOCAL_WORKERS, --workers=NB_LOCAL_WORKERS
 number of local worker threads, must be >= 0, 
default
 is number of detected cores, very probably 0 if 
your
 OS is not Linux

The biggest difference I see is that parallel is a GNU project
and is packaged for several Linux distributions; so it can
be installed automatically. PAR could probably be packaged
as an easy_install package for Python.

It would be nice to see a benchmark of PAR versus GNU parallel.

Since I am the author of PAR, I don't switch to another software
because I can add features in mine as I wish. :)

> cat commands.sh | parallel -P 8 > log 2> error.log
> (or 2>&1 if someone prefers joint log)
>
> 
> Pozdrawiam,  |  Best regards,
> Maciek Wójcikowski
> mac...@wojcikowski.pl 
>
>
> 2014-02-04 11:58 GMT+01:00 Francois Berenger  >:
>
> On 2/4/14, 6:28 PM, Noel O'Boyle wrote:
>
> It would be nice to see some explicit examples of how Open Babel
> might
> be used in this way, using one or all of these tools.
>
>
> Let's say you have a file commmands.sh:
> ---
> obabel some_file1 -Osome_other_file1.other___format
> ...
> obabel some_file2 -Osome_other_file2.other___format
> ---
>
> With PAR, you run them like this in parallel:
>
> par -i commands.sh -v -o log # all your cores will be used by default
>
> to check the logs (in case you are really careful about
> what you are doing, e.g. when preparing datasets for scientific use),
> I recommend:
>
> sort -u log | less
>
>
> - Noel
>
> On 4 February 2014 00:52, Francois Berenger  > wrote:
>
> On 02/04/2014 12:14 AM, Maciek Wójcikowski wrote:
>
> You can also use xargs.
>
>
> Yes, xargs with the -P option, but the command lines are not
> trivial then.
>
> 
> Pozdrawiam,  |  Best regards,
> Maciek Wójcikowski
> mac...@wojcikowski.pl
>   >
>
>
> 2014-02-03 16:10 GMT+01:00 Igor Filippov
>  
>  >>:
>
>   How is it different from GNU parallel?
> 
> http://www.gnu.org/software/__bash/manual/html_node/GNU-__Parallel.html
> 
> 
>
>
> It should be quite similar in functionality.
>
>   Igor
>
>
>   On Mon, Feb 3, 2014 at 1:37 AM, Francois Berenger
> mailto:beren...@riken.jp>
>    >> wrote:
>
>   Hello,
>
>   I do this almost everyday so I think I should
> share it with this
>   list.
>
>   In case you need to execute many Open Babel
> commands
>   and don't want to wait, you can execute them
> in parallel
>   on a multi-core computer.
>   Of course, the commands should be independent,
> for example
>   processing different datasets.
>
>   Let's say the commands are in a file called
> for_par.sh.
>   I developped a tool called PAR years ago that
> can do this:
>
>   par -i for_par.sh -v -o log