Re: [Cocci] [PATCH] coccinelle: add pycocci wrapper for multithreaded support

2014-08-28 Thread Luis R. Rodriguez
On Thu, Aug 28, 2014 at 08:15:16PM +0200, Julia Lawall wrote:
> 
> 
> On Thu, 28 Aug 2014, Luis R. Rodriguez wrote:
> 
> > On Thu, Apr 10, 2014 at 09:32:34PM +0200, Julia Lawall wrote:
> > > 
> > > 
> > > On Thu, 10 Apr 2014, Luis R. Rodriguez wrote:
> > > 
> > > > On Thu, Apr 10, 2014 at 07:51:29PM +0200, Johannes Berg wrote:
> > > > > On Thu, 2014-04-10 at 10:48 -0700, Luis R. Rodriguez wrote:
> > > > > 
> > > > > > You just pass it a cocci file, a target dir, and in git environments
> > > > > > you always want --in-place enabled. Experiments and profiling random
> > > > > > cocci files with the Linux kernel show that using just using number 
> > > > > > of
> > > > > > CPUs doesn't scale well given that lots of buckets of files don't 
> > > > > > require
> > > > > > work, as such this uses 10 * number of CPUs for its number of 
> > > > > > threads.
> > > > > > For work that define more general ruler 3 * number of CPUs works 
> > > > > > better,
> > > > > > but for smaller cocci files 3 * number of CPUs performs best right 
> > > > > > now.
> > > > > > To experiment more with what's going on with the multithreading one 
> > > > > > can enable
> > > > > > htop while kicking off a cocci task on the kernel, we want to keep
> > > > > > these CPUs busy as much as possible. 
> > > > > 
> > > > > That's not really a good benchmark, you want to actually check how
> > > > > quickly it finishes ... If you have some IO issues then just keeping 
> > > > > the
> > > > > CPUs busy trying to do IO won't help at all.
> > > > 
> > > > I checked the profile results, the reason the jobs finish is some 
> > > > threads
> > > > had no work or little work. Hence why I increased the number of threads,
> > > > depending on the context (long or short cocci expected, in backports
> > > > at least, the long being all cocci files in one, the short being 
> > > > --test-cocci
> > > > flag to gentree.py). This wrapper uses the short assumption with 10 * 
> > > > num_cpus
> > > > 
> > > > > > Since its just a helper I toss it into the python directory but 
> > > > > > don't
> > > > > > install it. Hope is that we can evolve it there instead of carrying 
> > > > > > this
> > > > > > helper within backports.
> > > > > 
> > > > > If there's a plan to make coccinelle itself multi-threaded, what's the
> > > > > point?
> > > > 
> > > > To be clear, Coccinelle *has* a form of multithreaded support but 
> > > > requires manual
> > > > spawning of jobs with references to the max count and also the number 
> > > > thread
> > > > that this new process you are spawning belongs to. There's plans to 
> > > > consider
> > > > reworking things to handle all this internally but as I discussed with 
> > > > Julia
> > > > the changes required would require some structural changes, and as such 
> > > > we
> > > > need to live with this for a bit longer. I need to use Coccinelle daily 
> > > > now,
> > > > so figured I'd punt this out there in case others might make use of it.
> > > 
> > > I agree with Luis.  Multithreading inside Coccinelle is currently a 
> > > priority task, but not a highest priority one.
> > 
> > Folks, anyone object to merging pycocci in the meantime? I keep using it 
> > outside
> > of backports and it does what I think most kernel developers expect. This 
> > would
> > be until we get proper parallelism support in place.
> 
> Merge away...

Pushed.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Cocci] [PATCH] coccinelle: add pycocci wrapper for multithreaded support

2014-08-28 Thread Julia Lawall


On Thu, 28 Aug 2014, Luis R. Rodriguez wrote:

> On Thu, Apr 10, 2014 at 09:32:34PM +0200, Julia Lawall wrote:
> > 
> > 
> > On Thu, 10 Apr 2014, Luis R. Rodriguez wrote:
> > 
> > > On Thu, Apr 10, 2014 at 07:51:29PM +0200, Johannes Berg wrote:
> > > > On Thu, 2014-04-10 at 10:48 -0700, Luis R. Rodriguez wrote:
> > > > 
> > > > > You just pass it a cocci file, a target dir, and in git environments
> > > > > you always want --in-place enabled. Experiments and profiling random
> > > > > cocci files with the Linux kernel show that using just using number of
> > > > > CPUs doesn't scale well given that lots of buckets of files don't 
> > > > > require
> > > > > work, as such this uses 10 * number of CPUs for its number of threads.
> > > > > For work that define more general ruler 3 * number of CPUs works 
> > > > > better,
> > > > > but for smaller cocci files 3 * number of CPUs performs best right 
> > > > > now.
> > > > > To experiment more with what's going on with the multithreading one 
> > > > > can enable
> > > > > htop while kicking off a cocci task on the kernel, we want to keep
> > > > > these CPUs busy as much as possible. 
> > > > 
> > > > That's not really a good benchmark, you want to actually check how
> > > > quickly it finishes ... If you have some IO issues then just keeping the
> > > > CPUs busy trying to do IO won't help at all.
> > > 
> > > I checked the profile results, the reason the jobs finish is some threads
> > > had no work or little work. Hence why I increased the number of threads,
> > > depending on the context (long or short cocci expected, in backports
> > > at least, the long being all cocci files in one, the short being 
> > > --test-cocci
> > > flag to gentree.py). This wrapper uses the short assumption with 10 * 
> > > num_cpus
> > > 
> > > > > Since its just a helper I toss it into the python directory but don't
> > > > > install it. Hope is that we can evolve it there instead of carrying 
> > > > > this
> > > > > helper within backports.
> > > > 
> > > > If there's a plan to make coccinelle itself multi-threaded, what's the
> > > > point?
> > > 
> > > To be clear, Coccinelle *has* a form of multithreaded support but 
> > > requires manual
> > > spawning of jobs with references to the max count and also the number 
> > > thread
> > > that this new process you are spawning belongs to. There's plans to 
> > > consider
> > > reworking things to handle all this internally but as I discussed with 
> > > Julia
> > > the changes required would require some structural changes, and as such we
> > > need to live with this for a bit longer. I need to use Coccinelle daily 
> > > now,
> > > so figured I'd punt this out there in case others might make use of it.
> > 
> > I agree with Luis.  Multithreading inside Coccinelle is currently a 
> > priority task, but not a highest priority one.
> 
> Folks, anyone object to merging pycocci in the meantime? I keep using it 
> outside
> of backports and it does what I think most kernel developers expect. This 
> would
> be until we get proper parallelism support in place.

Merge away...

julia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Cocci] [PATCH] coccinelle: add pycocci wrapper for multithreaded support

2014-08-28 Thread Julia Lawall


On Thu, 28 Aug 2014, Luis R. Rodriguez wrote:

 On Thu, Apr 10, 2014 at 09:32:34PM +0200, Julia Lawall wrote:
  
  
  On Thu, 10 Apr 2014, Luis R. Rodriguez wrote:
  
   On Thu, Apr 10, 2014 at 07:51:29PM +0200, Johannes Berg wrote:
On Thu, 2014-04-10 at 10:48 -0700, Luis R. Rodriguez wrote:

 You just pass it a cocci file, a target dir, and in git environments
 you always want --in-place enabled. Experiments and profiling random
 cocci files with the Linux kernel show that using just using number of
 CPUs doesn't scale well given that lots of buckets of files don't 
 require
 work, as such this uses 10 * number of CPUs for its number of threads.
 For work that define more general ruler 3 * number of CPUs works 
 better,
 but for smaller cocci files 3 * number of CPUs performs best right 
 now.
 To experiment more with what's going on with the multithreading one 
 can enable
 htop while kicking off a cocci task on the kernel, we want to keep
 these CPUs busy as much as possible. 

That's not really a good benchmark, you want to actually check how
quickly it finishes ... If you have some IO issues then just keeping the
CPUs busy trying to do IO won't help at all.
   
   I checked the profile results, the reason the jobs finish is some threads
   had no work or little work. Hence why I increased the number of threads,
   depending on the context (long or short cocci expected, in backports
   at least, the long being all cocci files in one, the short being 
   --test-cocci
   flag to gentree.py). This wrapper uses the short assumption with 10 * 
   num_cpus
   
 Since its just a helper I toss it into the python directory but don't
 install it. Hope is that we can evolve it there instead of carrying 
 this
 helper within backports.

If there's a plan to make coccinelle itself multi-threaded, what's the
point?
   
   To be clear, Coccinelle *has* a form of multithreaded support but 
   requires manual
   spawning of jobs with references to the max count and also the number 
   thread
   that this new process you are spawning belongs to. There's plans to 
   consider
   reworking things to handle all this internally but as I discussed with 
   Julia
   the changes required would require some structural changes, and as such we
   need to live with this for a bit longer. I need to use Coccinelle daily 
   now,
   so figured I'd punt this out there in case others might make use of it.
  
  I agree with Luis.  Multithreading inside Coccinelle is currently a 
  priority task, but not a highest priority one.
 
 Folks, anyone object to merging pycocci in the meantime? I keep using it 
 outside
 of backports and it does what I think most kernel developers expect. This 
 would
 be until we get proper parallelism support in place.

Merge away...

julia
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Cocci] [PATCH] coccinelle: add pycocci wrapper for multithreaded support

2014-08-28 Thread Luis R. Rodriguez
On Thu, Aug 28, 2014 at 08:15:16PM +0200, Julia Lawall wrote:
 
 
 On Thu, 28 Aug 2014, Luis R. Rodriguez wrote:
 
  On Thu, Apr 10, 2014 at 09:32:34PM +0200, Julia Lawall wrote:
   
   
   On Thu, 10 Apr 2014, Luis R. Rodriguez wrote:
   
On Thu, Apr 10, 2014 at 07:51:29PM +0200, Johannes Berg wrote:
 On Thu, 2014-04-10 at 10:48 -0700, Luis R. Rodriguez wrote:
 
  You just pass it a cocci file, a target dir, and in git environments
  you always want --in-place enabled. Experiments and profiling random
  cocci files with the Linux kernel show that using just using number 
  of
  CPUs doesn't scale well given that lots of buckets of files don't 
  require
  work, as such this uses 10 * number of CPUs for its number of 
  threads.
  For work that define more general ruler 3 * number of CPUs works 
  better,
  but for smaller cocci files 3 * number of CPUs performs best right 
  now.
  To experiment more with what's going on with the multithreading one 
  can enable
  htop while kicking off a cocci task on the kernel, we want to keep
  these CPUs busy as much as possible. 
 
 That's not really a good benchmark, you want to actually check how
 quickly it finishes ... If you have some IO issues then just keeping 
 the
 CPUs busy trying to do IO won't help at all.

I checked the profile results, the reason the jobs finish is some 
threads
had no work or little work. Hence why I increased the number of threads,
depending on the context (long or short cocci expected, in backports
at least, the long being all cocci files in one, the short being 
--test-cocci
flag to gentree.py). This wrapper uses the short assumption with 10 * 
num_cpus

  Since its just a helper I toss it into the python directory but 
  don't
  install it. Hope is that we can evolve it there instead of carrying 
  this
  helper within backports.
 
 If there's a plan to make coccinelle itself multi-threaded, what's the
 point?

To be clear, Coccinelle *has* a form of multithreaded support but 
requires manual
spawning of jobs with references to the max count and also the number 
thread
that this new process you are spawning belongs to. There's plans to 
consider
reworking things to handle all this internally but as I discussed with 
Julia
the changes required would require some structural changes, and as such 
we
need to live with this for a bit longer. I need to use Coccinelle daily 
now,
so figured I'd punt this out there in case others might make use of it.
   
   I agree with Luis.  Multithreading inside Coccinelle is currently a 
   priority task, but not a highest priority one.
  
  Folks, anyone object to merging pycocci in the meantime? I keep using it 
  outside
  of backports and it does what I think most kernel developers expect. This 
  would
  be until we get proper parallelism support in place.
 
 Merge away...

Pushed.

  Luis
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Cocci] [PATCH] coccinelle: add pycocci wrapper for multithreaded support

2014-08-27 Thread Luis R. Rodriguez
On Thu, Apr 10, 2014 at 09:32:34PM +0200, Julia Lawall wrote:
> 
> 
> On Thu, 10 Apr 2014, Luis R. Rodriguez wrote:
> 
> > On Thu, Apr 10, 2014 at 07:51:29PM +0200, Johannes Berg wrote:
> > > On Thu, 2014-04-10 at 10:48 -0700, Luis R. Rodriguez wrote:
> > > 
> > > > You just pass it a cocci file, a target dir, and in git environments
> > > > you always want --in-place enabled. Experiments and profiling random
> > > > cocci files with the Linux kernel show that using just using number of
> > > > CPUs doesn't scale well given that lots of buckets of files don't 
> > > > require
> > > > work, as such this uses 10 * number of CPUs for its number of threads.
> > > > For work that define more general ruler 3 * number of CPUs works better,
> > > > but for smaller cocci files 3 * number of CPUs performs best right now.
> > > > To experiment more with what's going on with the multithreading one can 
> > > > enable
> > > > htop while kicking off a cocci task on the kernel, we want to keep
> > > > these CPUs busy as much as possible. 
> > > 
> > > That's not really a good benchmark, you want to actually check how
> > > quickly it finishes ... If you have some IO issues then just keeping the
> > > CPUs busy trying to do IO won't help at all.
> > 
> > I checked the profile results, the reason the jobs finish is some threads
> > had no work or little work. Hence why I increased the number of threads,
> > depending on the context (long or short cocci expected, in backports
> > at least, the long being all cocci files in one, the short being 
> > --test-cocci
> > flag to gentree.py). This wrapper uses the short assumption with 10 * 
> > num_cpus
> > 
> > > > Since its just a helper I toss it into the python directory but don't
> > > > install it. Hope is that we can evolve it there instead of carrying this
> > > > helper within backports.
> > > 
> > > If there's a plan to make coccinelle itself multi-threaded, what's the
> > > point?
> > 
> > To be clear, Coccinelle *has* a form of multithreaded support but requires 
> > manual
> > spawning of jobs with references to the max count and also the number thread
> > that this new process you are spawning belongs to. There's plans to consider
> > reworking things to handle all this internally but as I discussed with Julia
> > the changes required would require some structural changes, and as such we
> > need to live with this for a bit longer. I need to use Coccinelle daily now,
> > so figured I'd punt this out there in case others might make use of it.
> 
> I agree with Luis.  Multithreading inside Coccinelle is currently a 
> priority task, but not a highest priority one.

Folks, anyone object to merging pycocci in the meantime? I keep using it outside
of backports and it does what I think most kernel developers expect. This would
be until we get proper parallelism support in place.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Cocci] [PATCH] coccinelle: add pycocci wrapper for multithreaded support

2014-08-27 Thread Luis R. Rodriguez
On Thu, Apr 10, 2014 at 09:32:34PM +0200, Julia Lawall wrote:
 
 
 On Thu, 10 Apr 2014, Luis R. Rodriguez wrote:
 
  On Thu, Apr 10, 2014 at 07:51:29PM +0200, Johannes Berg wrote:
   On Thu, 2014-04-10 at 10:48 -0700, Luis R. Rodriguez wrote:
   
You just pass it a cocci file, a target dir, and in git environments
you always want --in-place enabled. Experiments and profiling random
cocci files with the Linux kernel show that using just using number of
CPUs doesn't scale well given that lots of buckets of files don't 
require
work, as such this uses 10 * number of CPUs for its number of threads.
For work that define more general ruler 3 * number of CPUs works better,
but for smaller cocci files 3 * number of CPUs performs best right now.
To experiment more with what's going on with the multithreading one can 
enable
htop while kicking off a cocci task on the kernel, we want to keep
these CPUs busy as much as possible. 
   
   That's not really a good benchmark, you want to actually check how
   quickly it finishes ... If you have some IO issues then just keeping the
   CPUs busy trying to do IO won't help at all.
  
  I checked the profile results, the reason the jobs finish is some threads
  had no work or little work. Hence why I increased the number of threads,
  depending on the context (long or short cocci expected, in backports
  at least, the long being all cocci files in one, the short being 
  --test-cocci
  flag to gentree.py). This wrapper uses the short assumption with 10 * 
  num_cpus
  
Since its just a helper I toss it into the python directory but don't
install it. Hope is that we can evolve it there instead of carrying this
helper within backports.
   
   If there's a plan to make coccinelle itself multi-threaded, what's the
   point?
  
  To be clear, Coccinelle *has* a form of multithreaded support but requires 
  manual
  spawning of jobs with references to the max count and also the number thread
  that this new process you are spawning belongs to. There's plans to consider
  reworking things to handle all this internally but as I discussed with Julia
  the changes required would require some structural changes, and as such we
  need to live with this for a bit longer. I need to use Coccinelle daily now,
  so figured I'd punt this out there in case others might make use of it.
 
 I agree with Luis.  Multithreading inside Coccinelle is currently a 
 priority task, but not a highest priority one.

Folks, anyone object to merging pycocci in the meantime? I keep using it outside
of backports and it does what I think most kernel developers expect. This would
be until we get proper parallelism support in place.

  Luis
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Cocci] [PATCH] coccinelle: add pycocci wrapper for multithreaded support

2014-04-11 Thread Luis R. Rodriguez
On Fri, Apr 11, 2014 at 08:01:04AM +0200, Julia Lawall wrote:
> 
> 
> On Fri, 11 Apr 2014, SF Markus Elfring wrote:
> 
> > > I checked the profile results, the reason the jobs finish is some threads
> > > had no work or little work.
> > 
> > Could you find out during the data processing which parts or files
> > result in a special application behaviour you would like to point out here?
> 
> I don't understand the question at all, but since the various files have 
> different properties, it is hard to determine automatically in advance how 
> much work Coccinelle will need to do on each one.

For the person who might work on enhancing multithreading support, I'd wonder
if there could be gains of actually putting an effort out first to evaluate
which files have one rule hit and then adding the file to an activity file lis
to later be spread between the threads. As you note though it is hard to
determine this in advance though given that each rule express any change.

I think one small change which could help, and likely not incur a drastic
immediate change on architecture could be to not let theads take files / jobs
list of files, but instead just take say:

work_task_n = (files / jobs) / 10

The list of files needing work could then be kept on a list protected
against the threads, and each thread will only die if all the files
have been worked on already. This would enable keeping number_cpu
threads only, as each CPU would indeed be busy all the time.

BTW is the patch Acked-by Julia? Can we commit it :)

  Luis


pgp5gS88H9aT1.pgp
Description: PGP signature


Re: [Cocci] [PATCH] coccinelle: add pycocci wrapper for multithreaded support

2014-04-11 Thread SF Markus Elfring
>> Could you find out during the data processing which parts or files
>> result in a special application behaviour you would like to point out here?
> I don't understand the question at all, but since the various files have 
> different properties, it is hard to determine automatically in advance how 
> much work Coccinelle will need to do on each one.

It was reported that a system utilisation did not fit to some
expectations. I am curious if any more details or patterns can be
determined for an observed situation.

Do the involved files need also another look?
- semantic patches
- source directories

Regards,
Markus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Cocci] [PATCH] coccinelle: add pycocci wrapper for multithreaded support

2014-04-11 Thread Julia Lawall


On Fri, 11 Apr 2014, SF Markus Elfring wrote:

> > I checked the profile results, the reason the jobs finish is some threads
> > had no work or little work.
> 
> Could you find out during the data processing which parts or files
> result in a special application behaviour you would like to point out here?

I don't understand the question at all, but since the various files have 
different properties, it is hard to determine automatically in advance how 
much work Coccinelle will need to do on each one.

julia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Cocci] [PATCH] coccinelle: add pycocci wrapper for multithreaded support

2014-04-11 Thread Julia Lawall


On Fri, 11 Apr 2014, SF Markus Elfring wrote:

  I checked the profile results, the reason the jobs finish is some threads
  had no work or little work.
 
 Could you find out during the data processing which parts or files
 result in a special application behaviour you would like to point out here?

I don't understand the question at all, but since the various files have 
different properties, it is hard to determine automatically in advance how 
much work Coccinelle will need to do on each one.

julia
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Cocci] [PATCH] coccinelle: add pycocci wrapper for multithreaded support

2014-04-11 Thread SF Markus Elfring
 Could you find out during the data processing which parts or files
 result in a special application behaviour you would like to point out here?
 I don't understand the question at all, but since the various files have 
 different properties, it is hard to determine automatically in advance how 
 much work Coccinelle will need to do on each one.

It was reported that a system utilisation did not fit to some
expectations. I am curious if any more details or patterns can be
determined for an observed situation.

Do the involved files need also another look?
- semantic patches
- source directories

Regards,
Markus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Cocci] [PATCH] coccinelle: add pycocci wrapper for multithreaded support

2014-04-11 Thread Luis R. Rodriguez
On Fri, Apr 11, 2014 at 08:01:04AM +0200, Julia Lawall wrote:
 
 
 On Fri, 11 Apr 2014, SF Markus Elfring wrote:
 
   I checked the profile results, the reason the jobs finish is some threads
   had no work or little work.
  
  Could you find out during the data processing which parts or files
  result in a special application behaviour you would like to point out here?
 
 I don't understand the question at all, but since the various files have 
 different properties, it is hard to determine automatically in advance how 
 much work Coccinelle will need to do on each one.

For the person who might work on enhancing multithreading support, I'd wonder
if there could be gains of actually putting an effort out first to evaluate
which files have one rule hit and then adding the file to an activity file lis
to later be spread between the threads. As you note though it is hard to
determine this in advance though given that each rule express any change.

I think one small change which could help, and likely not incur a drastic
immediate change on architecture could be to not let theads take files / jobs
list of files, but instead just take say:

work_task_n = (files / jobs) / 10

The list of files needing work could then be kept on a list protected
against the threads, and each thread will only die if all the files
have been worked on already. This would enable keeping number_cpu
threads only, as each CPU would indeed be busy all the time.

BTW is the patch Acked-by Julia? Can we commit it :)

  Luis


pgp5gS88H9aT1.pgp
Description: PGP signature


Re: [Cocci] [PATCH] coccinelle: add pycocci wrapper for multithreaded support

2014-04-10 Thread Julia Lawall


On Thu, 10 Apr 2014, Luis R. Rodriguez wrote:

> On Thu, Apr 10, 2014 at 07:51:29PM +0200, Johannes Berg wrote:
> > On Thu, 2014-04-10 at 10:48 -0700, Luis R. Rodriguez wrote:
> > 
> > > You just pass it a cocci file, a target dir, and in git environments
> > > you always want --in-place enabled. Experiments and profiling random
> > > cocci files with the Linux kernel show that using just using number of
> > > CPUs doesn't scale well given that lots of buckets of files don't require
> > > work, as such this uses 10 * number of CPUs for its number of threads.
> > > For work that define more general ruler 3 * number of CPUs works better,
> > > but for smaller cocci files 3 * number of CPUs performs best right now.
> > > To experiment more with what's going on with the multithreading one can 
> > > enable
> > > htop while kicking off a cocci task on the kernel, we want to keep
> > > these CPUs busy as much as possible. 
> > 
> > That's not really a good benchmark, you want to actually check how
> > quickly it finishes ... If you have some IO issues then just keeping the
> > CPUs busy trying to do IO won't help at all.
> 
> I checked the profile results, the reason the jobs finish is some threads
> had no work or little work. Hence why I increased the number of threads,
> depending on the context (long or short cocci expected, in backports
> at least, the long being all cocci files in one, the short being --test-cocci
> flag to gentree.py). This wrapper uses the short assumption with 10 * num_cpus
> 
> > > Since its just a helper I toss it into the python directory but don't
> > > install it. Hope is that we can evolve it there instead of carrying this
> > > helper within backports.
> > 
> > If there's a plan to make coccinelle itself multi-threaded, what's the
> > point?
> 
> To be clear, Coccinelle *has* a form of multithreaded support but requires 
> manual
> spawning of jobs with references to the max count and also the number thread
> that this new process you are spawning belongs to. There's plans to consider
> reworking things to handle all this internally but as I discussed with Julia
> the changes required would require some structural changes, and as such we
> need to live with this for a bit longer. I need to use Coccinelle daily now,
> so figured I'd punt this out there in case others might make use of it.

I agree with Luis.  Multithreading inside Coccinelle is currently a 
priority task, but not a highest priority one.

julia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Cocci] [PATCH] coccinelle: add pycocci wrapper for multithreaded support

2014-04-10 Thread Julia Lawall


On Thu, 10 Apr 2014, Luis R. Rodriguez wrote:

 On Thu, Apr 10, 2014 at 07:51:29PM +0200, Johannes Berg wrote:
  On Thu, 2014-04-10 at 10:48 -0700, Luis R. Rodriguez wrote:
  
   You just pass it a cocci file, a target dir, and in git environments
   you always want --in-place enabled. Experiments and profiling random
   cocci files with the Linux kernel show that using just using number of
   CPUs doesn't scale well given that lots of buckets of files don't require
   work, as such this uses 10 * number of CPUs for its number of threads.
   For work that define more general ruler 3 * number of CPUs works better,
   but for smaller cocci files 3 * number of CPUs performs best right now.
   To experiment more with what's going on with the multithreading one can 
   enable
   htop while kicking off a cocci task on the kernel, we want to keep
   these CPUs busy as much as possible. 
  
  That's not really a good benchmark, you want to actually check how
  quickly it finishes ... If you have some IO issues then just keeping the
  CPUs busy trying to do IO won't help at all.
 
 I checked the profile results, the reason the jobs finish is some threads
 had no work or little work. Hence why I increased the number of threads,
 depending on the context (long or short cocci expected, in backports
 at least, the long being all cocci files in one, the short being --test-cocci
 flag to gentree.py). This wrapper uses the short assumption with 10 * num_cpus
 
   Since its just a helper I toss it into the python directory but don't
   install it. Hope is that we can evolve it there instead of carrying this
   helper within backports.
  
  If there's a plan to make coccinelle itself multi-threaded, what's the
  point?
 
 To be clear, Coccinelle *has* a form of multithreaded support but requires 
 manual
 spawning of jobs with references to the max count and also the number thread
 that this new process you are spawning belongs to. There's plans to consider
 reworking things to handle all this internally but as I discussed with Julia
 the changes required would require some structural changes, and as such we
 need to live with this for a bit longer. I need to use Coccinelle daily now,
 so figured I'd punt this out there in case others might make use of it.

I agree with Luis.  Multithreading inside Coccinelle is currently a 
priority task, but not a highest priority one.

julia
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/