Re: [PATCH v3 00/34] libify mailinfo and call it directly from am

2015-10-26 Thread Junio C Hamano
Johannes Schindelin  writes:

>> > The corresponding times for me were:
>> >
>> >  (master)   (with the series)
>> >real0m9.760s  real  0m5.744s
>> >user0m0.531s  user  0m0.656s
>> >sys 0m5.726s  sys   0m3.520s
>> >
>> > So, yes, a noticeable improvement! :)
>> 
>> Same here, on Windows built with the old msysgit environment:
>> 
>> (master) (with the series)
>> real0m3.147s  real0m1.947s
>> user0m0.016s  user0m0.000s
>> sys 0m0.015s  sys 0m0.031s
>> 
>> Although I tested 'git am patches/*' where the patches/* are the result of
>> 'git-format-patch v2.6.1..github/jc/mailinfo-lib'.
>
> 2.548s vs 2.068s here.
>
> Ciao,
> Johannes

Thanks.  Let's start thinking about moving it forward.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/34] libify mailinfo and call it directly from am

2015-10-26 Thread Johannes Schindelin
Hi,

On Wed, 21 Oct 2015, Johannes Sixt wrote:

> Am 21.10.2015 um 17:51 schrieb Ramsay Jones:
> > On 20/10/15 22:24, Junio C Hamano wrote:
> > > Junio C Hamano  writes:
> > > some numbers on my desktop (Dell T3500 2.66GHz Xeon X5650 with 12GB,
> > > running Ubuntu),
> >
> > I suspect that I haven't tested exactly the same version as you, but I had
> > a quick look at testing this on Cygwin today. I have included a complete
> > transcript (below), so you can see what I did wrong! :-P
> >
> > >
> > > Between 'master' and the version with this series (on 'jch'),
> > > applying this 34-patch series itself on top of 'master' using "git
> > > am", best of 5 numbers for running:
> > >
> > >  time git am mbox >/dev/null
> > >
> > > are
> > >
> > >(master) (with the series)
> > >  real0m0.648sreal0m0.537s
> > >  user0m0.358suser0m0.338s
> > >  sys 0m0.172ssys 0m0.154s
> > >
> >
> > The corresponding times for me were:
> >
> >  (master)   (with the series)
> >real 0m9.760s  real  0m5.744s
> >user 0m0.531s  user  0m0.656s
> >sys  0m5.726s  sys   0m3.520s
> >
> > So, yes, a noticeable improvement! :)
> 
> Same here, on Windows built with the old msysgit environment:
> 
> (master) (with the series)
> real0m3.147s  real0m1.947s
> user0m0.016s  user0m0.000s
> sys 0m0.015s  sys 0m0.031s
> 
> Although I tested 'git am patches/*' where the patches/* are the result of
> 'git-format-patch v2.6.1..github/jc/mailinfo-lib'.

2.548s vs 2.068s here.

Ciao,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/34] libify mailinfo and call it directly from am

2015-10-21 Thread Ramsay Jones


On 20/10/15 22:24, Junio C Hamano wrote:
> Junio C Hamano  writes:
> 
>> During the discussion on the recent "git am" regression, I noticed
>> that the command reimplemented in C spawns one "mailsplit" and then
>> spawns "mailinfo" followed by "apply --index" to commit the changes
>> described in each message.  As there are platforms where spawning
>> subprocess via run_command() interface is heavy-weight, something
>> that is conceptually very simple like "mailinfo" is better called
>> directly inside the process---something that is lightweight and
>> frequently used is where the overhead of run_command() would be felt
>> most.
> 
> Although I still haven't seen any offer to help from those who work
> on the platforms that may benefit from this series the most, I have
> some numbers on my desktop (Dell T3500 2.66GHz Xeon X5650 with 12GB,
> running Ubuntu), where the cost of spawning is not as costly as
> elsewhere, making this series less pressing.

I suspect that I haven't tested exactly the same version as you, but I had
a quick look at testing this on Cygwin today. I have included a complete
transcript (below), so you can see what I did wrong! :-P

> 
> Between 'master' and the version with this series (on 'jch'),
> applying this 34-patch series itself on top of 'master' using "git
> am", best of 5 numbers for running:
> 
> time git am mbox >/dev/null
> 
> are
> 
>   (master) (with the series)
> real0m0.648sreal0m0.537s
> user0m0.358suser0m0.338s
> sys 0m0.172ssys 0m0.154s
> 

The corresponding times for me were:

(master)   (with the series)
  real  0m9.760s  real  0m5.744s
  user  0m0.531s  user  0m0.656s
  sys   0m5.726s  sys   0m3.520s

So, yes, a noticeable improvement! :)

HTH

ATB,
Ramsay Jones

  $ uname -a
  CYGWIN_NT-6.3 satellite 2.2.1(0.289/5/3) 2015-08-20 11:42 x86_64 Cygwin
  $ pwd
  /home/ramsay/git
  $ git log --decorate --oneline -1
  74301d6 (HEAD -> master, origin/master, origin/HEAD) Sync with maint
  $ ./git version
  git version 2.6.2.280.g74301d6
  $ git format-patch --stdout 2a5ce7c^..896df93 >mailinfo.mbox
  $ git format-patch --stdout a4106a8^..559e247 >>mailinfo.mbox
  $ git checkout -b master-mailinfo master
  Switched to a new branch 'master-mailinfo'
  $ time ./git am mailinfo.mbox
  Applying: mailinfo: remove a no-op call convert_to_utf8(it, "")
  Applying: mailinfo: fold decode_header_bq() into decode_header()
  Applying: mailinfo: fix an off-by-one error in the boundary stack
  Applying: mailinfo: explicitly close file handle to the patch output
  Applying: mailinfo: move handle_boundary() lower
  Applying: mailinfo: get rid of function-local static states
  Applying: mailinfo: do not let handle_body() touch global "line" directly
  Applying: mailinfo: do not let handle_boundary() touch global "line" directly
  Applying: mailinfo: do not let find_boundary() touch global "line" directly
  Applying: mailinfo: move global "line" into mailinfo() function
  Applying: mailinfo: introduce "struct mailinfo" to hold globals
  Applying: mailinfo: move keep_subject & keep_non_patch_bracket to struct 
mailinfo
  Applying: mailinfo: move global "FILE *fin, *fout" to struct mailinfo
  Applying: mailinfo: move filter/header stage to struct mailinfo
  Applying: mailinfo: move patch_lines to struct mailinfo
  Applying: mailinfo: move add_message_id and message_id to struct mailinfo
  Applying: mailinfo: move use_scissors and use_inbody_headers to struct 
mailinfo
  Applying: mailinfo: move metainfo_charset to struct mailinfo
  Applying: mailinfo: move check for metainfo_charset to convert_to_utf8()
  Applying: mailinfo: move transfer_encoding to struct mailinfo
  Applying: mailinfo: move charset to struct mailinfo
  Applying: mailinfo: move cmitmsg and patchfile to struct mailinfo
  Applying: mailinfo: move [ps]_hdr_data to struct mailinfo
  Applying: mailinfo: move content/content_top to struct mailinfo
  Applying: mailinfo: handle_commit_msg() shouldn't be called after finding 
patchbreak
  Applying: mailinfo: keep the parsed log message in a strbuf
  Applying: mailinfo: move read_one_header_line() closer to its callers
  Applying: mailinfo: move check_header() after the helpers it uses
  Applying: mailinfo: move cleanup_space() before its users
  Applying: mailinfo: move definition of MAX_HDR_PARSED closer to its use
  Applying: mailinfo: libify
  Applying: mailinfo: handle charset conversion errors in the caller
  Applying: mailinfo: remove calls to exit() and die() deep in the callchain
  Applying: am: make direct call to mailinfo
  Applying: mailinfo: plug strbuf leak during continuation line handling
  
  real  0m9.760s
  user  0m0.531s
  sys   0m5.726s
  $ 
  
  $ make clean >/dev/null 2>&1
  $ make >out.mi 2>&1
  $ ./git version
  git version 2.6.2.315.g1e9f6ff
  $ git describe
  v2.6.2-315-g1e9f6ff
  $ git checkout -b 

Re: [PATCH v3 00/34] libify mailinfo and call it directly from am

2015-10-21 Thread Johannes Sixt

Am 21.10.2015 um 17:51 schrieb Ramsay Jones:

On 20/10/15 22:24, Junio C Hamano wrote:

Junio C Hamano  writes:
some numbers on my desktop (Dell T3500 2.66GHz Xeon X5650 with 12GB,
running Ubuntu),


I suspect that I haven't tested exactly the same version as you, but I had
a quick look at testing this on Cygwin today. I have included a complete
transcript (below), so you can see what I did wrong! :-P



Between 'master' and the version with this series (on 'jch'),
applying this 34-patch series itself on top of 'master' using "git
am", best of 5 numbers for running:

 time git am mbox >/dev/null

are

   (master) (with the series)
 real0m0.648sreal0m0.537s
 user0m0.358suser0m0.338s
 sys 0m0.172ssys 0m0.154s



The corresponding times for me were:

 (master)   (with the series)
   real 0m9.760s  real  0m5.744s
   user 0m0.531s  user  0m0.656s
   sys  0m5.726s  sys   0m3.520s

So, yes, a noticeable improvement! :)


Same here, on Windows built with the old msysgit environment:

(master) (with the series)
real0m3.147s  real0m1.947s
user0m0.016s  user0m0.000s
sys 0m0.015s  sys 0m0.031s

Although I tested 'git am patches/*' where the patches/* are the result 
of 'git-format-patch v2.6.1..github/jc/mailinfo-lib'.


-- Hannes

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/34] libify mailinfo and call it directly from am

2015-10-21 Thread Junio C Hamano
Johannes Sixt  writes:

> Am 21.10.2015 um 17:51 schrieb Ramsay Jones:
>> On 20/10/15 22:24, Junio C Hamano wrote:
>>>
>>>  time git am mbox >/dev/null
>>>
>>> are
>>>
>>>(master) (with the series)
>>>  real0m0.648sreal0m0.537s
>>>  user0m0.358suser0m0.338s
>>>  sys 0m0.172ssys 0m0.154s
>>>
>>
>> The corresponding times for me were:
>>
>>  (master)   (with the series)
>>real  0m9.760s  real  0m5.744s
>>user  0m0.531s  user  0m0.656s
>>sys   0m5.726s  sys   0m3.520s
>>
>> So, yes, a noticeable improvement! :)
>
> Same here, on Windows built with the old msysgit environment:
>
> (master) (with the series)
> real0m3.147s  real0m1.947s
> user0m0.016s  user0m0.000s
> sys 0m0.015s  sys 0m0.031s
>
> Although I tested 'git am patches/*' where the patches/* are the
> result of 'git-format-patch v2.6.1..github/jc/mailinfo-lib'.

Thanks, both.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/34] libify mailinfo and call it directly from am

2015-10-20 Thread Junio C Hamano
Junio C Hamano  writes:

> During the discussion on the recent "git am" regression, I noticed
> that the command reimplemented in C spawns one "mailsplit" and then
> spawns "mailinfo" followed by "apply --index" to commit the changes
> described in each message.  As there are platforms where spawning
> subprocess via run_command() interface is heavy-weight, something
> that is conceptually very simple like "mailinfo" is better called
> directly inside the process---something that is lightweight and
> frequently used is where the overhead of run_command() would be felt
> most.

Although I still haven't seen any offer to help from those who work
on the platforms that may benefit from this series the most, I have
some numbers on my desktop (Dell T3500 2.66GHz Xeon X5650 with 12GB,
running Ubuntu), where the cost of spawning is not as costly as
elsewhere, making this series less pressing.

Between 'master' and the version with this series (on 'jch'),
applying this 34-patch series itself on top of 'master' using "git
am", best of 5 numbers for running:

time git am mbox >/dev/null

are

  (master) (with the series)
real0m0.648sreal0m0.537s
user0m0.358suser0m0.338s
sys 0m0.172ssys 0m0.154s

I haven't re-read the series to catch leaks yet, so in that sense
the series is still not ready to be merged to 'next' (of course,
help with fresh set of eyes is very much appreciated here).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/34] libify mailinfo and call it directly from am

2015-10-20 Thread Stefan Beller
On Tue, Oct 20, 2015 at 2:24 PM, Junio C Hamano  wrote:
> Junio C Hamano  writes:
>
>> During the discussion on the recent "git am" regression, I noticed
>> that the command reimplemented in C spawns one "mailsplit" and then
>> spawns "mailinfo" followed by "apply --index" to commit the changes
>> described in each message.  As there are platforms where spawning
>> subprocess via run_command() interface is heavy-weight, something
>> that is conceptually very simple like "mailinfo" is better called
>> directly inside the process---something that is lightweight and
>> frequently used is where the overhead of run_command() would be felt
>> most.
>
> Although I still haven't seen any offer to help from those who work
> on the platforms that may benefit from this series the most, I have
> some numbers on my desktop (Dell T3500 2.66GHz Xeon X5650 with 12GB,
> running Ubuntu), where the cost of spawning is not as costly as
> elsewhere, making this series less pressing.

As far as I understand, this only helps for mailing list workflows, which
in my limited view of the world is only found in established infrastructure
projects, who tend to be maintained by people who run some kind of
ab-nomination of unix.
("Are there people in Windows land who heavily use the email workflow?")

>
> Between 'master' and the version with this series (on 'jch'),
> applying this 34-patch series itself on top of 'master' using "git
> am", best of 5 numbers for running:
>
> time git am mbox >/dev/null
>
> are
>
>   (master) (with the series)
> real0m0.648sreal0m0.537s
> user0m0.358suser0m0.338s
> sys 0m0.172ssys 0m0.154s
>
> I haven't re-read the series to catch leaks yet, so in that sense
> the series is still not ready to be merged to 'next' (of course,
> help with fresh set of eyes is very much appreciated here).

I'll take another look after sending out my series.

> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/34] libify mailinfo and call it directly from am

2015-10-20 Thread Junio C Hamano
Stefan Beller  writes:

> As far as I understand, this only helps for mailing list workflows, which
> in my limited view of the world is only found in established infrastructure
> projects, who tend to be maintained by people who run some kind of
> ab-nomination of unix.
> ("Are there people in Windows land who heavily use the email workflow?")

Forgot that "am" is a workhorse behind "rebase"?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/34] libify mailinfo and call it directly from am

2015-10-20 Thread Stefan Beller
On Tue, Oct 20, 2015 at 3:06 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> As far as I understand, this only helps for mailing list workflows, which
>> in my limited view of the world is only found in established infrastructure
>> projects, who tend to be maintained by people who run some kind of
>> ab-nomination of unix.
>> ("Are there people in Windows land who heavily use the email workflow?")
>
> Forgot that "am" is a workhorse behind "rebase"?
>

Yes, I was deceived by the name of the series ("it must be mail related and only
mail related" was my thinking).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 00/34] libify mailinfo and call it directly from am

2015-10-19 Thread Junio C Hamano
During the discussion on the recent "git am" regression, I noticed
that the command reimplemented in C spawns one "mailsplit" and then
spawns "mailinfo" followed by "apply --index" to commit the changes
described in each message.  As there are platforms where spawning
subprocess via run_command() interface is heavy-weight, something
that is conceptually very simple like "mailinfo" is better called
directly inside the process---something that is lightweight and
frequently used is where the overhead of run_command() would be felt
most.

I think this round is ready for 'next'.  Relative to the previous
round, the changes are:

 * Editorial fixes on log messages.

 * The previous round leaked some fields in struct mailinfo upon
   completion (of course, inherited from the original that let the
   system clean them up upon process termination).  clear_mailinfo()
   has been enhanced to clear them.

 * The step to remove the global "line" variable has been split into
   multiple steps.

 * The step to move metainfo_charset to the struct has been split
   into two.

And here are the patches.

  mailinfo: remove a no-op call convert_to_utf8(it, "")
  mailinfo: fold decode_header_bq() into decode_header()
  mailinfo: fix an off-by-one error in the boundary stack
  mailinfo: explicitly close file handle to the patch output
  mailinfo: move handle_boundary() lower
  mailinfo: get rid of function-local static states

Mostly unchanged other than editorial fixes on their log messages.

  mailinfo: do not let handle_body() touch global "line" directly
  mailinfo: do not let handle_boundary() touch global "line" directly
  mailinfo: do not let find_boundary() touch global "line" directly
  mailinfo: move global "line" into mailinfo() function

After Stefan's review comments, I wanted to be really sure that this
conversion was correct.  Blindingly replacing the reference to the
global with a pointer passed from the callern is not sufficient to
ensure that the change is a no-op; the patches needed to show that
the pointer passed from the caller is always the global "line".  For
that, the patch [v2 7/31] needed to be split further into 3 patches
to convert three functions, each of which always is called with the
pointer that points at the global "line", in separate steps.

  mailinfo: introduce "struct mailinfo" to hold globals

The previous round was lazy and did not introduce clear_mailinfo()
until later, leaking the two strbuf moved into the struct.  The
original was leaking the global anyway, so it is not a big deal, but
this round adds corresponding clean-up as the patches move global
variables to the struct, which should make it harder to miss
forgotten clean-up.

  mailinfo: move keep_subject & keep_non_patch_bracket to struct mailinfo
  mailinfo: move global "FILE *fin, *fout" to struct mailinfo
  mailinfo: move filter/header stage to struct mailinfo
  mailinfo: move patch_lines to struct mailinfo
  mailinfo: move add_message_id and message_id to struct mailinfo
  mailinfo: move use_scissors and use_inbody_headers to struct mailinfo
  mailinfo: move metainfo_charset to struct mailinfo

Mostly unchanged, except for the clean-up in clear_mailinfo().

  mailinfo: move check for metainfo_charset to convert_to_utf8()

Eric's review noticed that the previous round conflated this step
in the previous step.  Separated the change to its own commit.

  mailinfo: move transfer_encoding to struct mailinfo
  mailinfo: move charset to struct mailinfo
  mailinfo: move cmitmsg and patchfile to struct mailinfo
  mailinfo: move [ps]_hdr_data to struct mailinfo
  mailinfo: move content/content_top to struct mailinfo

Mostly unchanged, except for the clean-up in clear_mailinfo().

  mailinfo: handle_commit_msg() shouldn't be called after finding patchbreak
  mailinfo: keep the parsed log message in a strbuf

These two were placed earlier in the series in the previous round,
but are not about moving globals to struct.  This round finishes
moving the globals to struct mailinfo before doing these two steps.

  mailinfo: move read_one_header_line() closer to its callers
  mailinfo: move check_header() after the helpers it uses
  mailinfo: move cleanup_space() before its users
  mailinfo: move definition of MAX_HDR_PARSED closer to its use

Mostly unchanged.  These are pure shuffling of functions and
variables to lose forward declarations and to allow easier reading
by grouping related things together.

  mailinfo: libify

Mostly pure code movement that is unchanged since v2.

  mailinfo: handle charset conversion errors in the caller
  mailinfo: remove calls to exit() and die() deep in the callchain
  am: make direct call to mailinfo

 Makefile |1 +
 builtin/am.c |   42 +-
 builtin/mailinfo.c   | 1137 ++
 builtin/mailinfo.c => mailinfo.c |  799 +--
 mailinfo.h   |   41 ++
 5 files changed, 505