Re: cassandra-stress HexStrings generator

2018-12-13 Thread Benedict Elliott Smith
I’m honestly not sure.  The code has changed since I last worked on it, which 
was years ago.  I suspect the profile mode has entirely supplanted the prior 
modes, and that these older modes supported the HexStrings generator.

Perhaps somebody else can help answer this question.


> On 13 Dec 2018, at 17:37, Saleil Bhat (BLOOMBERG/ 731 LEX) 
>  wrote:
> 
> Ah ok thanks. This brings up another question: how did the HexStrings 
> generator code path even get called? 
> 
> 
> 
> When I saw these results, I was using the following test table: 
>  CREATE TABLE testtable (
>  partition_key text,
>  clustering_column text, 
>  value text,
>  PRIMARY KEY (partition_key, clustering_column)
>  )
> 
> 
> From StressProfile.java, any column of type TEXT should use the Strings 
> generator. 
> However, my data looks suspiciously like the HexStrings generator was being 
> used instead. 
> 
> 
> First, the generated strings included control characters like SUB (\x1A), BEL 
> (\x07), etc. However, the Strings generator code looks like it forces the 
> characters to be in the printing characters range. 
> Second, the result I documented previously (that the characters are normally 
> distributed, but the strings are not), matches the implementation of 
> HexStrings. 
> 
> 
> 
> Do you know why this might be the case?
> 
> Thanks, 
> -Saleil 
> 
> 
> From: bened...@apache.org At: 12/12/18 18:09:14To:  Saleil Bhat (BLOOMBERG/ 
> 731 LEX ) ,  dev@cassandra.apache.org
> Subject: Re: cassandra-stress HexStrings generator
> 
> Yes, I’m pretty sure you understood correctly (I wrote most of this, but it’s 
> been a long time so I cannot remember much for certain).  
> 
> It should be implemented like the Strings generator.  It looks like both 
> HexStrings and HexBytes are incorrect, and have been for a long time.
> 
> 
>> On 12 Dec 2018, at 22:27, Saleil Bhat (BLOOMBERG/ 731 LEX) 
>  wrote:
>> 
>> Hi, 
>> 
>> I have a question about the behavior of the HexStrings value generator in 
>> the 
> cassandra-stress tool, particularly concerning its population/identity 
> distribution.  
>> 
>> 
>> Per the discussion in JIRA item CASSANDRA-6146 concerning the stress YAML 
> profile, the population field in a columnspec “represents the total unique 
> population distribution of that column across rows.”
>> 
>> 
>> I interpreted this to mean that if I specify some distribution 'F' for a 
> column, then the probability of occurrence for each potential value of that 
> column is given by 'F'. 
>> 
>> So, for example, if I provided the following columnspec for a text column: 
>> name: fake_column 
>>  size: fixed(32) 
>>population: gaussian(1..100)  
>> and then generated a large amount of data according to this specification, 
>> I would expect there to be 100 distinct values for ‘fake_column’, and that a 
> histogram of the frequency of occurrence of each value would be roughly 
> bell-shaped. 
>> 
>> 
>> 
>> However, the current implementation of the HexStrings generator deviates 
>> from 
> this expectation. In the current implementation, each CHARACTER in the string 
> is drawn from F, rather than the string as a whole. Therefore, if you plot 
> the 
> histogram of frequency of occurrence for each character, you get a 
> bell-shaped 
> curve, but the distribution of the occurrences of whole strings (the actual 
> columns) is something else. 
>> 
>> 
>> My question is, is this the desired behavior for string columns? Was my 
> expectation/interpretation incorrect? If so, can anyone give some insight as 
> to 
> why strings are designed to behave this way and what the use case is for this 
> behavior? 
>> 
>> Thanks, 
>> -Saleil 
> 
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Revisit the proposal to use github PR

2018-12-13 Thread Ariel Weisberg
Hi,

Sorry I missed that point. I agree github PRs are not useful for merging.

What I do is force push the feature/bug fix branches (which is fine, github 
remembers the old versions in the PR) with everything updated and ready to 
merge, and then push those branches from my local repo to the apache repo with 
--atomic.

Ariel

On Thu, Dec 13, 2018, at 1:00 PM, Jason Brown wrote:
> To clarify my position: Github PRs are great for *reviewing* code, and the
> commentary is much easier to follow imo. But for *merging* code, esp into
> our multi-branch strategy, PRs don't fit well, unless there's some
> technique I and perhaps others are unaware of.
> 
> On Thu, Dec 13, 2018 at 9:47 AM Ariel Weisberg  wrote:
> 
> > Hi,
> >
> > I'm not clear on what github makes worse. It preserves more history then
> > the JIRA approach. When people invitably force push their branches you
> > can't tell from the link to a branch on JIRA. Github preserves the comments
> > and force push history so you know what version of the code each comment
> > applied to. Github also tracks when requests for changes are acknowledged
> > and resolved.  I have had to make the same change request many times and
> > keep track independently whether it was resolved. This has also resulting
> > in mistakes getting merged when I missed a comment that was ignored.
> >
> > Now that github can CC JIRA that also CCs to commits@. It's better then
> > JIRA comments because each comment includes a small diff of the code the
> > comment applies to. To do that in JIRA I have to manually link to the code
> > the PR and most people don't do that for every comment so some of them are
> > inscrutable after the fact. Also manually created links sometimes refer to
> > references that disappear or get force pushed. It's a bit tricky to get
> > right.
> >
> > To me arguing against leveraging a better code review workflow (whether
> > github or some other tool) is like arguing against using source control
> > tools. Sure the filesystem and home grown scripts can be used to work
> > around lack of source control, but why would you?
> >
> > I see two complaints so far. One is that github PRs encourage nitpicking.
> > I don't see tool based solution to that off the cuff. Another is that
> > github doesn't by default CC JIRA. Maybe we can just refuse to accept
> > improperly formatted PRs and look into auto rejecting ones that don't refer
> > to a ticket?
> >
> > Ariel
> >
> > On Thu, Dec 13, 2018, at 12:20 PM, Aleksey Yeschenko wrote:
> > > There are some nice benefits to GH PRs, one of them is that we could
> > > eventually set up CircleCI hooks that would explicitly prevent commits
> > > that don’t pass the tests.
> > >
> > > But handling multiple branches would indeed be annoying. Would have to
> > > either submit 1 PR per branch - which is both tedious and non-atomic -
> > > or do a mixed approach, with a PR for the oldest branch, then a manual
> > > merge upwards. The latter would be kinda meh, especially when commits
> > > for different branches diverge.
> > >
> > > For me personally, the current setup works quite well, and I mostly
> > > share Sylvain’s opinion above, for the same reasons listed.
> > >
> > > —
> > > AY
> > >
> > > > On 13 Dec 2018, at 08:15, Sylvain Lebresne  wrote:
> > > >
> > > > Fwiw, I personally find it very useful to have all discussion, review
> > > > comments included, in the same place (namely JIRA, since for better or
> > > > worse, that's what we use for tracking tickets). Typically, that means
> > > > everything gets consistently pushed to the  commits@ mailing list,
> > which I
> > > > find extremely convenient to keep track of things. I also have a theory
> > > > that the inline-comments type of review github PR give you is very
> > > > convenient for nitpicks, shallow or spur-of-the-moment comments, but
> > > > doesn't help that much for deeper reviews, and that it thus to favor
> > the
> > > > former kind of review.
> > > >
> > > > Additionally, and to Benedict's point, I happen to have first hand
> > > > experience with a PR-based process for a multi-branch workflow very
> > similar
> > > > to the one of this project, and suffice to say that I hate it with a
> > > > passion.
> > > >
> > > > Anyway, very much personal opinion here.
> > > > --
> > > > Sylvain
> > > >
> > > >
> > > > On Thu, Dec 13, 2018 at 2:13 AM dinesh.jo...@yahoo.com.INVALID
> > > >  wrote:
> > > >
> > > >> I've been already using github PRs for some time now. Once you
> > specify the
> > > >> ticket number, the comments and discussion are persisted in Apache
> > Jira as
> > > >> work log so it can be audited if desired. However, committers usually
> > > >> squash and commit the changes once the PR is approved. We don't use
> > the
> > > >> merge feature in github. I don't believe github we can merge the
> > commit
> > > >> into multiple branches through the UI. We would need to merge it into
> > one
> > > >> branch and then manually merge that commit into 

Re: Revisit the proposal to use github PR

2018-12-13 Thread Jason Brown
To clarify my position: Github PRs are great for *reviewing* code, and the
commentary is much easier to follow imo. But for *merging* code, esp into
our multi-branch strategy, PRs don't fit well, unless there's some
technique I and perhaps others are unaware of.

On Thu, Dec 13, 2018 at 9:47 AM Ariel Weisberg  wrote:

> Hi,
>
> I'm not clear on what github makes worse. It preserves more history then
> the JIRA approach. When people invitably force push their branches you
> can't tell from the link to a branch on JIRA. Github preserves the comments
> and force push history so you know what version of the code each comment
> applied to. Github also tracks when requests for changes are acknowledged
> and resolved.  I have had to make the same change request many times and
> keep track independently whether it was resolved. This has also resulting
> in mistakes getting merged when I missed a comment that was ignored.
>
> Now that github can CC JIRA that also CCs to commits@. It's better then
> JIRA comments because each comment includes a small diff of the code the
> comment applies to. To do that in JIRA I have to manually link to the code
> the PR and most people don't do that for every comment so some of them are
> inscrutable after the fact. Also manually created links sometimes refer to
> references that disappear or get force pushed. It's a bit tricky to get
> right.
>
> To me arguing against leveraging a better code review workflow (whether
> github or some other tool) is like arguing against using source control
> tools. Sure the filesystem and home grown scripts can be used to work
> around lack of source control, but why would you?
>
> I see two complaints so far. One is that github PRs encourage nitpicking.
> I don't see tool based solution to that off the cuff. Another is that
> github doesn't by default CC JIRA. Maybe we can just refuse to accept
> improperly formatted PRs and look into auto rejecting ones that don't refer
> to a ticket?
>
> Ariel
>
> On Thu, Dec 13, 2018, at 12:20 PM, Aleksey Yeschenko wrote:
> > There are some nice benefits to GH PRs, one of them is that we could
> > eventually set up CircleCI hooks that would explicitly prevent commits
> > that don’t pass the tests.
> >
> > But handling multiple branches would indeed be annoying. Would have to
> > either submit 1 PR per branch - which is both tedious and non-atomic -
> > or do a mixed approach, with a PR for the oldest branch, then a manual
> > merge upwards. The latter would be kinda meh, especially when commits
> > for different branches diverge.
> >
> > For me personally, the current setup works quite well, and I mostly
> > share Sylvain’s opinion above, for the same reasons listed.
> >
> > —
> > AY
> >
> > > On 13 Dec 2018, at 08:15, Sylvain Lebresne  wrote:
> > >
> > > Fwiw, I personally find it very useful to have all discussion, review
> > > comments included, in the same place (namely JIRA, since for better or
> > > worse, that's what we use for tracking tickets). Typically, that means
> > > everything gets consistently pushed to the  commits@ mailing list,
> which I
> > > find extremely convenient to keep track of things. I also have a theory
> > > that the inline-comments type of review github PR give you is very
> > > convenient for nitpicks, shallow or spur-of-the-moment comments, but
> > > doesn't help that much for deeper reviews, and that it thus to favor
> the
> > > former kind of review.
> > >
> > > Additionally, and to Benedict's point, I happen to have first hand
> > > experience with a PR-based process for a multi-branch workflow very
> similar
> > > to the one of this project, and suffice to say that I hate it with a
> > > passion.
> > >
> > > Anyway, very much personal opinion here.
> > > --
> > > Sylvain
> > >
> > >
> > > On Thu, Dec 13, 2018 at 2:13 AM dinesh.jo...@yahoo.com.INVALID
> > >  wrote:
> > >
> > >> I've been already using github PRs for some time now. Once you
> specify the
> > >> ticket number, the comments and discussion are persisted in Apache
> Jira as
> > >> work log so it can be audited if desired. However, committers usually
> > >> squash and commit the changes once the PR is approved. We don't use
> the
> > >> merge feature in github. I don't believe github we can merge the
> commit
> > >> into multiple branches through the UI. We would need to merge it into
> one
> > >> branch and then manually merge that commit into other branches. The
> big
> > >> upside of using github PRs is that it makes collaborating a lot
> easier.
> > >> Downside is that it makes it very difficult to follow along the
> progress in
> > >> Apache Jira. The messages that github posts back include huge diffs
> and are
> > >> aweful.
> > >> Dinesh
> > >>
> > >>On Thursday, December 13, 2018, 1:10:12 AM GMT+5:30, Benedict
> Elliott
> > >> Smith  wrote:
> > >>
> > >> Perhaps somebody could summarise the tradeoffs?  I’m a little
> concerned
> > >> about how it would work for our multi-branch workflow.  Would we 

Re: Revisit the proposal to use github PR

2018-12-13 Thread Ariel Weisberg
Hi,

I'm not clear on what github makes worse. It preserves more history then the 
JIRA approach. When people invitably force push their branches you can't tell 
from the link to a branch on JIRA. Github preserves the comments and force push 
history so you know what version of the code each comment applied to. Github 
also tracks when requests for changes are acknowledged and resolved.  I have 
had to make the same change request many times and keep track independently 
whether it was resolved. This has also resulting in mistakes getting merged 
when I missed a comment that was ignored.

Now that github can CC JIRA that also CCs to commits@. It's better then JIRA 
comments because each comment includes a small diff of the code the comment 
applies to. To do that in JIRA I have to manually link to the code the PR and 
most people don't do that for every comment so some of them are inscrutable 
after the fact. Also manually created links sometimes refer to references that 
disappear or get force pushed. It's a bit tricky to get right.

To me arguing against leveraging a better code review workflow (whether github 
or some other tool) is like arguing against using source control tools. Sure 
the filesystem and home grown scripts can be used to work around lack of source 
control, but why would you?

I see two complaints so far. One is that github PRs encourage nitpicking. I 
don't see tool based solution to that off the cuff. Another is that github 
doesn't by default CC JIRA. Maybe we can just refuse to accept improperly 
formatted PRs and look into auto rejecting ones that don't refer to a ticket?

Ariel

On Thu, Dec 13, 2018, at 12:20 PM, Aleksey Yeschenko wrote:
> There are some nice benefits to GH PRs, one of them is that we could 
> eventually set up CircleCI hooks that would explicitly prevent commits 
> that don’t pass the tests.
> 
> But handling multiple branches would indeed be annoying. Would have to 
> either submit 1 PR per branch - which is both tedious and non-atomic - 
> or do a mixed approach, with a PR for the oldest branch, then a manual 
> merge upwards. The latter would be kinda meh, especially when commits 
> for different branches diverge.
> 
> For me personally, the current setup works quite well, and I mostly 
> share Sylvain’s opinion above, for the same reasons listed.
> 
> —
> AY
> 
> > On 13 Dec 2018, at 08:15, Sylvain Lebresne  wrote:
> > 
> > Fwiw, I personally find it very useful to have all discussion, review
> > comments included, in the same place (namely JIRA, since for better or
> > worse, that's what we use for tracking tickets). Typically, that means
> > everything gets consistently pushed to the  commits@ mailing list, which I
> > find extremely convenient to keep track of things. I also have a theory
> > that the inline-comments type of review github PR give you is very
> > convenient for nitpicks, shallow or spur-of-the-moment comments, but
> > doesn't help that much for deeper reviews, and that it thus to favor the
> > former kind of review.
> > 
> > Additionally, and to Benedict's point, I happen to have first hand
> > experience with a PR-based process for a multi-branch workflow very similar
> > to the one of this project, and suffice to say that I hate it with a
> > passion.
> > 
> > Anyway, very much personal opinion here.
> > --
> > Sylvain
> > 
> > 
> > On Thu, Dec 13, 2018 at 2:13 AM dinesh.jo...@yahoo.com.INVALID
> >  wrote:
> > 
> >> I've been already using github PRs for some time now. Once you specify the
> >> ticket number, the comments and discussion are persisted in Apache Jira as
> >> work log so it can be audited if desired. However, committers usually
> >> squash and commit the changes once the PR is approved. We don't use the
> >> merge feature in github. I don't believe github we can merge the commit
> >> into multiple branches through the UI. We would need to merge it into one
> >> branch and then manually merge that commit into other branches. The big
> >> upside of using github PRs is that it makes collaborating a lot easier.
> >> Downside is that it makes it very difficult to follow along the progress in
> >> Apache Jira. The messages that github posts back include huge diffs and are
> >> aweful.
> >> Dinesh
> >> 
> >>On Thursday, December 13, 2018, 1:10:12 AM GMT+5:30, Benedict Elliott
> >> Smith  wrote:
> >> 
> >> Perhaps somebody could summarise the tradeoffs?  I’m a little concerned
> >> about how it would work for our multi-branch workflow.  Would we open
> >> multiple PRs?
> >> 
> >> Could we easily link with external CircleCI?
> >> 
> >> It occurs to me, in JIRA proposal mode, that an extra required field for a
> >> permalink to GitHub for the patch would save a lot of time I spend hunting
> >> for a branch in the comments.
> >> 
> >> 
> >> 
> >> 
> >>> On 12 Dec 2018, at 19:20, jay.zhu...@yahoo.com.INVALID wrote:
> >>> 
> >>> It was discussed 1 year's ago:
> >> https://www.mail-archive.com/dev@cassandra.apache.org/msg11810.html

Re: Revisit the proposal to use github PR

2018-12-13 Thread Jason Brown
I'm with Sylvain and Aleksey. Our multi-branch strategy does not work well
(if at all) with Github PR-style workflows (and, yes, I and most c*
maintainers have played this game). The cassandra-dtests repo is different
as we don't really branch there, so might be fine for Github PRs - except
for the merge commit noise, and that we prefer squashed patches for commit.

We could perhaps be a bit more explicit about "how to contribute" in the
docs [1], wrt branches as well as Github PRs (that we don't do them), but
it's mostly not unreasonable.

Thus, I'm in favor of the current process.

[1] http://cassandra.apache.org/doc/latest/development/patches.html


On Thu, Dec 13, 2018 at 9:20 AM Aleksey Yeschenko 
wrote:

> There are some nice benefits to GH PRs, one of them is that we could
> eventually set up CircleCI hooks that would explicitly prevent commits that
> don’t pass the tests.
>
> But handling multiple branches would indeed be annoying. Would have to
> either submit 1 PR per branch - which is both tedious and non-atomic - or
> do a mixed approach, with a PR for the oldest branch, then a manual merge
> upwards. The latter would be kinda meh, especially when commits for
> different branches diverge.
>
> For me personally, the current setup works quite well, and I mostly share
> Sylvain’s opinion above, for the same reasons listed.
>
> —
> AY
>
> > On 13 Dec 2018, at 08:15, Sylvain Lebresne  wrote:
> >
> > Fwiw, I personally find it very useful to have all discussion, review
> > comments included, in the same place (namely JIRA, since for better or
> > worse, that's what we use for tracking tickets). Typically, that means
> > everything gets consistently pushed to the  commits@ mailing list,
> which I
> > find extremely convenient to keep track of things. I also have a theory
> > that the inline-comments type of review github PR give you is very
> > convenient for nitpicks, shallow or spur-of-the-moment comments, but
> > doesn't help that much for deeper reviews, and that it thus to favor the
> > former kind of review.
> >
> > Additionally, and to Benedict's point, I happen to have first hand
> > experience with a PR-based process for a multi-branch workflow very
> similar
> > to the one of this project, and suffice to say that I hate it with a
> > passion.
> >
> > Anyway, very much personal opinion here.
> > --
> > Sylvain
> >
> >
> > On Thu, Dec 13, 2018 at 2:13 AM dinesh.jo...@yahoo.com.INVALID
> >  wrote:
> >
> >> I've been already using github PRs for some time now. Once you specify
> the
> >> ticket number, the comments and discussion are persisted in Apache Jira
> as
> >> work log so it can be audited if desired. However, committers usually
> >> squash and commit the changes once the PR is approved. We don't use the
> >> merge feature in github. I don't believe github we can merge the commit
> >> into multiple branches through the UI. We would need to merge it into
> one
> >> branch and then manually merge that commit into other branches. The big
> >> upside of using github PRs is that it makes collaborating a lot easier.
> >> Downside is that it makes it very difficult to follow along the
> progress in
> >> Apache Jira. The messages that github posts back include huge diffs and
> are
> >> aweful.
> >> Dinesh
> >>
> >>On Thursday, December 13, 2018, 1:10:12 AM GMT+5:30, Benedict Elliott
> >> Smith  wrote:
> >>
> >> Perhaps somebody could summarise the tradeoffs?  I’m a little concerned
> >> about how it would work for our multi-branch workflow.  Would we open
> >> multiple PRs?
> >>
> >> Could we easily link with external CircleCI?
> >>
> >> It occurs to me, in JIRA proposal mode, that an extra required field
> for a
> >> permalink to GitHub for the patch would save a lot of time I spend
> hunting
> >> for a branch in the comments.
> >>
> >>
> >>
> >>
> >>> On 12 Dec 2018, at 19:20, jay.zhu...@yahoo.com.INVALID wrote:
> >>>
> >>> It was discussed 1 year's ago:
> >> https://www.mail-archive.com/dev@cassandra.apache.org/msg11810.html
> >>> As all Apache projects are moving to gitbox:
> >> https://reference.apache.org/committer/github, should we revisit that
> and
> >> change our review/commit process to use github PR?A good example is
> >> Spark:"Changes to Spark source code are proposed, reviewed and committed
> >> via Github pull requests" (https://spark.apache.org/contributing.html).
> >>> /jay
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: cassandra-stress HexStrings generator

2018-12-13 Thread Saleil Bhat (BLOOMBERG/ 731 LEX)
Ah ok thanks. This brings up another question: how did the HexStrings generator 
code path even get called? 



When I saw these results, I was using the following test table: 
  CREATE TABLE testtable (
  partition_key text,
  clustering_column text, 
  value text,
  PRIMARY KEY (partition_key, clustering_column)
  )


From StressProfile.java, any column of type TEXT should use the Strings 
generator. 
However, my data looks suspiciously like the HexStrings generator 
was being used instead. 


First, the generated strings included control characters like SUB (\x1A), BEL 
(\x07), etc. However, the Strings generator code looks like it forces the 
characters to be in the printing characters range. 
Second, the result I documented previously (that the characters are normally 
distributed, but the strings are not), matches the implementation of 
HexStrings. 



Do you know why this might be the case?

Thanks, 
-Saleil 


From: bened...@apache.org At: 12/12/18 18:09:14To:  Saleil Bhat (BLOOMBERG/ 731 
LEX ) ,  dev@cassandra.apache.org
Subject: Re: cassandra-stress HexStrings generator

Yes, I’m pretty sure you understood correctly (I wrote most of this, but it’s 
been a long time so I cannot remember much for certain).  

It should be implemented like the Strings generator.  It looks like both 
HexStrings and HexBytes are incorrect, and have been for a long time.


> On 12 Dec 2018, at 22:27, Saleil Bhat (BLOOMBERG/ 731 LEX) 
 wrote:
> 
> Hi, 
> 
> I have a question about the behavior of the HexStrings value generator in the 
cassandra-stress tool, particularly concerning its population/identity 
distribution.  
> 
> 
> Per the discussion in JIRA item CASSANDRA-6146 concerning the stress YAML 
profile, the population field in a columnspec “represents the total unique 
population distribution of that column across rows.”
> 
> 
> I interpreted this to mean that if I specify some distribution 'F' for a 
column, then the probability of occurrence for each potential value of that 
column is given by 'F'. 
> 
> So, for example, if I provided the following columnspec for a text column: 
>  name: fake_column 
>   size: fixed(32) 
> population: gaussian(1..100)  
> and then generated a large amount of data according to this specification, 
> I would expect there to be 100 distinct values for ‘fake_column’, and that a 
histogram of the frequency of occurrence of each value would be roughly 
bell-shaped. 
> 
> 
> 
> However, the current implementation of the HexStrings generator deviates from 
this expectation. In the current implementation, each CHARACTER in the string 
is drawn from F, rather than the string as a whole. Therefore, if you plot the 
histogram of frequency of occurrence for each character, you get a bell-shaped 
curve, but the distribution of the occurrences of whole strings (the actual 
columns) is something else. 
> 
> 
> My question is, is this the desired behavior for string columns? Was my 
expectation/interpretation incorrect? If so, can anyone give some insight as to 
why strings are designed to behave this way and what the use case is for this 
behavior? 
> 
> Thanks, 
> -Saleil 




Re: Revisit the proposal to use github PR

2018-12-13 Thread Aleksey Yeschenko
There are some nice benefits to GH PRs, one of them is that we could eventually 
set up CircleCI hooks that would explicitly prevent commits that don’t pass the 
tests.

But handling multiple branches would indeed be annoying. Would have to either 
submit 1 PR per branch - which is both tedious and non-atomic - or do a mixed 
approach, with a PR for the oldest branch, then a manual merge upwards. The 
latter would be kinda meh, especially when commits for different branches 
diverge.

For me personally, the current setup works quite well, and I mostly share 
Sylvain’s opinion above, for the same reasons listed.

—
AY

> On 13 Dec 2018, at 08:15, Sylvain Lebresne  wrote:
> 
> Fwiw, I personally find it very useful to have all discussion, review
> comments included, in the same place (namely JIRA, since for better or
> worse, that's what we use for tracking tickets). Typically, that means
> everything gets consistently pushed to the  commits@ mailing list, which I
> find extremely convenient to keep track of things. I also have a theory
> that the inline-comments type of review github PR give you is very
> convenient for nitpicks, shallow or spur-of-the-moment comments, but
> doesn't help that much for deeper reviews, and that it thus to favor the
> former kind of review.
> 
> Additionally, and to Benedict's point, I happen to have first hand
> experience with a PR-based process for a multi-branch workflow very similar
> to the one of this project, and suffice to say that I hate it with a
> passion.
> 
> Anyway, very much personal opinion here.
> --
> Sylvain
> 
> 
> On Thu, Dec 13, 2018 at 2:13 AM dinesh.jo...@yahoo.com.INVALID
>  wrote:
> 
>> I've been already using github PRs for some time now. Once you specify the
>> ticket number, the comments and discussion are persisted in Apache Jira as
>> work log so it can be audited if desired. However, committers usually
>> squash and commit the changes once the PR is approved. We don't use the
>> merge feature in github. I don't believe github we can merge the commit
>> into multiple branches through the UI. We would need to merge it into one
>> branch and then manually merge that commit into other branches. The big
>> upside of using github PRs is that it makes collaborating a lot easier.
>> Downside is that it makes it very difficult to follow along the progress in
>> Apache Jira. The messages that github posts back include huge diffs and are
>> aweful.
>> Dinesh
>> 
>>On Thursday, December 13, 2018, 1:10:12 AM GMT+5:30, Benedict Elliott
>> Smith  wrote:
>> 
>> Perhaps somebody could summarise the tradeoffs?  I’m a little concerned
>> about how it would work for our multi-branch workflow.  Would we open
>> multiple PRs?
>> 
>> Could we easily link with external CircleCI?
>> 
>> It occurs to me, in JIRA proposal mode, that an extra required field for a
>> permalink to GitHub for the patch would save a lot of time I spend hunting
>> for a branch in the comments.
>> 
>> 
>> 
>> 
>>> On 12 Dec 2018, at 19:20, jay.zhu...@yahoo.com.INVALID wrote:
>>> 
>>> It was discussed 1 year's ago:
>> https://www.mail-archive.com/dev@cassandra.apache.org/msg11810.html
>>> As all Apache projects are moving to gitbox:
>> https://reference.apache.org/committer/github, should we revisit that and
>> change our review/commit process to use github PR?A good example is
>> Spark:"Changes to Spark source code are proposed, reviewed and committed
>> via Github pull requests" (https://spark.apache.org/contributing.html).
>>> /jay
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Revisit the proposal to use github PR

2018-12-13 Thread Sylvain Lebresne
Fwiw, I personally find it very useful to have all discussion, review
comments included, in the same place (namely JIRA, since for better or
worse, that's what we use for tracking tickets). Typically, that means
everything gets consistently pushed to the  commits@ mailing list, which I
find extremely convenient to keep track of things. I also have a theory
that the inline-comments type of review github PR give you is very
convenient for nitpicks, shallow or spur-of-the-moment comments, but
doesn't help that much for deeper reviews, and that it thus to favor the
former kind of review.

Additionally, and to Benedict's point, I happen to have first hand
experience with a PR-based process for a multi-branch workflow very similar
to the one of this project, and suffice to say that I hate it with a
passion.

Anyway, very much personal opinion here.
--
Sylvain


On Thu, Dec 13, 2018 at 2:13 AM dinesh.jo...@yahoo.com.INVALID
 wrote:

> I've been already using github PRs for some time now. Once you specify the
> ticket number, the comments and discussion are persisted in Apache Jira as
> work log so it can be audited if desired. However, committers usually
> squash and commit the changes once the PR is approved. We don't use the
> merge feature in github. I don't believe github we can merge the commit
> into multiple branches through the UI. We would need to merge it into one
> branch and then manually merge that commit into other branches. The big
> upside of using github PRs is that it makes collaborating a lot easier.
> Downside is that it makes it very difficult to follow along the progress in
> Apache Jira. The messages that github posts back include huge diffs and are
> aweful.
> Dinesh
>
> On Thursday, December 13, 2018, 1:10:12 AM GMT+5:30, Benedict Elliott
> Smith  wrote:
>
>  Perhaps somebody could summarise the tradeoffs?  I’m a little concerned
> about how it would work for our multi-branch workflow.  Would we open
> multiple PRs?
>
> Could we easily link with external CircleCI?
>
> It occurs to me, in JIRA proposal mode, that an extra required field for a
> permalink to GitHub for the patch would save a lot of time I spend hunting
> for a branch in the comments.
>
>
>
>
> > On 12 Dec 2018, at 19:20, jay.zhu...@yahoo.com.INVALID wrote:
> >
> > It was discussed 1 year's ago:
> https://www.mail-archive.com/dev@cassandra.apache.org/msg11810.html
> > As all Apache projects are moving to gitbox:
> https://reference.apache.org/committer/github, should we revisit that and
> change our review/commit process to use github PR?A good example is
> Spark:"Changes to Spark source code are proposed, reviewed and committed
> via Github pull requests" (https://spark.apache.org/contributing.html).
> > /jay
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>