Ignoring everything else in this thread to put sharper point on one issue. In 
the pr multiple people referred to it's not a blocker based on it was also a 
bug/dropped feature in the previous release (note one was phrased slightly 
different as it was stated not a regression, which I read as not a regression 
from the previous feature release).  My thoughts on this are if multiple people 
think this then others may as well so I think we need a discuss thread on it.
My reasons for disagreeing with that are it specifically goes against our 
documented versioning policy.  The jira claims we essentially broke proper 
support for hive udafs, we specifically state in our docs we support hive 
udafs, i consider that an api, our versioning docs say we wont break api 
compatibility in feature releases. It shouldn't matter if that was 1 feature 
release ago or 10, until we do a major release we shouldn't break or drop that 
compatibility.
So we should not be using that as a reason to decide if a jira is a blocker or 
not.

Tom 
 
  On Thu, Oct 25, 2018 at 9:39 AM, Sean Owen<sro...@gmail.com> wrote:   What 
does "PMC members aren't saying its a block for reasons other then the actual 
impact the jira has" mean that isn't already widely agreed? Likewise 
"Committers and PMC members should not be saying its not a blocker because they 
personally or their company doesn't care about this feature or api". It sounds 
like insinuation, and I'd rather make it explicit -- call out the bad actions 
-- or keep it to observable technical issues.
Likewise one could say there's a problem just because A thinks X should be a 
blocker and B disagrees. I see no bad faith, process problem, or obvious 
errors. Do you? I see disagreement, and it's tempting to suspect motives. I 
have seen what I think are actual bad-faith decisions in the past in this 
project, too. I don't see it here though and want to stick to 'now'.

(Aside: the implication is that those representing vendors are steam-rolling a 
release. Actually, the cynical incentives cut the other way here. Blessing the 
latest changes as OSS Apache Spark is predominantly beneficial to users of OSS, 
not distros. In fact, it forces distros to make changes. And broadly, vendors 
have much more accountability for quality of releases, because they're paid to.)

I'm still not sure what specifically the objection is to what here? I 
understand a lot is in flight and nobody agrees with every decision made, but, 
what else is new? Concretely: the release is held again to fix a few issues, in 
the end. For the map_filter issue, that seems like the right call, and there 
are a few other important issues that could be quickly fixed too. All is well 
there, yes?
This has surfaced some implicit reasoning about releases that we could make 
explicit, like:
(Sure, if you want to write down things like, release blockers should be 
decided in the interests of the project by the PMC, OK)
We have a time-based release schedule, so time matters. There is an opportunity 
cost to not releasing. The bar for blockers goes up over time.
Not all regressions are blockers. Would you hold a release over a trivial 
regression? but then which must or should block? There's no objective answer, 
but a reasonable rule is: non-trivial regressions from minor release x.y to 
x.{y+1} block releases. Regressions from x.{y-1} to x.{y+1} should, but not 
necessarily, block the release. We try hard to avoid regressions in x.y.0 
releases because these are generally consumed by aggressive upgraders, on 
x.{y-1}.z now. If a bug exists in x.{y-1}, they're not affected or worked 
around it. The cautious upgrader goes from maybe x.{y-2}.z to x.y.1 later. 
They're affected, but not before, maybe, a maintenance release. A crude 
argument, and it's not an argument that regressions are OK. It's an argument 
that 'old' regressions matter less. And maybe it's reasonable to draw the 
"must" vs "should" line between them.



On Thu, Oct 25, 2018 at 8:51 AM Tom Graves <tgraves...@yahoo.com> wrote:

 So just to clarify a few things in case people didn't read the entire thread 
in the PR, the discussion is what is the criteria for a blocker and really my 
concerns are what people are using as criteria for not marking a jira as a 
blocker.
The only thing we have documented to mark a jira as a blocker is for 
correctness issues: http://spark.apache.org/contributing.html.  And really I 
think that is initially mark it as a blocker to bring attention to it.The final 
decision as to whether something is a blocker is up to the PMC who votes on 
whether a release passes.  I think it would be impossible to properly define 
what a blocker is with strict rules.
Personally from this thread I would like to make sure committers and PMC 
members aren't saying its a block for reasons other then the actual impact the 
jira has and if its at all in question it should be brought to the PMC's 
attention for a vote.  I agree with others that if its during an RC it should 
be talked about on the RC thread.
A few specific things that were said that I disagree with are:   - its not a 
blocker because it was also an issue in the last release (meaning feature 
release).  ie the bug was introduced in 2.2 and now we are doing 2.4 so its 
automatically not a blocker.  This to me is just wrong.  Lots of things are not 
found immediately, or aren't reported immediately.   Now I do believe the 
timeframe its been in there does affect the decision on the impact but just 
making the decision on this to me is to strict.    - Committers and PMC members 
should not be saying its not a blocker because they personally or their company 
doesn't care about this feature or api, or state that the Spark project as a 
whole doesn't care about this feature unless that was specifically voted on at 
the project level. They need to follow the api compatibility we have 
documented. This is really a broader issue then just marking a jira, it goes to 
anything checked in and perhaps need to be a separate thread.

For the verbiage of what a regression is, it seems like that should be defined 
by our versioning documents. It states what we do in maintenance, feature, and 
major releases (http://spark.apache.org/versioning-policy.html), if its not 
defined by that we probably need to clarify.   There was a good example we 
might want to clarify about things like scala or java compatibility in feature 
releases.  
Obviously this is my opinion and its here for everyone to discuss and come to a 
consensus on.   
Tom


  
  

Reply via email to