Re: Spark Improvement Proposals

2017-03-13 Thread Sean Owen
Responding to your request for a vote, I meant that this isn't required per se and the consensus here was not to vote on it. Hence the jokes about meta-voting protocol. In that sense nothing new happened process-wise, nothing against ASF norms, if that's your concern. I think it's just an agreed

Re: Spark Improvement Proposals

2017-03-13 Thread Tom Graves
Another thing I think you should send out is when exactly does this take affect.  Is it any major new feature without a pull request?   Is it anything major starting with the 2.3 release?   Tom On Monday, March 13, 2017 1:08 PM, Tom Graves wrote: I'm not

Re: Spark Improvement Proposals

2017-03-13 Thread Tom Graves
I'm not sure how you can say its not a new process.  If that is the case why do we need a page documenting it?   As a developer if I want to put up a major improvement I have to now follow the SPIP whereas before I didn't, that certain seems like a new process.  As a PMC member I now have the

Re: Spark Improvement Proposals

2017-03-13 Thread Sean Owen
It's not a new process, in that it doesn't entail anything not already in http://apache.org/foundation/voting.html . We're just deciding to call a VOTE for this type of code modification. To your point -- yes, it's been around a long time with no further comment, and I called several times for

Re: Spark Improvement Proposals

2017-03-13 Thread Tom Graves
It seems like if you are adding responsibilities you should do a vote.  SPIP'S require votes from PMC members so you are now putting more responsibility on them. It feels like we should have an official vote to make sure they (PMC members) agree with that and to make sure everyone pays

Re: Spark Improvement Proposals

2017-03-13 Thread Sean Owen
This ended up proceeding as a normal doc change, instead of precipitating a meta-vote. However, the text that's on the web site now can certainly be further amended if anyone wants to propose a change from here. On Mon, Mar 13, 2017 at 1:50 PM Tom Graves wrote: > I think a

Re: Spark Improvement Proposals

2017-03-13 Thread Tom Graves
I think a vote here would be good. I think most of the discussion was done by 4 or 5 people and its a long thread.  If nothing else it summarizes everything and gets people attention to the change. Tom On Thursday, March 9, 2017 10:55 AM, Sean Owen wrote: I think a

Re: Spark Improvement Proposals

2017-03-10 Thread Reynold Xin
We can just start using spip label and link to it. On Fri, Mar 10, 2017 at 9:18 AM, Cody Koeninger wrote: > So to be clear, if I translate that google doc to markup and submit a > PR, you will merge it? > > If we're just using "spip" label, that's probably fine, but we

Re: Spark Improvement Proposals

2017-03-10 Thread Cody Koeninger
Can someone with filter share permissions can make a filter for open SPIP and one for closed SPIP and share it? e.g. project = SPARK AND status in (Open, Reopened, "In Progress") AND labels=SPIP ORDER BY createdDate DESC and another with the status closed equivalent I just made an open ticket

Re: Spark Improvement Proposals

2017-03-10 Thread Cody Koeninger
So to be clear, if I translate that google doc to markup and submit a PR, you will merge it? If we're just using "spip" label, that's probably fine, but we still need shared filters for open and closed SPIPs so the page can link to them. I do not believe I have jira permissions to share filters,

Re: Spark Improvement Proposals

2017-03-10 Thread Cody Koeninger
I think it ought to be its own page, linked from the more / community menu dropdowns. We also need the jira tag, and for the page to clearly link to filters that show proposed / completed SPIPs On Fri, Mar 10, 2017 at 3:39 AM, Sean Owen wrote: > Alrighty, if nobody is

Re: Spark Improvement Proposals

2017-03-10 Thread Sean Owen
Alrighty, if nobody is objecting, and nobody calls for a VOTE, then, let's say this document is the SPIP 1.0 process. I think the next step is just to translate the text to some suitable location. I suggest adding it to https://github.com/apache/spark-website/blob/asf-site/contributing.md On

Re: Spark Improvement Proposals

2017-03-09 Thread Koert Kuipers
gonna end up with a stackoverflow on recursive votes here On Thu, Mar 9, 2017 at 1:17 PM, Mark Hamstra wrote: > -0 on voting on whether we need a vote. > > On Thu, Mar 9, 2017 at 9:00 AM, Reynold Xin wrote: > >> I'm fine without a vote. (are we

Re: Spark Improvement Proposals

2017-03-09 Thread Mark Hamstra
-0 on voting on whether we need a vote. On Thu, Mar 9, 2017 at 9:00 AM, Reynold Xin wrote: > I'm fine without a vote. (are we voting on wether we need a vote?) > > > On Thu, Mar 9, 2017 at 8:55 AM, Sean Owen wrote: > >> I think a VOTE is over-thinking

Re: Spark Improvement Proposals

2017-03-09 Thread vaquar khan
Many of us have issue with "shepherd role " , i think we should go with vote. Regards, Vaquar khan On Thu, Mar 9, 2017 at 11:00 AM, Reynold Xin wrote: > I'm fine without a vote. (are we voting on wether we need a vote?) > > > On Thu, Mar 9, 2017 at 8:55 AM, Sean Owen

Re: Spark Improvement Proposals

2017-03-09 Thread Reynold Xin
I'm fine without a vote. (are we voting on wether we need a vote?) On Thu, Mar 9, 2017 at 8:55 AM, Sean Owen wrote: > I think a VOTE is over-thinking it, and is rarely used, but, can't hurt. > Nah, anyone can call a vote. This really isn't that formal. We just want to >

Re: Spark Improvement Proposals

2017-03-09 Thread Sean Owen
I think a VOTE is over-thinking it, and is rarely used, but, can't hurt. Nah, anyone can call a vote. This really isn't that formal. We just want to declare and document consensus. I think SPIP is just a remix of existing process anyway, and don't think it will actually do much anyway, which is

Re: Spark Improvement Proposals

2017-03-09 Thread Cody Koeninger
I started this idea as a fork with a merge-able change to docs. Reynold moved it to his google doc, and has suggested during this email thread that a vote should occur. If a vote needs to occur, I can't see anything on http://apache.org/foundation/voting.html suggesting that I can call for a vote,

Re: Spark Improvement Proposals

2017-03-07 Thread Sean Owen
Do we need a VOTE? heck I think anyone can call one, anyway. Pre-flight vote check: anyone have objections to the text as-is? See https://docs.google.com/document/d/1-Zdi_W-wtuxS9hTK0P9qb2x-nRanvXmnZ7SUi4qMljg/edit# If so let's hash out specific suggest changes. If not, then I think the next

Re: Spark Improvement Proposals

2017-03-07 Thread Cody Koeninger
osed to be a way to make people design in public >>> and a >>> >>> >> way to >>> >>> >> force attention to a particular change, then, this doesn't do >>> that by >>> >>> >> itself. Therefore I don't want to

Re: Spark Improvement Proposals

2017-02-27 Thread Sean Owen
To me, no new process is being invented here, on purpose, and so we should just rely on whatever governs any large JIRA or vote, because SPIPs are really just guidance for making a big JIRA. http://apache.org/foundation/voting.html suggests that PMC members have the binding votes in general, and

Re: Spark Improvement Proposals

2017-02-27 Thread Ryan Blue
>> document. >> >>> >> >> >>> >> Still, a fine step IMHO. >> >>> >> >> >>> >> On Thu, Feb 16, 2017 at 4:22 PM Reynold Xin <r...@databricks.com> >> >>> >> wrote: >> >

Re: Spark Improvement Proposals

2017-02-24 Thread Joseph Bradley
>>> > >>> >>> > >>> >>> On Wed, Feb 15, 2017 at 2:53 AM, Cody Koeninger < > c...@koeninger.org> > >>> >>> wrote: > >>> >>>> > >>> >>>> Thanks for doing that. > >>> >

Re: Spark Improvement Proposals

2017-02-24 Thread Cody Koeninger
e are at least 4 different Apache voting processes, >>> >>>> "typical Apache vote process" isn't meaningful to me. >>> >>>> >>> >>>> I think the intention is that in order to pass, it needs at least 3 >>>

Re: Spark Improvement Proposals

2017-02-17 Thread vaquar khan
ot; isn't meaningful to me. >> >>>> >> >>>> I think the intention is that in order to pass, it needs at least 3 >> +1 >> >>>> votes from PMC members *and no -1 votes from PMC members*. But the >> document >> >>>> doesn't expli

Re: Spark Improvement Proposals

2017-02-16 Thread Ryan Blue
ding a shepherd, but that's > different. > >>>> > >>>> Other than that, LGTM. > >>>> > >>>> On Mon, Feb 13, 2017 at 9:02 AM, Reynold Xin <r...@databricks.com> > wrote: > >>>>> > >>>>> Here's a new

Re: Spark Improvement Proposals

2017-02-16 Thread Sam Elamin
t;>> > >>>> Other than that, LGTM. > >>>> > >>>> On Mon, Feb 13, 2017 at 9:02 AM, Reynold Xin <r...@databricks.com> > wrote: > >>>>> > >>>>> Here's a new draft that incorporated most of the feedback: &g

Re: Spark Improvement Proposals

2017-02-16 Thread Cody Koeninger
that incorporated most of the feedback: >>>> https://docs.google.com/document/d/1-Zdi_W-wtuxS9hTK0P9qb2x-nRanvXmnZ7SUi4qMljg/edit# >>>> >>>> I added a specific role for SPIP Author and another one for SPIP >>>> Shepherd. >>>> >>>>

Re: Spark Improvement Proposals

2017-02-16 Thread Sean Owen
specific role for SPIP Author and another one for SPIP Shepherd. > > On Sat, Feb 11, 2017 at 6:13 PM, Xiao Li <gatorsm...@gmail.com> wrote: > > During the summit, I also had a lot of discussions over similar topics > with multiple Committers and active users. I heard many fanta

Re: Spark Improvement Proposals

2017-02-16 Thread Ryan Blue
specific role for SPIP Author and another one for SPIP >>> Shepherd. >>> >>> On Sat, Feb 11, 2017 at 6:13 PM, Xiao Li <gatorsm...@gmail.com> wrote: >>> >>>> During the summit, I also had a lot of discussions over similar topics >>&g

Re: Spark Improvement Proposals

2017-02-16 Thread Reynold Xin
the summit, I also had a lot of discussions over similar topics >>> with multiple Committers and active users. I heard many fantastic ideas. I >>> believe Spark improvement proposals are good channels to collect the >>> requirements/designs. >>> >>> &g

Re: Spark Improvement Proposals

2017-02-14 Thread Cody Koeninger
gatorsm...@gmail.com> wrote: > >> During the summit, I also had a lot of discussions over similar topics >> with multiple Committers and active users. I heard many fantastic ideas. I >> believe Spark improvement proposals are good channels to collect the >> requirem

Re: Spark Improvement Proposals

2017-02-13 Thread Reynold Xin
wrote: > During the summit, I also had a lot of discussions over similar topics > with multiple Committers and active users. I heard many fantastic ideas. I > believe Spark improvement proposals are good channels to collect the > requirements/designs. > > > IMO, we also need to

Re: Spark Improvement Proposals

2017-02-11 Thread Xiao Li
During the summit, I also had a lot of discussions over similar topics with multiple Committers and active users. I heard many fantastic ideas. I believe Spark improvement proposals are good channels to collect the requirements/designs. IMO, we also need to consider the priority when working

Re: Spark Improvement Proposals

2017-02-11 Thread Cody Koeninger
At the spark summit this week, everyone from PMC members to users I had never met before were asking me about the Spark improvement proposals idea. It's clear that it's a real community need. But it's been almost half a year, and nothing visible has been done. Reynold, are you going to do

Re: Spark Improvement Proposals

2017-01-11 Thread Reynold Xin
two emails that go >>>>> to >>>>> >> >>> > dev@. >>>>> >> >>> > >>>>> >> >>> > >>>>> >> >>> > While I was editing this, I thought we really needed a >>&

Re: Spark Improvement Proposals

2017-01-05 Thread Tim Hunter
gt;> Most things looked OK to me too, although I do plan to take a >>>> >> >>> >> closer >>>> >> >>> >> look >>>> >> >>> >> after Nov 1st when we cut the release branch for 2.1. >>>&g

Re: Spark Improvement Proposals

2017-01-03 Thread Cody Koeninger
s not >>> >> >>> >>> explicitly >>> >> >>> >>> called, that voting would happen by e-mail? A template for the >>> >> >>> >>> proposal document (instead of just a bullet nice) would also >&

Re: Spark Improvement Proposals

2017-01-03 Thread Imran Rashid
;> >>> somewhat matches the proposed format. So if anyone wants to try >> >> >>> >>> out >> >> >>> >>> the process... >> >> >>> >>> >> >> >>> >>> On Mon, Oct 31, 2016 at 1

Re: Spark Improvement Proposals

2017-01-03 Thread Joseph Bradley
gt;>> > in > >> >>> >>> > moving forward with this? > >> >>> >>> > > >> >>> >>> > > >> >>> >>> > > >> >>> >>> > > >> >&

Re: Spark Improvement Proposals

2016-11-08 Thread Ryan Blue
>>> >>> >> The > >>> >>> >> idea with benchmarks was to show two things: > >>> >>> >> > >>> >>> >> - why some people are doing bad PR for Spark > >>> >>> >> > >>&g

Re: Spark Improvement Proposals

2016-11-08 Thread Cody Koeninger
t;>> >> - why some people are doing bad PR for Spark >>> >>> >> >>> >>> >> - how - in easy way - we can change it and show that Spark is >>> >>> >> still on >>> >>> >> the >>> >>> >> top >>> >>&

Re: Spark Improvement Proposals

2016-11-07 Thread Reynold Xin
n Spark :) On the Spark main page there is >> still >> >>> >> chart >> >>> >> "Spark vs Hadoop". It is important to show that framework is not >> the >> >>> >> same >> >>> >> Spark with other A

Re: Spark Improvement Proposals

2016-11-07 Thread Reynold Xin
gt;> faster than other frameworks. > >>> >> > >>> >> > >>> >> About real-time streaming, I think it would be just good to see it > in > >>> >> Spark. > >>> >> I very like current Spark model, but m

Re: Odp.: Spark Improvement Proposals

2016-11-07 Thread Cody Koeninger
e" - >>> >> community should listen also them and try to help them. With SIPs it >>> >> would >>> >> be easier, I've just posted this example as "thing that may be changed >>> >> with >>> >> SIP". >>> >&g

Re: Odp.: Spark Improvement Proposals

2016-11-07 Thread Reynold Xin
a lot of algorithms >> >> inside - let's make easy API, but with strong background (articles, >> >> benchmarks, descriptions, etc) that shows that Spark is still modern >> >> framework. >> >> >> >> >> >> Maybe now my intent

Re: Python Spark Improvements (forked from Spark Improvement Proposals)

2016-11-01 Thread Holden Karau
think this is something that >> we >> want some more community iteration on maybe? >> >> >> >> >> >> -- >> View this message in context: http://apache-spark-developers >> -list.1001551.n3.nabble.com/Python-Spark-Improvements- >

Re: Odp.: Spark Improvement Proposals

2016-11-01 Thread Reynold Xin
gt; >> were already mentioned and I agree with them, my mail was just to show > some > >> aspects from my side, so from theside of developer and person who is > trying > >> to help others with Spark (via StackOverflow or other ways) > >> > >> > >

Re: Python Spark Improvements (forked from Spark Improvement Proposals)

2016-10-31 Thread Holden Karau
gt; package so that people can try to use it. I think this is something that > we > want some more community iteration on maybe? > > > > > > -- > View this message in context: http://apache-spark- > developers-list.1001551.n3.nabble.com/Python-Spark- > Improvements-fo

Re: Python Spark Improvements (forked from Spark Improvement Proposals)

2016-10-31 Thread mariusvniekerk
-Improvement-Proposals-tp19422p19670.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Odp.: Spark Improvement Proposals

2016-10-31 Thread Marcelo Vanzin
my intention will be clearer :) As I said organizational ideas >> were already mentioned and I agree with them, my mail was just to show some >> aspects from my side, so from theside of developer and person who is trying >> to help others with Spark (via StackOverflow or other wa

Re: Odp.: Spark Improvement Proposals

2016-10-31 Thread Ryan Blue
erson who is > trying > > to help others with Spark (via StackOverflow or other ways) > > > > > > Pozdrawiam / Best regards, > > > > Tomasz > > > > > > > > Od: Cody Koeninger <c...@koeninger.org> > > W

Re: Odp.: Spark Improvement Proposals

2016-10-31 Thread Cody Koeninger
h them, my mail was just to show some > aspects from my side, so from theside of developer and person who is trying > to help others with Spark (via StackOverflow or other ways) > > > Pozdrawiam / Best regards, > > Tomasz > > > ________ > Od: Cody K

Odp.: Spark Improvement Proposals

2016-10-17 Thread Tomasz Gawęda
ys) Pozdrawiam / Best regards, Tomasz Od: Cody Koeninger <c...@koeninger.org> Wysłane: 17 października 2016 16:46 Do: Debasish Das DW: Tomasz Gawęda; dev@spark.apache.org Temat: Re: Spark Improvement Proposals I think narrowly focusing on Flink or benchmarks is missing

Re: Spark Improvement Proposals

2016-10-17 Thread Cody Koeninger
I think narrowly focusing on Flink or benchmarks is missing my point. My point is evolve or die. Spark's governance and organization is hampering its ability to evolve technologically, and it needs to change. On Sun, Oct 16, 2016 at 9:21 PM, Debasish Das wrote: >

Re: Spark Improvement Proposals(Internet mail)

2016-10-17 Thread 黄明
can be. --- Sincerely Andy 原始邮件 发件人: Debasish Das<debasish.da...@gmail.com> 收件人: Tomasz Gawęda<tomasz.gaw...@outlook.com> 抄送: dev@spark.apache.org<dev@spark.apache.org>; Cody Koeninger<c...@koeninger.org> 发送时间: 2016年10月17日(周一) 10:21 主题: Re: Spark Improvement Proposals

Re: Spark Improvement Proposals

2016-10-16 Thread Debasish Das
Thanks Cody for bringing up a valid point...I picked up Spark in 2014 as soon as I looked into it since compared to writing Java map-reduce and Cascading code, Spark made writing distributed code fun...But now as we went deeper with Spark and real-time streaming use-case gets more prominent, I

Re: Spark Improvement Proposals

2016-10-16 Thread Tomasz Gawęda
Hi everyone, I'm quite late with my answer, but I think my suggestions may help a little bit. :) Many technical and organizational topics were mentioned, but I want to focus on these negative posts about Spark and about "haters" I really like Spark. Easy of use, speed, very good community -

Re: Python Spark Improvements (forked from Spark Improvement Proposals)

2016-10-14 Thread mariusvniekerk
ogress bars, tab completion for spark configuration properties, easier loading of scala objects via py4j. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Python-Spark-Improvements-forked-from-Spark-Improvement-Proposals-tp19422p19449.html Sent from the

Re: Python Spark Improvements (forked from Spark Improvement Proposals)

2016-10-13 Thread Holden Karau
lto:ml-node+ > <javascript:_e(%7B%7D,'cvml','ml-node%2B');>[hidden email] > <http:///user/SendEmail.jtp?type=node=19431=0>] > *Sent:* Thursday, October 13, 2016 3:51 AM > *To:* Mendelson, Assaf > *Subject:* Re: Python Spark Improvements (forked from Spark Improvement &g

RE: Python Spark Improvements (forked from Spark Improvement Proposals)

2016-10-13 Thread assaf.mendelson
te. From: msukmanowsky [via Apache Spark Developers List] [mailto:ml-node+s1001551n19426...@n3.nabble.com] Sent: Thursday, October 13, 2016 3:51 AM To: Mendelson, Assaf Subject: Re: Python Spark Improvements (forked from Spark Improvement Proposals) As very heavy Spark users at Parse.ly, I just wanted to

Re: Spark Improvement Proposals

2016-10-12 Thread kant kodali
;> >> > problem is >> >>>> >> > that writing a good document takes time. This way we can >> leverage >> >>>> >> > non >> >>>> >> > committers to do some of this work (it is just another way to >>

Re: Python Spark Improvements (forked from Spark Improvement Proposals)

2016-10-12 Thread msukmanowsky
similar concepts but incompatible implementations. We're big fans of PySpark and are happy to provide feedback and contribute wherever we can. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Python-Spark-Improvements-forked-from-Spark-Improvement-Propo

Re: Spark Improvement Proposals

2016-10-11 Thread Ryan Blue
t;>>> >> > contribute). > >>>> >> > > >>>> >> > > >>>> >> > > >>>> >> > As for strategy, in many cases implementation strategy can affect > >>>> >> >

Re: Improving governance / committers (split from Spark Improvement Proposals thread)

2016-10-10 Thread Holden Karau
conversation applies <http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-Improvement-Proposals-tp19268p19284.html> - I do very much "have a horse in the race" so I will avoid proposing new criteria. I working on Spark is a core part of what I do most days, and once my day job wi

Re: Spark Improvement Proposals

2016-10-10 Thread Cody Koeninger
gt; > strategy, >>>> >> > we group by the time to achieve a sliding window. This is >>>> >> > definitely an >>>> >> > implementation decision and not a goal. However, I can think of >>>> >> > several >>>>

Re: Spark Improvement Proposals

2016-10-10 Thread Mark Hamstra
inside their calculation >>> >> > buffer. >>> >> > For example, let’s say we want to return a set of all distinct >>> values. >>> >> > One >>> >> > way to implement this would be to make the set into a map and have >>> the >>> >> >

Re: Spark Improvement Proposals

2016-10-10 Thread Mark Hamstra
I'm not a fan of the SEP acronym. Besides it prior established meaning of "Somebody else's problem", the are other inappropriate or offensive connotations such as this Australian slang that often gets shortened to just "sep": http://www.urbandictionary.com/define.php?term=Seppo On Sun, Oct 9,

Re: Spark Improvement Proposals

2016-10-10 Thread Cody Koeninger
;>>>>>>>> >>>>>>>>>>> - For user-facing stuff, I think you need a section on API. >>>>>>>>>>> Virtually >>>>>>>>>>> all >>>>>>>>>>> other *IPs I

Re: Spark Improvement Proposals

2016-10-10 Thread Cody Koeninger
gt; > One >> >>> > way to implement this would be to make the set into a map and have >> >>> > the >> >>> > value >> >>> > contain the last time seen. Multiplying it across the groupby would >> >>> > cost a >

Re: Spark Improvement Proposals

2016-10-10 Thread Cody Koeninger
t; > full >>> >>> > design document." Is this unclear? Design docs can be worked on >>> >>> > obviously, but that's not what I'm concerned with here. >>> >>> > >>> >>> > >>> >>>

Re: Spark Improvement Proposals

2016-10-10 Thread Cody Koeninger
hat these cases are rare enough so that the strategy is still good >> > enough >> > but how would we know it without user feedback? >> > >> > I believe this example is exactly what Cody was talking about. Since >> > many >> > times implementation str

Re: Spark Improvement Proposals

2016-10-10 Thread Ryan Blue
ible goals, it is often much harder to > > figure out that the goals are unfeasible without fine tuning. > > > > > > > > > > > > Assaf. > > > > > > > > From: Cody Koeninger-2 [via Apache Spark Developers List] > > [mailto:ml-node+[hid

Re: Spark Improvement Proposals

2016-10-10 Thread Cody Koeninger
; > Assaf. > > > > From: Cody Koeninger-2 [via Apache Spark Developers List] > [mailto:ml-node+[hidden email]] > Sent: Monday, October 10, 2016 2:25 AM > To: Mendelson, Assaf > Subject: Re: Spark Improvement Proposals > > > > Only committers should formally

Re: Spark Improvement Proposals

2016-10-09 Thread Cody Koeninger
ect a strategy before you've really begun >> >> designing >> >> and implementing something? What if you discover that the strategy is >> >> actually better when you start doing stuff? >> >> >> >> At a super high level, it depends o

Re: Spark Improvement Proposals

2016-10-09 Thread Nicholas Chammas
On Sun, Oct 9, 2016 at 5:19 PM Cody Koeninger wrote: > Regarding name, if the SIP overlap is a concern, we can pick a different > name. > > My tongue in cheek suggestion would be > > Spark Lightweight Improvement process (SPARKLI) > If others share my minor concern about the

Re: Spark Improvement Proposals

2016-10-09 Thread Matei Zaharia
d design docs (just a more > >> visible design doc for bigger changes). I looked at Kafka's KIPs, and they > >> actually seem to be more like design docs. This can work too but it does > >> require more work from the proposer and it can lead to the same problems > &g

Re: Spark Improvement Proposals

2016-10-09 Thread Cody Koeninger
rategy is >> >> actually better when you start doing stuff? >> >> >> >> At a super high level, it depends on whether you want the SIPs to be >> >> PRDs >> >> for getting some quick feedback on the goals of a feature before it is >

Re: Spark Improvement Proposals

2016-10-09 Thread Cody Koeninger
; >> designed, or something more like full-fledged design docs (just a more > >> visible design doc for bigger changes). I looked at Kafka's KIPs, and > they > >> actually seem to be more like design docs. This can work too but it does > >> require more work from the

Re: Spark Improvement Proposals

2016-10-09 Thread Ofir Manor
looked at Kafka's KIPs, and > they > >> actually seem to be more like design docs. This can work too but it does > >> require more work from the proposer and it can lead to the same > problems you > >> mentioned with people already having a design and implementation in > mind.

Re: Spark Improvement Proposals

2016-10-09 Thread Matei Zaharia
dding a step for user feedback earlier? Or are you just trying to make >> design docs for key features more visible (and their approval more formal)? >> >> BTW note that in either case, I'd like to have a template for design docs >> too, which should also include goals. I

Re: Spark Improvement Proposals

2016-10-09 Thread Cody Koeninger
problems you >> mentioned with people already having a design and implementation in mind. >> >> Basically, the question is, are you trying to iterate faster on design by >> adding a step for user feedback earlier? Or are you just trying to make >> design docs for

Re: Spark Improvement Proposals

2016-10-09 Thread Cody Koeninger
ures more visible (and their approval more formal)? > > BTW note that in either case, I'd like to have a template for design docs > too, which should also include goals. I think that would've avoided some of > the issues you brought up. > > Matei > > On Oct 9, 2016, at 10

Re: Spark Improvement Proposals

2016-10-09 Thread Cody Koeninger
osely. >> >> Dunno if we want to follow a similar pattern for Spark, since the >> project’s needs are different. But the Python community has used PEPs to >> help organize and steer development since 2000; there are plenty of >> examples there we can probably take inspiration

Re: Spark Improvement Proposals

2016-10-09 Thread Nicholas Chammas
ganize and steer development since 2000; there are plenty of > examples there we can probably take inspiration from. > > By the way, can we call these things something other than Spark > Improvement Proposals? The acronym, SIP, conflicts with Scala SIPs > <http://docs.scala-lang.org/s

Re: Spark Improvement Proposals

2016-10-09 Thread Matei Zaharia
different. But the Python community has used PEPs to help organize > and steer development since 2000; there are plenty of examples there we can > probably take inspiration from. > > By the way, can we call these things something other than Spark Improvement > Proposals? The acro

Re: Spark Improvement Proposals

2016-10-09 Thread Nicholas Chammas
t since 2000; there are plenty of examples there we can probably take inspiration from. By the way, can we call these things something other than Spark Improvement Proposals? The acronym, SIP, conflicts with Scala SIPs <http://docs.scala-lang.org/sips/index.html>. Since the Scala and Spark communities hav

Re: Spark Improvement Proposals

2016-10-09 Thread Cody Koeninger
Here's my specific proposal (meta-proposal?) Spark Improvement Proposals (SIP) Background: The current problem is that design and implementation of large features are often done in private, before soliciting user feedback. When feedback is solicited, it is often as to detailed design

Re: Improving governance / committers (split from Spark Improvement Proposals thread)

2016-10-08 Thread Cody Koeninger
It's not about technical design disagreement as to matters of taste, it's about familiarity with the domain. To make an analogy, it's as if a committer in MLlib was firmly intent on, I dunno, treating a collection of categorical variables as if it were an ordered range of continuous variables.

Re: Spark Improvement Proposals

2016-10-08 Thread vaquar khan
+1 for SIP lebles,waiting for Reynolds detailed proposal . Regards, Vaquar khan On 8 Oct 2016 16:22, "Matei Zaharia" wrote: > Sounds good. Just to comment on the compatibility part: > > > I meant changing public user interfaces. I think the first design is > >

Re: Spark Improvement Proposals

2016-10-08 Thread Matei Zaharia
Sounds good. Just to comment on the compatibility part: > I meant changing public user interfaces. I think the first design is > unlikely to be right, because it's done at a time when you have the > least information. As a user, I find it considerably more frustrating > to be unable to use a

Re: Improving governance / committers (split from Spark Improvement Proposals thread)

2016-10-08 Thread Matei Zaharia
This makes a lot of sense; just to comment on a few things: > - More committers > Just looking at the ratio of committers to open tickets, or committers > to contributors, I don't think you have enough human power. > I realize this is a touchy issue. I don't have dog in this fight, > because I'm

Re: Improving volunteer management / JIRAs (split from Spark Improvement Proposals thread)

2016-10-08 Thread Matei Zaharia
> > > > > > From: Nicholas Chammas [via Apache Spark Developers List] [mailto:ml-node+ > <javascript:_e(%7B%7D,'cvml','ml-node%2B');>[hidden email] > <http://user/SendEmail.jtp?type=node=19322=0>] > Sent: Saturday, October 08, 2016 12:42 AM &g

Re: Improving volunteer management / JIRAs (split from Spark Improvement Proposals thread)

2016-10-08 Thread Cody Koeninger
rs List] [mailto: > ml-node+ <javascript:_e(%7B%7D,'cvml','ml-node%2B');>[hidden email] > <http:///user/SendEmail.jtp?type=node=19322=0>] > *Sent:* Saturday, October 08, 2016 12:42 AM > *To:* Mendelson, Assaf > *Subject:* Re: Improving volunteer management / JIRAs (split

Re: Spark Improvement Proposals

2016-10-07 Thread Reynold Xin
Alright looks like there are quite a bit of support. We should wait to hear from more people too. To push this forward, Cody and I will be working together in the next couple of weeks to come up with a concrete, detailed proposal on what this entails, and then we can discuss this the specific

Re: Improving volunteer management / JIRAs (split from Spark Improvement Proposals thread)

2016-10-07 Thread Nicholas Chammas
Ah yes, on a given JIRA issue the number of watchers is often a better indicator of community interest than votes. But yeah, it could be any metric or formula we want, as long as it yielded a "reasonable" bar to cross for unsolicited contributions to get committer review--or at the very least a

Re: Improving volunteer management / JIRAs (split from Spark Improvement Proposals thread)

2016-10-07 Thread Cody Koeninger
I really like the idea of using jira votes (and/or watchers?) as a filter! On Fri, Oct 7, 2016 at 4:41 PM, Nicholas Chammas wrote: > I agree with Cody and others that we need some automation — or at least an > adjusted process — to help us manage organic contributions

Re: Improving volunteer management / JIRAs (split from Spark Improvement Proposals thread)

2016-10-07 Thread Nicholas Chammas
I agree with Cody and others that we need some automation — or at least an adjusted process — to help us manage organic contributions better. The objections about automated closing being potentially abrasive are understood, but I wouldn’t accept that as a defeat for automation. Instead, it seems

Re: Spark Improvement Proposals

2016-10-07 Thread Cody Koeninger
Yeah, in case it wasn't clear, I was talking about SIPs for major user-facing or cross-cutting changes, not minor feature adds. On Fri, Oct 7, 2016 at 3:58 PM, Stavros Kontopoulos < stavros.kontopou...@lightbend.com> wrote: > +1 to the SIP label as long as it does not slow down things and it

Improving volunteer management / JIRAs (split from Spark Improvement Proposals thread)

2016-10-07 Thread Cody Koeninger
Matei asked: > I agree about empowering people interested here to contribute, but I'm > wondering, do you think there are technical things that people don't want to > work on, or is it a matter of what there's been time to do? It's a matter of mismanagement and miscommunication. The

  1   2   >