Re: Design docs: consolidation and discoverability

2015-04-27 Thread Nicholas Chammas
I like the idea of having design docs be kept up to date and tracked in
git.

If the Apache repo isn't a good fit, perhaps we can have a separate repo
just for design docs? Maybe something like github.com/spark-docs/spark-docs/
?

If there's other stuff we want to track but haven't, perhaps we can
generalize the purpose of the repo a bit and rename it accordingly (e.g.
spark-misc/spark-misc).

Nick

On Mon, Apr 27, 2015 at 1:21 PM Sandy Ryza sandy.r...@cloudera.com wrote:

 My only issue with Google Docs is that they're mutable, so it's difficult
 to follow a design's history through its revisions and link up JIRA
 comments with the relevant version.

 -Sandy

 On Mon, Apr 27, 2015 at 7:54 AM, Steve Loughran ste...@hortonworks.com
 wrote:

 
  One thing to consider is that while docs as PDFs in JIRAs do document the
  original proposal, that's not the place to keep living specifications.
 That
  stuff needs to live in SCM, in a format which can be easily maintained,
 can
  generate readable documents, and, in an unrealistically ideal world, even
  be used by machines to validate compliance with the design. Test suites
  tend to be the implicit machine-readable part of the specification,
 though
  they aren't usually viewed as such.
 
  PDFs of word docs in JIRAs are not the place for ongoing work, even if
 the
  early drafts can contain them. Given it's just as easy to point to
 markdown
  docs in github by commit ID, that could be an alternative way to publish
  docs, with the document itself being viewed as one of the deliverables.
  When the time comes to update a document, then its there in the source
 tree
  to edit.
 
  If there's a flaw here, its that design docs are that: the design. The
  implementation may not match, ongoing work will certainly diverge. If the
  design docs aren't kept in sync, then they can mislead people.
 Accordingly,
  once the design docs are incorporated into the source tree, keeping them
 in
  sync with changes has be viewed as essential as keeping tests up to date
 
   On 26 Apr 2015, at 22:34, Patrick Wendell pwend...@gmail.com wrote:
  
   I actually don't totally see why we can't use Google Docs provided it
   is clearly discoverable from the JIRA. It was my understanding that
   many projects do this. Maybe not (?).
  
   If it's a matter of maintaining public record on ASF infrastructure,
   perhaps we can just automate that if an issue is closed we capture the
   doc content and attach it to the JIRA as a PDF.
  
   My sense is that in general the ASF infrastructure policy is becoming
   more and more lenient with regards to using third party services,
   provided the are broadly accessible (such as a public google doc) and
   can be definitively archived on ASF controlled storage.
  
   - Patrick
  
   On Fri, Apr 24, 2015 at 4:57 PM, Sean Owen so...@cloudera.com wrote:
   I know I recently used Google Docs from a JIRA, so am guilty as
   charged. I don't think there are a lot of design docs in general, but
   the ones I've seen have simply pushed docs to a JIRA. (I did the same,
   mirroring PDFs of the Google Doc.) I don't think this is hard to
   follow.
  
   I think you can do what you like: make a JIRA and attach files. Make a
   WIP PR and attach your notes. Make a Google Doc if you're feeling
   transgressive.
  
   I don't see much of a problem to solve here. In practice there are
   plenty of workable options, all of which are mainstream, and so I do
   not see an argument that somehow this is solved by letting people make
   wikis.
  
   On Fri, Apr 24, 2015 at 7:42 PM, Punyashloka Biswal
   punya.bis...@gmail.com wrote:
   Okay, I can understand wanting to keep Git history clean, and avoid
   bottlenecking on committers. Is it reasonable to establish a
  convention of
   having a label, component or (best of all) an issue type for issues
  that are
   associated with design docs? For example, if we used the existing
   Brainstorming issue type, and people put their design doc in the
   description of the ticket, it would be relatively easy to figure out
  what
   designs are in progress.
  
   Given the push-back against design docs in Git or on the wiki and the
  strong
   preference for keeping docs on ASF property, I'm a bit surprised that
  all
   the existing design docs are on Google Docs. Perhaps Apache should
  consider
   opening up parts of the wiki to a larger group, to better serve this
  use
   case.
  
   Punya
  
   On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell pwend...@gmail.com
  wrote:
  
   Using our ASF git repository as a working area for design docs, it
   seems potentially concerning to me. It's difficult process wise
   because all commits need to go through committers and also, we'd
   pollute our git history a lot with random incremental design
 updates.
  
   The git history is used a lot by downstream packagers, us during our
   QA process, etc... we really try to keep it oriented around code
   patches:
  
   

Re: Design docs: consolidation and discoverability

2015-04-27 Thread Punyashloka Biswal
Nick, I like your idea of keeping it in a separate git repository. It seems
to combine the advantages of the present Google Docs approach with the
crisper history, discoverability, and text format simplicity of GitHub
wikis.

Punya
On Mon, Apr 27, 2015 at 1:30 PM Nicholas Chammas nicholas.cham...@gmail.com
wrote:

 I like the idea of having design docs be kept up to date and tracked in
 git.

 If the Apache repo isn't a good fit, perhaps we can have a separate repo
 just for design docs? Maybe something like
 github.com/spark-docs/spark-docs/
 ?

 If there's other stuff we want to track but haven't, perhaps we can
 generalize the purpose of the repo a bit and rename it accordingly (e.g.
 spark-misc/spark-misc).

 Nick

 On Mon, Apr 27, 2015 at 1:21 PM Sandy Ryza sandy.r...@cloudera.com
 wrote:

  My only issue with Google Docs is that they're mutable, so it's difficult
  to follow a design's history through its revisions and link up JIRA
  comments with the relevant version.
 
  -Sandy
 
  On Mon, Apr 27, 2015 at 7:54 AM, Steve Loughran ste...@hortonworks.com
  wrote:
 
  
   One thing to consider is that while docs as PDFs in JIRAs do document
 the
   original proposal, that's not the place to keep living specifications.
  That
   stuff needs to live in SCM, in a format which can be easily maintained,
  can
   generate readable documents, and, in an unrealistically ideal world,
 even
   be used by machines to validate compliance with the design. Test suites
   tend to be the implicit machine-readable part of the specification,
  though
   they aren't usually viewed as such.
  
   PDFs of word docs in JIRAs are not the place for ongoing work, even if
  the
   early drafts can contain them. Given it's just as easy to point to
  markdown
   docs in github by commit ID, that could be an alternative way to
 publish
   docs, with the document itself being viewed as one of the deliverables.
   When the time comes to update a document, then its there in the source
  tree
   to edit.
  
   If there's a flaw here, its that design docs are that: the design. The
   implementation may not match, ongoing work will certainly diverge. If
 the
   design docs aren't kept in sync, then they can mislead people.
  Accordingly,
   once the design docs are incorporated into the source tree, keeping
 them
  in
   sync with changes has be viewed as essential as keeping tests up to
 date
  
On 26 Apr 2015, at 22:34, Patrick Wendell pwend...@gmail.com
 wrote:
   
I actually don't totally see why we can't use Google Docs provided it
is clearly discoverable from the JIRA. It was my understanding that
many projects do this. Maybe not (?).
   
If it's a matter of maintaining public record on ASF infrastructure,
perhaps we can just automate that if an issue is closed we capture
 the
doc content and attach it to the JIRA as a PDF.
   
My sense is that in general the ASF infrastructure policy is becoming
more and more lenient with regards to using third party services,
provided the are broadly accessible (such as a public google doc) and
can be definitively archived on ASF controlled storage.
   
- Patrick
   
On Fri, Apr 24, 2015 at 4:57 PM, Sean Owen so...@cloudera.com
 wrote:
I know I recently used Google Docs from a JIRA, so am guilty as
charged. I don't think there are a lot of design docs in general,
 but
the ones I've seen have simply pushed docs to a JIRA. (I did the
 same,
mirroring PDFs of the Google Doc.) I don't think this is hard to
follow.
   
I think you can do what you like: make a JIRA and attach files.
 Make a
WIP PR and attach your notes. Make a Google Doc if you're feeling
transgressive.
   
I don't see much of a problem to solve here. In practice there are
plenty of workable options, all of which are mainstream, and so I do
not see an argument that somehow this is solved by letting people
 make
wikis.
   
On Fri, Apr 24, 2015 at 7:42 PM, Punyashloka Biswal
punya.bis...@gmail.com wrote:
Okay, I can understand wanting to keep Git history clean, and avoid
bottlenecking on committers. Is it reasonable to establish a
   convention of
having a label, component or (best of all) an issue type for issues
   that are
associated with design docs? For example, if we used the existing
Brainstorming issue type, and people put their design doc in the
description of the ticket, it would be relatively easy to figure
 out
   what
designs are in progress.
   
Given the push-back against design docs in Git or on the wiki and
 the
   strong
preference for keeping docs on ASF property, I'm a bit surprised
 that
   all
the existing design docs are on Google Docs. Perhaps Apache should
   consider
opening up parts of the wiki to a larger group, to better serve
 this
   use
case.
   
Punya
   
On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell 
 pwend...@gmail.com
   wrote:
   

Re: Design docs: consolidation and discoverability

2015-04-27 Thread Sandy Ryza
My only issue with Google Docs is that they're mutable, so it's difficult
to follow a design's history through its revisions and link up JIRA
comments with the relevant version.

-Sandy

On Mon, Apr 27, 2015 at 7:54 AM, Steve Loughran ste...@hortonworks.com
wrote:


 One thing to consider is that while docs as PDFs in JIRAs do document the
 original proposal, that's not the place to keep living specifications. That
 stuff needs to live in SCM, in a format which can be easily maintained, can
 generate readable documents, and, in an unrealistically ideal world, even
 be used by machines to validate compliance with the design. Test suites
 tend to be the implicit machine-readable part of the specification, though
 they aren't usually viewed as such.

 PDFs of word docs in JIRAs are not the place for ongoing work, even if the
 early drafts can contain them. Given it's just as easy to point to markdown
 docs in github by commit ID, that could be an alternative way to publish
 docs, with the document itself being viewed as one of the deliverables.
 When the time comes to update a document, then its there in the source tree
 to edit.

 If there's a flaw here, its that design docs are that: the design. The
 implementation may not match, ongoing work will certainly diverge. If the
 design docs aren't kept in sync, then they can mislead people. Accordingly,
 once the design docs are incorporated into the source tree, keeping them in
 sync with changes has be viewed as essential as keeping tests up to date

  On 26 Apr 2015, at 22:34, Patrick Wendell pwend...@gmail.com wrote:
 
  I actually don't totally see why we can't use Google Docs provided it
  is clearly discoverable from the JIRA. It was my understanding that
  many projects do this. Maybe not (?).
 
  If it's a matter of maintaining public record on ASF infrastructure,
  perhaps we can just automate that if an issue is closed we capture the
  doc content and attach it to the JIRA as a PDF.
 
  My sense is that in general the ASF infrastructure policy is becoming
  more and more lenient with regards to using third party services,
  provided the are broadly accessible (such as a public google doc) and
  can be definitively archived on ASF controlled storage.
 
  - Patrick
 
  On Fri, Apr 24, 2015 at 4:57 PM, Sean Owen so...@cloudera.com wrote:
  I know I recently used Google Docs from a JIRA, so am guilty as
  charged. I don't think there are a lot of design docs in general, but
  the ones I've seen have simply pushed docs to a JIRA. (I did the same,
  mirroring PDFs of the Google Doc.) I don't think this is hard to
  follow.
 
  I think you can do what you like: make a JIRA and attach files. Make a
  WIP PR and attach your notes. Make a Google Doc if you're feeling
  transgressive.
 
  I don't see much of a problem to solve here. In practice there are
  plenty of workable options, all of which are mainstream, and so I do
  not see an argument that somehow this is solved by letting people make
  wikis.
 
  On Fri, Apr 24, 2015 at 7:42 PM, Punyashloka Biswal
  punya.bis...@gmail.com wrote:
  Okay, I can understand wanting to keep Git history clean, and avoid
  bottlenecking on committers. Is it reasonable to establish a
 convention of
  having a label, component or (best of all) an issue type for issues
 that are
  associated with design docs? For example, if we used the existing
  Brainstorming issue type, and people put their design doc in the
  description of the ticket, it would be relatively easy to figure out
 what
  designs are in progress.
 
  Given the push-back against design docs in Git or on the wiki and the
 strong
  preference for keeping docs on ASF property, I'm a bit surprised that
 all
  the existing design docs are on Google Docs. Perhaps Apache should
 consider
  opening up parts of the wiki to a larger group, to better serve this
 use
  case.
 
  Punya
 
  On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell pwend...@gmail.com
 wrote:
 
  Using our ASF git repository as a working area for design docs, it
  seems potentially concerning to me. It's difficult process wise
  because all commits need to go through committers and also, we'd
  pollute our git history a lot with random incremental design updates.
 
  The git history is used a lot by downstream packagers, us during our
  QA process, etc... we really try to keep it oriented around code
  patches:
 
  https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog
 
  Committing a polished design doc along with a feature, maybe that's
  something we could consider. But I still think JIRA is the best
  location for these docs, consistent with what most other ASF projects
  do that I know.
 
  On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger c...@koeninger.org
  wrote:
  Why can't pull requests be used for design docs in Git if people who
  aren't
  committers want to contribute changes (as opposed to just comments)?
 
  On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen so...@cloudera.com
 wrote:
 
 

Re: Design docs: consolidation and discoverability

2015-04-27 Thread Punyashloka Biswal
Github's wiki is just another Git repo. If we use a separate repo, it's
probably easiest to use the wiki git repo rather than the primary git
repo.

Punya

On Mon, Apr 27, 2015 at 1:50 PM Nicholas Chammas nicholas.cham...@gmail.com
wrote:

 Oh, a GitHub wiki (which is separate from having docs in a repo) is yet
 another approach we could take, though if we want to do that on the main
 Spark repo we'd need permission from Apache, which may be tough to get...

 On Mon, Apr 27, 2015 at 1:47 PM Punyashloka Biswal punya.bis...@gmail.com
 wrote:

 Nick, I like your idea of keeping it in a separate git repository. It
 seems to combine the advantages of the present Google Docs approach with
 the crisper history, discoverability, and text format simplicity of GitHub
 wikis.

 Punya
 On Mon, Apr 27, 2015 at 1:30 PM Nicholas Chammas 
 nicholas.cham...@gmail.com wrote:

 I like the idea of having design docs be kept up to date and tracked in
 git.

 If the Apache repo isn't a good fit, perhaps we can have a separate repo
 just for design docs? Maybe something like
 github.com/spark-docs/spark-docs/
 ?

 If there's other stuff we want to track but haven't, perhaps we can
 generalize the purpose of the repo a bit and rename it accordingly (e.g.
 spark-misc/spark-misc).

 Nick

 On Mon, Apr 27, 2015 at 1:21 PM Sandy Ryza sandy.r...@cloudera.com
 wrote:

  My only issue with Google Docs is that they're mutable, so it's
 difficult
  to follow a design's history through its revisions and link up JIRA
  comments with the relevant version.
 
  -Sandy
 
  On Mon, Apr 27, 2015 at 7:54 AM, Steve Loughran 
 ste...@hortonworks.com
  wrote:
 
  
   One thing to consider is that while docs as PDFs in JIRAs do
 document the
   original proposal, that's not the place to keep living
 specifications.
  That
   stuff needs to live in SCM, in a format which can be easily
 maintained,
  can
   generate readable documents, and, in an unrealistically ideal world,
 even
   be used by machines to validate compliance with the design. Test
 suites
   tend to be the implicit machine-readable part of the specification,
  though
   they aren't usually viewed as such.
  
   PDFs of word docs in JIRAs are not the place for ongoing work, even
 if
  the
   early drafts can contain them. Given it's just as easy to point to
  markdown
   docs in github by commit ID, that could be an alternative way to
 publish
   docs, with the document itself being viewed as one of the
 deliverables.
   When the time comes to update a document, then its there in the
 source
  tree
   to edit.
  
   If there's a flaw here, its that design docs are that: the design.
 The
   implementation may not match, ongoing work will certainly diverge.
 If the
   design docs aren't kept in sync, then they can mislead people.
  Accordingly,
   once the design docs are incorporated into the source tree, keeping
 them
  in
   sync with changes has be viewed as essential as keeping tests up to
 date
  
On 26 Apr 2015, at 22:34, Patrick Wendell pwend...@gmail.com
 wrote:
   
I actually don't totally see why we can't use Google Docs provided
 it
is clearly discoverable from the JIRA. It was my understanding that
many projects do this. Maybe not (?).
   
If it's a matter of maintaining public record on ASF
 infrastructure,
perhaps we can just automate that if an issue is closed we capture
 the
doc content and attach it to the JIRA as a PDF.
   
My sense is that in general the ASF infrastructure policy is
 becoming
more and more lenient with regards to using third party services,
provided the are broadly accessible (such as a public google doc)
 and
can be definitively archived on ASF controlled storage.
   
- Patrick
   
On Fri, Apr 24, 2015 at 4:57 PM, Sean Owen so...@cloudera.com
 wrote:
I know I recently used Google Docs from a JIRA, so am guilty as
charged. I don't think there are a lot of design docs in general,
 but
the ones I've seen have simply pushed docs to a JIRA. (I did the
 same,
mirroring PDFs of the Google Doc.) I don't think this is hard to
follow.
   
I think you can do what you like: make a JIRA and attach files.
 Make a
WIP PR and attach your notes. Make a Google Doc if you're feeling
transgressive.
   
I don't see much of a problem to solve here. In practice there are
plenty of workable options, all of which are mainstream, and so I
 do
not see an argument that somehow this is solved by letting people
 make
wikis.
   
On Fri, Apr 24, 2015 at 7:42 PM, Punyashloka Biswal
punya.bis...@gmail.com wrote:
Okay, I can understand wanting to keep Git history clean, and
 avoid
bottlenecking on committers. Is it reasonable to establish a
   convention of
having a label, component or (best of all) an issue type for
 issues
   that are
associated with design docs? For example, if we used the existing
Brainstorming issue type, and people put their design doc 

Re: Design docs: consolidation and discoverability

2015-04-26 Thread Patrick Wendell
I actually don't totally see why we can't use Google Docs provided it
is clearly discoverable from the JIRA. It was my understanding that
many projects do this. Maybe not (?).

If it's a matter of maintaining public record on ASF infrastructure,
perhaps we can just automate that if an issue is closed we capture the
doc content and attach it to the JIRA as a PDF.

My sense is that in general the ASF infrastructure policy is becoming
more and more lenient with regards to using third party services,
provided the are broadly accessible (such as a public google doc) and
can be definitively archived on ASF controlled storage.

- Patrick

On Fri, Apr 24, 2015 at 4:57 PM, Sean Owen so...@cloudera.com wrote:
 I know I recently used Google Docs from a JIRA, so am guilty as
 charged. I don't think there are a lot of design docs in general, but
 the ones I've seen have simply pushed docs to a JIRA. (I did the same,
 mirroring PDFs of the Google Doc.) I don't think this is hard to
 follow.

 I think you can do what you like: make a JIRA and attach files. Make a
 WIP PR and attach your notes. Make a Google Doc if you're feeling
 transgressive.

 I don't see much of a problem to solve here. In practice there are
 plenty of workable options, all of which are mainstream, and so I do
 not see an argument that somehow this is solved by letting people make
 wikis.

 On Fri, Apr 24, 2015 at 7:42 PM, Punyashloka Biswal
 punya.bis...@gmail.com wrote:
 Okay, I can understand wanting to keep Git history clean, and avoid
 bottlenecking on committers. Is it reasonable to establish a convention of
 having a label, component or (best of all) an issue type for issues that are
 associated with design docs? For example, if we used the existing
 Brainstorming issue type, and people put their design doc in the
 description of the ticket, it would be relatively easy to figure out what
 designs are in progress.

 Given the push-back against design docs in Git or on the wiki and the strong
 preference for keeping docs on ASF property, I'm a bit surprised that all
 the existing design docs are on Google Docs. Perhaps Apache should consider
 opening up parts of the wiki to a larger group, to better serve this use
 case.

 Punya

 On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell pwend...@gmail.com wrote:

 Using our ASF git repository as a working area for design docs, it
 seems potentially concerning to me. It's difficult process wise
 because all commits need to go through committers and also, we'd
 pollute our git history a lot with random incremental design updates.

 The git history is used a lot by downstream packagers, us during our
 QA process, etc... we really try to keep it oriented around code
 patches:

 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog

 Committing a polished design doc along with a feature, maybe that's
 something we could consider. But I still think JIRA is the best
 location for these docs, consistent with what most other ASF projects
 do that I know.

 On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger c...@koeninger.org
 wrote:
  Why can't pull requests be used for design docs in Git if people who
  aren't
  committers want to contribute changes (as opposed to just comments)?
 
  On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen so...@cloudera.com wrote:
 
  Only catch there is it requires commit access to the repo. We need a
  way for people who aren't committers to write and collaborate (for
  point #1)
 
  On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
  punya.bis...@gmail.com wrote:
   Sandy, doesn't keeping (in-progress) design docs in Git satisfy the
  history
   requirement? Referring back to my Gradle example, it seems that
  
 
  https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
   is a really good way to see why the design doc evolved the way it
   did.
  When
   keeping the doc in Jira (presumably as an attachment) it's not easy
   to
  see
   what changed between successive versions of the doc.
  
   Punya
  
   On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza sandy.r...@cloudera.com
  wrote:
  
   I think there are maybe two separate things we're talking about?
  
   1. Design discussions and in-progress design docs.
  
   My two cents are that JIRA is the best place for this.  It allows
  tracking
   the progression of a design across multiple PRs and contributors.  A
  piece
   of useful feedback that I've gotten in the past is to make design
   docs
   immutable.  When updating them in response to feedback, post a new
  version
   rather than editing the existing one.  This enables tracking the
  history of
   a design and makes it possible to read comments about previous
   designs
  in
   context.  Otherwise it's really difficult to understand why
   particular
   approaches were chosen or abandoned.
  
   2. Completed design docs for features that we've implemented.
  
   Perhaps less essential to project progress, but it would be really
  lovely
   to have a 

Re: Design docs: consolidation and discoverability

2015-04-24 Thread Sean Owen
That would require giving wiki access to everyone or manually adding people
any time they make a doc.

I don't see how this helps though. They're still docs on the internet and
they're still linked from the central project JIRA, which is what you
should follow.
 On Apr 24, 2015 8:14 AM, Punyashloka Biswal punya.bis...@gmail.com
wrote:

 Dear Spark devs,

 Right now, design docs are stored on Google docs and linked from tickets.
 For someone new to the project, it's hard to figure out what subjects are
 being discussed, what organization to follow for new feature proposals,
 etc.

 Would it make sense to consolidate future design docs in either a
 designated area on the Apache Confluence Wiki, or on GitHub's Wiki pages?
 If people have a strong preference to keep the design docs on Google Docs,
 then could we have a top-level page on the confluence wiki that lists all
 active and archived design docs?

 Punya



Re: Design docs: consolidation and discoverability

2015-04-24 Thread Cody Koeninger
My 2 cents - I'd rather see design docs in github pull requests (using
plain text / markdown).  That doesn't require changing access or adding
people, and github PRs already allow for conversation / email notifications.

Conversation is already split between jira and github PRs.  Having a third
stream of conversation in Google Docs just leads to things being ignored.

On Fri, Apr 24, 2015 at 7:21 AM, Sean Owen so...@cloudera.com wrote:

 That would require giving wiki access to everyone or manually adding people
 any time they make a doc.

 I don't see how this helps though. They're still docs on the internet and
 they're still linked from the central project JIRA, which is what you
 should follow.
  On Apr 24, 2015 8:14 AM, Punyashloka Biswal punya.bis...@gmail.com
 wrote:

  Dear Spark devs,
 
  Right now, design docs are stored on Google docs and linked from tickets.
  For someone new to the project, it's hard to figure out what subjects are
  being discussed, what organization to follow for new feature proposals,
  etc.
 
  Would it make sense to consolidate future design docs in either a
  designated area on the Apache Confluence Wiki, or on GitHub's Wiki pages?
  If people have a strong preference to keep the design docs on Google
 Docs,
  then could we have a top-level page on the confluence wiki that lists all
  active and archived design docs?
 
  Punya
 



Re: Design docs: consolidation and discoverability

2015-04-24 Thread Reynold Xin
I'd love to see more design discussions consolidated in a single place as
well. That said, there are many practical challenges to overcome. Some of
them are out of our control:

1. For large features, it is fairly common to open a PR for discussion,
close the PR taking some feedback into account, and reopen another one. You
sort of lose the discussions that way.

2. With the way Jenkins is setup currently, Jenkins testing introduces a
lot of noise to GitHub pull requests, making it hard to differentiate
legitimate comments from noise. This is unfortunately due to the fact that
ASF won't allow our Jenkins bot to have API privilege to post messages.

3. The Apache Way is that all development discussions need to happen on ASF
property, i.e. dev lists and JIRA. As a result, technically we are not
allowed to have development discussions on GitHub.


On Fri, Apr 24, 2015 at 7:09 AM, Cody Koeninger c...@koeninger.org wrote:

 My 2 cents - I'd rather see design docs in github pull requests (using
 plain text / markdown).  That doesn't require changing access or adding
 people, and github PRs already allow for conversation / email
 notifications.

 Conversation is already split between jira and github PRs.  Having a third
 stream of conversation in Google Docs just leads to things being ignored.

 On Fri, Apr 24, 2015 at 7:21 AM, Sean Owen so...@cloudera.com wrote:

  That would require giving wiki access to everyone or manually adding
 people
  any time they make a doc.
 
  I don't see how this helps though. They're still docs on the internet and
  they're still linked from the central project JIRA, which is what you
  should follow.
   On Apr 24, 2015 8:14 AM, Punyashloka Biswal punya.bis...@gmail.com
  wrote:
 
   Dear Spark devs,
  
   Right now, design docs are stored on Google docs and linked from
 tickets.
   For someone new to the project, it's hard to figure out what subjects
 are
   being discussed, what organization to follow for new feature proposals,
   etc.
  
   Would it make sense to consolidate future design docs in either a
   designated area on the Apache Confluence Wiki, or on GitHub's Wiki
 pages?
   If people have a strong preference to keep the design docs on Google
  Docs,
   then could we have a top-level page on the confluence wiki that lists
 all
   active and archived design docs?
  
   Punya
  
 



Re: Design docs: consolidation and discoverability

2015-04-24 Thread Punyashloka Biswal
Okay, I can understand wanting to keep Git history clean, and avoid
bottlenecking on committers. Is it reasonable to establish a convention of
having a label, component or (best of all) an issue type for issues that
are associated with design docs? For example, if we used the existing
Brainstorming issue type, and people put their design doc in the
description of the ticket, it would be relatively easy to figure out what
designs are in progress.

Given the push-back against design docs in Git or on the wiki and the
strong preference for keeping docs on ASF property, I'm a bit surprised
that all the existing design docs are on Google Docs. Perhaps Apache should
consider opening up parts of the wiki to a larger group, to better serve
this use case.

Punya

On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell pwend...@gmail.com wrote:

 Using our ASF git repository as a working area for design docs, it
 seems potentially concerning to me. It's difficult process wise
 because all commits need to go through committers and also, we'd
 pollute our git history a lot with random incremental design updates.

 The git history is used a lot by downstream packagers, us during our
 QA process, etc... we really try to keep it oriented around code
 patches:

 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog

 Committing a polished design doc along with a feature, maybe that's
 something we could consider. But I still think JIRA is the best
 location for these docs, consistent with what most other ASF projects
 do that I know.

 On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger c...@koeninger.org
 wrote:
  Why can't pull requests be used for design docs in Git if people who
 aren't
  committers want to contribute changes (as opposed to just comments)?
 
  On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen so...@cloudera.com wrote:
 
  Only catch there is it requires commit access to the repo. We need a
  way for people who aren't committers to write and collaborate (for
  point #1)
 
  On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
  punya.bis...@gmail.com wrote:
   Sandy, doesn't keeping (in-progress) design docs in Git satisfy the
  history
   requirement? Referring back to my Gradle example, it seems that
  
 
 https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
   is a really good way to see why the design doc evolved the way it did.
  When
   keeping the doc in Jira (presumably as an attachment) it's not easy to
  see
   what changed between successive versions of the doc.
  
   Punya
  
   On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza sandy.r...@cloudera.com
  wrote:
  
   I think there are maybe two separate things we're talking about?
  
   1. Design discussions and in-progress design docs.
  
   My two cents are that JIRA is the best place for this.  It allows
  tracking
   the progression of a design across multiple PRs and contributors.  A
  piece
   of useful feedback that I've gotten in the past is to make design
 docs
   immutable.  When updating them in response to feedback, post a new
  version
   rather than editing the existing one.  This enables tracking the
  history of
   a design and makes it possible to read comments about previous
 designs
  in
   context.  Otherwise it's really difficult to understand why
 particular
   approaches were chosen or abandoned.
  
   2. Completed design docs for features that we've implemented.
  
   Perhaps less essential to project progress, but it would be really
  lovely
   to have a central repository to all the projects design doc.  If
 anyone
   wants to step up to maintain it, it would be cool to have a wiki page
  with
   links to all the final design docs posted on JIRA.
  
 



Re: Design docs: consolidation and discoverability

2015-04-24 Thread Sean Owen
I know I recently used Google Docs from a JIRA, so am guilty as
charged. I don't think there are a lot of design docs in general, but
the ones I've seen have simply pushed docs to a JIRA. (I did the same,
mirroring PDFs of the Google Doc.) I don't think this is hard to
follow.

I think you can do what you like: make a JIRA and attach files. Make a
WIP PR and attach your notes. Make a Google Doc if you're feeling
transgressive.

I don't see much of a problem to solve here. In practice there are
plenty of workable options, all of which are mainstream, and so I do
not see an argument that somehow this is solved by letting people make
wikis.

On Fri, Apr 24, 2015 at 7:42 PM, Punyashloka Biswal
punya.bis...@gmail.com wrote:
 Okay, I can understand wanting to keep Git history clean, and avoid
 bottlenecking on committers. Is it reasonable to establish a convention of
 having a label, component or (best of all) an issue type for issues that are
 associated with design docs? For example, if we used the existing
 Brainstorming issue type, and people put their design doc in the
 description of the ticket, it would be relatively easy to figure out what
 designs are in progress.

 Given the push-back against design docs in Git or on the wiki and the strong
 preference for keeping docs on ASF property, I'm a bit surprised that all
 the existing design docs are on Google Docs. Perhaps Apache should consider
 opening up parts of the wiki to a larger group, to better serve this use
 case.

 Punya

 On Fri, Apr 24, 2015 at 5:01 PM Patrick Wendell pwend...@gmail.com wrote:

 Using our ASF git repository as a working area for design docs, it
 seems potentially concerning to me. It's difficult process wise
 because all commits need to go through committers and also, we'd
 pollute our git history a lot with random incremental design updates.

 The git history is used a lot by downstream packagers, us during our
 QA process, etc... we really try to keep it oriented around code
 patches:

 https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog

 Committing a polished design doc along with a feature, maybe that's
 something we could consider. But I still think JIRA is the best
 location for these docs, consistent with what most other ASF projects
 do that I know.

 On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger c...@koeninger.org
 wrote:
  Why can't pull requests be used for design docs in Git if people who
  aren't
  committers want to contribute changes (as opposed to just comments)?
 
  On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen so...@cloudera.com wrote:
 
  Only catch there is it requires commit access to the repo. We need a
  way for people who aren't committers to write and collaborate (for
  point #1)
 
  On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
  punya.bis...@gmail.com wrote:
   Sandy, doesn't keeping (in-progress) design docs in Git satisfy the
  history
   requirement? Referring back to my Gradle example, it seems that
  
 
  https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
   is a really good way to see why the design doc evolved the way it
   did.
  When
   keeping the doc in Jira (presumably as an attachment) it's not easy
   to
  see
   what changed between successive versions of the doc.
  
   Punya
  
   On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza sandy.r...@cloudera.com
  wrote:
  
   I think there are maybe two separate things we're talking about?
  
   1. Design discussions and in-progress design docs.
  
   My two cents are that JIRA is the best place for this.  It allows
  tracking
   the progression of a design across multiple PRs and contributors.  A
  piece
   of useful feedback that I've gotten in the past is to make design
   docs
   immutable.  When updating them in response to feedback, post a new
  version
   rather than editing the existing one.  This enables tracking the
  history of
   a design and makes it possible to read comments about previous
   designs
  in
   context.  Otherwise it's really difficult to understand why
   particular
   approaches were chosen or abandoned.
  
   2. Completed design docs for features that we've implemented.
  
   Perhaps less essential to project progress, but it would be really
  lovely
   to have a central repository to all the projects design doc.  If
   anyone
   wants to step up to maintain it, it would be cool to have a wiki
   page
  with
   links to all the final design docs posted on JIRA.
  
 

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Design docs: consolidation and discoverability

2015-04-24 Thread Sean Owen
I think it's OK to have design discussions on github, as emails go to
ASF lists. After all, loads of PR discussions happen there. It's easy
for anyone to follow.

I also would rather just discuss on Github, except for all that noise.

It's not great to put discussions in something like Google Docs
actually; the resulting doc needs to be pasted back to JIRA promptly
if so. I suppose it's still better than a private conversation or not
talking at all, but the principle is that one should be able to access
any substantive decision or conversation by being tuned in to only the
project systems of record -- mailing list, JIRA.



On Fri, Apr 24, 2015 at 2:30 PM, Reynold Xin r...@databricks.com wrote:
 I'd love to see more design discussions consolidated in a single place as
 well. That said, there are many practical challenges to overcome. Some of
 them are out of our control:

 1. For large features, it is fairly common to open a PR for discussion,
 close the PR taking some feedback into account, and reopen another one. You
 sort of lose the discussions that way.

 2. With the way Jenkins is setup currently, Jenkins testing introduces a lot
 of noise to GitHub pull requests, making it hard to differentiate legitimate
 comments from noise. This is unfortunately due to the fact that ASF won't
 allow our Jenkins bot to have API privilege to post messages.

 3. The Apache Way is that all development discussions need to happen on ASF
 property, i.e. dev lists and JIRA. As a result, technically we are not
 allowed to have development discussions on GitHub.


 On Fri, Apr 24, 2015 at 7:09 AM, Cody Koeninger c...@koeninger.org wrote:

 My 2 cents - I'd rather see design docs in github pull requests (using
 plain text / markdown).  That doesn't require changing access or adding
 people, and github PRs already allow for conversation / email
 notifications.

 Conversation is already split between jira and github PRs.  Having a third
 stream of conversation in Google Docs just leads to things being ignored.

 On Fri, Apr 24, 2015 at 7:21 AM, Sean Owen so...@cloudera.com wrote:

  That would require giving wiki access to everyone or manually adding
  people
  any time they make a doc.
 
  I don't see how this helps though. They're still docs on the internet
  and
  they're still linked from the central project JIRA, which is what you
  should follow.
   On Apr 24, 2015 8:14 AM, Punyashloka Biswal punya.bis...@gmail.com
  wrote:
 
   Dear Spark devs,
  
   Right now, design docs are stored on Google docs and linked from
   tickets.
   For someone new to the project, it's hard to figure out what subjects
   are
   being discussed, what organization to follow for new feature
   proposals,
   etc.
  
   Would it make sense to consolidate future design docs in either a
   designated area on the Apache Confluence Wiki, or on GitHub's Wiki
   pages?
   If people have a strong preference to keep the design docs on Google
  Docs,
   then could we have a top-level page on the confluence wiki that lists
   all
   active and archived design docs?
  
   Punya
  
 



-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Design docs: consolidation and discoverability

2015-04-24 Thread Punyashloka Biswal
The Gradle dev team keep their design documents  *checked into* their Git
repository -- see
https://github.com/gradle/gradle/blob/master/design-docs/build-comparison.md
for example. The advantages I see to their approach are:

   - design docs stay on ASF property (since Github is synced to the
   Apache-run Git repository)
   - design docs have a lifetime across PRs, but can still be modified and
   commented on through the mechanism of PRs
   - keeping a central location helps people to find good role models and
   converge on conventions

Sean, I find it hard to use the central Jira as a jumping-off point for
understanding ongoing design work because a tiny fraction of the tickets
actually relate to design docs, and it's not easy from the outside to
figure out which ones are relevant.

Punya

On Fri, Apr 24, 2015 at 2:49 PM Sean Owen so...@cloudera.com wrote:

 I think it's OK to have design discussions on github, as emails go to
 ASF lists. After all, loads of PR discussions happen there. It's easy
 for anyone to follow.

 I also would rather just discuss on Github, except for all that noise.

 It's not great to put discussions in something like Google Docs
 actually; the resulting doc needs to be pasted back to JIRA promptly
 if so. I suppose it's still better than a private conversation or not
 talking at all, but the principle is that one should be able to access
 any substantive decision or conversation by being tuned in to only the
 project systems of record -- mailing list, JIRA.



 On Fri, Apr 24, 2015 at 2:30 PM, Reynold Xin r...@databricks.com wrote:
  I'd love to see more design discussions consolidated in a single place as
  well. That said, there are many practical challenges to overcome. Some of
  them are out of our control:
 
  1. For large features, it is fairly common to open a PR for discussion,
  close the PR taking some feedback into account, and reopen another one.
 You
  sort of lose the discussions that way.
 
  2. With the way Jenkins is setup currently, Jenkins testing introduces a
 lot
  of noise to GitHub pull requests, making it hard to differentiate
 legitimate
  comments from noise. This is unfortunately due to the fact that ASF won't
  allow our Jenkins bot to have API privilege to post messages.
 
  3. The Apache Way is that all development discussions need to happen on
 ASF
  property, i.e. dev lists and JIRA. As a result, technically we are not
  allowed to have development discussions on GitHub.
 
 
  On Fri, Apr 24, 2015 at 7:09 AM, Cody Koeninger c...@koeninger.org
 wrote:
 
  My 2 cents - I'd rather see design docs in github pull requests (using
  plain text / markdown).  That doesn't require changing access or adding
  people, and github PRs already allow for conversation / email
  notifications.
 
  Conversation is already split between jira and github PRs.  Having a
 third
  stream of conversation in Google Docs just leads to things being
 ignored.
 
  On Fri, Apr 24, 2015 at 7:21 AM, Sean Owen so...@cloudera.com wrote:
 
   That would require giving wiki access to everyone or manually adding
   people
   any time they make a doc.
  
   I don't see how this helps though. They're still docs on the internet
   and
   they're still linked from the central project JIRA, which is what you
   should follow.
On Apr 24, 2015 8:14 AM, Punyashloka Biswal 
 punya.bis...@gmail.com
   wrote:
  
Dear Spark devs,
   
Right now, design docs are stored on Google docs and linked from
tickets.
For someone new to the project, it's hard to figure out what
 subjects
are
being discussed, what organization to follow for new feature
proposals,
etc.
   
Would it make sense to consolidate future design docs in either a
designated area on the Apache Confluence Wiki, or on GitHub's Wiki
pages?
If people have a strong preference to keep the design docs on Google
   Docs,
then could we have a top-level page on the confluence wiki that
 lists
all
active and archived design docs?
   
Punya
   
  
 
 



Re: Design docs: consolidation and discoverability

2015-04-24 Thread Patrick Wendell
Using our ASF git repository as a working area for design docs, it
seems potentially concerning to me. It's difficult process wise
because all commits need to go through committers and also, we'd
pollute our git history a lot with random incremental design updates.

The git history is used a lot by downstream packagers, us during our
QA process, etc... we really try to keep it oriented around code
patches:

https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog

Committing a polished design doc along with a feature, maybe that's
something we could consider. But I still think JIRA is the best
location for these docs, consistent with what most other ASF projects
do that I know.

On Fri, Apr 24, 2015 at 1:19 PM, Cody Koeninger c...@koeninger.org wrote:
 Why can't pull requests be used for design docs in Git if people who aren't
 committers want to contribute changes (as opposed to just comments)?

 On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen so...@cloudera.com wrote:

 Only catch there is it requires commit access to the repo. We need a
 way for people who aren't committers to write and collaborate (for
 point #1)

 On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
 punya.bis...@gmail.com wrote:
  Sandy, doesn't keeping (in-progress) design docs in Git satisfy the
 history
  requirement? Referring back to my Gradle example, it seems that
 
 https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
  is a really good way to see why the design doc evolved the way it did.
 When
  keeping the doc in Jira (presumably as an attachment) it's not easy to
 see
  what changed between successive versions of the doc.
 
  Punya
 
  On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza sandy.r...@cloudera.com
 wrote:
 
  I think there are maybe two separate things we're talking about?
 
  1. Design discussions and in-progress design docs.
 
  My two cents are that JIRA is the best place for this.  It allows
 tracking
  the progression of a design across multiple PRs and contributors.  A
 piece
  of useful feedback that I've gotten in the past is to make design docs
  immutable.  When updating them in response to feedback, post a new
 version
  rather than editing the existing one.  This enables tracking the
 history of
  a design and makes it possible to read comments about previous designs
 in
  context.  Otherwise it's really difficult to understand why particular
  approaches were chosen or abandoned.
 
  2. Completed design docs for features that we've implemented.
 
  Perhaps less essential to project progress, but it would be really
 lovely
  to have a central repository to all the projects design doc.  If anyone
  wants to step up to maintain it, it would be cool to have a wiki page
 with
  links to all the final design docs posted on JIRA.
 


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Design docs: consolidation and discoverability

2015-04-24 Thread Sandy Ryza
I think there are maybe two separate things we're talking about?

1. Design discussions and in-progress design docs.

My two cents are that JIRA is the best place for this.  It allows tracking
the progression of a design across multiple PRs and contributors.  A piece
of useful feedback that I've gotten in the past is to make design docs
immutable.  When updating them in response to feedback, post a new version
rather than editing the existing one.  This enables tracking the history of
a design and makes it possible to read comments about previous designs in
context.  Otherwise it's really difficult to understand why particular
approaches were chosen or abandoned.

2. Completed design docs for features that we've implemented.

Perhaps less essential to project progress, but it would be really lovely
to have a central repository to all the projects design doc.  If anyone
wants to step up to maintain it, it would be cool to have a wiki page with
links to all the final design docs posted on JIRA.

-Sandy

On Fri, Apr 24, 2015 at 12:01 PM, Punyashloka Biswal punya.bis...@gmail.com
 wrote:

 The Gradle dev team keep their design documents  *checked into* their Git
 repository -- see

 https://github.com/gradle/gradle/blob/master/design-docs/build-comparison.md
 for example. The advantages I see to their approach are:

- design docs stay on ASF property (since Github is synced to the
Apache-run Git repository)
- design docs have a lifetime across PRs, but can still be modified and
commented on through the mechanism of PRs
- keeping a central location helps people to find good role models and
converge on conventions

 Sean, I find it hard to use the central Jira as a jumping-off point for
 understanding ongoing design work because a tiny fraction of the tickets
 actually relate to design docs, and it's not easy from the outside to
 figure out which ones are relevant.

 Punya

 On Fri, Apr 24, 2015 at 2:49 PM Sean Owen so...@cloudera.com wrote:

  I think it's OK to have design discussions on github, as emails go to
  ASF lists. After all, loads of PR discussions happen there. It's easy
  for anyone to follow.
 
  I also would rather just discuss on Github, except for all that noise.
 
  It's not great to put discussions in something like Google Docs
  actually; the resulting doc needs to be pasted back to JIRA promptly
  if so. I suppose it's still better than a private conversation or not
  talking at all, but the principle is that one should be able to access
  any substantive decision or conversation by being tuned in to only the
  project systems of record -- mailing list, JIRA.
 
 
 
  On Fri, Apr 24, 2015 at 2:30 PM, Reynold Xin r...@databricks.com
 wrote:
   I'd love to see more design discussions consolidated in a single place
 as
   well. That said, there are many practical challenges to overcome. Some
 of
   them are out of our control:
  
   1. For large features, it is fairly common to open a PR for discussion,
   close the PR taking some feedback into account, and reopen another one.
  You
   sort of lose the discussions that way.
  
   2. With the way Jenkins is setup currently, Jenkins testing introduces
 a
  lot
   of noise to GitHub pull requests, making it hard to differentiate
  legitimate
   comments from noise. This is unfortunately due to the fact that ASF
 won't
   allow our Jenkins bot to have API privilege to post messages.
  
   3. The Apache Way is that all development discussions need to happen on
  ASF
   property, i.e. dev lists and JIRA. As a result, technically we are not
   allowed to have development discussions on GitHub.
  
  
   On Fri, Apr 24, 2015 at 7:09 AM, Cody Koeninger c...@koeninger.org
  wrote:
  
   My 2 cents - I'd rather see design docs in github pull requests (using
   plain text / markdown).  That doesn't require changing access or
 adding
   people, and github PRs already allow for conversation / email
   notifications.
  
   Conversation is already split between jira and github PRs.  Having a
  third
   stream of conversation in Google Docs just leads to things being
  ignored.
  
   On Fri, Apr 24, 2015 at 7:21 AM, Sean Owen so...@cloudera.com
 wrote:
  
That would require giving wiki access to everyone or manually adding
people
any time they make a doc.
   
I don't see how this helps though. They're still docs on the
 internet
and
they're still linked from the central project JIRA, which is what
 you
should follow.
 On Apr 24, 2015 8:14 AM, Punyashloka Biswal 
  punya.bis...@gmail.com
wrote:
   
 Dear Spark devs,

 Right now, design docs are stored on Google docs and linked from
 tickets.
 For someone new to the project, it's hard to figure out what
  subjects
 are
 being discussed, what organization to follow for new feature
 proposals,
 etc.

 Would it make sense to consolidate future design docs in either a
 designated area on the Apache Confluence 

Re: Design docs: consolidation and discoverability

2015-04-24 Thread Punyashloka Biswal
Sandy, doesn't keeping (in-progress) design docs in Git satisfy the history
requirement? Referring back to my Gradle example, it seems that
https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
is a really good way to see why the design doc evolved the way it did. When
keeping the doc in Jira (presumably as an attachment) it's not easy to see
what changed between successive versions of the doc.

Punya

On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza sandy.r...@cloudera.com wrote:

 I think there are maybe two separate things we're talking about?

 1. Design discussions and in-progress design docs.

 My two cents are that JIRA is the best place for this.  It allows tracking
 the progression of a design across multiple PRs and contributors.  A piece
 of useful feedback that I've gotten in the past is to make design docs
 immutable.  When updating them in response to feedback, post a new version
 rather than editing the existing one.  This enables tracking the history of
 a design and makes it possible to read comments about previous designs in
 context.  Otherwise it's really difficult to understand why particular
 approaches were chosen or abandoned.

 2. Completed design docs for features that we've implemented.

 Perhaps less essential to project progress, but it would be really lovely
 to have a central repository to all the projects design doc.  If anyone
 wants to step up to maintain it, it would be cool to have a wiki page with
 links to all the final design docs posted on JIRA.

 -Sandy

 On Fri, Apr 24, 2015 at 12:01 PM, Punyashloka Biswal 
 punya.bis...@gmail.com wrote:

 The Gradle dev team keep their design documents  *checked into* their Git


 repository -- see

 https://github.com/gradle/gradle/blob/master/design-docs/build-comparison.md
 for example. The advantages I see to their approach are:

- design docs stay on ASF property (since Github is synced to the
Apache-run Git repository)
- design docs have a lifetime across PRs, but can still be modified and


commented on through the mechanism of PRs

- keeping a central location helps people to find good role models and


converge on conventions

 Sean, I find it hard to use the central Jira as a jumping-off point for
 understanding ongoing design work because a tiny fraction of the tickets
 actually relate to design docs, and it's not easy from the outside to
 figure out which ones are relevant.

 Punya

 On Fri, Apr 24, 2015 at 2:49 PM Sean Owen so...@cloudera.com wrote:

  I think it's OK to have design discussions on github, as emails go to
  ASF lists. After all, loads of PR discussions happen there. It's easy
  for anyone to follow.
 
  I also would rather just discuss on Github, except for all that noise.
 
  It's not great to put discussions in something like Google Docs
  actually; the resulting doc needs to be pasted back to JIRA promptly
  if so. I suppose it's still better than a private conversation or not
  talking at all, but the principle is that one should be able to access
  any substantive decision or conversation by being tuned in to only the
  project systems of record -- mailing list, JIRA.
 
 
 
  On Fri, Apr 24, 2015 at 2:30 PM, Reynold Xin r...@databricks.com
 wrote:
   I'd love to see more design discussions consolidated in a single
 place as
   well. That said, there are many practical challenges to overcome.
 Some of
   them are out of our control:
  
   1. For large features, it is fairly common to open a PR for
 discussion,
   close the PR taking some feedback into account, and reopen another
 one.
  You
   sort of lose the discussions that way.
  
   2. With the way Jenkins is setup currently, Jenkins testing
 introduces a
  lot
   of noise to GitHub pull requests, making it hard to differentiate
  legitimate
   comments from noise. This is unfortunately due to the fact that ASF
 won't
   allow our Jenkins bot to have API privilege to post messages.
  
   3. The Apache Way is that all development discussions need to happen
 on
  ASF
   property, i.e. dev lists and JIRA. As a result, technically we are not
   allowed to have development discussions on GitHub.
  
  
   On Fri, Apr 24, 2015 at 7:09 AM, Cody Koeninger c...@koeninger.org
  wrote:
  
   My 2 cents - I'd rather see design docs in github pull requests
 (using
   plain text / markdown).  That doesn't require changing access or
 adding
   people, and github PRs already allow for conversation / email
   notifications.
  
   Conversation is already split between jira and github PRs.  Having a
  third
   stream of conversation in Google Docs just leads to things being
  ignored.
  
   On Fri, Apr 24, 2015 at 7:21 AM, Sean Owen so...@cloudera.com
 wrote:
  
That would require giving wiki access to everyone or manually
 adding
people
any time they make a doc.
   
I don't see how this helps though. They're still docs on the
 internet
and
they're still linked from the central project JIRA, 

Re: Design docs: consolidation and discoverability

2015-04-24 Thread Sean Owen
Only catch there is it requires commit access to the repo. We need a
way for people who aren't committers to write and collaborate (for
point #1)

On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
punya.bis...@gmail.com wrote:
 Sandy, doesn't keeping (in-progress) design docs in Git satisfy the history
 requirement? Referring back to my Gradle example, it seems that
 https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
 is a really good way to see why the design doc evolved the way it did. When
 keeping the doc in Jira (presumably as an attachment) it's not easy to see
 what changed between successive versions of the doc.

 Punya

 On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza sandy.r...@cloudera.com wrote:

 I think there are maybe two separate things we're talking about?

 1. Design discussions and in-progress design docs.

 My two cents are that JIRA is the best place for this.  It allows tracking
 the progression of a design across multiple PRs and contributors.  A piece
 of useful feedback that I've gotten in the past is to make design docs
 immutable.  When updating them in response to feedback, post a new version
 rather than editing the existing one.  This enables tracking the history of
 a design and makes it possible to read comments about previous designs in
 context.  Otherwise it's really difficult to understand why particular
 approaches were chosen or abandoned.

 2. Completed design docs for features that we've implemented.

 Perhaps less essential to project progress, but it would be really lovely
 to have a central repository to all the projects design doc.  If anyone
 wants to step up to maintain it, it would be cool to have a wiki page with
 links to all the final design docs posted on JIRA.


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Design docs: consolidation and discoverability

2015-04-24 Thread Cody Koeninger
Why can't pull requests be used for design docs in Git if people who aren't
committers want to contribute changes (as opposed to just comments)?

On Fri, Apr 24, 2015 at 2:57 PM, Sean Owen so...@cloudera.com wrote:

 Only catch there is it requires commit access to the repo. We need a
 way for people who aren't committers to write and collaborate (for
 point #1)

 On Fri, Apr 24, 2015 at 3:56 PM, Punyashloka Biswal
 punya.bis...@gmail.com wrote:
  Sandy, doesn't keeping (in-progress) design docs in Git satisfy the
 history
  requirement? Referring back to my Gradle example, it seems that
 
 https://github.com/gradle/gradle/commits/master/design-docs/build-comparison.md
  is a really good way to see why the design doc evolved the way it did.
 When
  keeping the doc in Jira (presumably as an attachment) it's not easy to
 see
  what changed between successive versions of the doc.
 
  Punya
 
  On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza sandy.r...@cloudera.com
 wrote:
 
  I think there are maybe two separate things we're talking about?
 
  1. Design discussions and in-progress design docs.
 
  My two cents are that JIRA is the best place for this.  It allows
 tracking
  the progression of a design across multiple PRs and contributors.  A
 piece
  of useful feedback that I've gotten in the past is to make design docs
  immutable.  When updating them in response to feedback, post a new
 version
  rather than editing the existing one.  This enables tracking the
 history of
  a design and makes it possible to read comments about previous designs
 in
  context.  Otherwise it's really difficult to understand why particular
  approaches were chosen or abandoned.
 
  2. Completed design docs for features that we've implemented.
 
  Perhaps less essential to project progress, but it would be really
 lovely
  to have a central repository to all the projects design doc.  If anyone
  wants to step up to maintain it, it would be cool to have a wiki page
 with
  links to all the final design docs posted on JIRA.