Here here! :)  Completely agree with you - here's the latest updates
to Proposed
Community Mailing Lists / StackOverflow Changes
<https://docs.google.com/document/d/1N0pKatcM15cqBPqFWCqIy6jdgNzIoacZlYDCjufBh2s/edit#>.
Keep them coming though at this point, I'd like to limit new verbiage to
prevent it from being too long hence not being read.  Modifications and
suggestions are absolutely welcome - just asking that we don't make it too
much longer.  Thanks!


On Wed, Nov 9, 2016 at 5:36 AM Gerard Maas <gerard.m...@gmail.com> wrote:

> Great discussion. Glad to see it happening and lucky to have seen it on
> the mailing list due to its high volume.
>
> I had this same conversation with Patrick Wendell few Spark Summits ago.
> At the time, SO was not even listed as a resource and the idea was to make
> it the primary "go-to" place for questions.
>
> Having contributed to both the list (in its early days) and SO, the
> biggest hurdle IMO is how to deal with lazy people. These days, at SO, I
> spend more time leaving comments than answering in an attempt to moderate
> the requirement of "show some effort" and clarify unclear questions.
>
> It's my impression that the mailing list is much more friendly with "plz
> send me da code" folk and indeed would answer questions that would
> otherwise get down-voted or closed at SO. That also shows in the high email
> volume, which at the same time lowers its value for many of us who get
> overwhelmed. It's hard to separate authentic efforts in getting started,
> which deserve help and encouraging vs moderating "work dumpers" that abuse
> resources to get their thing done. Also, beginner questions always repeat
> and a mailing list has no features to help with that.
>
> The model I had in imagined roughly follows the "Odersky scale":
>  - Users new with the technology and basic "how to" questions belong in
> Stack Overflow. => The search and de-duplication features should help in
> getting an answer if already present, reducing the load.
>  - Advanced discussions and troubleshooting belong in users@
>  - Library bugs, new features and improvements belong in dev@
>
> Off course, there's no hard line between these levels and it would require
> contributor discretion aided with some routing procedure:
>
> - Spark documentation should establish Stack Overflow as the main go-to
> resource.
> - Contributors on the list should friendly redirect "intro level
> questions" to Stack Overflow.
> - SO contributors should redirect potential bugs and questions deserving a
> deeper discussion to @users or @dev as needed
> - @users -> @dev as today
> - Cross-posting SO + @users should be discouraged. The idea is to create
> efficient channels.
>
> A good resource on how and where to ask questions would be a great routing
> channel between the levels above.
> I'm willing to help with moderation efforts on "Spark Overflow" :-) to get
> this going.
>
> The Spark community has always been very welcoming and that spirit should
> be preserved. We just need to channel the efforts in a more efficient way.
>
> my 2c,
>
> Gerard.
>
>
> On Mon, Nov 7, 2016 at 11:24 PM, Maciej Szymkiewicz <
> mszymkiew...@gmail.com> wrote:
>
> Just a couple of random thoughts regarding Stack Overflow...
>
>    - If we are thinking about shifting focus towards SO all attempts of
>    micromanaging should be discarded right in the beginning. Especially things
>    like meta tags, which are discouraged and "burninated" (
>    https://meta.stackoverflow.com/tags/burninate-request/info) , or
>    thread bumping. Depending on a context these won't be manageable, go
>    against community guidelines or simply obsolete.
>    - Lack of expertise is unlikely an issue. Even now there is a number
>    of advanced Spark users on SO. Of course the more the merrier.
>
> Things that can be easily improved:
>
>    - Identifying, improving and promoting canonical questions and
>    answers. It means closing duplicate, suggesting edits to improve existing
>    answers, providing alternative solutions. This can be also used to identify
>    gaps in the documentation.
>    - Providing a set of clear posting guidelines to reduce effort
>    required to identify the problem (think about
>    http://stackoverflow.com/q/5963269 a.k.a How to make a great R
>    reproducible example?)
>    - Helping users decide if question is a good fit for SO (see below).
>    API questions are great fit, debugging problems like "my cluster is slow"
>    are not.
>    - Actively cleaning (closing, deleting) off-topic and low quality
>    questions. The less junk to sieve through the better chance of good
>    questions being answered.
>    - Repurposing and actively moderating SO docs (
>    https://stackoverflow.com/documentation/apache-spark/topics). Right
>    now most of the stuff that goes there is useless, duplicated or
>    plagiarized, or border case SPAM.
>    - Encouraging community to monitor featured (
>    https://stackoverflow.com/questions/tagged/apache-spark?sort=featured)
>    and active & upvoted & unanswered (
>    https://stackoverflow.com/unanswered/tagged/apache-spark) questions.
>    - Implementing some procedure to identify questions which are likely
>    to be bugs or a material for feature requests. Personally I am quite often
>    tempted to simply send a link to dev list, but I don't think it is really
>    acceptable.
>    - Animating Spark related chat room. I tried this a couple of times
>    but to no avail. Without a certain critical mass of users it just won't
>    work.
>
>
>
> On 11/07/2016 07:32 AM, Reynold Xin wrote:
>
> This is an excellent point. If we do go ahead and feature SO as a way for
> users to ask questions more prominently, as someone who knows SO very well,
> would you be willing to help write a short guideline (ideally the shorter
> the better, which makes it hard) to direct what goes to user@ and what
> goes to SO?
>
>
> Sure, I'll be happy to help if I can.
>
>
>
>
> On Sun, Nov 6, 2016 at 9:54 PM, Maciej Szymkiewicz <mszymkiew...@gmail.com
> > wrote:
>
> Damn, I always thought that mailing list is only for nice and welcoming
> people and there is nothing to do for me here >:)
>
> To be serious though, there are many questions on the users list which
> would fit just fine on SO but it is not true in general. There are dozens
> of questions which are to broad, opinion based, ask for external resources
> and so on. If you want to direct users to SO you have to help them to
> decide if it is the right channel. Otherwise it will just create a really
> bad experience for both seeking help and active answerers. Former ones will
> be downvoted and bashed, latter ones will have to deal with handling all
> the junk and the number of active Spark users with moderation privileges is
> really low (with only Massg and me being able to directly close duplicates).
>
> Believe me, I've seen this before.
> On 11/07/2016 05:08 AM, Reynold Xin wrote:
>
> You have substantially underestimated how opinionated people can be on
> mailing lists too :)
>
> On Sunday, November 6, 2016, Maciej Szymkiewicz <mszymkiew...@gmail.com>
> wrote:
>
> You have to remember that Stack Overflow crowd (like me) is highly
> opinionated, so many questions, which could be just fine on the mailing
> list, will be quickly downvoted and / or closed as off-topic. Just
> saying...
>
> --
> Best,
> Maciej
>
>
> On 11/07/2016 04:03 AM, Reynold Xin wrote:
>
> OK I've checked on the ASF member list (which is private so there is no
> public archive).
>
> It is not against any ASF rule to recommend StackOverflow as a place for
> users to ask questions. I don't think we can or should delete the existing
> user@spark list either, but we can certainly make SO more visible than it
> is.
>
>
>
> On Wed, Nov 2, 2016 at 10:21 AM, Reynold Xin <r...@databricks.com> wrote:
>
> Actually after talking with more ASF members, I believe the only policy is
> that development decisions have to be made and announced on ASF properties
> (dev list or jira), but user questions don't have to.
>
> I'm going to double check this. If it is true, I would actually recommend
> us moving entirely over the Q&A part of the user list to stackoverflow, or
> at least make that the recommended way rather than the existing user list
> which is not very scalable.
>
>
> On Wednesday, November 2, 2016, Nicholas Chammas <
> nicholas.cham...@gmail.com> wrote:
>
> We’ve discussed several times upgrading our communication tools, as far
> back as 2014 and maybe even before that too. The bottom line is that we
> can’t due to ASF rules requiring the use of ASF-managed mailing lists.
>
> For some history, see this discussion:
>
>    -
>    
> https://mail-archives.apache.org/mod_mbox/spark-user/201412.mbox/%3CCAOhmDzfL2COdysV8r5hZN8f=NqXM=f=oy5no2dhwj_kveop...@mail.gmail.com%3E
>    -
>    
> https://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%3CCAOhmDzec1JdsXQq3dDwAv7eLnzRidSkrsKKG0xKw=tktxy_...@mail.gmail.com%3E
>
> (It’s ironic that it’s difficult to follow the past discussion on why we
> can’t change our official communication tools due to those very tools…)
>
> Nick
> ​
>
> On Wed, Nov 2, 2016 at 12:24 PM Ricardo Almeida <
> ricardo.alme...@actnowib.com> wrote:
>
> I fell Assaf point is quite relevant if we want to move this project
> forward from the Spark user perspective (as I do). In fact, we're still
> using 20th century tools (mailing lists) with some add-ons (like Stack
> Overflow).
>
> As usually, Sean and Cody's contributions are very to the point.
> I fell it is indeed a matter of of culture (hard to enforce) and tools
> (much easier). Isn't it?
>
> On 2 November 2016 at 16:36, Cody Koeninger <c...@koeninger.org> wrote:
>
> So concrete things people could do
>
> - users could tag subject lines appropriately to the component they're
> asking about
>
> - contributors could monitor user@ for tags relating to components
> they've worked on.
> I'd be surprised if my miss rate for any mailing list questions
> well-labeled as Kafka was higher than 5%
>
> - committers could be more aggressive about soliciting and merging PRs
> to improve documentation.
> It's a lot easier to answer even poorly-asked questions with a link to
> relevant docs.
>
> On Wed, Nov 2, 2016 at 7:39 AM, Sean Owen <so...@cloudera.com> wrote:
> > There's already reviews@ and issues@. dev@ is for project development
> itself
> > and I think is OK. You're suggesting splitting up user@ and I sympathize
> > with the motivation. Experience tells me that we'll have a beginner@
> that's
> > then totally ignored, and people will quickly learn to post to advanced@
> to
> > get attention, and we'll be back where we started. Putting it in JIRA
> > doesn't help. I don't think this a problem that is merely down to lack of
> > process. It actually requires cultivating a culture change on the
> community
> > list.
> >
> > On Wed, Nov 2, 2016 at 12:11 PM Mendelson, Assaf <
> assaf.mendel...@rsa.com>
> > wrote:
> >>
> >> What I am suggesting is basically to fix that.
> >>
> >> For example, we might say that mailing list A is only for voting,
> mailing
> >> list B is only for PR and have something like stack overflow for
> developer
> >> questions (I would even go as far as to have beginner, intermediate and
> >> advanced mailing list for users and beginner/advanced for dev).
> >>
> >>
> >>
> >> This can easily be done using stack overflow tags, however, that would
> >> probably be harder to manage.
> >>
> >> Maybe using special jira tags and manage it in jira?
> >>
> >>
> >>
> >> Anyway as I said, the main issue is not user questions (except maybe
> >> advanced ones) but more for dev questions. It is so easy to get lost in
> the
> >> chatter that it makes it very hard for people to learn spark internals…
> >>
> >> Assaf.
> >>
> >>
> >>
> >> From: Sean Owen [mailto:so...@cloudera.com]
> >> Sent: Wednesday, November 02, 2016 2:07 PM
> >> To: Mendelson, Assaf; dev@spark.apache.org
> >> Subject: Re: Handling questions in the mailing lists
> >>
> >>
> >>
> >> I think that unfortunately mailing lists don't scale well. This one has
> >> thousands of subscribers with different interests and levels of
> experience.
> >> For any given person, most messages will be irrelevant. I also find
> that a
> >> lot of questions on user@ are not well-asked, aren't an SSCCE
> >> (http://sscce.org/), not something most people are going to bother
> replying
> >> to even if they could answer. I almost entirely ignore user@ because
> there
> >> are higher-priority channels like PRs to deal with, that already have
> >> hundreds of messages per day. This is why little of it gets an answer
> -- too
> >> noisy.
> >>
> >>
> >>
> >> We have to have official mailing lists, in any event, to have some
> >> official channel for things like votes and announcements. It's not
> wrong to
> >> ask questions on user@ of course, but a lot of the questions I see
> could
> >> have been answered with research of existing docs or looking at the
> code. I
> >> think that given the scale of the list, it's not wrong to assert that
> this
> >> is sort of a prerequisite for asking thousands of people to answer one's
> >> question. But we can't enforce that.
> >>
> >>
> >>
> >> The situation will get better to the extent people ask better questions,
> >> help other people ask better questions, and answer good questions. I'd
> >> encourage anyone feeling this way to try to help along those dimensions.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Wed, Nov 2, 2016 at 11:32 AM assaf.mendelson <
> assaf.mendel...@rsa.com>
> >> wrote:
> >>
> >> Hi,
> >>
> >> I know this is a little off topic but I wanted to raise an issue about
> >> handling questions in the mailing list (this is true both for the user
> >> mailing list and the dev but since there are other options such as stack
> >> overflow for user questions, this is more problematic in dev).
> >>
> >> Let’s say I ask a question (as I recently did). Unfortunately this was
> >> during spark summit in Europe so probably people were busy. In any case
> no
> >> one answered.
> >>
> >> The problem is, that if no one answers very soon, the question will
> almost
> >> certainly remain unanswered because new messages will simply drown it.
> >>
> >>
> >>
> >> This is a common issue not just for questions but for any comment or
> idea
> >> which is not immediately picked up.
> >>
> >>
> >>
> >> I believe we should have a method of handling this.
> >>
> >> Generally, I would say these types of things belong in stack overflow,
> >> after all, the way it is built is perfect for this. More seasoned spark
> >> contributors and committers can periodically check out unanswered
> questions
> >> and answer them.
> >>
> >> The problem is that stack overflow (as well as other targets such as the
> >> databricks forums) tend to have a more user based orientation. This
> means
> >> that any spark internal question will almost certainly remain
> unanswered.
> >>
> >>
> >>
> >> I was wondering if we could come up with a solution for this.
> >>
> >>
> >>
> >> Assaf.
> >>
> >>
> >>
> >>
> >>
> >> ________________________________
> >>
> >> View this message in context: Handling questions in the mailing lists
> >> Sent from the Apache Spark Developers List mailing list archive at
> >> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>
>
>
>
>
>
> --
> Maciej Szymkiewicz
>
>
>

Reply via email to