Re: [build system] IMPORTANT! northern california fire danger, potential power outage(s)

2019-10-09 Thread Wenchen Fan
Thanks for the updates!

On Thu, Oct 10, 2019 at 5:34 AM Shane Knapp  wrote:

> quick update:
>
> campus is losing power @ 8pm.  this is after we were told 4am, 8am,
> noon, and 2-4pm.  :)
>
> PG&E expects to start bringing alameda county back online at noon
> tomorrow, but i believe that target to be fluid and take longer than
> expected.
>
> this means that the earliest that we can bring the build system back
> up is friday, but there's a much greater than non-zero chance of this
> not happening until monday morning.  i will be leaving town for the
> weekend friday afternoon, which means i won't be physically present to
> turn on all of our servers in the colo (about ~80 servers including
> jenkins) until monday.
>
> more updates as they come.  thanks for your patience!
>
> On Tue, Oct 8, 2019 at 7:32 PM Shane Knapp  wrote:
> >
> > jenkins is going down now.
> >
> > On Tue, Oct 8, 2019 at 4:21 PM Shane Knapp  wrote:
> > >
> > > quick update:
> > >
> > > we are definitely going to have our power shut off starting early
> > > tomorrow morning (by 4am PDT oct 9th), and expect at least 48 hours
> > > before it is restored.
> > >
> > > i will be shutting jenkins down some time this evening, and will
> > > update everyone here when i get more information.
> > >
> > > full service will be restored (i HOPE) by friday morning.
> > >
> > > shane (who doesn't ever want to check this list's archives and count
> > > how many times we've had power issues)
> > >
> > > On Tue, Oct 8, 2019 at 12:50 PM Shane Knapp 
> wrote:
> > > >
> > > > here in the lovely bay area, we are currently experiencing some
> > > > absolutely lovely weather:  temps around 20C, light winds, and not a
> > > > drop of moisture anywhere.
> > > >
> > > > this means that wildfire season is here, and our utilities company
> > > > (PG&E) is very concerned about fires like last year's Camp Fire
> > > > (https://en.wikipedia.org/wiki/Camp_Fire_(2018)), the 2018 fires
> > > > (https://en.wikipedia.org/wiki/2018_California_wildfires) and 2017
> > > > fires (https://en.wikipedia.org/wiki/2017_California_wildfires).
> > > >
> > > > because conditions are absolutely perfect for wildfires, we may lose
> > > > power here in berkeley tomorrow and thursday.
> > > >
> > > > there will be little to no notice of then this might happen, and if
> it
> > > > does that means that jenkins will most definitely go down.
> > > >
> > > > i will continue to keep a close eye on this and give updates as they
> > > > happen.  sadly, the pg&e website is down because they apparently
> > > > didn't think that they needed load balancers.  :\
> > > >
> > > > shane
> > > > --
> > > > Shane Knapp
> > > > UC Berkeley EECS Research / RISELab Staff Technical Lead
> > > > https://rise.cs.berkeley.edu
> > >
> > >
> > >
> > > --
> > > Shane Knapp
> > > UC Berkeley EECS Research / RISELab Staff Technical Lead
> > > https://rise.cs.berkeley.edu
> >
> >
> >
> > --
> > Shane Knapp
> > UC Berkeley EECS Research / RISELab Staff Technical Lead
> > https://rise.cs.berkeley.edu
>
>
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [build system] IMPORTANT! northern california fire danger, potential power outage(s)

2019-10-09 Thread Shane Knapp
quick update:

campus is losing power @ 8pm.  this is after we were told 4am, 8am,
noon, and 2-4pm.  :)

PG&E expects to start bringing alameda county back online at noon
tomorrow, but i believe that target to be fluid and take longer than
expected.

this means that the earliest that we can bring the build system back
up is friday, but there's a much greater than non-zero chance of this
not happening until monday morning.  i will be leaving town for the
weekend friday afternoon, which means i won't be physically present to
turn on all of our servers in the colo (about ~80 servers including
jenkins) until monday.

more updates as they come.  thanks for your patience!

On Tue, Oct 8, 2019 at 7:32 PM Shane Knapp  wrote:
>
> jenkins is going down now.
>
> On Tue, Oct 8, 2019 at 4:21 PM Shane Knapp  wrote:
> >
> > quick update:
> >
> > we are definitely going to have our power shut off starting early
> > tomorrow morning (by 4am PDT oct 9th), and expect at least 48 hours
> > before it is restored.
> >
> > i will be shutting jenkins down some time this evening, and will
> > update everyone here when i get more information.
> >
> > full service will be restored (i HOPE) by friday morning.
> >
> > shane (who doesn't ever want to check this list's archives and count
> > how many times we've had power issues)
> >
> > On Tue, Oct 8, 2019 at 12:50 PM Shane Knapp  wrote:
> > >
> > > here in the lovely bay area, we are currently experiencing some
> > > absolutely lovely weather:  temps around 20C, light winds, and not a
> > > drop of moisture anywhere.
> > >
> > > this means that wildfire season is here, and our utilities company
> > > (PG&E) is very concerned about fires like last year's Camp Fire
> > > (https://en.wikipedia.org/wiki/Camp_Fire_(2018)), the 2018 fires
> > > (https://en.wikipedia.org/wiki/2018_California_wildfires) and 2017
> > > fires (https://en.wikipedia.org/wiki/2017_California_wildfires).
> > >
> > > because conditions are absolutely perfect for wildfires, we may lose
> > > power here in berkeley tomorrow and thursday.
> > >
> > > there will be little to no notice of then this might happen, and if it
> > > does that means that jenkins will most definitely go down.
> > >
> > > i will continue to keep a close eye on this and give updates as they
> > > happen.  sadly, the pg&e website is down because they apparently
> > > didn't think that they needed load balancers.  :\
> > >
> > > shane
> > > --
> > > Shane Knapp
> > > UC Berkeley EECS Research / RISELab Staff Technical Lead
> > > https://rise.cs.berkeley.edu
> >
> >
> >
> > --
> > Shane Knapp
> > UC Berkeley EECS Research / RISELab Staff Technical Lead
> > https://rise.cs.berkeley.edu
>
>
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu



-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Spark 3.0 preview release feature list and major changes

2019-10-09 Thread Xiao Li
SPARK-29345  Add an API
that allows a user to define and observe arbitrary metrics on streaming
queries

Let us add this too.

Cheers,

Xiao

On Tue, Oct 8, 2019 at 10:31 PM Wenchen Fan  wrote:

> Regarding DS v2, I'd like to remove
> SPARK-26785  data
> source v2 API refactor: streaming write
> SPARK-26956  remove
> streaming output mode from data source v2 APIs
>
> and put the umbrella ticket instead
> SPARK-25390  data
> source V2 API refactoring
>
> Thanks,
> Wenchen
>
> On Wed, Oct 9, 2019 at 1:19 PM Dongjoon Hyun 
> wrote:
>
>> Thank you for the preparation of 3.0-preview, Xingbo!
>>
>> Bests,
>> Dongjoon.
>>
>> On Tue, Oct 8, 2019 at 2:32 PM Xingbo Jiang 
>> wrote:
>>
>>>  What's the process to propose a feature to be included in the final
 Spark 3.0 release?

>>>
>>> I don't know whether there exists any specific process here, normally
>>> you just merge the feature into Spark master before release code freeze,
>>> and then the feature would probably be included in the release. The code
>>> freeze date for Spark 3.0 has not been decided yet, though.
>>>
>>> Li Jin  于2019年10月8日周二 下午2:14写道:
>>>
 Thanks for summary!

 I have a question that is semi-related - What's the process to propose
 a feature to be included in the final Spark 3.0 release?

 In particular, I am interested in
 https://issues.apache.org/jira/browse/SPARK-28006.  I am happy to do
 the work so want to make sure I don't miss the "cut" date.

 On Tue, Oct 8, 2019 at 4:53 PM Xingbo Jiang 
 wrote:

> Hi all,
>
> Thanks for all the feedbacks, here is the updated feature list:
>
> SPARK-11215 
> Multiple columns support added to various Transformers: StringIndexer
>
> SPARK-11150 
> Implement Dynamic Partition Pruning
>
> SPARK-13677 
> Support Tree-Based Feature Transformation
>
> SPARK-16692  Add
> MultilabelClassificationEvaluator
>
> SPARK-19591  Add
> sample weights to decision trees
>
> SPARK-19712 
> Pushing Left Semi and Left Anti joins through Project, Aggregate, Window,
> Union etc.
>
> SPARK-19827  R API
> for Power Iteration Clustering
>
> SPARK-20286 
> Improve logic for timing out executors in dynamic allocation
>
> SPARK-20636 
> Eliminate unnecessary shuffle with adjacent Window expressions
>
> SPARK-22148 
> Acquire new executors to avoid hang because of blacklisting
>
> SPARK-22796 
> Multiple columns support added to various Transformers: PySpark
> QuantileDiscretizer
>
> SPARK-23128  A new
> approach to do adaptive execution in Spark SQL
>
> SPARK-23155  Apply
> custom log URL pattern for executor log URLs in SHS
>
> SPARK-23539  Add
> support for Kafka headers
>
> SPARK-23674  Add
> Spark ML Listener for Tracking ML Pipeline Status
>
> SPARK-23710 
> Upgrade the built-in Hive to 2.3.5 for hadoop-3.2
>
> SPARK-24333  Add
> fit with validation set to Gradient Boosted Trees: Python API
>
> SPARK-24417  Build
> and Run Spark on JDK11
>
> SPARK-24615 
> Accelerator-aware task scheduling for Spark
>
> SPARK-24920  Allow
> sharing Netty's memory pool allocators
>
> SPARK-25250  Fix
> race condition with tasks running when new attempt for same stage is
> created leads to other task in the next attempt running on the same
> partition id retry multiple times
>
> SPARK-25341 
> Support rolling back a shuffle map stage and re-generate the shuffle files
>
> SPARK-25348 

Re: Auto-closing PRs when there are no feedback or response from its author

2019-10-09 Thread Hyukjin Kwon
Yes, the problem was that it is difficult to automate. I think this has
been discussed twice(?) in the mailing list;
however, it ended up with doing nothing because it was difficult to
automate.

I think in case of PRs unlike JIRAs, there are some more different cases
that need manual judgement.

As an example, while JIRAs are easy to keep it updated in general, I think
it might not be fair to request to keep updating and resolving
conflicts of a PR with indefinitely waiting for review. For some large PRs,
it's kind of painful to keep it updated always.
It might be more reasonable to be updated per request when a committer has
some time to review.

> If there's little overhead to adoption, cool, though I doubt people
> will consistently use a new tag.

Yea, this is a good point. But in fact the standard about when to use is
quite simple - in a PR, leave a comment or review and tag this.
In case of readthedocs, they seem always tagging this whenever they leave a
comment or responds.

In fact, I myself am not sure about how useful it would be but to me it
looked worth trying. I remember we tried
such bots and dropped it back when it is found practically not quite useful.



2019년 10월 9일 (수) 오전 11:26, Sean Owen 님이 작성:

> I'm generally all for closing pretty old PRs. They can be reopened
> easily. Closing a PR (a particular proposal for how to resolve an
> issue) is less drastic than closing a JIRA (a description of an
> issue). Closing them just delivers the reality, that nobody is going
> to otherwise revisit it, and can actually prompt a few contributors to
> update or revisit their proposal.
>
> I wouldn't necessarily want to adopt new process or tools though. Is
> it not sufficient to auto-close PRs that have a merge conflict and
> haven't been updated in months? or just haven't been updated in a
> year? Those are probably manual-ish processes, but, don't need to
> happen more than a couple times a year.
>
> If there's little overhead to adoption, cool, though I doubt people
> will consistently use a new tag. I'd prefer any process or tool that
> implements the above.
>
>
> On Tue, Oct 8, 2019 at 8:19 PM Hyukjin Kwon  wrote:
> >
> > Hi all,
> >
> > I think we talked about this before. Roughly speaking, there are two
> cases of PRs:
> >   1. PRs waiting for review and 2. PRs waiting for author's reaction
> > We might not have to take an action but wait for reviewing for the first
> case.
> > However, we can ping and/or take an action for the second case.
> >
> > I noticed (at Read the Docs,
> https://github.com/readthedocs/readthedocs.org/blob/master/.github/no-response.yml)
> there's one bot integrated with Github app that does exactly what we want
> (see https://github.com/probot/no-response).
> >
> > 1. Maintainers (committers) can add a tag to a PR (e.g.,
> need-more-information)
> > 2. If the PR author responds with a comment or update, the bot removes
> the tag
> > 3. If the PR author does not respond, the bot closes the PR after
> waiting for the configured number of days.
> >
> > We already have a kind of simple mechanism for windowing the number of
> JIRAs. I think it's time to have such mechanism in Github PR as well.
> >
> > Although this repo doesn't look popular or widely used enough, seems
> exactly matched to what we want and less aggressive since this mechanism
> will only work when maintainers (committers) add a tag to a PR.
> >
> > WDYT guys?
> >
> > I cc'ed few people who I think were in the past similar discussions.
> >
>