Re: need assistance debugging a strange build failure...

2018-11-06 Thread shane knapp
btw, this is a compilation error in the SBT build that only shows up on the ubuntu workers. On Mon, Nov 5, 2018 at 5:07 PM shane knapp wrote: > the maven build is quite happy: > > https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.7-ubuntu-testing/ > >

Re: need assistance debugging a strange build failure...

2018-11-05 Thread shane knapp
you tried Maven instead of SBT? This looks like a Java dependency > problem, e.g. a wrong version of Avro is picked. > > On Tue, Nov 6, 2018 at 8:30 AM shane knapp wrote: > >> i'm really close (for real: really close!) on the ubuntu port... but one >> build has be

need assistance debugging a strange build failure...

2018-11-05 Thread shane knapp
an example job w/this failure is here: https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7-ubuntu-testing/30/consoleFull thoughts? am i missing something obvious? i've checked and there are no avro system packages installed on any of the workers (centos or ubuntu). than

Re: python lint is broken on master branch

2018-10-31 Thread shane knapp
'. >> > flake8 checks failed. >> > >> > As an example please see >> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Compile/job/spark-master-lint/9080/console >> > >> > Any ideas? >> >> ---

Re: python lint is broken on master branch

2018-10-31 Thread shane knapp
e > https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Compile/job/spark-master-lint/9080/console > > > > Any ideas? > > ----- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org &

Re: GitHub is out of order

2018-10-22 Thread shane knapp
shane knapp wrote: > quick update: > > the github status messages have gone from "red" to "orange", which to me > still means "we're broken". :) > > https://status.github.com/messages > > i'm still holding off on restarting jenkins, but if thin

Re: GitHub is out of order

2018-10-22 Thread shane knapp
art it back up and keep an eye on PRBs/github statii/etc. shane On Mon, Oct 22, 2018 at 9:57 AM shane knapp wrote: > i've actually taken jenkins down completely. > > after i see an 'all clear' on their status page, i'll bring jenkins back > up. > > On Mon, Oct 22, 2018 at 9:34

Re: GitHub is out of order

2018-10-22 Thread shane knapp
i've actually taken jenkins down completely. after i see an 'all clear' on their status page, i'll bring jenkins back up. On Mon, Oct 22, 2018 at 9:34 AM shane knapp wrote: > jenkins is no longer accepting new builds. > > On Mon, Oct 22, 2018 at 8:55 AM Hyukjin Kwon wrote: > >

Re: GitHub is out of order

2018-10-22 Thread shane knapp
월) 오후 12:40, Dongjoon Hyun 님이 작성: >> >>> Hi, All. >>> >>> Currently, GitHub is out of order. Apache Spark repo is also affected. >>> Newly filed pull requests to Apache Spark repository seem to disappear >>> repeatedly, too. >>> >>> https:

Re: moving the spark jenkins job builder repo from dbricks --> spark

2018-10-17 Thread shane knapp
t this. shane > On Wed, Oct 10, 2018 at 12:06 PM shane knapp wrote: > >> Not sure if that's what you meant; but it should be ok for the jenkins >>> servers to manually sync with master after you (or someone else) have >>> verified the changes. That should prevent

Re: moving the spark jenkins job builder repo from dbricks --> spark

2018-10-10 Thread shane knapp
ccess to > some test jenkins server. > > JJB has some built-in lint and testing, so that'll be the first step in verifying the build configs. i still have a dream where i have a fully functioning jenkins staging deployment... one day i will make that happen. :) shane -- Shane Knapp U

Re: moving the spark jenkins job builder repo from dbricks --> spark

2018-10-10 Thread shane knapp
/jenkins-job-builder/ On Tue, Oct 9, 2018 at 10:22 PM Sean Owen wrote: > Some responses inline -- this discussion can do to dev@ though. > > dev@ added. > On Tue, Oct 9, 2018 at 3:28 PM shane knapp wrote: > > JBB templates in spark repo: > > * code path is currently und

Re: [build system] jenkins wedged, not accepting new PRBs

2018-10-09 Thread shane knapp
...and we're back and happily building. On Tue, Oct 9, 2018 at 3:12 PM shane knapp wrote: > i just restarted jenkins... hopefully this will fix the issue. > > of course, nothing in the logs to show why this happened (nor is there > ever). > > shane > -- > Shane

[build system] jenkins wedged, not accepting new PRBs

2018-10-09 Thread shane knapp
i just restarted jenkins... hopefully this will fix the issue. of course, nothing in the logs to show why this happened (nor is there ever). shane -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: [VOTE] SPARK 2.4.0 (RC2)

2018-10-04 Thread shane knapp
ild/mvn at runtime. -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

[build system] jenkins wedged post-SSL cert fiasco, restarting now

2018-09-24 Thread shane knapp
jenkins will be back up and building Real Soon Now[tm]... sorry for the inconvenience! -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: Something wrong of Jenkins proxy

2018-09-24 Thread shane knapp
cert renewed and jenkins is reachable now. On Sun, Sep 23, 2018 at 8:58 PM, shane knapp wrote: > i don't manage the certs on the box doing the reverse proxy, so i've > reached out to the proper party and will hopefully things will be sorted by > early tomorrow. > > On Sun, Sep 2

Re: Something wrong of Jenkins proxy

2018-09-23 Thread shane knapp
i don't manage the certs on the box doing the reverse proxy, so i've reached out to the proper party and will hopefully things will be sorted by early tomorrow. On Sun, Sep 23, 2018 at 8:37 PM, shane knapp wrote: > for now, you can visit: > > https://hadrian.ist.berkeley.ed

Re: Something wrong of Jenkins proxy

2018-09-23 Thread shane knapp
i just noticed this... taking a look now. On Sun, Sep 23, 2018 at 4:38 AM, Yuanjian Li wrote: > Hi devs, > Is there something wrong of Jenkins proxy? > [image: image.png] > I got this proxy 500 whole days. > > Thanks, > Yuanjian Li > -- Shane Knapp UC Berkeley EECS

Re: Something wrong of Jenkins proxy

2018-09-23 Thread shane knapp
for now, you can visit: https://hadrian.ist.berkeley.edu/jenkins/ something is up w/the reverse proxy setup. On Sun, Sep 23, 2018 at 8:37 PM, shane knapp wrote: > i just noticed this... taking a look now. > > On Sun, Sep 23, 2018 at 4:38 AM, Yuanjian Li > wrote:

Re: Branch 2.4 is cut

2018-09-07 Thread shane knapp
you, Shane! :D > > Bests, > Dongjoon. > > On Fri, Sep 7, 2018 at 9:51 AM shane knapp wrote: > >> i'll try and get to the 2.4 branch stuff today... >> >> -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: Branch 2.4 is cut

2018-09-07 Thread shane knapp
> On Thu, Sep 6, 2018 at 12:32 AM Wenchen Fan >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I've cut the branch-2.4 since all the major blockers are resolved. If >>>>>> no objections I'll shortly followup with an RC to get the QA started in >>>>>> parallel. >>>>>> >>>>>> Committers, please only merge PRs to branch-2.4 that are bug fixes, >>>>>> performance regression fixes, document changes, or test suites changes. >>>>>> >>>>>> Thanks, >>>>>> Wenchen >>>>>> >>>>> -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: Nightly Builds in the docs (in spark-nightly/spark-master-bin/latest? Can't seem to find it)

2018-09-04 Thread shane knapp
mastering-kafka-streams > >> > Follow me at https://twitter.com/jaceklaskowski > >> > >> - > >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >> > > > -- > Marcelo > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: Jenkins automatic disabling service - who and why?

2018-09-04 Thread shane knapp
;> instance, see https://github.com/apache/spark/pull/18447 >>>> I don't explicitly object this idea but at least can I ask who and why >>>> this was started? >>>> Is it for notification purpose or to save resource? Did I miss some >>>> discussion about this? >>>> >>>> -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: code freeze and branch cut for Apache Spark 2.4

2018-08-30 Thread shane knapp
ark but also about a lot of Scala libraries that >>> stopped supporting Scala 2.11, if Spark 2.4 will not support Scala 2.12, >>> then people will not be able to use them in their Zeppelin, Jupyter and >>> other notebooks together with Spark. >>> >>> > > > > -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: [discuss][minor] impending python 3.x jenkins upgrade... 3.5.x? 3.6.x?

2018-08-21 Thread shane knapp
versions (latest micros should > be fine). > > On Mon, Aug 20, 2018 at 7:07 PM shane knapp wrote: > >> initially, i'd like to just choose one version to have the primary tests >> against, but i'm also not opposed to supporting more of a matrix. the >> biggest

Re: [discuss][minor] impending python 3.x jenkins upgrade... 3.5.x? 3.6.x?

2018-08-20 Thread shane knapp
ion in Jenkins and highest version via AppVeyor FWIW. >> I don't have a strong preference opinion on this since we have been >> having compatibility issues for each Python version. >> >> >> 2018년 8월 14일 (화) 오전 4:15, shane knapp 님이 작성: >> >>> hey everyon

Re: [DISCUSS] SparkR support on k8s back-end for Spark 2.4

2018-08-16 Thread shane knapp
a couple of weeks after the 2.4 cut. > Part of the intent here is to allow this to happen without Shane having to > reorganize his complex upgrade schedule and make it even more complicated. > > this. exactly. :) -- Shane Knapp UC Berkeley EECS Research / RISELab Staff

Re: [DISCUSS] SparkR support on k8s back-end for Spark 2.4

2018-08-15 Thread shane knapp
est" builds that don't require > credentials such as GPG keys). > > awesome++ > Perhaps we should think about revamping these jobs instead of keeping > them as is. > i fully support this. which is exactly why i punted on even trying to get them ported over to the ubunt

Re: [DISCUSS] SparkR support on k8s back-end for Spark 2.4

2018-08-15 Thread shane knapp
ge, i will make it happen. shane (who wants everyone to remember that it's just little old me running this... not a team of people) ;) -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

[discuss][minor] impending python 3.x jenkins upgrade... 3.5.x? 3.6.x?

2018-08-13 Thread shane knapp
be fully backwards-compatible w/3.4. of course, this needs to be taken w/a grain of salt, as we're mostly focused on actual python package requirements, rather than worrying about core python functionality. thoughts? comments? thanks in advance, shane -- Shane Knapp UC Berkeley EECS Research

Re: [R] discuss: removing lint-r checks for old branches

2018-08-11 Thread shane knapp
ntr since it is missing some tests. > > Also these seems like real test failures? Are these only happening in 2.1 > and 2.2? > > > ------ > *From:* shane knapp > *Sent:* Friday, August 10, 2018 4:04 PM > *To:* Sean Owen > *Cc:* Shivaram Venkatarama

Re: [R] discuss: removing lint-r checks for old branches

2018-08-10 Thread shane knapp
< shiva...@eecs.berkeley.edu> wrote: > Sounds good to me as well. Thanks Shane. > > Shivaram > On Fri, Aug 10, 2018 at 1:40 PM Reynold Xin wrote: > > > > SGTM > > > > On Fri, Aug 10, 2018 at 1:39 PM shane knapp wrote: > >> > >> https://i

Re: [R] discuss: removing lint-r checks for old branches

2018-08-10 Thread shane knapp
/agreemsg On Fri, Aug 10, 2018 at 4:02 PM, Sean Owen wrote: > Seems OK to proceed with shutting off lintr, as it was masking those. > > On Fri, Aug 10, 2018 at 6:01 PM shane knapp wrote: > >> ugh... R unit tests failed on both of these builds. >> https://amplab.cs.b

[R] discuss: removing lint-r checks for old branches

2018-08-10 Thread shane knapp
for the 2.4 cut/code freeze, but i wanted to get this done before it gets pushed down my queue and before we revisit the ubuntu port. thanks in advance, shane -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: code freeze and branch cut for Apache Spark 2.4

2018-08-10 Thread shane knapp
at all spark branches will happily pass against 3.5, it will not happen until after the 2.4 cut. :) however, from my (limited) testing, it does look like that's the case. still not gonna pull the trigger on it until after the cut. shane -- Shane Knapp UC Berkeley EECS Research / RISELab Staff

Re: code freeze and branch cut for Apache Spark 2.4

2018-08-10 Thread shane knapp
python 3.5/pyarrow 0.10.0 build: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-master-test-sbt-hadoop-2.6-python-3.5-arrow-0.10.0-ubuntu-testing/ On Fri, Aug 10, 2018 at 10:44 AM, shane knapp wrote: > see: https://github.com/apache/spark/pull/21939#issuecomm

Re: code freeze and branch cut for Apache Spark 2.4

2018-08-10 Thread shane knapp
extra work, so no objections from me to hold > off on things for now. > > On Fri, Aug 10, 2018 at 9:48 AM, shane knapp wrote: > >> On Fri, Aug 10, 2018 at 9:47 AM, Wenchen Fan wrote: >> >>> It seems safer to skip the arrow 0.10.0 upgrade for Spark 2.4 and leav

Re: code freeze and branch cut for Apache Spark 2.4

2018-08-10 Thread shane knapp
On Fri, Aug 10, 2018 at 9:47 AM, Wenchen Fan wrote: > It seems safer to skip the arrow 0.10.0 upgrade for Spark 2.4 and leave it > to Spark 3.0, so that we have more time to test. Any objections? > none here. -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical L

Re: code freeze and branch cut for Apache Spark 2.4

2018-08-10 Thread shane knapp
e is no consensus about >>> what is the right fix yet. Likely to miss it in Spark 2.4 because it's a >>> long-standing issue, not a regression. >>> >> >> This is a really serious data loss bug. Yes its very complex, but we >> absolutely have to f

Re: [pyspark][SPARK-25079]: preparing to enter the brave new world of python3.5!

2018-08-09 Thread shane knapp
also, i looked pretty closely @ the python3.5 release notes, and nothing caught my eye as being a showstopper. On Thu, Aug 9, 2018 at 10:41 AM, shane knapp wrote: > please see: https://issues.apache.org/jira/browse/SPARK-25079 > > this is holding back the arrow 0.10.0 upgrade. >

[pyspark][SPARK-25079]: preparing to enter the brave new world of python3.5!

2018-08-09 Thread shane knapp
-- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: [build system] IMPORTANT: taking centos workers offline for pyarrow upgrade

2018-08-08 Thread shane knapp
tion. please hold. > > On Wed, Aug 8, 2018 at 11:48 AM, shane knapp wrote: > >> well... i've been running in to problems (aka dependency hell), and just >> hit a show-stopper: >> >> UnsatisfiableError: The following specifications were found to be in >

Re: [build system] IMPORTANT: taking centos workers offline for pyarrow upgrade

2018-08-08 Thread shane knapp
at 11:48 AM, shane knapp wrote: > well... i've been running in to problems (aka dependency hell), and just > hit a show-stopper: > > UnsatisfiableError: The following specifications were found to be in > conflict: > - pyarrow 0.10.* -> arrow-cpp 0.10.0.* -> python >

Re: [build system] IMPORTANT: taking centos workers offline for pyarrow upgrade

2018-08-08 Thread shane knapp
o see the dependencies for each package. yep, we're testing against python 3.4, and pyarrow 0.10.0 needs 3.5+ putting the workers back on-line until i figure out what to do next. shane On Wed, Aug 8, 2018 at 10:31 AM, shane knapp wrote: > pyarrow 0.10.0 has been released, and this is importa

[build system] IMPORTANT: taking centos workers offline for pyarrow upgrade

2018-08-08 Thread shane knapp
the centos workers in to quiet mode now, prepping the upgrade and once the majority of existing builds are done, i'll kill any outliers (and retrigger them later) and perform the upgrade. expect this to take ~3 hours, max. shane -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical

[build system] jenkins/github commit access exploit

2018-08-07 Thread shane knapp
ful of how auth tokens are passed around in builds. there are masked 'password'-style env vars for things like that, and are easily located in job configs. we are not immune to exploits like this, so please be careful. :) shane -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical

[build system] bumped pull request builder job timeout to 400mins

2018-08-07 Thread shane knapp
i hate doing this, because our tests and builds take WY too long, but this should help get PRs through before the code freeze. -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: code freeze and branch cut for Apache Spark 2.4

2018-08-07 Thread shane knapp
> > According to the status, I think we should wait a few more days. Any > objections? > > none here. i'm also pretty certain that waiting until after the code freeze to start testing the GHPRB on ubuntu is the wisest course of action for us. shane -- Shane Knapp UC Berkele

Re: Set up Scala 2.12 test build in Jenkins

2018-08-06 Thread shane knapp
On Mon, Aug 6, 2018 at 12:46 PM, shane knapp wrote: > i'll get something set up quickly by hand today, and make a TODO to get > the job config checked in to the jenkins job builder configs later this > week. > > shane > > On Sun, Aug 5, 2018 at 7:10 AM, Sean Owen wrote: > >&

Re: Set up Scala 2.12 test build in Jenkins

2018-08-06 Thread shane knapp
> profiles that are enabled. > > I can already see two test failures for the 2.12 build right now and will > try to fix those, but this should help verify whether the failures are > 'real' and detect them going forward. > > > -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: code freeze and branch cut for Apache Spark 2.4

2018-08-01 Thread shane knapp
n order to get 2.4 out on time. >> >> > -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: [build system] DOWNTIME jenkins unreachable overnight

2018-08-01 Thread shane knapp
the UPS has been replaced, and you can now access the wonderful entity known as jenkins via the internet superhighway! shane (who only really showed up early to work and didn't actually help replace said UPS) On Tue, Jul 31, 2018 at 5:14 PM, shane knapp wrote: > our building is fina

[build system] DOWNTIME jenkins unreachable overnight

2018-07-31 Thread shane knapp
our building is finally replacing the broken UPS that keeps biting us... ...which means another bit of downtime. :( it begins in 6 hours (11pm PDT) and will be finished tomorrow (august 1st) by ~8am PDT. shane -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https

[build system] two workers will be reimaged w/ubuntu tomorrow

2018-07-30 Thread shane knapp
ub.com/apache/spark/pull/21584 2) stop needing 2 builds for pull requests (one for regular tests on centos, one to test against minikube on ubuntu). questions/comments/concerns? shane -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: [SPARK-24950] issues running DateTimeUtilsSuite daysToMillis and millisToDays w/java 8 181-b13

2018-07-27 Thread shane knapp
On Fri, Jul 27, 2018 at 1:23 PM shane knapp wrote: > >> hey everyone! >> >> i'm making great progress on porting the spark builds to run under ubuntu >> 16.04LTS, but have hit a show-stopper in my testing. >> >> i am not a scala person by any definition of

[SPARK-24950] issues running DateTimeUtilsSuite daysToMillis and millisToDays w/java 8 181-b13

2018-07-27 Thread shane knapp
in to are here: https://issues.apache.org/jira/browse/SPARK-24950 thanks in advance, shane -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: Build timeout -- continuous-integration/appveyor/pr — AppVeyor build failed

2018-07-24 Thread shane knapp
; just FWIW, I talked about this here (https://github.com/apache/ > spark/pull/20146#issuecomment-406132543) too for possible solutions to > handle this. > > > > > 2018년 7월 25일 (수) 오전 4:32, shane knapp 님이 작성: > >> revisiting this thread... >> >> i pushed a

Re: Build timeout -- continuous-integration/appveyor/pr — AppVeyor build failed

2018-07-24 Thread shane knapp
t of >>>> commits from master and I got the following error: >>>> >>>> *continuous-integration/appveyor/pr *— AppVeyor build failed >>>> >>>> due to: >>>> >>>> *Build execution time has reached the maximum allowed time

[build system] upped build retention for GHPRB builds

2018-07-24 Thread shane knapp
. shane -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

[build system] spark/k8s integration tests are now working

2018-07-13 Thread shane knapp
after upgrading minikube to v0.28.0 and much wailing and gnashing of teeth, it was discovered that v0.25.0 actually *works* as expected and the k8s integration tests are now green! side note, i've also opportunistically upgraded the minikube VM drivers from kvm to kvm2. shane -- Shane Knapp UC

Re: [build system] ubuntu workers temporarily offline

2018-07-11 Thread shane knapp
ok, things seem much happier now. On Wed, Jul 11, 2018 at 8:57 PM, shane knapp wrote: > i'm seeing some strange docker/minikube errors, so i'm currently rebooting > the boxes. when they're back up, i will retrigger any killed builds and > send an all-clear. > > On Wed, Jul 11,

Re: [build system] ubuntu workers temporarily offline

2018-07-11 Thread shane knapp
i'm seeing some strange docker/minikube errors, so i'm currently rebooting the boxes. when they're back up, i will retrigger any killed builds and send an all-clear. On Wed, Jul 11, 2018 at 7:40 PM, shane knapp wrote: > done, and the workers are back online. > > $ pssh -h ubuntu_worke

Re: [build system] ubuntu workers temporarily offline

2018-07-11 Thread shane knapp
PM, shane knapp wrote: > i'll be taking amp-jenkins-staging-worker-0{1,2} offline to upgrade > minikube to v0.28.0. > > this is currently blocking: https://github.com/apache/spark/pull/21583 > > this should be a relatively short downtime, and i'll reply back here when > it

[build system] ubuntu workers temporarily offline

2018-07-11 Thread shane knapp
i'll be taking amp-jenkins-staging-worker-0{1,2} offline to upgrade minikube to v0.28.0. this is currently blocking: https://github.com/apache/spark/pull/21583 this should be a relatively short downtime, and i'll reply back here when it's done. shane -- Shane Knapp UC Berkeley EECS Research

Re: [build system] taking ubuntu workers offline for docker update

2018-07-09 Thread shane knapp
this is done. On Mon, Jul 9, 2018 at 6:48 PM, shane knapp wrote: > we need to update docker to something more modern (17.05.0-ce -> > 18.03.1-ce), so i have taken the two ubuntu workers offline and once the > current builds finish, i will perform the update. > > this should

[build system] taking ubuntu workers offline for docker update

2018-07-09 Thread shane knapp
we need to update docker to something more modern (17.05.0-ce -> 18.03.1-ce), so i have taken the two ubuntu workers offline and once the current builds finish, i will perform the update. this shouldn't take more than an hour. shane -- Shane Knapp UC Berkeley EECS Research / RISELab St

Re: Jenkins build errors

2018-06-18 Thread shane knapp
> Caused by: sbt.ForkMain$ForkError: java.io.IOException: error=2, No such > file or directory > at java.lang.UNIXProcess.forkAndExec(Native Method) > at java.lang.UNIXProcess.(UNIXProcess.java:248) > at java.lang.ProcessImpl.start(ProcessImpl.java:134) > at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) > ... 17 more > > -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

[build system] DOWNTIME ALERT! jenkins will be down all day july 16th (saturday)

2018-06-11 Thread shane knapp
ay. shane -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: [build system] meet your build engineer @ spark ai summit SF 2018

2018-06-05 Thread shane knapp
, 2018 at 11:11 AM, shane knapp wrote: > hey everyone! > > if you ever wanted to meet the one-man operation that keeps things going, > talk about future build system plans, complain about the fact that we're > still on centos 6 (yes, i know), or just say hi, i'll be manning the

Re: [build system] meet your build engineer @ spark ai summit SF 2018

2018-05-02 Thread shane knapp
RISELab booth at summit all three days! > > :) > > shane > -- > Shane Knapp > UC Berkeley EECS Research / RISELab Staff Technical Lead > https://rise.cs.berkeley.edu > -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

[build system] meet your build engineer @ spark ai summit SF 2018

2018-05-02 Thread shane knapp
hey everyone! if you ever wanted to meet the one-man operation that keeps things going, talk about future build system plans, complain about the fact that we're still on centos 6 (yes, i know), or just say hi, i'll be manning the RISELab booth at summit all three days! :) shane -- Shane Knapp

Re: [build system] jenkins master unreachable, build system currently down

2018-05-01 Thread shane knapp
and we're back! there was apparently a firewall migration yesterday that went sideways. shane On Mon, Apr 30, 2018 at 8:27 PM, shane knapp <skn...@berkeley.edu> wrote: > we just noticed that we're unable to connect to jenkins, and have reached > out to our NOC support staff at our

[build system] jenkins master unreachable, build system currently down

2018-04-30 Thread shane knapp
we just noticed that we're unable to connect to jenkins, and have reached out to our NOC support staff at our colo. until we hear back, there's nothing we can do. i'll update the list as soon as i hear something. sorry for the inconvenience! shane -- Shane Knapp UC Berkeley EECS Research

Re: [build system] experiencing network issues, git fetch timeouts likely

2018-04-03 Thread shane knapp
...and we're back! On Tue, Apr 3, 2018 at 8:10 AM, shane knapp <skn...@berkeley.edu> wrote: > this apparently caused jenkins to get wedged overnight. i'll restarting > it now. > > On Mon, Apr 2, 2018 at 9:12 PM, shane knapp <skn...@berkeley.edu> wrote: > >> the

Re: [build system] experiencing network issues, git fetch timeouts likely

2018-04-03 Thread shane knapp
this apparently caused jenkins to get wedged overnight. i'll restarting it now. On Mon, Apr 2, 2018 at 9:12 PM, shane knapp <skn...@berkeley.edu> wrote: > the problem was identified and fixed, and we should be good as of about an > hour ago. > > sorry for any inconvenience!

Re: [build system] experiencing network issues, git fetch timeouts likely

2018-04-02 Thread shane knapp
the problem was identified and fixed, and we should be good as of about an hour ago. sorry for any inconvenience! On Mon, Apr 2, 2018 at 4:15 PM, shane knapp <skn...@berkeley.edu> wrote: > hey all! > > we're having network issues on campus right now, and the jenkins workers >

[build system] experiencing network issues, git fetch timeouts likely

2018-04-02 Thread shane knapp
on. shane -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: [build system] rebooting firewall, access to jenkins will return shortly

2018-01-26 Thread shane knapp
and we're back! On Fri, Jan 26, 2018 at 2:32 PM, shane knapp <skn...@berkeley.edu> wrote: > our firewall was running a bit... slowly... and needed a reboot. this > means access to jenkins will be gone for ~10 mins. > > i'll send out an all-clear when we're back up and running. >

[build system] rebooting firewall, access to jenkins will return shortly

2018-01-26 Thread shane knapp
our firewall was running a bit... slowly... and needed a reboot. this means access to jenkins will be gone for ~10 mins. i'll send out an all-clear when we're back up and running.

Re: [build system] currently experiencing git timeouts when building

2018-01-18 Thread shane knapp
est-maven-hadoop-2.7 > > Timeouts by day: > 2018-01-094 > 2018-01-1013 > 2018-01-1127 > 2018-01-1274 > 2018-01-139 > 2018-01-142 > 2018-01-158 > 2018-01-1634 > > Total builds:4112 > Total timeouts:171 > Percentage of a

Re: Build timed out for `branch-2.3 (hadoop-2.7)`

2018-01-18 Thread shane knapp
this doesn't have anything to do w/the git timeouts... those will timeout the build 10 mins after starting (and failing on the initial fetch call). On Wed, Jan 17, 2018 at 9:51 PM, Sameer Agarwal wrote: > FYI, I ended up bumping the build timeouts from 255 to 275 minutes.

Re: [build system] currently experiencing git timeouts when building

2018-01-16 Thread shane knapp
2018-01-158 2018-01-1634 Total builds:4112 Total timeouts:171 Percentage of all builds timing out:4.15856031128 On Wed, Jan 10, 2018 at 9:54 AM, shane knapp <skn...@berkeley.edu> wrote: > i just noticed we're starting to see the once-yearly rash of git timeouts > w

Re: [build system] yet another power outage at our colo

2018-01-16 Thread shane knapp
ok, we're back up and ready to build. sorry for the inconvenience. On Tue, Jan 16, 2018 at 9:59 AM, shane knapp <skn...@berkeley.edu> wrote: > all non-UPS machines (read: all jenkins workers) temporarily lost power a > few minutes ago, and i will need to reconnect them t

[build system] yet another power outage at our colo

2018-01-16 Thread shane knapp
all non-UPS machines (read: all jenkins workers) temporarily lost power a few minutes ago, and i will need to reconnect them to the master. this means no builds for ~20 mins. i will also be installing a plugin for the spark-on-k8s builds (

[build system] patching for meltdown/specter this week

2018-01-03 Thread shane knapp
i'll be patching the build system once the patches are released. https://security.googleblog.com/2018/01/todays-cpu- vulnerability-what-you-need.html https://googleprojectzero.blogspot.com/2018/01/reading- privileged-memory-with-side.html whee!

Re: [build system] UPS failure in Soda Hall took down jenkins' reverse proxy

2017-12-20 Thread shane knapp
...and we're back! On Wed, Dec 20, 2017 at 7:51 PM, shane knapp <skn...@berkeley.edu> wrote: > the build system is up, but you can't reach it through normal channels ( > amplab.cs.berkeley.edu/jenkins or rise.cs.berkeley.edu/jenkins) as the > machine hosting the reverse pr

[build system] UPS failure in Soda Hall took down jenkins' reverse proxy

2017-12-20 Thread shane knapp
the build system is up, but you can't reach it through normal channels ( amplab.cs.berkeley.edu/jenkins or rise.cs.berkeley.edu/jenkins) as the machine hosting the reverse proxy is down due to a UPS fault during normal maintenance... machines are coming up now, and we should have network

Re: [build system] power outage @ berkeley, again. jenkins offline ~2-6am nov 29th

2017-11-29 Thread shane knapp
this maintenance was cancelled last night, and will take place some time in 2018. i'll be sure to update the everyone when i get more information. On Tue, Nov 28, 2017 at 11:53 AM, shane knapp <skn...@berkeley.edu> wrote: > more electrical repairs need to be done on the high volt

[build system] power outage @ berkeley, again. jenkins offline ~2-6am nov 29th

2017-11-28 Thread shane knapp
more electrical repairs need to be done on the high voltage leads to our building, and we will be losing power overnight. this means the PRB builds will not be working as amplab.cs.berkeley.edu will be down. timer-based builds will still run normally. i'll get everything back up and running

Re: Jenkins upgrade/Test Parallelization & Containerization

2017-11-12 Thread shane knapp
hey all, i'm finally back from vacation this week and will be following up once i whittle down my inbox. in summation: jenkins worker upgrades will be happening. the biggest one is the move to ubuntu... we need containerized builds for this, but i don't have the cycles to really do all of this

Re: Spark build is failing in amplab Jenkins

2017-11-05 Thread shane knapp
hello from the canary islands! ;) i just saw this thread, and another one about a quick power loss at the colo where our machines are hosted. the master is on UPS but the workers aren't... and when they come back, the PATH variable specified in the workers' configs get dropped and we see

[build system] jenkins wedged, restarting service

2017-10-19 Thread shane knapp
jenkins got itself in to a 'state' this morning, and required a restart. it should be back up and building now. sorry for the inconvenience! shane

Re: Raise Jenkins timeout?

2017-10-09 Thread shane knapp
++joshrosen On Mon, Oct 9, 2017 at 1:48 AM, Sean Owen wrote: > I'm seeing jobs killed regularly, presumably because the time out (210 > minutes?) > > https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA% >

Re: Nightly builds for master branch failed

2017-10-05 Thread shane knapp
...and we're green: https://amplab.cs.berkeley.edu/jenkins/job/spark-master-maven-snapshots/2025/ On Thu, Oct 5, 2017 at 9:46 AM, shane knapp <skn...@berkeley.edu> wrote: > not a problem. :) > > On Thu, Oct 5, 2017 at 9:26 AM, Felix Cheung <felixcheun...@hotmail.com> >

Re: Nightly builds for master branch failed

2017-10-05 Thread shane knapp
not a problem. :) On Thu, Oct 5, 2017 at 9:26 AM, Felix Cheung <felixcheun...@hotmail.com> wrote: > Thanks Shane! > > -- > *From:* shane knapp <skn...@berkeley.edu> > *Sent:* Thursday, October 5, 2017 9:14:54 AM > *To:* Felix Cheung >

Re: Nightly builds for master branch failed

2017-10-05 Thread shane knapp
yep, it was a corrupted jar on amp-jenkins-worker-01. i grabbed a new one from maven.org and kicked off a fresh build. On Thu, Oct 5, 2017 at 9:03 AM, shane knapp <skn...@berkeley.edu> wrote: > yep, looking now. > > On Wed, Oct 4, 2017 at 10:04 PM, Felix Cheung <felixch

Re: Welcoming Tejas Patil as a Spark committer

2017-09-29 Thread shane knapp
congrats, and welcome! :) On Fri, Sep 29, 2017 at 12:58 PM, Matei Zaharia wrote: > Hi all, > > The Spark PMC recently added Tejas Patil as a committer on the > project. Tejas has been contributing across several areas of Spark for > a while, focusing especially on

Re: Signing releases with pwendell or release manager's key?

2017-09-18 Thread shane knapp
i will detail how we control access to the jenkins infra tomorrow. we're pretty well locked down, but there is absolutely room for improvement. this thread is also a good reminder that we (RMs + pwendell + ?) should audit who still has, but does not need direct (or special) access to jenkins.

Re: [build system] tonight's downtime

2017-08-29 Thread shane knapp
alright, we're back up! On Tue, Aug 29, 2017 at 9:13 AM, shane knapp <skn...@berkeley.edu> wrote: > ok, we were up for a little bit, but had to take the webserver down > due to a failed disk in the RAID array. > > given that this was our only hardware casualty, i will happily

<    1   2   3   4   5   6   7   8   >