Re: [RESULT][VOTE] Spark 2.2.1 (RC2)

2017-12-13 Thread Sean Owen
org> wrote: > I saw the svn move on Monday so I’m working on the website updates. > > I will look into maven today. I will ask if I couldn’t do it. > > > On Wed, Dec 6, 2017 at 10:49 AM Sean Owen <so...@cloudera.com> wrote: > >> Pardon, did this release

Re: BUILD FAILURE due to...not found: value AnalysisBarrier in spark-catalyst_2.11?

2017-12-08 Thread Sean Owen
Build is fine for me, and on Jenkins. Try a clean build? On Fri, Dec 8, 2017 at 11:04 AM Jacek Laskowski wrote: > Hi, > > Just got BUILD FAILURE and have been wondering if it's just me or is this > a known issue that's being worked on? > > (Sorry if that's just my local setup

Re: [RESULT][VOTE] Spark 2.2.1 (RC2)

2017-12-06 Thread Sean Owen
che...@apache.org> wrote: > This vote passes. Thanks everyone for testing this release. > > > +1: > > Sean Owen (binding) > > Herman van Hövell tot Westerflier (binding) > > Wenchen Fan (binding) > > Shivaram Venkataraman (binding) > > Felix Cheung >

Re: Does anyone know how to build spark with scala12.4?

2017-11-29 Thread Sean Owen
No, you have to run ./dev/change-scala-version.sh 2.12 before building for 2.12. That makes all the necessary POM changes. On Wed, Nov 29, 2017 at 8:11 PM Zhang, Liyun wrote: > Hi Sean: > > I have tried to use following script to build package but have problem( > I am

Re: Publishing official docker images for KubernetesSchedulerBackend

2017-11-29 Thread Sean Owen
nsibility to maintain and publish that image. If there is more >> than one way to do it and publishing a particular image is more just a >> convenience, then my bias tends more away from maintaining and publish it. >> >> On Wed, Nov 29, 2017 at 5:14 AM, Sean Owen <so...@clo

Re: Publishing official docker images for KubernetesSchedulerBackend

2017-11-29 Thread Sean Owen
Source code is the primary release; compiled binary releases are conveniences that are also released. A docker image sounds fairly different though. To the extent it's the standard delivery mechanism for some artifact (think: pyspark on PyPI as well) that makes sense, but is that the situation? if

Re: Does anyone know how to build spark with scala12.4?

2017-11-28 Thread Sean Owen
anks > Jerry > > 2017-11-28 21:51 GMT+08:00 Sean Owen <so...@cloudera.com>: > >> The Scala 2.12 profile mostly works, but not all tests pass. Use >> -Pscala-2.12 on the command line to build. >> >> On Tue, Nov 28, 2017 at 5:36 AM Ofir Manor <ofir.ma...

Re: Does anyone know how to build spark with scala12.4?

2017-11-28 Thread Sean Owen
tor / fix Spark source code to support > Scala 2.12 - look for multiple emails on this list in the last months from > Sean Owen on his progress. > Once Spark supports Scala 2.12, I think the next target would be JDK 9 > support. > > Ofir Manor > > Co-Founder & CTO |

Re: Object in compiler mirror not found - maven build

2017-11-26 Thread Sean Owen
I'm not seeing that on OS X or Linux. It sounds a bit like you have an old version of zinc or scala or something installed. On Sun, Nov 26, 2017 at 3:55 PM Tomasz Dudek wrote: > Hello everyone, > > I would love to help develop Apache Spark. I have run into a (very

Re: [VOTE] Spark 2.2.1 (RC2)

2017-11-26 Thread Sean Owen
gt;> >> val preferredMirror = >> Seq("wget", "https://www.apache.org/dyn/closer.lua?preferred=true;, "-q", >> "-O", "-").!!.trim >> val url = s" >> $preferredMirror/spark/spark-$version/spark-$version-bin-hadoop2.7.

Re: [VOTE] Spark 2.2.1 (RC2)

2017-11-25 Thread Sean Owen
I hit the same StackOverflowError as in the previous RC test, but, pretty sure this is just because the increased thread stack size JVM flag isn't applied consistently. This seems to resolve it: https://github.com/apache/spark/pull/19820 This wouldn't block release IMHO. I am currently

Re: Thoughts on extedning ML exporting in Spark?

2017-11-19 Thread Sean Owen
To paraphrase, you are mostly suggesting a new API for reading/writing models, not a new serialization? and the API should be more like the other DataFrame writer APIs, and more extensible? That's better than introducing any new format for sure, as there are already 1.5 supported formats -- the

Re: [VOTE] Spark 2.2.1 (RC1)

2017-11-15 Thread Sean Owen
The signature is fine, with your new sig. Updated hashes look fine too. LICENSE is still fine to my knowledge. Is anyone else seeing this failure? - GenerateOrdering with ShortType *** RUN ABORTED *** java.lang.StackOverflowError: at

Re: Cutting the RC for Spark 2.2.1 release

2017-11-13 Thread Sean Owen
It's repo.maven.apache.org ? On Mon, Nov 13, 2017 at 12:52 PM Felix Cheung wrote: > I did change it, but getting unknown host? > > [ERROR] Non-resolvable parent POM for > org.apache.spark:spark-parent_2.11:2.2.1-SNAPSHOT: Could not transfer > artifact

Re: Cutting the RC for Spark 2.2.1 release

2017-11-13 Thread Sean Owen
I'm not seeing a problem building, myself. However we could change the location of the Maven Repository in our POM to https://repo.maven.apache.org/maven2/ without any consequence. The only reason we overrode it was to force it to use HTTPS which still doesn't look like the default (!):

Re: is there a way for removing hadoop from spark

2017-11-12 Thread Sean Owen
Nothing about Spark depends on a cluster. The Hadoop client libs are required as they are part of the API but there is no need to remove that if you aren't using YARN. Indeed you can't but they're just libs. On Sun, Nov 12, 2017, 9:36 PM wrote: > @Jörn Spark without Hadoop is

Re: Jenkins upgrade/Test Parallelization & Containerization

2017-11-08 Thread Sean Owen
u <hol...@pigscanfly.ca> > *Sent:* Tuesday, November 7, 2017 2:14:18 PM > *To:* Sean Owen > *Cc:* Xin Lu; dev@spark.apache.org > *Subject:* Re: Jenkins upgrade/Test Parallelization & Containerization > > True, I think we've seen that the Amp Lab Jenkins needs to be more focus

Re: Jenkins upgrade/Test Parallelization & Containerization

2017-11-07 Thread Sean Owen
Faster tests would be great. I recall that the straightforward ways to parallelize via Maven haven't worked because many tests collide with one another. Is this about running each module's tests in a container? that should work. I can see how this is becoming essential for repeatable and reliable

Re: Spark build is failing in amplab Jenkins

2017-11-04 Thread Sean Owen
Agree, seeing this somewhat regularly on the pull request builder. Do some machines inadvertently have Python 2.6? some builds succeed, so may just be one or a few. CC Shane. On Thu, Nov 2, 2017 at 5:39 PM Pralabh Kumar wrote: > Hi Dev > > Spark build is failing in

Re: Kicking off the process around Spark 2.2.1

2017-11-02 Thread Sean Owen
The feature freeze is "mid November" : http://spark.apache.org/versioning-policy.html Let's say... Nov 15? any body have a better date? Although it'd be nice to get 2.2.1 out sooner than later in all events, and kind of makes sense to get out first, they need not go in order. It just might be

Re: [Vote] SPIP: Continuous Processing Mode for Structured Streaming

2017-11-02 Thread Sean Owen
+0 simply because I don't feel I know enough to have an opinion. I have no reason to doubt the change though, from a skim through the doc. On Wed, Nov 1, 2017 at 3:37 PM Reynold Xin wrote: > Earlier I sent out a discussion thread for CP in Structured Streaming: > >

Re: Anyone knows how to build and spark on jdk9?

2017-10-27 Thread Sean Owen
Certainly, Scala 2.12 support precedes Java 9 support. A lot of the work is in place already, and the last issue is dealing with how Scala closures are now implemented quite different with lambdas / invokedynamic. This affects the ClosureCleaner. For the interested, this is as far as I know the

Re: Kicking off the process around Spark 2.2.1

2017-10-25 Thread Sean Owen
It would be reasonably consistent with the timing of other x.y.1 releases, and more release managers sounds useful, yeah. Note also that in theory the code freeze for 2.3.0 starts in about 2 weeks. On Wed, Oct 25, 2017 at 12:29 PM Holden Karau wrote: > Now that Spark

Raise Jenkins timeout?

2017-10-09 Thread Sean Owen
I'm seeing jobs killed regularly, presumably because the time out (210 minutes?) https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.6/3907/console Possibly related: this master-SBT-2.7 build hasn't passed in weeks:

Re: netlib-java not maintaned anymore?

2017-10-06 Thread Sean Owen
It doesn't stop working though, so I don't think it means we have to stop using it. It's a simple connector that doesn't depend on other stuff. Unless there's another better option, or until it no longer works, I think it's fine. On Fri, Oct 6, 2017 at 9:29 AM Takeshi Yamamuro

Re: Disabling Closed -> Reopened transition for non-committers

2017-10-05 Thread Sean Owen
as Resolved. >> >> I support this idea. I think don't think this is unfriendly as it sounds >> in practice. This case should be quite occasional I guess. >> >> >> 2017-10-05 20:02 GMT+09:00 Sean Owen <so...@cloudera.com>: >> >>>

Re: Disabling Closed -> Reopened transition for non-committers

2017-10-05 Thread Sean Owen
ption , where the links to wiki > entries are lists of "common causes of these networking issues" are a > checklist for everyone; the real facts of hostnames and ports are there for > tracking things down. The core Java io networking errors are without > meaningful information, so it's

Re: Disabling Closed -> Reopened transition for non-committers

2017-10-05 Thread Sean Owen
o be sure, this is only for JIRA and not for github PR, right? > > If then +1 but I think the access control on JIRA does not necessarily > match the committer list, and is manually maintained, last I hear. > > -- > *From:* Sean Owen <so...@cloudera.com

Re: Disabling Closed -> Reopened transition for non-committers

2017-10-04 Thread Sean Owen
gmail.com> wrote: > It can stop reopening, but new JIRA issues with duplicate content will be > created intentionally instead. > > Is that policy (privileged reopening) used in other Apache communities for > that purpose? > > > On Wed, Oct 4, 2017 at 7:06 PM, Sean Ow

Disabling Closed -> Reopened transition for non-committers

2017-10-04 Thread Sean Owen
We have this problem occasionally, where a disgruntled user continually reopens an issue after it's closed. https://issues.apache.org/jira/browse/SPARK-21999 (Feel free to comment on this one if anyone disagrees) Regardless of that particular JIRA, I'd like to disable to Closed -> Reopened

Re: Configuration docs pages are broken

2017-10-03 Thread Sean Owen
I think this was fixed in https://issues.apache.org/jira/browse/SPARK-21593 but the docs only go out with releases. It would be fixed when 2.2.1 goes out. On Tue, Oct 3, 2017 at 5:53 PM Nick Dimiduk wrote: > Heya, > > Looks like the Configuration sections of your docs, both

Re: [VOTE] Spark 2.1.2 (RC4)

2017-10-03 Thread Sean Owen
+1 same as last RC. Tests pass, sigs and hashes are OK. On Tue, Oct 3, 2017 at 7:24 AM Holden Karau wrote: > Please vote on releasing the following candidate as Apache Spark version 2 > .1.2. The vote is open until Saturday October 7th at 9:00 PST and passes > if a

Re: Should Flume integration be behind a profile?

2017-10-02 Thread Sean Owen
t;> On Sun, Oct 1, 2017 at 3:50 PM, Reynold Xin <r...@databricks.com> wrote: >> > Probably should do 1, and then it is an easier transition in 3.0. >> > >> > On Sun, Oct 1, 2017 at 1:28 AM Sean Owen <so...@cloudera.com> wrote: >> >> >> >> I tried an

Re: Should Flume integration be behind a profile?

2017-10-01 Thread Sean Owen
examples, deprecate 2. Put Flume behind a profile, remove examples, but don't deprecate 3. Punt until Spark 3.0, when this integration would probably be removed entirely (?) On Tue, Sep 26, 2017 at 10:36 AM Sean Owen <so...@cloudera.com> wrote: > Not a big deal, but I'm wondering whet

Re: Inclusion of Spark on SDKMAN

2017-09-28 Thread Sean Owen
I don't think Spark would ever distributed except through the ASF and mainstream channels like Maven Central, but you can redistribute the bits as-is as you like. This would be in line with the terms of the Apache license. On Thu, Sep 28, 2017 at 6:17 AM Marco Vermeulen

Re: [VOTE] Spark 2.1.2 (RC2)

2017-09-27 Thread Sean Owen
+1 I tested the source release. Hashes and signature (your signature) check out, project builds and tests pass with -Phadoop-2.7 -Pyarn -Phive -Pmesos on Debian 9. List of issues look good and there are no open issues at all for 2.1.2. Great work on improving the build process and docs. On

Should Flume integration be behind a profile?

2017-09-26 Thread Sean Owen
Not a big deal, but I'm wondering whether Flume integration should at least be opt-in and behind a profile? it still sees some use (at least on our end) but not applicable to the majority of users. Most other third-party framework integrations are behind a profile, like YARN, Mesos, Kinesis, Kafka

Re: [VOTE][SPIP] SPARK-21866 Image support in Apache Spark

2017-09-21 Thread Sean Owen
Am I right that this doesn't mean other packages would use this representation, but that they could? The representation looked fine to me w.r.t. what DL frameworks need. My previous comment was that this is actually quite lightweight. It's kind of like how I/O support is provided for CSV and

Re: A little Scala 2.12 help

2017-09-19 Thread Sean Owen
eConverter[T]. I'm working through this and other deprecated items in 2.12 and preparing more 2.11-compatible changes that allow these to work cleanly in 2.12. On Fri, Sep 15, 2017 at 11:21 AM Sean Owen <so...@cloudera.com> wrote: > I'm working on updating to Scala 2.12, and, have hit a c

Re: [VOTE] Spark 2.1.2 (RC1)

2017-09-17 Thread Sean Owen
: >>>> >>>>> Indeed it's limited to a people with login permissions on the Jenkins >>>>> host (and perhaps further limited, I'm not certain). Shane probably knows >>>>> more about the ACLs, so I'll ask him in the other thread for specifi

Signing releases with pwendell or release manager's key?

2017-09-15 Thread Sean Owen
Yeah I had meant to ask about that in the past. While I presume Patrick consents to this and all that, it does mean that anyone with access to said Jenkins scripts can create a signed Spark release, regardless of who they are. I haven't thought through whether that's a theoretical issue we can

A little Scala 2.12 help

2017-09-15 Thread Sean Owen
I'm working on updating to Scala 2.12, and, have hit a compile error in Scala 2.12 that I'm strugging to design a fix to (that doesn't modify the API significantly). If you "./dev/change-scala-version.sh 2.12" and compile, you'll see errors like... [error]

Re: [VOTE] Spark 2.1.2 (RC1)

2017-09-14 Thread Sean Owen
+1 Very nice. The sigs and hashes look fine, it builds fine for me on Debian Stretch with Java 8, yarn/hive/hadoop-2.7 profiles, and passes tests. Yes as you say, no outstanding issues except for this which doesn't look critical, as it's not a regression. SPARK-21985 PySpark PairDeserializer is

Re: [VOTE] Spark 2.1.2 (RC1)

2017-09-14 Thread Sean Owen
I think the search filter is OK, but for whatever reason the filter link includes what JIRA you're currently browsing, and that one is not actually included in the filter. It opens on a JIRA that's not included, but the search results look correct. project = SPARK AND fixVersion = 2.1.2 On Thu,

Re: What is d3kbcqa49mib13.cloudfront.net ?

2017-09-14 Thread Sean Owen
, Sep 13, 2017 at 10:26 AM, Shivaram Venkataraman >>> > <shiva...@eecs.berkeley.edu> wrote: >>> >> >>> >> The bucket comes from Cloudfront, a CDN thats part of AWS. There was a >>> >> bunch of discussion about this back in 2013 >

Re: What is d3kbcqa49mib13.cloudfront.net ?

2017-09-13 Thread Sean Owen
13 > > https://lists.apache.org/thread.html/9a72ff7ce913dd85a6b112b1b2de536dcda74b28b050f70646aba0ac@1380147885@%3Cdev.spark.apache.org%3E > > Shivaram > > On Wed, Sep 13, 2017 at 9:30 AM, Sean Owen <so...@cloudera.com> wrote: > > Not a big deal, but Mark noticed that this t

What is d3kbcqa49mib13.cloudfront.net ?

2017-09-13 Thread Sean Owen
Not a big deal, but Mark noticed that this test now downloads Spark artifacts from the same 'direct download' link available on the downloads page: https://github.com/apache/spark/blob/master/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala#L53

Re: 2.1.2 maintenance release?

2017-09-12 Thread Sean Owen
2017 9:27 AM >> Subject: Re: 2.1.2 maintenance release? >> To: Felix Cheung <felixcheun...@hotmail.com>, Holden Karau < >> hol...@pigscanfly.ca>, Sean Owen <so...@cloudera.com>, dev < >> dev@spark.apache.org> >> >> >> >> +1 as w

Re: Putting Kafka 0.8 behind an (opt-in) profile

2017-09-11 Thread Sean Owen
no non-deprecated support now? On Thu, Sep 7, 2017 at 10:32 AM Sean Owen <so...@cloudera.com> wrote: > For those following along, see discussions at > https://github.com/apache/spark/pull/19134 > > It's now also clear that we'd need to remove Kafka 0.8 examples if Kafka > 0.

CVE-2017-12612 Unsafe deserialization in Apache Spark launcher API

2017-09-08 Thread Sean Owen
Severity: Medium Vendor: The Apache Software Foundation Versions Affected: Versions of Apache Spark from 1.6.0 until 2.1.1 Description: In Apache Spark 1.6.0 until 2.1.1, the launcher API performs unsafe deserialization of data received by its socket. This makes applications launched

Re: 2.1.2 maintenance release?

2017-09-08 Thread Sean Owen
Let's look at the standard ASF guidance, which actually surprised me when I first read it: https://www.apache.org/foundation/voting.html VOTES ON PACKAGE RELEASES Votes on whether a package is ready to be released use majority approval -- i.e. at least three PMC members must vote affirmatively

Re: Putting Kafka 0.8 behind an (opt-in) profile

2017-09-07 Thread Sean Owen
, 2017 at 3:00 PM Cody Koeninger <c...@koeninger.org> wrote: > I kind of doubt the kafka 0.10 integration is going to change much at > all before the upgrade to 0.11 > > On Wed, Sep 6, 2017 at 8:57 AM, Sean Owen <so...@cloudera.com> wrote: > > Thanks, I can do that. We

2.1.2 maintenance release?

2017-09-07 Thread Sean Owen
In a separate conversation about bugs and a security issue fixed in 2.1.x and 2.0.x, Marcelo suggested it could be time for a maintenance release. I'm not sure what our stance on 2.0.x is, but 2.1.2 seems like it could be valuable to release. Thoughts? I believe Holden had expressed interest in

Re: Putting Kafka 0.8 behind an (opt-in) profile

2017-09-06 Thread Sean Owen
> +1 to going ahead and giving a deprecation warning now > > On Tue, Sep 5, 2017 at 6:39 AM, Sean Owen <so...@cloudera.com> wrote: > > On the road to Scala 2.12, we'll need to make Kafka 0.8 support optional > in > > the build, because it is not available for Sc

Re: descriptions not updated in "Useful Developer Tools" of the website

2017-09-06 Thread Sean Owen
True, can you make a pull request vs github.com/apache/spark-website? I think users probably have to add this to spark.{executor|driver}.extraJavaOptions On Wed, Sep 6, 2017 at 8:08 AM Takeshi Yamamuro wrote: > hi, devs, > > I found some descriptions were not updated in >

Putting Kafka 0.8 behind an (opt-in) profile

2017-09-05 Thread Sean Owen
On the road to Scala 2.12, we'll need to make Kafka 0.8 support optional in the build, because it is not available for Scala 2.12. https://github.com/apache/spark/pull/19134 adds that profile. I mention it because this means that Kafka 0.8 becomes "opt-in" and has to be explicitly enabled, and

Re: Moving Scala 2.12 forward one step

2017-09-01 Thread Sean Owen
hat Spark isn’t supporting 2.12 > soon enough. I thought SPARK-14220 was blocked mainly because the changes > are hard, but if not, maybe we can release such a branch sooner. > > Matei > > > On Aug 31, 2017, at 3:59 AM, Sean Owen <so...@cloudera.com> wrote: > >

Re: Moving Scala 2.12 forward one step

2017-08-31 Thread Sean Owen
wrote: > Hi Sean, > > Do we have a planned target version for Scala 2.12 support? Several other > projects like Zeppelin, Livy which rely on Spark repl also require changes > to support this Scala 2.12. > > Thanks > Jerry > > On Thu, Aug 31, 2017 at 5:55 PM,

Re: Moving Scala 2.12 forward one step

2017-08-31 Thread Sean Owen
ny improvements in benchmarks? > > > On 31 August 2017 at 12:25, Sean Owen <so...@cloudera.com> wrote: > >> Calling attention to the question of Scala 2.12 again for moment. I'd >> like to make a modest step towards support. Have a look again, if you >> would, at SPAR

Moving Scala 2.12 forward one step

2017-08-31 Thread Sean Owen
Calling attention to the question of Scala 2.12 again for moment. I'd like to make a modest step towards support. Have a look again, if you would, at SPARK-14280: https://github.com/apache/spark/pull/18645 This is a lot of the change for 2.12 that doesn't break 2.11, and really doesn't add any

Are there multiple processes out there running JIRA <-> Github maintenance tasks?

2017-08-28 Thread Sean Owen
Like whatever reassigns JIRAs after a PR is closed? It seems to be going crazy, or maybe there are many running. Not sure who owns that, but can he/she take a look?

Re: Thoughts on release cadence?

2017-08-24 Thread Sean Owen
are every 4 months. > > > Tom > On Monday, July 31, 2017, 2:23:10 PM CDT, Sean Owen <so...@cloudera.com> > wrote: > > > Done at https://spark.apache.org/versioning-policy.html > > On Mon, Jul 31, 2017 at 6:22 PM Reynold Xin <r...@databricks.com> wrote: &g

Re: Use Apache ORC in Apache Spark 2.3

2017-08-11 Thread Sean Owen
-private@ list for future replies. This is not a PMC conversation. On Fri, Aug 11, 2017 at 3:17 AM Andrew Ash wrote: > @Reynold no I don't use the HiveCatalog -- I'm using a custom > implementation of ExternalCatalog instead. > > On Thu, Aug 10, 2017 at 3:34 PM, Dong Joon

Re: Some PRs not automatically linked to JIRAs

2017-08-02 Thread Sean Owen
Hyukjin mentioned this here earlier today and had run it manually, but yeah I'm not sure where it normally runs or why it hasn't. Shane not sure if you're the person to ask? On Wed, Aug 2, 2017 at 7:47 PM Bryan Cutler wrote: > Hi Devs, > > I've noticed a couple PRs recently

Re: [VOTE] [SPIP] SPARK-18085: Better History Server scalability

2017-08-01 Thread Sean Owen
(Direct link to design doc, linked from JIRA) https://issues.apache.org/jira/browse/SPARK-18085 https://issues.apache.org/jira/secure/attachment/12835040/spark_hs_next_gen.pdf I know Marcelo has looked closely at this issue for a long while and trust his judgment about what needs to be fixed, and

Re: Thoughts on release cadence?

2017-07-31 Thread Sean Owen
Done at https://spark.apache.org/versioning-policy.html On Mon, Jul 31, 2017 at 6:22 PM Reynold Xin <r...@databricks.com> wrote: > We can just say release in December, and code freeze mid Nov? > > On Mon, Jul 31, 2017 at 10:14 AM, Sean Owen <so...@cloudera.com> wrot

Re: Thoughts on release cadence?

2017-07-31 Thread Sean Owen
gt; > On Sun, Jul 30, 2017 at 3:34 PM, Reynold Xin <r...@databricks.com> wrote: > >> This is reasonable ... +1 >> >> >> On Sun, Jul 30, 2017 at 2:19 AM, Sean Owen <so...@cloudera.com> wrote: >> >>> The project had traditionally posted some gui

Thoughts on release cadence?

2017-07-30 Thread Sean Owen
The project had traditionally posted some guidance about upcoming releases. The last release cycle was about 6 months. What about penciling in December 2017 for 2.3.0? http://spark.apache.org/versioning-policy.html

Re: Tests failing with run-tests.py SyntaxError

2017-07-28 Thread Sean Owen
sure before - > https://issues.apache.org/jira/browse/SPARK-20149 > > On 28 Jul 2017 9:56 pm, "Sean Owen" <so...@cloudera.com> wrote: > >> File "./dev/run-tests.py", line 124 >> {m: set(m.dependencies).intersec

Tests failing with run-tests.py SyntaxError

2017-07-28 Thread Sean Owen
File "./dev/run-tests.py", line 124 {m: set(m.dependencies).intersection(modules_to_test) for m in modules_to_test}, sort=True) ^ SyntaxError: invalid syntax It seems like tests are failing intermittently with this type of error,

Re: 2.2.0 under Unreleased Versions in JIRA?

2017-07-16 Thread Sean Owen
Done, it just needed to be marked as released. On Sun, Jul 16, 2017 at 12:03 PM Jacek Laskowski wrote: > Hi, > > Just noticed that 2.2.0 label is under Unreleased Versions in JIRA. > Since it's out, I think 2.2.1 and 2.3.0 are valid only. Correct? > > Pozdrawiam, > Jacek

Re: Testing Apache Spark with JDK 9 Early Access builds

2017-07-14 Thread Sean Owen
IIRC Scala 2.11 doesn't work on Java 9, and full support will come in 2.13. I think that may be the biggest gating factor for Spark. At least, we can get going on 2.12 support now that 2.10 support is dropped. On Fri, Jul 14, 2017 at 5:23 PM Matei Zaharia wrote: > FYI,

Re: Crowdsourced triage Scapegoat compiler plugin warnings

2017-07-13 Thread Sean Owen
h sharing this with the user list, there must be people > willing to collaborate who are not on the dev list. > > 2017-07-13 10:00 GMT+01:00 Sean Owen <so...@cloudera.com>: > >> I don't think everything needs to be triaged. There are a ton of useful >> changes that have been

Re: Crowdsourced triage Scapegoat compiler plugin warnings

2017-07-13 Thread Sean Owen
I don't think everything needs to be triaged. There are a ton of useful changes that have been identified. I think you could just pick some warning types where they've all been triaged and go fix them. On Thu, Jul 13, 2017 at 9:16 AM Hyukjin Kwon wrote: > Hi all, > > >

CVE-2017-7678 Apache Spark XSS web UI MHTML vulnerability

2017-07-12 Thread Sean Owen
Severity: Low Vendor: The Apache Software Foundation Versions Affected: Versions of Apache Spark before 2.2.0 Description: It is possible for an attacker to take advantage of a user's trust in the server to trick them into visiting a link that points to a shared Spark cluster and submits data

Re: [VOTE] Apache Spark 2.2.0 (RC6)

2017-07-01 Thread Sean Owen
+1 binding. Same as last time. All tests pass with -Phive -Phadoop-2.7 -Pyarn, all sigs and licenses look OK. We have one issue opened yesterday for 2.2.0: https://issues.apache.org/jira/browse/SPARK-21267 I assume this isn't really meant to be in this release, and sounds non-essential, so OK.

Re: Is there something wrong with jenkins?

2017-06-26 Thread Sean Owen
The Arrow change broke the build: https://github.com/apache/spark/pull/15821#issuecomment-310894657 Do we need to revert this? I don't want to but it's also blocking testing. On Mon, Jun 26, 2017 at 12:19 PM Yuming Wang wrote: > Hi All, > > Is there something wrong with

Re: Question on Spark code

2017-06-25 Thread Sean Owen
rg/slf4j/simple/SimpleLogger.java#L599 > > Please correct me if I am wrong. > > > > > On Sun, Jun 25, 2017 at 3:04 AM, Sean Owen <so...@cloudera.com> wrote: > >> Maybe you are looking for declarations like this. "=> String" means the >> arg i

Re: Question on Spark code

2017-06-25 Thread Sean Owen
Maybe you are looking for declarations like this. "=> String" means the arg isn't evaluated until it's used, which is just what you want with log statements. The message isn't constructed unless it will be logged. protected def logInfo(msg: => String) { On Sun, Jun 25, 2017 at 10:28 AM kant

Re: [VOTE] Apache Spark 2.2.0 (RC5)

2017-06-21 Thread Sean Owen
+1 Sigs/hashes look good. Tests pass on Java 8 / Ubuntu 17 with -Pyarn -Phive -Phadoop-2.7 for me. The only open issues for 2.2.0 are: SPARK-21144 Unexpected results when the data schema and partition schema have the duplicate columns SPARK-18267 Distribute PySpark via Python Package Index

Question: why is Externalizable used?

2017-06-19 Thread Sean Owen
Just wanted to call attention to this question, mostly because I'm curious: https://github.com/apache/spark/pull/18343#issuecomment-309388668 Why is Externalizable (+ KryoSerializable) used instead of Serializable? and should the first two always go together?

Re: Crowdsourced triage Scapegoat compiler plugin warnings

2017-06-17 Thread Sean Owen
Looks like a whole lot of the results have been analyzed. I suspect there's more than enough to act on already. I think we should wait until after 2.2 is done. Anybody prefer how to proceed here -- just open a JIRA to take care of a batch of related types of issues and go for it? On Sat, Jun 17,

Re: the dependence length of RDD, can its size be greater than 1 pleaae?

2017-06-15 Thread Sean Owen
Yes. Imagine an RDD that results from a union of other RDDs. On Thu, Jun 15, 2017, 09:11 萝卜丝炒饭 <1427357...@qq.com> wrote: > Hi all, > > The RDD code keeps a member as below: > dependencies_ : seq[Dependency[_]] > > It is a seq, that means it can keep more than one dependency. > > I have an issue

Re: [apache/spark] [TEST][SPARKR][CORE] Fix broken SparkSubmitSuite (#18283)

2017-06-15 Thread Sean Owen
Cc Shane? On Thu, Jun 15, 2017, 08:39 Felix Cheung wrote: > I guess that script can be changed to use JAVA_HOME instead of blindly > assume it's accessible... > are these new machines in Jenkins? > > — > You are receiving this because you were mentioned. > Reply to

SBT / PR builder builds failing on "include an external JAR in SparkR"

2017-06-12 Thread Sean Owen
I noticed the PR builder builds are all failing with: [info] - correctly builds R packages included in a jar with --packages !!! IGNORED !!! [info] - include an external JAR in SparkR *** FAILED *** (32 milliseconds) [info] new java.io.File(rScriptDir).exists() was false

Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-09 Thread Sean Owen
Different errors as in https://issues.apache.org/jira/browse/SPARK-20520 but that's also reporting R test failures. I went back and tried to run the R tests and they passed, at least on Ubuntu 17 / R 3.3. On Fri, Jun 9, 2017 at 9:12 AM Nick Pentreath wrote: > All

Re: Are release docs part of a release?

2017-06-09 Thread Sean Owen
t you're quoting > from are the source code and convenience binaries. There's definitely room > for interpretation here, but I don't think it would be a problem as long as > we do something reasonable. > > On Tue, Jun 6, 2017 at 2:15 AM, Sean Owen <so...@cloudera.com> wrote: >

Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-08 Thread Sean Owen
e the only remaining 2.2 issues, FYI: SPARK-20520 R streaming tests failed on Windows SPARK-15799 Release SparkR on CRAN SPARK-18267 Distribute PySpark via Python Package Index (pypi) On Tue, Jun 6, 2017 at 12:20 AM Sean Owen <so...@cloudera.com> wrote: > On the latest Ubuntu, Java 8, with

Re: Is static volatile variable different with static variable in the closure?

2017-06-07 Thread Sean Owen
static and volatile are unrelated. Being volatile doesn't change the properties of the variable with respect to being static. On Wed, Jun 7, 2017 at 4:01 PM Chang Chen wrote: > Static variable will be initialized in worker node JVM, will not be > serialized from master.

Are release docs part of a release?

2017-06-06 Thread Sean Owen
That's good, but, I think we should agree on whether release docs are part of a release. It's important to reasoning about releases. To be clear, you're suggesting that, say, right now you are OK with updating this page with a few more paragraphs?

Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-06 Thread Sean Owen
On Tue, Jun 6, 2017 at 1:06 AM Michael Armbrust wrote: > Regarding the readiness of this and previous RCs. I did cut RC1 & RC2 > knowing that they were unlikely to pass. That said, I still think these > early RCs are valuable. I know several users that wanted to test

Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-05 Thread Sean Owen
On the latest Ubuntu, Java 8, with -Phive -Phadoop-2.7 -Pyarn, this passes all tests. It's looking good, pending a double-check on the outstanding JIRA questions. All the hashes and sigs are correct. On Mon, Jun 5, 2017 at 8:15 PM Michael Armbrust wrote: > Please vote

Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-05 Thread Sean Owen
h options long term if this vote passes. Looks like the > remaining JIRAs are doc/website updates that can happen after the vote or > QA that should be done on this RC. I think we are ready to start testing > this release seriously! > > On Mon, Jun 5, 2017 at 12:40 PM, Sean O

Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-05 Thread Sean Owen
Xiao opened a blocker on 2.2.0 this morning: SPARK-20980 Rename the option `wholeFile` to `multiLine` for JSON and CSV I don't see that this should block? We still have 7 Critical issues: SPARK-20520 R streaming tests failed on Windows SPARK-20512 SparkR 2.2 QA: Programming guide, migration

Re: [build system] jenkins got itself wedged...

2017-05-18 Thread Sean Owen
I'm not sure if it's related, but I still can't get Jenkins to test PRs. For example, triggering it through the spark-prs.appspot.com UI gives me... https://spark-prs.appspot.com/trigger-jenkins/18012 Internal Server Error That might be from the appspot app though? But posting "Jenkins test this

Re: [VOTE] Apache Spark 2.2.0 (RC2)

2017-05-04 Thread Sean Owen
The tests pass, licenses are OK, sigs, etc. I'd endorse it but we do still have blockers, so I assume people mean we need there will be another RC at some point. Blocker SPARK-20503 ML 2.2 QA: API: Python API coverage SPARK-20501 ML, Graph 2.2 QA: API: New Scala APIs, docs SPARK-20502 ML, Graph

Re: [VOTE] Apache Spark 2.2.0 (RC1)

2017-05-01 Thread Sean Owen
en it hits avro 1.7.7 on the > classpath. Avro 1.8.0 is not binary compatible with 1.7.7. > > [0] - https://issues.apache.org/jira/browse/SPARK-19697 > [1] - > https://github.com/apache/parquet-mr/blob/apache-parquet-1.8.2/pom.xml#L96 > > On Sun, Apr 30, 2017 at 3:28 AM, Sean Owen &

Re: [VOTE] Apache Spark 2.2.0 (RC1)

2017-04-30 Thread Sean Owen
I have one more issue that, if it needs to be fixed, needs to be fixed for 2.2.0. I'm fixing build warnings for the release and noticed that checkstyle actually complains there are some Java methods named in TitleCase, like `ProcessingTimeTimeout`:

Re: [VOTE] Apache Spark 2.2.0 (RC1)

2017-04-27 Thread Sean Owen
By the way the RC looks good. Sigs and license are OK, tests pass with -Phive -Pyarn -Phadoop-2.7. +1 from me. On Thu, Apr 27, 2017 at 7:31 PM Michael Armbrust wrote: > Please vote on releasing the following candidate as Apache Spark version > 2.2.0. The vote is open

Re: [VOTE] Apache Spark 2.2.0 (RC1)

2017-04-27 Thread Sean Owen
on, which I don't think needs to >> block testing on an RC (and in fact probably needs an RC to test?). >> Joseph, please correct me if I'm wrong. It is unlikely this first RC is >> going to pass, but I wanted to get the ball rolling on testing 2.2. >> >> On Thu, Apr 27,

<    4   5   6   7   8   9   10   11   12   13   >