Re: [VOTE] Release Spark 3.2.1 (RC2)

2022-01-24 Thread Michael Heuer
+1 (non-binding) michael > On Jan 24, 2022, at 7:30 AM, Gengliang Wang wrote: > > +1 (non-binding) > > On Mon, Jan 24, 2022 at 6:26 PM Dongjoon Hyun > wrote: > +1 > > Dongjoon. > > On Sat, Jan 22, 2022 at 7:19 AM Mridul Muralidharan

Re: [VOTE] Release Spark 3.2.0 (RC7)

2021-10-06 Thread Michael Heuer
+1 (non-binding) michael > On Oct 6, 2021, at 11:49 AM, Gengliang Wang wrote: > > Starting with my +1(non-binding) > > Thanks, > Gengliang > > On Thu, Oct 7, 2021 at 12:48 AM Gengliang Wang > wrote: > Please vote on releasing the following candidate as Apache

Re: [VOTE] Release Spark 3.2.0 (RC6)

2021-09-28 Thread Michael Heuer
+1 (non-bindng) Works for us, as with previous RCs. michael > On Sep 28, 2021, at 10:45 AM, Gengliang Wang wrote: > > Starting with my +1(non-binding) > > Thanks, > Gengliang > > On Tue, Sep 28, 2021 at 11:45 PM Gengliang Wang > wrote: > Please vote on

Re: [VOTE] Release Spark 3.2.0 (RC3)

2021-09-20 Thread Michael Heuer
+1 (non-binding) Spark 3.2.0 RC3, Parquet 1.12.1, and Avro 1.10.2 together removes the need for various conflict-preventing workarounds we've needed to maintain for several years. Cheers, michael > On Sep 18, 2021, at 10:18 PM, Gengliang Wang wrote: > > Please vote on releasing the

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-22 Thread Michael Heuer
ould well explain it if > you're exposed to the Spark classpath and have your own different Jackson dep. > > On Sun, Aug 22, 2021 at 1:21 PM Michael Heuer <mailto:heue...@gmail.com>> wrote: > We're seeing runtime classpath issues with Avro 1.10.2, Parquet 1.12.0, an

Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-22 Thread Michael Heuer
We're seeing runtime classpath issues with Avro 1.10.2, Parquet 1.12.0, and Spark 3.2.0 RC1. Our dependency tree is deep though, and will require further investigation. https://github.com/bigdatagenomics/adam/pull/2289 $ mvn test ... *** RUN

Re: Recovering SparkR on CRAN?

2020-12-22 Thread Michael Heuer
ilto:felixche...@apache.org> FYI > > 2020년 12월 23일 (수) 오전 9:22, Michael Heuer <mailto:heue...@gmail.com>>님이 작성: > Anecdotally, as a project downstream of Spark, we've been prevented from > pushing to CRAN because of this > > https://github.com/bigdatagenomics/ada

Re: Recovering SparkR on CRAN?

2020-12-22 Thread Michael Heuer
Anecdotally, as a project downstream of Spark, we've been prevented from pushing to CRAN because of this https://github.com/bigdatagenomics/adam/issues/1851 We've given up and marked as WontFix. michael > On Dec 22, 2020, at 5:14 PM,

Re: [VOTE] Amend Spark's Semantic Versioning Policy

2020-03-09 Thread Michael Heuer
+1 (non-binding) I am disappointed however that this only mentions API and not dependencies and transitive dependencies. As Spark does not provide separation between its runtime classpath and the classpath used by applications, I believe Spark's dependencies and transitive dependencies should

Re: Spark 2.4.5 release for Parquet and Avro dependency updates?

2019-11-22 Thread Michael Heuer
larify, I don't think that Parquet 1.10.1 to 1.11.0 is a > runtime-incompatible change. The example mixed 1.11.0 and 1.10.1 in the same > execution. > > Michael, please be more careful about announcing compatibility problems in > other communities. If you've observed proble

Spark 2.4.5 release for Parquet and Avro dependency updates?

2019-11-22 Thread Michael Heuer
Hello, Avro 1.8.2 to 1.9.1 is a binary incompatible update, and it appears that Parquet 1.10.1 to 1.11 will be a runtime-incompatible update (see thread on dev@parquet ).

Re: Thoughts on Spark 3 release, or a preview release

2019-09-16 Thread Michael Heuer
gjoon.h...@gmail.com>>님이 작성: >> Thank you, Sean. >> >> I'm also +1 for the following three. >> >> 1. Start to ramp down (by the official branch-3.0 cut) >> 2. Apache Spark 3.0.0-preview in 2019 >> 3. Apache Spark 3.0.0 in early 2020 >>

Re: Thoughts on Spark 3 release, or a preview release

2019-09-11 Thread Michael Heuer
I would love to see Spark + Hadoop + Parquet + Avro compatibility problems resolved, e.g. https://issues.apache.org/jira/browse/SPARK-25588 https://issues.apache.org/jira/browse/SPARK-27781

Re: JDK11 Support in Apache Spark

2019-08-26 Thread Michael Heuer
That is not true for any downstream users who also provide a library. Whatever build mess you create in Apache Spark, we'll have to inherit it. ;) michael > On Aug 26, 2019, at 12:32 PM, Dongjoon Hyun wrote: > > As Shane wrote, not yet. > > `one build for works for both` is our

Re: Hadoop version(s) compatible with spark-2.4.3-bin-without-hadoop-scala-2.12

2019-05-21 Thread Michael Heuer
add avro-1.8.2.jar and jline-2.14.6.jar to jars folder. i believe these > > jars missing in provided profile is simply a mistake. > > > > best, > > koert > > > > On Mon, May 20, 2019 at 3:37 PM Michael Heuer > <mailto:heue...@gmail.com>> wrote: &g

Hadoop version(s) compatible with spark-2.4.3-bin-without-hadoop-scala-2.12

2019-05-20 Thread Michael Heuer
Hello, Which Hadoop version or versions are compatible with Spark 2.4.3 and Scala 2.12? The binary distribution spark-2.4.3-bin-without-hadoop-scala-2.12.tgz is missing avro-1.8.2.jar, so when attempting to run with Hadoop 2.7.7 there are classpath conflicts at runtime, as Hadoop 2.7.7

re: vote thread for Avro 1.9.0-RC4

2019-05-13 Thread Michael Heuer
All, FYI, in case you are not also subscribed to d...@avro.apache.org, there is a vote thread currently in progress for Avro version 1.9.0-RC4, which is binary and source incompatible with version 1.8.2 http://people.apache.org/~busbey/avro/1.9.0-RC4/1.8.2_to_1.9.0RC4_compat_report.html

Re: [VOTE] Release Apache Spark 2.4.3

2019-05-01 Thread Michael Heuer
+1 (non-binding) The binary release files are correctly built with Scala 2.11.12. Thank you, michael > On May 1, 2019, at 9:39 AM, Xiao Li wrote: > > Please vote on releasing the following candidate as Apache Spark version > 2.4.3. > > The vote is open until May 5th PST and passes if a

Re: [VOTE] Release Apache Spark 2.4.2

2019-04-26 Thread Michael Heuer
it > beyond what I mentioned below? you have a project with interdependent Python > and Scala components? > > On Fri, Apr 26, 2019 at 11:02 AM Michael Heuer <mailto:heue...@gmail.com>> wrote: > We certainly can't be the only project downstream of Spark that includes

Re: [VOTE] Release Apache Spark 2.4.2

2019-04-26 Thread Michael Heuer
e... just Pyspark apps that > are using a Scala-based library? Trying to make sure we understand what is > and isn't a problem here. > > On Fri, Apr 26, 2019 at 9:44 AM Michael Heuer <mailto:heue...@gmail.com>> wrote: > This will also cause problems in Conda builds

Re: [VOTE] Release Apache Spark 2.4.2

2019-04-26 Thread Michael Heuer
This will also cause problems in Conda builds that depend on pyspark https://anaconda.org/conda-forge/pyspark and Homebrew builds that depend on apache-spark, as that also uses the binary distribution.

Re: Spark 2.4.2

2019-04-18 Thread Michael Heuer
+100 > On Apr 18, 2019, at 1:48 AM, Reynold Xin wrote: > > We should have shaded all Spark’s dependencies :( > > On Wed, Apr 17, 2019 at 11:47 PM Sean Owen > wrote: > For users that would inherit Jackson and use it directly, or whose > dependencies do. Spark itself

Re: [VOTE] Release Apache Spark 2.4.1 (RC6)

2019-03-10 Thread Michael Heuer
I'm not saying that this issue should be a blocker for 2.4.1, rather I'm looking for help moving things along. I'm not a committer in any of the Spark, Parquet, or Avro projects. > On Mar 10, 2019, at 8:53 PM, Sean Owen wrote: > > From https://issues.apache.org/jira/browse/SPARK-25588, I'm

Re: [VOTE] Release Apache Spark 2.4.1 (RC6)

2019-03-10 Thread Michael Heuer
Any chance we could get some movement on this for 2.4.1? https://issues.apache.org/jira/browse/SPARK-25588 https://github.com/apache/parquet-mr/pull/560 It would require a new Parquet release,

Re: [VOTE] SPARK 2.4.0 (RC2)

2018-10-01 Thread Michael Heuer
FYI I’ve open two new issues against 2.4.0 rc2 https://issues.apache.org/jira/browse/SPARK-25587 https://issues.apache.org/jira/browse/SPARK-25588 that are regressions against 2.3.1, and may

Re: [VOTE] SPARK 2.3.2 (RC6)

2018-09-21 Thread Michael Heuer
+1 (non-binding) Bumping our build to 2.3.2 rc6 and Avro to 1.8.2 and Parquet to 1.8.3 works for us, running on version 2.3.2 rc6 and older Spark versions. https://github.com/bigdatagenomics/adam/pull/2055 michael On Thu, Sep 20, 2018 at 7:09 PM, Ryan Blue wrote: > Changing my vote to +1

Re: SparkR was removed from CRAN on 2018-05-01

2018-05-29 Thread Michael Heuer
A friendly request to please be transparent about the changes being requested and how those are addressed. As a downstream library that would like to get into CRAN, it is hard when upstream comes and goes https://github.com/bigdatagenomics/adam/issues/1851 On Tue, May 29, 2018 at 1:52 PM,

Re: Please keep s3://spark-related-packages/ alive

2018-02-27 Thread Michael Heuer
On Tue, Feb 27, 2018 at 8:17 AM, Sean Owen wrote: > See http://apache-spark-developers-list.1001551.n3.nabble.com/What-is- > d3kbcqa49mib13-cloudfront-net-td22427.html -- it was 'retired', yes. > > Agree with all that, though they're intended for occasional individual use > and

Re: [VOTE] Spark 2.3.0 (RC2)

2018-02-01 Thread Michael Heuer
We found two classes new to Spark 2.3.0 that must be registered in Kryo for our tests to pass on RC2 org.apache.spark.sql.execution.datasources.BasicWriteTaskStats org.apache.spark.sql.execution.datasources.ExecutedWriteSummary https://github.com/bigdatagenomics/adam/pull/1897 Perhaps a mention

Re: [VOTE] Apache Spark 2.2.0 (RC1)

2017-05-01 Thread Michael Heuer
;>> >>>> Frank Austin Nothaft >>>> fnoth...@berkeley.edu >>>> fnoth...@eecs.berkeley.edu >>>> 202-340-0466 <(202)%20340-0466> >>>> >>>> On May 1, 2017, at 10:02 AM, Ryan Blue <rb...@netflix.com.INVALID &

Re: [VOTE] Apache Spark 2.2.0 (RC1)

2017-05-01 Thread Michael Heuer
Version 2.2.0 bumps the dependency version for parquet to 1.8.2 but does not bump the dependency version for avro (currently at 1.7.7). Though perhaps not clear from the issue I reported [0], this means that Spark is internally inconsistent, in that a call through parquet (which depends on avro

Re: [VOTE] Release Apache Parquet 1.8.2 RC1

2017-01-24 Thread Michael Heuer
Per comment https://github.com/bigdatagenomics/adam/pull/1360#issuecomment-274681650 and Jenkins failure https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1757/HADOOP_VERSION=2.6.0,SCALAVER=2.10,SPARK_VERSION=1.5.2,label=centos/ when bumping our build to 1.8.2-rc1 unit tests succeed but we

Re: Spark 1.x/2.x qualifiers in downstream artifact names

2016-09-16 Thread Michael Heuer
cribed below. Relevant pull requests: https://github.com/bigdatagenomics/adam/pull/1123 https://github.com/bigdatagenomics/utils/pull/78 Thanks! michael > > On Wed, Aug 24, 2016 at 6:02 PM, Michael Heuer <heue...@gmail.com> wrote: > > Have you seen any successful applications

Re: Spark 1.x/2.x qualifiers in downstream artifact names

2016-08-24 Thread Michael Heuer
aven.apache.org/pom.html > > It has been used to ship code for Hadoop 1 vs 2 APIs. > > In a way it's the same idea as Scala's "_2.xx" naming convention, with > a less unfortunate implementation. > > > On Wed, Aug 24, 2016 at 5:41 PM, Michael Heuer <heue...@

Re: Spark 1.x/2.x qualifiers in downstream artifact names

2016-08-24 Thread Michael Heuer
park 2.x > and Scala 2.10 & 2.11 > > On Wed, Aug 24, 2016 at 9:41 AM, Michael Heuer <heue...@gmail.com> wrote: > >> Hello, >> >> We're a project downstream of Spark and need to provide separate >> artifacts for Spark 1.x and Spark 2.x. Has

Spark 1.x/2.x qualifiers in downstream artifact names

2016-08-24 Thread Michael Heuer
Hello, We're a project downstream of Spark and need to provide separate artifacts for Spark 1.x and Spark 2.x. Has any convention been established or even proposed for artifact names and/or qualifiers? We are currently thinking org.bdgenomics.adam:adam-{core,apis,cli}_2.1[0,1] for Spark 1.x