Jan 31 is pretty reasonable. Happy Holidays! Xiao
On Tue, Dec 24, 2019 at 5:52 AM Sean Owen <sro...@gmail.com> wrote: > Yep, always happens. Is earlier realistic, like Jan 15? it's all arbitrary > but indeed this has been in progress for a while, and there's a downside to > not releasing it, to making the gap to 3.0 larger. > On my end I don't know of anything that's holding up a release; is it > basically DSv2? > > BTW these are the items still targeted to 3.0.0, some of which may not > have been legitimately tagged. It may be worth reviewing what's still open > and necessary, and what should be untargeted. > > SPARK-29768 nondeterministic expression fails column pruning > SPARK-29345 Add an API that allows a user to define and observe arbitrary > metrics on streaming queries > SPARK-29348 Add observable metrics > SPARK-29429 Support Prometheus monitoring natively > SPARK-29577 Implement p-value simulation and unit tests for chi2 test > SPARK-28900 Test Pyspark, SparkR on JDK 11 with run-tests > SPARK-28883 Fix a flaky test: ThriftServerQueryTestSuite > SPARK-28717 Update SQL ALTER TABLE RENAME to use TableCatalog API > SPARK-28588 Build a SQL reference doc > SPARK-28629 Capture the missing rules in HiveSessionStateBuilder > SPARK-28684 Hive module support JDK 11 > SPARK-28548 explain() shows wrong result for persisted DataFrames after > some operations > SPARK-28264 Revisiting Python / pandas UDF > SPARK-28301 fix the behavior of table name resolution with multi-catalog > SPARK-28155 do not leak SaveMode to file source v2 > SPARK-28103 Cannot infer filters from union table with empty local > relation table properly > SPARK-27986 Support Aggregate Expressions with filter > SPARK-28024 Incorrect numeric values when out of range > SPARK-27936 Support local dependency uploading from --py-files > SPARK-27780 Shuffle server & client should be versioned to enable smoother > upgrade > SPARK-27714 Support Join Reorder based on Genetic Algorithm when the # of > joined tables > 12 > SPARK-27471 Reorganize public v2 catalog API > SPARK-27520 Introduce a global config system to replace hadoopConfiguration > SPARK-24625 put all the backward compatible behavior change configs under > spark.sql.legacy.* > SPARK-24941 Add RDDBarrier.coalesce() function > SPARK-25017 Add test suite for ContextBarrierState > SPARK-25083 remove the type erasure hack in data source scan > SPARK-25383 Image data source supports sample pushdown > SPARK-27272 Enable blacklisting of node/executor on fetch failures by > default > SPARK-27296 Efficient User Defined Aggregators > SPARK-25128 multiple simultaneous job submissions against k8s backend > cause driver pods to hang > SPARK-26664 Make DecimalType's minimum adjusted scale configurable > SPARK-21559 Remove Mesos fine-grained mode > SPARK-24942 Improve cluster resource management with jobs containing > barrier stage > SPARK-25914 Separate projection from grouping and aggregate in logical > Aggregate > SPARK-20964 Make some keywords reserved along with the ANSI/SQL standard > SPARK-26221 Improve Spark SQL instrumentation and metrics > SPARK-26425 Add more constraint checks in file streaming source to avoid > checkpoint corruption > SPARK-25843 Redesign rangeBetween API > SPARK-25841 Redesign window function rangeBetween API > SPARK-25752 Add trait to easily whitelist logical operators that produce > named output from CleanupAliases > SPARK-25640 Clarify/Improve EvalType for grouped aggregate and window > aggregate > SPARK-25531 new write APIs for data source v2 > SPARK-25547 Pluggable jdbc connection factory > SPARK-20845 Support specification of column names in INSERT INTO > SPARK-24724 Discuss necessary info and access in barrier mode + Kubernetes > SPARK-24725 Discuss necessary info and access in barrier mode + Mesos > SPARK-25074 Implement maxNumConcurrentTasks() in > MesosFineGrainedSchedulerBackend > SPARK-23710 Upgrade the built-in Hive to 2.3.5 for hadoop-3.2 > SPARK-25186 Stabilize Data Source V2 API > SPARK-25376 Scenarios we should handle but missed in 2.4 for barrier > execution mode > SPARK-7768 Make user-defined type (UDT) API public > SPARK-14922 Alter Table Drop Partition Using Predicate-based Partition Spec > SPARK-15694 Implement ScriptTransformation in sql/core > SPARK-18134 SQL: MapType in Group BY and Joins not working > SPARK-19842 Informational Referential Integrity Constraints Support in > Spark > SPARK-22231 Support of map, filter, withColumn, dropColumn in nested list > of structures > SPARK-22386 Data Source V2 improvements > SPARK-24723 Discuss necessary info and access in barrier mode + YARN > > > On Mon, Dec 23, 2019 at 5:48 PM Reynold Xin <r...@databricks.com> wrote: > >> We've pushed out 3.0 multiple times. The latest release window documented >> on the website <http://spark.apache.org/versioning-policy.html> says >> we'd code freeze and cut branch-3.0 early Dec. It looks like we are >> suffering a bit from the tragedy of the commons, that nobody is pushing for >> getting the release out. I understand the natural tendency for each >> individual is to finish or extend the feature/bug that the person has been >> working on. At some point we need to say "this is it" and get the release >> out. I'm happy to help drive this process. >> >> To be realistic, I don't think we should just code freeze *today*. >> Although we have updated the website, contributors have all been operating >> under the assumption that all active developments are still going on. I >> propose we *cut the branch on **Jan 31**, and code freeze and switch >> over to bug squashing mode, and try to get the 3.0 official release out in >> Q1*. That is, by default no new features can go into the branch starting Jan >> 31. >> >> What do you think? >> >> And happy holidays everybody. >> >> >> >> -- [image: Databricks Summit - Watch the talks] <https://databricks.com/sparkaisummit/north-america>