Yep, always happens. Is earlier realistic, like Jan 15? it's all arbitrary but indeed this has been in progress for a while, and there's a downside to not releasing it, to making the gap to 3.0 larger. On my end I don't know of anything that's holding up a release; is it basically DSv2?
BTW these are the items still targeted to 3.0.0, some of which may not have been legitimately tagged. It may be worth reviewing what's still open and necessary, and what should be untargeted. SPARK-29768 nondeterministic expression fails column pruning SPARK-29345 Add an API that allows a user to define and observe arbitrary metrics on streaming queries SPARK-29348 Add observable metrics SPARK-29429 Support Prometheus monitoring natively SPARK-29577 Implement p-value simulation and unit tests for chi2 test SPARK-28900 Test Pyspark, SparkR on JDK 11 with run-tests SPARK-28883 Fix a flaky test: ThriftServerQueryTestSuite SPARK-28717 Update SQL ALTER TABLE RENAME to use TableCatalog API SPARK-28588 Build a SQL reference doc SPARK-28629 Capture the missing rules in HiveSessionStateBuilder SPARK-28684 Hive module support JDK 11 SPARK-28548 explain() shows wrong result for persisted DataFrames after some operations SPARK-28264 Revisiting Python / pandas UDF SPARK-28301 fix the behavior of table name resolution with multi-catalog SPARK-28155 do not leak SaveMode to file source v2 SPARK-28103 Cannot infer filters from union table with empty local relation table properly SPARK-27986 Support Aggregate Expressions with filter SPARK-28024 Incorrect numeric values when out of range SPARK-27936 Support local dependency uploading from --py-files SPARK-27780 Shuffle server & client should be versioned to enable smoother upgrade SPARK-27714 Support Join Reorder based on Genetic Algorithm when the # of joined tables > 12 SPARK-27471 Reorganize public v2 catalog API SPARK-27520 Introduce a global config system to replace hadoopConfiguration SPARK-24625 put all the backward compatible behavior change configs under spark.sql.legacy.* SPARK-24941 Add RDDBarrier.coalesce() function SPARK-25017 Add test suite for ContextBarrierState SPARK-25083 remove the type erasure hack in data source scan SPARK-25383 Image data source supports sample pushdown SPARK-27272 Enable blacklisting of node/executor on fetch failures by default SPARK-27296 Efficient User Defined Aggregators SPARK-25128 multiple simultaneous job submissions against k8s backend cause driver pods to hang SPARK-26664 Make DecimalType's minimum adjusted scale configurable SPARK-21559 Remove Mesos fine-grained mode SPARK-24942 Improve cluster resource management with jobs containing barrier stage SPARK-25914 Separate projection from grouping and aggregate in logical Aggregate SPARK-20964 Make some keywords reserved along with the ANSI/SQL standard SPARK-26221 Improve Spark SQL instrumentation and metrics SPARK-26425 Add more constraint checks in file streaming source to avoid checkpoint corruption SPARK-25843 Redesign rangeBetween API SPARK-25841 Redesign window function rangeBetween API SPARK-25752 Add trait to easily whitelist logical operators that produce named output from CleanupAliases SPARK-25640 Clarify/Improve EvalType for grouped aggregate and window aggregate SPARK-25531 new write APIs for data source v2 SPARK-25547 Pluggable jdbc connection factory SPARK-20845 Support specification of column names in INSERT INTO SPARK-24724 Discuss necessary info and access in barrier mode + Kubernetes SPARK-24725 Discuss necessary info and access in barrier mode + Mesos SPARK-25074 Implement maxNumConcurrentTasks() in MesosFineGrainedSchedulerBackend SPARK-23710 Upgrade the built-in Hive to 2.3.5 for hadoop-3.2 SPARK-25186 Stabilize Data Source V2 API SPARK-25376 Scenarios we should handle but missed in 2.4 for barrier execution mode SPARK-7768 Make user-defined type (UDT) API public SPARK-14922 Alter Table Drop Partition Using Predicate-based Partition Spec SPARK-15694 Implement ScriptTransformation in sql/core SPARK-18134 SQL: MapType in Group BY and Joins not working SPARK-19842 Informational Referential Integrity Constraints Support in Spark SPARK-22231 Support of map, filter, withColumn, dropColumn in nested list of structures SPARK-22386 Data Source V2 improvements SPARK-24723 Discuss necessary info and access in barrier mode + YARN On Mon, Dec 23, 2019 at 5:48 PM Reynold Xin <r...@databricks.com> wrote: > We've pushed out 3.0 multiple times. The latest release window documented > on the website <http://spark.apache.org/versioning-policy.html> says we'd > code freeze and cut branch-3.0 early Dec. It looks like we are suffering a > bit from the tragedy of the commons, that nobody is pushing for getting the > release out. I understand the natural tendency for each individual is to > finish or extend the feature/bug that the person has been working on. At > some point we need to say "this is it" and get the release out. I'm happy > to help drive this process. > > To be realistic, I don't think we should just code freeze *today*. > Although we have updated the website, contributors have all been operating > under the assumption that all active developments are still going on. I > propose we *cut the branch on **Jan 31**, and code freeze and switch over > to bug squashing mode, and try to get the 3.0 official release out in Q1*. > That is, by default no new features can go into the branch starting Jan 31 > . > > What do you think? > > And happy holidays everybody. > > > >