Looks nice, happy holiday, all! Bests, Takeshi
On Wed, Dec 25, 2019 at 3:56 AM Dongjoon Hyun <dongjoon.h...@gmail.com> wrote: > +1 for January 31st. > > Bests, > Dongjoon. > > On Tue, Dec 24, 2019 at 7:11 AM Xiao Li <lix...@databricks.com> wrote: > >> Jan 31 is pretty reasonable. Happy Holidays! >> >> Xiao >> >> On Tue, Dec 24, 2019 at 5:52 AM Sean Owen <sro...@gmail.com> wrote: >> >>> Yep, always happens. Is earlier realistic, like Jan 15? it's all >>> arbitrary but indeed this has been in progress for a while, and there's a >>> downside to not releasing it, to making the gap to 3.0 larger. >>> On my end I don't know of anything that's holding up a release; is it >>> basically DSv2? >>> >>> BTW these are the items still targeted to 3.0.0, some of which may not >>> have been legitimately tagged. It may be worth reviewing what's still open >>> and necessary, and what should be untargeted. >>> >>> SPARK-29768 nondeterministic expression fails column pruning >>> SPARK-29345 Add an API that allows a user to define and observe >>> arbitrary metrics on streaming queries >>> SPARK-29348 Add observable metrics >>> SPARK-29429 Support Prometheus monitoring natively >>> SPARK-29577 Implement p-value simulation and unit tests for chi2 test >>> SPARK-28900 Test Pyspark, SparkR on JDK 11 with run-tests >>> SPARK-28883 Fix a flaky test: ThriftServerQueryTestSuite >>> SPARK-28717 Update SQL ALTER TABLE RENAME to use TableCatalog API >>> SPARK-28588 Build a SQL reference doc >>> SPARK-28629 Capture the missing rules in HiveSessionStateBuilder >>> SPARK-28684 Hive module support JDK 11 >>> SPARK-28548 explain() shows wrong result for persisted DataFrames after >>> some operations >>> SPARK-28264 Revisiting Python / pandas UDF >>> SPARK-28301 fix the behavior of table name resolution with multi-catalog >>> SPARK-28155 do not leak SaveMode to file source v2 >>> SPARK-28103 Cannot infer filters from union table with empty local >>> relation table properly >>> SPARK-27986 Support Aggregate Expressions with filter >>> SPARK-28024 Incorrect numeric values when out of range >>> SPARK-27936 Support local dependency uploading from --py-files >>> SPARK-27780 Shuffle server & client should be versioned to enable >>> smoother upgrade >>> SPARK-27714 Support Join Reorder based on Genetic Algorithm when the # >>> of joined tables > 12 >>> SPARK-27471 Reorganize public v2 catalog API >>> SPARK-27520 Introduce a global config system to replace >>> hadoopConfiguration >>> SPARK-24625 put all the backward compatible behavior change configs >>> under spark.sql.legacy.* >>> SPARK-24941 Add RDDBarrier.coalesce() function >>> SPARK-25017 Add test suite for ContextBarrierState >>> SPARK-25083 remove the type erasure hack in data source scan >>> SPARK-25383 Image data source supports sample pushdown >>> SPARK-27272 Enable blacklisting of node/executor on fetch failures by >>> default >>> SPARK-27296 Efficient User Defined Aggregators >>> SPARK-25128 multiple simultaneous job submissions against k8s backend >>> cause driver pods to hang >>> SPARK-26664 Make DecimalType's minimum adjusted scale configurable >>> SPARK-21559 Remove Mesos fine-grained mode >>> SPARK-24942 Improve cluster resource management with jobs containing >>> barrier stage >>> SPARK-25914 Separate projection from grouping and aggregate in logical >>> Aggregate >>> SPARK-20964 Make some keywords reserved along with the ANSI/SQL standard >>> SPARK-26221 Improve Spark SQL instrumentation and metrics >>> SPARK-26425 Add more constraint checks in file streaming source to avoid >>> checkpoint corruption >>> SPARK-25843 Redesign rangeBetween API >>> SPARK-25841 Redesign window function rangeBetween API >>> SPARK-25752 Add trait to easily whitelist logical operators that produce >>> named output from CleanupAliases >>> SPARK-25640 Clarify/Improve EvalType for grouped aggregate and window >>> aggregate >>> SPARK-25531 new write APIs for data source v2 >>> SPARK-25547 Pluggable jdbc connection factory >>> SPARK-20845 Support specification of column names in INSERT INTO >>> SPARK-24724 Discuss necessary info and access in barrier mode + >>> Kubernetes >>> SPARK-24725 Discuss necessary info and access in barrier mode + Mesos >>> SPARK-25074 Implement maxNumConcurrentTasks() in >>> MesosFineGrainedSchedulerBackend >>> SPARK-23710 Upgrade the built-in Hive to 2.3.5 for hadoop-3.2 >>> SPARK-25186 Stabilize Data Source V2 API >>> SPARK-25376 Scenarios we should handle but missed in 2.4 for barrier >>> execution mode >>> SPARK-7768 Make user-defined type (UDT) API public >>> SPARK-14922 Alter Table Drop Partition Using Predicate-based Partition >>> Spec >>> SPARK-15694 Implement ScriptTransformation in sql/core >>> SPARK-18134 SQL: MapType in Group BY and Joins not working >>> SPARK-19842 Informational Referential Integrity Constraints Support in >>> Spark >>> SPARK-22231 Support of map, filter, withColumn, dropColumn in nested >>> list of structures >>> SPARK-22386 Data Source V2 improvements >>> SPARK-24723 Discuss necessary info and access in barrier mode + YARN >>> >>> >>> On Mon, Dec 23, 2019 at 5:48 PM Reynold Xin <r...@databricks.com> wrote: >>> >>>> We've pushed out 3.0 multiple times. The latest release window >>>> documented on the website >>>> <http://spark.apache.org/versioning-policy.html> says we'd code freeze >>>> and cut branch-3.0 early Dec. It looks like we are suffering a bit from the >>>> tragedy of the commons, that nobody is pushing for getting the release out. >>>> I understand the natural tendency for each individual is to finish or >>>> extend the feature/bug that the person has been working on. At some point >>>> we need to say "this is it" and get the release out. I'm happy to help >>>> drive this process. >>>> >>>> To be realistic, I don't think we should just code freeze *today*. >>>> Although we have updated the website, contributors have all been operating >>>> under the assumption that all active developments are still going on. I >>>> propose we *cut the branch on **Jan 31**, and code freeze and switch >>>> over to bug squashing mode, and try to get the 3.0 official release out in >>>> Q1*. That is, by default no new features can go into the branch >>>> starting Jan 31. >>>> >>>> What do you think? >>>> >>>> And happy holidays everybody. >>>> >>>> >>>> >>>> >> >> -- >> [image: Databricks Summit - Watch the talks] >> <https://databricks.com/sparkaisummit/north-america> >> > -- --- Takeshi Yamamuro