Awesome Shane. 2020년 2월 5일 (수) 오전 7:29, Xiao Li <lix...@databricks.com>님이 작성:
> Thank you, Shane! > > Xiao > > On Tue, Feb 4, 2020 at 2:16 PM Dongjoon Hyun <dongjoon.h...@gmail.com> > wrote: > >> Thank you, Shane! :D >> >> Bests, >> Dongjoon >> >> On Tue, Feb 4, 2020 at 13:28 shane knapp ☠ <skn...@berkeley.edu> wrote: >> >>> all the 3.0 builds have been created and are currently churning away! >>> >>> (the failed builds were to a silly bug in the build scripts sneaking >>> it's way back in, but that's resolved now) >>> >>> shane >>> >>> On Sat, Feb 1, 2020 at 6:16 PM Reynold Xin <r...@databricks.com> wrote: >>> >>>> Note that branch-3.0 was cut. Please focus on testing, polish, and >>>> let's get the release out! >>>> >>>> >>>> On Wed, Jan 29, 2020 at 3:41 PM, Reynold Xin <r...@databricks.com> >>>> wrote: >>>> >>>>> Just a reminder - code freeze is coming this Fri! >>>>> >>>>> There can always be exceptions, but those should be exceptions and >>>>> discussed on a case by case basis rather than becoming the norm. >>>>> >>>>> >>>>> >>>>> On Tue, Dec 24, 2019 at 4:55 PM, Jungtaek Lim < >>>>> kabhwan.opensou...@gmail.com> wrote: >>>>> >>>>>> Jan 31 sounds good to me. >>>>>> >>>>>> Just curious, do we allow some exception on code freeze? One thing >>>>>> came into my mind is that some feature could have multiple subtasks and >>>>>> part of subtasks have been merged and other subtask(s) are in reviewing. >>>>>> In >>>>>> this case do we allow these subtasks to have more days to get reviewed >>>>>> and >>>>>> merged later? >>>>>> >>>>>> Happy Holiday! >>>>>> >>>>>> Thanks, >>>>>> Jungtaek Lim (HeartSaVioR) >>>>>> >>>>>> On Wed, Dec 25, 2019 at 8:36 AM Takeshi Yamamuro < >>>>>> linguin....@gmail.com> wrote: >>>>>> >>>>>>> Looks nice, happy holiday, all! >>>>>>> >>>>>>> Bests, >>>>>>> Takeshi >>>>>>> >>>>>>> On Wed, Dec 25, 2019 at 3:56 AM Dongjoon Hyun < >>>>>>> dongjoon.h...@gmail.com> wrote: >>>>>>> >>>>>>>> +1 for January 31st. >>>>>>>> >>>>>>>> Bests, >>>>>>>> Dongjoon. >>>>>>>> >>>>>>>> On Tue, Dec 24, 2019 at 7:11 AM Xiao Li <lix...@databricks.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Jan 31 is pretty reasonable. Happy Holidays! >>>>>>>>> >>>>>>>>> Xiao >>>>>>>>> >>>>>>>>> On Tue, Dec 24, 2019 at 5:52 AM Sean Owen <sro...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Yep, always happens. Is earlier realistic, like Jan 15? it's all >>>>>>>>>> arbitrary but indeed this has been in progress for a while, and >>>>>>>>>> there's a >>>>>>>>>> downside to not releasing it, to making the gap to 3.0 larger. >>>>>>>>>> On my end I don't know of anything that's holding up a release; >>>>>>>>>> is it basically DSv2? >>>>>>>>>> >>>>>>>>>> BTW these are the items still targeted to 3.0.0, some of which >>>>>>>>>> may not have been legitimately tagged. It may be worth reviewing >>>>>>>>>> what's >>>>>>>>>> still open and necessary, and what should be untargeted. >>>>>>>>>> >>>>>>>>>> SPARK-29768 nondeterministic expression fails column pruning >>>>>>>>>> SPARK-29345 Add an API that allows a user to define and observe >>>>>>>>>> arbitrary metrics on streaming queries >>>>>>>>>> SPARK-29348 Add observable metrics >>>>>>>>>> SPARK-29429 Support Prometheus monitoring natively >>>>>>>>>> SPARK-29577 Implement p-value simulation and unit tests for chi2 >>>>>>>>>> test >>>>>>>>>> SPARK-28900 Test Pyspark, SparkR on JDK 11 with run-tests >>>>>>>>>> SPARK-28883 Fix a flaky test: ThriftServerQueryTestSuite >>>>>>>>>> SPARK-28717 Update SQL ALTER TABLE RENAME to use TableCatalog API >>>>>>>>>> SPARK-28588 Build a SQL reference doc >>>>>>>>>> SPARK-28629 Capture the missing rules in HiveSessionStateBuilder >>>>>>>>>> SPARK-28684 Hive module support JDK 11 >>>>>>>>>> SPARK-28548 explain() shows wrong result for persisted DataFrames >>>>>>>>>> after some operations >>>>>>>>>> SPARK-28264 Revisiting Python / pandas UDF >>>>>>>>>> SPARK-28301 fix the behavior of table name resolution with >>>>>>>>>> multi-catalog >>>>>>>>>> SPARK-28155 do not leak SaveMode to file source v2 >>>>>>>>>> SPARK-28103 Cannot infer filters from union table with empty >>>>>>>>>> local relation table properly >>>>>>>>>> SPARK-27986 Support Aggregate Expressions with filter >>>>>>>>>> SPARK-28024 Incorrect numeric values when out of range >>>>>>>>>> SPARK-27936 Support local dependency uploading from --py-files >>>>>>>>>> SPARK-27780 Shuffle server & client should be versioned to enable >>>>>>>>>> smoother upgrade >>>>>>>>>> SPARK-27714 Support Join Reorder based on Genetic Algorithm when >>>>>>>>>> the # of joined tables > 12 >>>>>>>>>> SPARK-27471 Reorganize public v2 catalog API >>>>>>>>>> SPARK-27520 Introduce a global config system to replace >>>>>>>>>> hadoopConfiguration >>>>>>>>>> SPARK-24625 put all the backward compatible behavior change >>>>>>>>>> configs under spark.sql.legacy.* >>>>>>>>>> SPARK-24941 Add RDDBarrier.coalesce() function >>>>>>>>>> SPARK-25017 Add test suite for ContextBarrierState >>>>>>>>>> SPARK-25083 remove the type erasure hack in data source scan >>>>>>>>>> SPARK-25383 Image data source supports sample pushdown >>>>>>>>>> SPARK-27272 Enable blacklisting of node/executor on fetch >>>>>>>>>> failures by default >>>>>>>>>> SPARK-27296 Efficient User Defined Aggregators >>>>>>>>>> SPARK-25128 multiple simultaneous job submissions against k8s >>>>>>>>>> backend cause driver pods to hang >>>>>>>>>> SPARK-26664 Make DecimalType's minimum adjusted scale configurable >>>>>>>>>> SPARK-21559 Remove Mesos fine-grained mode >>>>>>>>>> SPARK-24942 Improve cluster resource management with jobs >>>>>>>>>> containing barrier stage >>>>>>>>>> SPARK-25914 Separate projection from grouping and aggregate in >>>>>>>>>> logical Aggregate >>>>>>>>>> SPARK-20964 Make some keywords reserved along with the ANSI/SQL >>>>>>>>>> standard >>>>>>>>>> SPARK-26221 Improve Spark SQL instrumentation and metrics >>>>>>>>>> SPARK-26425 Add more constraint checks in file streaming source >>>>>>>>>> to avoid checkpoint corruption >>>>>>>>>> SPARK-25843 Redesign rangeBetween API >>>>>>>>>> SPARK-25841 Redesign window function rangeBetween API >>>>>>>>>> SPARK-25752 Add trait to easily whitelist logical operators that >>>>>>>>>> produce named output from CleanupAliases >>>>>>>>>> SPARK-25640 Clarify/Improve EvalType for grouped aggregate and >>>>>>>>>> window aggregate >>>>>>>>>> SPARK-25531 new write APIs for data source v2 >>>>>>>>>> SPARK-25547 Pluggable jdbc connection factory >>>>>>>>>> SPARK-20845 Support specification of column names in INSERT INTO >>>>>>>>>> SPARK-24724 Discuss necessary info and access in barrier mode + >>>>>>>>>> Kubernetes >>>>>>>>>> SPARK-24725 Discuss necessary info and access in barrier mode + >>>>>>>>>> Mesos >>>>>>>>>> SPARK-25074 Implement maxNumConcurrentTasks() in >>>>>>>>>> MesosFineGrainedSchedulerBackend >>>>>>>>>> SPARK-23710 Upgrade the built-in Hive to 2.3.5 for hadoop-3.2 >>>>>>>>>> SPARK-25186 Stabilize Data Source V2 API >>>>>>>>>> SPARK-25376 Scenarios we should handle but missed in 2.4 for >>>>>>>>>> barrier execution mode >>>>>>>>>> SPARK-7768 Make user-defined type (UDT) API public >>>>>>>>>> SPARK-14922 Alter Table Drop Partition Using Predicate-based >>>>>>>>>> Partition Spec >>>>>>>>>> SPARK-15694 Implement ScriptTransformation in sql/core >>>>>>>>>> SPARK-18134 SQL: MapType in Group BY and Joins not working >>>>>>>>>> SPARK-19842 Informational Referential Integrity Constraints >>>>>>>>>> Support in Spark >>>>>>>>>> SPARK-22231 Support of map, filter, withColumn, dropColumn in >>>>>>>>>> nested list of structures >>>>>>>>>> SPARK-22386 Data Source V2 improvements >>>>>>>>>> SPARK-24723 Discuss necessary info and access in barrier mode + >>>>>>>>>> YARN >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Mon, Dec 23, 2019 at 5:48 PM Reynold Xin <r...@databricks.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> We've pushed out 3.0 multiple times. The latest release window >>>>>>>>>>> documented on the website >>>>>>>>>>> <http://spark.apache.org/versioning-policy.html> says we'd code >>>>>>>>>>> freeze and cut branch-3.0 early Dec. It looks like we are suffering >>>>>>>>>>> a bit >>>>>>>>>>> from the tragedy of the commons, that nobody is pushing for getting >>>>>>>>>>> the >>>>>>>>>>> release out. I understand the natural tendency for each individual >>>>>>>>>>> is to >>>>>>>>>>> finish or extend the feature/bug that the person has been working >>>>>>>>>>> on. At >>>>>>>>>>> some point we need to say "this is it" and get the release out. I'm >>>>>>>>>>> happy >>>>>>>>>>> to help drive this process. >>>>>>>>>>> >>>>>>>>>>> To be realistic, I don't think we should just code freeze *today*. >>>>>>>>>>> Although we have updated the website, contributors have all been >>>>>>>>>>> operating >>>>>>>>>>> under the assumption that all active developments are still going >>>>>>>>>>> on. I >>>>>>>>>>> propose we *cut the branch on **Jan 31**, and code freeze and >>>>>>>>>>> switch over to bug squashing mode, and try to get the 3.0 official >>>>>>>>>>> release >>>>>>>>>>> out in Q1*. That is, by default no new features can go into the >>>>>>>>>>> branch starting Jan 31. >>>>>>>>>>> >>>>>>>>>>> What do you think? >>>>>>>>>>> >>>>>>>>>>> And happy holidays everybody. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> [image: Databricks Summit - Watch the talks] >>>>>>>>> <https://databricks.com/sparkaisummit/north-america> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> --- >>>>>>> Takeshi Yamamuro >>>>>>> >>>>>> >>>> >>> >>> -- >>> Shane Knapp >>> Computer Guy / Voice of Reason >>> UC Berkeley EECS Research / RISELab Staff Technical Lead >>> https://rise.cs.berkeley.edu >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >> -- > <https://databricks.com/sparkaisummit/north-america> >