Since it's a backport from master to branch-2.3 for ORC 1.4.3, I made a
backport PR.
https://github.com/apache/spark/pull/21093
Thank you for raising this issues and confirming, Henry and Xiao. :)
Bests,
Dongjoon.
On Tue, Apr 17, 2018 at 12:01 AM, Xiao Li wrote:
> Yes,
unsubscribe
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Hello everybody
We (at University of Zagreb and University of Washington) have
implemented an optimization of Spark's sort-merge join (SMJ) which has
improved performance of our jobs considerably and we would like to know
if Spark community thinks it would be useful to include this in the
Not a bug.
When disabling standadization, mllib LR will still do standadization for
features, but it will scale the coefficients back at the end (after
training finished). So it will get the same result with no standadization
training. The purpose of it is to improve the rate of convergence. So
Yea definitely not. The only requirement is, the DataReader/WriterFactory
must support at least one DataFormat.
> how are we going to express capability of the given reader of its
supported format(s), or specific support for each of “real-time data in row
format, and history data in columnar