Re: [VOTE] Apache Spark 2.1.1 (RC4)

2017-04-28 Thread Felix Cheung
+1 Tested R on linux and windows Previous issue with building vignettes on windows with stackoverflow in ALS still reproduce but as confirmed the issue was in 2.1.0 so this isn't a regression (and hope for the best on CRAN..) https://issues.apache.org/jira/browse/SPARK-20402

Re: [VOTE] Apache Spark 2.2.0 (RC1)

2017-04-28 Thread Koert Kuipers
we have been testing the 2.2.0 snapshots in the last few weeks for inhouse unit tests, integration tests and real workloads and we are very happy with it. the only issue i had so far (some encoders not being serialize anymore) has already been dealt with by wenchen. On Thu, Apr 27, 2017 at 6:49

Re: [VOTE] Apache Spark 2.2.0 (RC1)

2017-04-28 Thread Kazuaki Ishizaki
+1 (non-binding) I tested it on Ubuntu 16.04 and OpenJDK8 on ppc64le. All of the tests for core have passed.. $ java -version openjdk version "1.8.0_111" OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-2ubuntu0.16.04.2-b14) OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode) $

Re: [VOTE] Apache Spark 2.1.1 (RC4)

2017-04-28 Thread Denny Lee
+1 On Fri, Apr 28, 2017 at 9:17 AM Kazuaki Ishizaki wrote: > +1 (non-binding) > > I tested it on Ubuntu 16.04 and OpenJDK8 on ppc64le. All of the tests for > core have passed.. > > $ java -version > openjdk version "1.8.0_111" > OpenJDK Runtime Environment (build >

Re: [VOTE] Apache Spark 2.2.0 (RC1)

2017-04-28 Thread Koert Kuipers
this is column names containing dots that do not target fields inside structs? so not a.b as in field b inside struct a, but somehow a field called a.b? i didnt even know it is supported at all. its something i would never try because it sounds like a bad idea to go there... On Fri, Apr 28, 2017

Re: [VOTE] Apache Spark 2.2.0 (RC1)

2017-04-28 Thread Andrew Ash
-1 due to regression from 2.1.1 In 2.2.0-rc1 we bumped the Parquet version from 1.8.1 to 1.8.2 in commit 26a4cba3ff . Parquet 1.8.2 includes a backport from 1.9.0: PARQUET-389 in commit

Re: [VOTE] Apache Spark 2.1.1 (RC4)

2017-04-28 Thread Kazuaki Ishizaki
+1 (non-binding) I tested it on Ubuntu 16.04 and OpenJDK8 on ppc64le. All of the tests for core have passed.. $ java -version openjdk version "1.8.0_111" OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-2ubuntu0.16.04.2-b14) OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode) $

Re: [VOTE] Apache Spark 2.1.1 (RC4)

2017-04-28 Thread Tom Graves
+1 Tom Graves On Thursday, April 27, 2017 5:37 PM, vaquar khan wrote: +1  Regards, Vaquar khan On Apr 27, 2017 4:11 PM, "Holden Karau" wrote: +1 (non-binding) PySpark packaging issue from the earlier RC seems to have been fixed. On Thu,

RandomForest caching

2017-04-28 Thread madhu phatak
Hi, I am testing RandomForestClassification with 50gb of data which is cached in memory. I have 64gb of ram, in which 28gb is used for original dataset caching. When I run random forest, it caches around 300GB of intermediate data which un caches the original dataset. This caching is triggered