GitHub user zhuangxue opened a pull request:
https://github.com/apache/spark/pull/16188
Branch 1.6 decision tree
What algorithm is used in spark decision tree (is ID3, C4.5 or CART)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/spark branch-1.6
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/16188.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #16188
commit 4c28b4c8f342fde937ff77ab30f898dfe3186c03
Author: Gabriele Nizzoli
Date: 2016-02-02T18:57:18Z
[SPARK-13121][STREAMING] java mapWithState mishandles scala Option
java mapwithstate with Function3 has wrong conversion of java `Optional` to
scala `Option`, fixed code uses same conversion used in the mapwithstate call
that uses Function4 as an input. `Optional.fromNullable(v.get)` fails if v is
`None`, better to use `JavaUtils.optionToOptional(v)` instead.
Author: Gabriele Nizzoli
Closes #11007 from gabrielenizzoli/branch-1.6.
commit 9c0cf22f7681ae05d894ae05f6a91a9467787519
Author: Grzegorz Chilkiewicz
Date: 2016-02-02T19:16:24Z
[SPARK-12711][ML] ML StopWordsRemover does not protect itself from column
name duplication
Fixes problem and verifies fix by test suite.
Also - adds optional parameter: nullable (Boolean) to:
SchemaUtils.appendColumn
and deduplicates SchemaUtils.appendColumn functions.
Author: Grzegorz Chilkiewicz
Closes #10741 from grzegorz-chilkiewicz/master.
(cherry picked from commit b1835d727234fdff42aa8cadd17ddcf43b0bed15)
Signed-off-by: Joseph K. Bradley
commit 3c92333ee78f249dae37070d3b6558b9c92ec7f4
Author: Daoyuan Wang
Date: 2016-02-02T19:09:40Z
[SPARK-13056][SQL] map column would throw NPE if value is null
Jira:
https://issues.apache.org/jira/browse/SPARK-13056
Create a map like
{ "a": "somestring", "b": null}
Query like
SELECT col["b"] FROM t1;
NPE would be thrown.
Author: Daoyuan Wang
Closes #10964 from adrian-wang/npewriter.
(cherry picked from commit 358300c795025735c3b2f96c5447b1b227d4abc1)
Signed-off-by: Michael Armbrust
Conflicts:
sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
commit e81333be05cc5e2a41e5eb1a630c5af59a47dd23
Author: Kevin (Sangwoo) Kim
Date: 2016-02-02T21:24:09Z
[DOCS] Update StructType.scala
The example will throw error like
:20: error: not found: value StructType
Need to add this line:
import org.apache.spark.sql.types._
Author: Kevin (Sangwoo) Kim
Closes #10141 from swkimme/patch-1.
(cherry picked from commit b377b03531d21b1d02a8f58b3791348962e1f31b)
Signed-off-by: Michael Armbrust
commit 2f8abb4afc08aa8dc4ed763bcb93ff6b1d6f0d78
Author: Adam Budde
Date: 2016-02-03T03:35:33Z
[SPARK-13122] Fix race condition in MemoryStore.unrollSafely()
https://issues.apache.org/jira/browse/SPARK-13122
A race condition can occur in MemoryStore's unrollSafely() method if two
threads that
return the same value for currentTaskAttemptId() execute this method
concurrently. This
change makes the operation of reading the initial amount of unroll memory
used, performing
the unroll, and updating the associated memory maps atomic in order to
avoid this race
condition.
Initial proposed fix wraps all of unrollSafely() in a
memoryManager.synchronized { } block. A cleaner approach might be introduce a
mechanism that synchronizes based on task attempt ID. An alternative option
might be to track unroll/pending unroll memory based on block ID rather than
task attempt ID.
Author: Adam Budde
Closes #11012 from budde/master.
(cherry picked from commit ff71261b651a7b289ea2312abd6075da8b838ed9)
Signed-off-by: Andrew Or
Conflicts:
core/src/main/scala/org/apache/spark/storage/MemoryStore.scala
commit 5fe8796c2fa859e30cf5ba293bee8957e23163bc
Author: Mario Briggs
Date: 2016-02-03T17:50:28Z
[SPARK-12739][STREAMING] Details of batch in Streaming tab uses two
Duration columns
I have clearly prefix the two 'Duration' columns in 'Details of Batch'
Streaming tab as 'Output Op Duration' and 'Job Duration'
Author: Mario Briggs
Author: mariobriggs
Closes #11022 from