spark git commit: [SPARK-17717][SQL] Add exist/find methods to Catalog.

2016-09-29 Thread rxin
Repository: spark Updated Branches: refs/heads/master 2f7395670 -> 74ac1c438 [SPARK-17717][SQL] Add exist/find methods to Catalog. ## What changes were proposed in this pull request? The current user facing catalog does not implement methods for checking object existence or finding objects.

spark git commit: Updated the following PR with minor changes to allow cherry-pick to branch-2.0

2016-09-29 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-2.0 0cdd7370a -> a99ea4c9e Updated the following PR with minor changes to allow cherry-pick to branch-2.0 [SPARK-17697][ML] Fixed bug in summary calculations that pattern match against label without casting In calling

spark git commit: [SPARK-17697][ML] Fixed bug in summary calculations that pattern match against label without casting

2016-09-29 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 39eb3bb1e -> 2f7395670 [SPARK-17697][ML] Fixed bug in summary calculations that pattern match against label without casting ## What changes were proposed in this pull request? In calling LogisticRegression.evaluate and

spark git commit: [SPARK-17412][DOC] All test should not be run by `root` or any admin user

2016-09-29 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 3993ebca2 -> 39eb3bb1e [SPARK-17412][DOC] All test should not be run by `root` or any admin user ## What changes were proposed in this pull request? `FsHistoryProviderSuite` fails if `root` user runs it. The test case **SPARK-3697:

spark git commit: [SPARK-17676][CORE] FsHistoryProvider should ignore hidden files

2016-09-29 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 29396e7d1 -> 3993ebca2 [SPARK-17676][CORE] FsHistoryProvider should ignore hidden files ## What changes were proposed in this pull request? FsHistoryProvider was writing a hidden file (to check the fs's clock). Even though it deleted the

spark git commit: [SPARK-17721][MLLIB][ML] Fix for multiplying transposed SparseMatrix with SparseVector

2016-09-29 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 4ecc648ad -> 29396e7d1 [SPARK-17721][MLLIB][ML] Fix for multiplying transposed SparseMatrix with SparseVector ## What changes were proposed in this pull request? * changes the implementation of gemv with transposed SparseMatrix and

spark git commit: [SPARK-17721][MLLIB][ML] Fix for multiplying transposed SparseMatrix with SparseVector

2016-09-29 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-2.0 7c9450b00 -> 0cdd7370a [SPARK-17721][MLLIB][ML] Fix for multiplying transposed SparseMatrix with SparseVector ## What changes were proposed in this pull request? * changes the implementation of gemv with transposed SparseMatrix and

spark git commit: [SPARK-17612][SQL] Support `DESCRIBE table PARTITION` SQL syntax

2016-09-29 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 566d7f282 -> 4ecc648ad [SPARK-17612][SQL] Support `DESCRIBE table PARTITION` SQL syntax ## What changes were proposed in this pull request? This PR implements `DESCRIBE table PARTITION` SQL Syntax again. It was supported until Spark

spark git commit: [SPARK-17653][SQL] Remove unnecessary distincts in multiple unions

2016-09-29 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master fe33121a5 -> 566d7f282 [SPARK-17653][SQL] Remove unnecessary distincts in multiple unions ## What changes were proposed in this pull request? Currently for `Union [Distinct]`, a `Distinct` operator is necessary to be on the top of

spark git commit: [SPARK-17699] Support for parsing JSON string columns

2016-09-29 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 027dea8f2 -> fe33121a5 [SPARK-17699] Support for parsing JSON string columns Spark SQL has great support for reading text files that contain JSON data. However, in many cases the JSON data is just one column amongst others. This is

spark git commit: [SPARK-17715][SCHEDULER] Make task launch logs DEBUG

2016-09-29 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master cb87b3ced -> 027dea8f2 [SPARK-17715][SCHEDULER] Make task launch logs DEBUG ## What changes were proposed in this pull request? Ramp down the task launch logs from INFO to DEBUG. Task launches can happen orders of magnitude more than

spark git commit: [SPARK-17672] Spark 2.0 history server web Ui takes too long for a single application

2016-09-29 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.0 f7839e47c -> 7c9450b00 [SPARK-17672] Spark 2.0 history server web Ui takes too long for a single application Added a new API getApplicationInfo(appId: String) in class ApplicationHistoryProvider and class SparkUI to get app info. In

spark git commit: [SPARK-17672] Spark 2.0 history server web Ui takes too long for a single application

2016-09-29 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 7f779e743 -> cb87b3ced [SPARK-17672] Spark 2.0 history server web Ui takes too long for a single application Added a new API getApplicationInfo(appId: String) in class ApplicationHistoryProvider and class SparkUI to get app info. In this

spark git commit: [SPARK-17648][CORE] TaskScheduler really needs offers to be an IndexedSeq

2016-09-29 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 958200497 -> 7f779e743 [SPARK-17648][CORE] TaskScheduler really needs offers to be an IndexedSeq ## What changes were proposed in this pull request? The Seq[WorkerOffer] is accessed by index, so it really should be an IndexedSeq,

spark git commit: [SPARK-17712][SQL] Fix invalid pushdown of data-independent filters beneath aggregates

2016-09-29 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 7ffafa3bf -> f7839e47c [SPARK-17712][SQL] Fix invalid pushdown of data-independent filters beneath aggregates ## What changes were proposed in this pull request? This patch fixes a minor correctness issue impacting the pushdown of

spark git commit: [SPARK-16343][SQL] Improve the PushDownPredicate rule to pushdown predicates correctly in non-deterministic condition.

2016-09-29 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-2.0 ca8130050 -> 7ffafa3bf [SPARK-16343][SQL] Improve the PushDownPredicate rule to pushdown predicates correctly in non-deterministic condition. ## What changes were proposed in this pull request? Currently our Optimizer may reorder the

spark git commit: [DOCS] Reorganize explanation of Accumulators and Broadcast Variables

2016-09-29 Thread vanzin
Repository: spark Updated Branches: refs/heads/master b2e9731ca -> 958200497 [DOCS] Reorganize explanation of Accumulators and Broadcast Variables ## What changes were proposed in this pull request? The discussion of the interaction of Accumulators and Broadcast Variables should logically

spark git commit: [MINOR][DOCS] Fix th doc. of spark-streaming with kinesis

2016-09-29 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.0 7d612a7d5 -> ca8130050 [MINOR][DOCS] Fix th doc. of spark-streaming with kinesis ## What changes were proposed in this pull request? This pr is just to fix the document of `spark-kinesis-integration`. Since `SPARK-17418` prevented all

spark git commit: [MINOR][DOCS] Fix th doc. of spark-streaming with kinesis

2016-09-29 Thread srowen
Repository: spark Updated Branches: refs/heads/master b35b0dbbf -> b2e9731ca [MINOR][DOCS] Fix th doc. of spark-streaming with kinesis ## What changes were proposed in this pull request? This pr is just to fix the document of `spark-kinesis-integration`. Since `SPARK-17418` prevented all the

spark git commit: [SPARK-17614][SQL] sparkSession.read() .jdbc(***) use the sql syntax "where 1=0" that Cassandra does not support

2016-09-29 Thread srowen
Repository: spark Updated Branches: refs/heads/master f7082ac12 -> b35b0dbbf [SPARK-17614][SQL] sparkSession.read() .jdbc(***) use the sql syntax "where 1=0" that Cassandra does not support ## What changes were proposed in this pull request? Use dialect's table-exists query rather than

spark git commit: [SPARK-17704][ML][MLLIB] ChiSqSelector performance improvement.

2016-09-29 Thread yliang
Repository: spark Updated Branches: refs/heads/master a19a1bb59 -> f7082ac12 [SPARK-17704][ML][MLLIB] ChiSqSelector performance improvement. ## What changes were proposed in this pull request? Several performance improvement for ```ChiSqSelector```: 1, Keep ```selectedFeatures``` ordered

spark git commit: [SPARK-16356][FOLLOW-UP][ML] Enforce ML test of exception for local/distributed Dataset.

2016-09-29 Thread yliang
Repository: spark Updated Branches: refs/heads/master 37eb9184f -> a19a1bb59 [SPARK-16356][FOLLOW-UP][ML] Enforce ML test of exception for local/distributed Dataset. ## What changes were proposed in this pull request? #14035 added ```testImplicits``` to ML unit tests and promoted