Repository: spark
Updated Branches:
refs/heads/master 1115e8e73 - a3afa4a1b
SPARK-5815 [MLLIB] Part 2. Deprecate SVDPlusPlus APIs that expose DoubleMatrix
from JBLAS
Now, deprecated runSVDPlusPlus and update run, for 1.4.0 / master only
Author: Sean Owen so...@cloudera.com
Closes #4625
Repository: spark
Updated Branches:
refs/heads/master a3afa4a1b - 5c78be7a5
[SPARK-5799][SQL] Compute aggregation function on specified numeric columns
Compute aggregation function on specified numeric columns. For example:
val df = Seq((a, 1, 0, b), (b, 2, 4, c), (a, 2, 3,
Repository: spark
Updated Branches:
refs/heads/master 275a0c081 - 104b2c458
[SQL] Initial support for reporting location of error in sql string
Author: Michael Armbrust mich...@databricks.com
Closes #4587 from marmbrus/position and squashes the following commits:
0810052 [Michael Armbrust]
Repository: spark
Updated Branches:
refs/heads/branch-1.3 c2eaaea9f - 63fa123f1
[SQL] Initial support for reporting location of error in sql string
Author: Michael Armbrust mich...@databricks.com
Closes #4587 from marmbrus/position and squashes the following commits:
0810052 [Michael
Repository: spark
Updated Branches:
refs/heads/branch-1.3 1a8895560 - c2eaaea9f
[SPARK-5824] [SQL] add null format in ctas and set default col comment to null
Author: Daoyuan Wang daoyuan.w...@intel.com
Closes #4609 from adrian-wang/ctas and squashes the following commits:
0a75d5a [Daoyuan
Repository: spark
Updated Branches:
refs/heads/branch-1.3 fef2267cd - 1a8895560
[SQL] [Minor] Update the SpecificMutableRow.copy
When profiling the Join / Aggregate queries via VisualVM, I noticed lots of
`SpecificMutableRow` objects created, as well as the `MutableValue`, since the
Repository: spark
Updated Branches:
refs/heads/master 8e25373ce - cc552e042
[SQL] [Minor] Update the SpecificMutableRow.copy
When profiling the Join / Aggregate queries via VisualVM, I noticed lots of
`SpecificMutableRow` objects created, as well as the `MutableValue`, since the
Repository: spark
Updated Branches:
refs/heads/master 9baac56cc - 8e25373ce
SPARK-5795 [STREAMING] api.java.JavaPairDStream.saveAsNewAPIHadoopFiles may not
friendly to java
Revise JavaPairDStream API declaration on saveAs Hadoop methods, to allow it to
be called directly as intended.
CC
Repository: spark
Updated Branches:
refs/heads/branch-1.3 63fa123f1 - 0368494c5
[SQL] Add fetched row count in SparkSQLCLIDriver
before this change:
```scala
Time taken: 0.619 seconds
```
after this change :
```scala
Time taken: 0.619 seconds, Fetched: 4 row(s)
```
Author: OopsOutOfMemory
Repository: spark
Updated Branches:
refs/heads/master cc552e042 - 275a0c081
[SPARK-5824] [SQL] add null format in ctas and set default col comment to null
Author: Daoyuan Wang daoyuan.w...@intel.com
Closes #4609 from adrian-wang/ctas and squashes the following commits:
0a75d5a [Daoyuan
Repository: spark
Updated Branches:
refs/heads/master 5c78be7a5 - 9baac56cc
Minor fixes for commit https://github.com/apache/spark/pull/4592.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9baac56c
Tree:
Repository: spark
Updated Branches:
refs/heads/branch-1.3 0165e9d13 - fef2267cd
SPARK-5795 [STREAMING] api.java.JavaPairDStream.saveAsNewAPIHadoopFiles may not
friendly to java
Revise JavaPairDStream API declaration on saveAs Hadoop methods, to allow it to
be called directly as intended.
Repository: spark
Updated Branches:
refs/heads/master 104b2c458 - b4d7c7032
[SQL] Add fetched row count in SparkSQLCLIDriver
before this change:
```scala
Time taken: 0.619 seconds
```
after this change :
```scala
Time taken: 0.619 seconds, Fetched: 4 row(s)
```
Author: OopsOutOfMemory
Repository: spark
Updated Branches:
refs/heads/master b4d7c7032 - 6f54dee66
[SPARK-5296] [SQL] Add more filter types for data sources API
This PR adds the following filter types for data sources API:
- `IsNull`
- `IsNotNull`
- `Not`
- `And`
- `Or`
The code which converts Catalyst predicate
Repository: spark
Updated Branches:
refs/heads/branch-1.3 0368494c5 - 363a9a7d5
[SPARK-5296] [SQL] Add more filter types for data sources API
This PR adds the following filter types for data sources API:
- `IsNull`
- `IsNotNull`
- `Not`
- `And`
- `Or`
The code which converts Catalyst
Repository: spark
Updated Branches:
refs/heads/master bb05982dd - c01c4ebcf
SPARK-5357: Update commons-codec version to 1.10 (current)
Resolves https://issues.apache.org/jira/browse/SPARK-5357
In commons-codec 1.5, Base64 instances are not thread safe. That was only true
from 1.4-1.6.
Repository: spark
Updated Branches:
refs/heads/branch-1.3 363a9a7d5 - 864d77e0d
[SPARK-5833] [SQL] Adds REFRESH TABLE command
Lifts `HiveMetastoreCatalog.refreshTable` to `Catalog`. Adds `RefreshTable`
command to refresh (possibly cached) metadata in external data sources tables.
!--
Repository: spark
Updated Branches:
refs/heads/master c51ab37fa - bb05982dd
SPARK-5841: remove DiskBlockManager shutdown hook on stop
After a call to stop, the shutdown hook is redundant, and causes a
memory leak.
Author: Matt Whelan mwhe...@perka.com
Closes #4627 from MattWhelan/SPARK-5841
Repository: spark
Updated Branches:
refs/heads/branch-1.3 864d77e0d - dd977dfed
SPARK-5841: remove DiskBlockManager shutdown hook on stop
After a call to stop, the shutdown hook is redundant, and causes a
memory leak.
Author: Matt Whelan mwhe...@perka.com
Closes #4627 from
Repository: spark
Updated Branches:
refs/heads/branch-1.2 0df26bb97 - 432ceca2a
HOTFIX: Style issue causing build break
Caused by #4601
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/432ceca2
Tree:
Repository: spark
Updated Branches:
refs/heads/master 1294a6e01 - b1bd1dd32
[SPARK-5788] [PySpark] capture the exception in python write thread
The exception in Python writer thread will shutdown executor.
Author: Davies Liu dav...@databricks.com
Closes #4577 from davies/exception and
Repository: spark
Updated Branches:
refs/heads/branch-1.3 52994d83b - c2a9a6176
[SPARK-5788] [PySpark] capture the exception in python write thread
The exception in Python writer thread will shutdown executor.
Author: Davies Liu dav...@databricks.com
Closes #4577 from davies/exception and
Repository: spark
Updated Branches:
refs/heads/master c01c4ebcf - 0cfda8461
[SPARK-2313] Use socket to communicate GatewayServer port back to Python driver
This patch changes PySpark so that the GatewayServer's port is communicated
back to the Python process that launches it over a local
Repository: spark
Updated Branches:
refs/heads/branch-1.3 8c4561984 - b70b8ba0a
[SPARK-2313] Use socket to communicate GatewayServer port back to Python driver
This patch changes PySpark so that the GatewayServer's port is communicated
back to the Python process that launches it over a local
Repository: spark
Updated Branches:
refs/heads/branch-1.3 b70b8ba0a - ad8fd4fb3
HOTFIX: Break in Jekyll build from #4589
That patch had a line break in the middle of a {{ }} expression, which is not
allowed.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
Repository: spark
Updated Branches:
refs/heads/master 0cfda8461 - 04b401da8
HOTFIX: Break in Jekyll build from #4589
That patch had a line break in the middle of a {{ }} expression, which is not
allowed.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
Repository: spark
Updated Branches:
refs/heads/master 04b401da8 - 5b6cd65cd
[SPARK-5746][SQL] Check invalid cases for the write path of data source API
JIRA: https://issues.apache.org/jira/browse/SPARK-5746
liancheng marmbrus
Author: Yin Huai yh...@databricks.com
Closes #4617 from
Repository: spark
Updated Branches:
refs/heads/master f3ff1eb29 - cb6c48c87
[SQL] Optimize arithmetic and predicate operators
Existing implementation of arithmetic operators and BinaryComparison operators
have redundant type checking codes, e.g.:
Expression.n2 is used by
Repository: spark
Updated Branches:
refs/heads/master cb6c48c87 - e189cbb05
[SPARK-4865][SQL]Include temporary tables in SHOW TABLES
This PR adds a `ShowTablesCommand` to support `SHOW TABLES [IN databaseName]`
SQL command. The result of `SHOW TABLE` has two columns, `tableName` and
Repository: spark
Updated Branches:
refs/heads/branch-1.3 639a3c2fd - 8a94bf76b
[SPARK-4865][SQL]Include temporary tables in SHOW TABLES
This PR adds a `ShowTablesCommand` to support `SHOW TABLES [IN databaseName]`
SQL command. The result of `SHOW TABLE` has two columns, `tableName` and
Repository: spark
Updated Branches:
refs/heads/master e189cbb05 - 1294a6e01
SPARK-5848: tear down the ConsoleProgressBar timer
The timer is a GC root, and failing to terminate it leaks SparkContext
instances.
Author: Matt Whelan mwhe...@perka.com
Closes #4635 from MattWhelan/SPARK-5848 and
Repository: spark
Updated Branches:
refs/heads/branch-1.3 8a94bf76b - 52994d83b
SPARK-5848: tear down the ConsoleProgressBar timer
The timer is a GC root, and failing to terminate it leaks SparkContext
instances.
Author: Matt Whelan mwhe...@perka.com
Closes #4635 from MattWhelan/SPARK-5848
Repository: spark
Updated Branches:
refs/heads/branch-1.2 6f47114d9 - f468688f1
[SPARK-5788] [PySpark] capture the exception in python write thread
The exception in Python writer thread will shutdown executor.
Author: Davies Liu dav...@databricks.com
Closes #4577 from davies/exception and
Repository: spark
Updated Branches:
refs/heads/branch-1.2 f9d8c5e3f - 7f19c7c1b
[SPARK-1600] Refactor FileInputStream tests to remove Thread.sleep() calls and
SystemClock usage (branch-1.2 backport)
(This PR backports #3801 into `branch-1.2` (1.2.2))
This patch refactors Spark Streaming's
Repository: spark
Updated Branches:
refs/heads/branch-1.3 419865475 - a15a0a02c
[SPARK-5839][SQL]HiveMetastoreCatalog does not recognize table names and
aliases of data source tables.
JIRA: https://issues.apache.org/jira/browse/SPARK-5839
Author: Yin Huai yh...@databricks.com
Closes #4626
Repository: spark
Updated Branches:
refs/heads/branch-1.2 7f19c7c1b - 1af7ca15f
[SPARK-5441][pyspark] Make SerDeUtil PairRDD to Python conversions more robust
SerDeUtil.pairRDDToPython and SerDeUtil.pythonToPairRDD now both support empty
RDDs by checking the result of take(1) instead of
Repository: spark
Updated Branches:
refs/heads/branch-1.2 1af7ca15f - 6f47114d9
[SPARK-5361]Multiple Java RDD - Python RDD conversions not working correctly
This is found through reading RDD from `sc.newAPIHadoopRDD` and writing it back
using `rdd.saveAsNewAPIHadoopFile` in pyspark.
It
Repository: spark
Updated Branches:
refs/heads/master b1bd1dd32 - 16687651f
[SPARK-3340] Deprecate ADD_JARS and ADD_FILES
I created a patch that disables the environment variables.
Thereby scala or python shell log a warning message to notify user about the
deprecation
with the following
Repository: spark
Updated Branches:
refs/heads/branch-1.3 c2a9a6176 - d8c70fb6d
[SPARK-3340] Deprecate ADD_JARS and ADD_FILES
I created a patch that disables the environment variables.
Thereby scala or python shell log a warning message to notify user about the
deprecation
with the following
Repository: spark
Updated Branches:
refs/heads/master 16687651f - 58a82a788
[SPARK-5849] Handle more types of invalid JSON requests in
SubmitRestProtocolMessage.parseAction
This patch improves SubmitRestProtocol's handling of invalid JSON requests in
cases where those requests were parsable
Repository: spark
Updated Branches:
refs/heads/branch-1.3 d8c70fb6d - 385a339a2
[SPARK-5849] Handle more types of invalid JSON requests in
SubmitRestProtocolMessage.parseAction
This patch improves SubmitRestProtocol's handling of invalid JSON requests in
cases where those requests were
Repository: spark
Updated Branches:
refs/heads/branch-1.3 385a339a2 - e355b54de
[SQL] Various DataFrame doc changes.
Added a bunch of tags.
Also changed parquetFile to take varargs rather than a string followed by
varargs.
Author: Reynold Xin r...@databricks.com
Closes #4636 from
Repository: spark
Updated Branches:
refs/heads/master 58a82a788 - 0e180bfc3
[SQL] Various DataFrame doc changes.
Added a bunch of tags.
Also changed parquetFile to take varargs rather than a string followed by
varargs.
Author: Reynold Xin r...@databricks.com
Closes #4636 from rxin/df-doc
Repository: spark
Updated Branches:
refs/heads/branch-1.2 a39da171c - 0df26bb97
[SPARK-5363] [PySpark] check ending mark in non-block way
There is chance of dead lock that the Python process is waiting for ending mark
from JVM, but which is eaten by corrupted stream.
This PR checks the
Repository: spark
Updated Branches:
refs/heads/master a51d51ffa - d380f324c
[SPARK-5853][SQL] Schema support in Row.
Author: Reynold Xin r...@databricks.com
Closes #4640 from rxin/SPARK-5853 and squashes the following commits:
9c6f569 [Reynold Xin] [SPARK-5853][SQL] Schema support in Row.
Repository: spark
Updated Branches:
refs/heads/branch-1.3 c6a70694b - d0701d9bf
[SPARK-5853][SQL] Schema support in Row.
Author: Reynold Xin r...@databricks.com
Closes #4640 from rxin/SPARK-5853 and squashes the following commits:
9c6f569 [Reynold Xin] [SPARK-5853][SQL] Schema support in
Repository: spark
Updated Branches:
refs/heads/master 0e180bfc3 - ac6fe67e1
[SPARK-5363] [PySpark] check ending mark in non-block way
There is chance of dead lock that the Python process is waiting for ending mark
from JVM, but which is eaten by corrupted stream.
This PR checks the ending
Repository: spark
Updated Branches:
refs/heads/branch-1.3 e355b54de - baad6b3cf
[SPARK-5363] [PySpark] check ending mark in non-block way
There is chance of dead lock that the Python process is waiting for ending mark
from JVM, but which is eaten by corrupted stream.
This PR checks the
Repository: spark
Updated Branches:
refs/heads/master ac6fe67e1 - a51d51ffa
SPARK-5850: Remove experimental label for Scala 2.11 and FlumePollingStream
Author: Patrick Wendell patr...@databricks.com
Closes #4638 from pwendell/SPARK-5850 and squashes the following commits:
386126f [Patrick
Repository: spark
Updated Branches:
refs/heads/branch-1.2 f468688f1 - a39da171c
[SPARK-5395] [PySpark] fix python process leak while coalesce()
Currently, the Python process is released into pool only after the task had
finished, it cause many process forked if coalesce() is called.
This PR
Repository: spark
Updated Branches:
refs/heads/master 6f54dee66 - c51ab37fa
[SPARK-5833] [SQL] Adds REFRESH TABLE command
Lifts `HiveMetastoreCatalog.refreshTable` to `Catalog`. Adds `RefreshTable`
command to refresh (possibly cached) metadata in external data sources tables.
!--
Repository: spark
Updated Branches:
refs/heads/branch-1.3 d0701d9bf - dfe0fa01c
[SPARK-5802][MLLIB] cache transformed data in glm
If we need to transform the input data, we should cache the output to avoid
re-computing feature vectors every iteration. dbtsai
Author: Xiangrui Meng
Repository: spark
Updated Branches:
refs/heads/master d380f324c - fd84229e2
[SPARK-5802][MLLIB] cache transformed data in glm
If we need to transform the input data, we should cache the output to avoid
re-computing feature vectors every iteration. dbtsai
Author: Xiangrui Meng
Repository: spark
Updated Branches:
refs/heads/branch-1.3 dfe0fa01c - e9241fa70
HOTFIX: Style issue causing build break
Caused by #4601
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e9241fa7
Tree:
Repository: spark
Updated Branches:
refs/heads/master fd84229e2 - c06e42f2c
HOTFIX: Style issue causing build break
Caused by #4601
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c06e42f2
Tree:
Repository: spark
Updated Branches:
refs/heads/master c78a12c4c - d51d6ba15
[Ml] SPARK-5804 Explicitly manage cache in Crossvalidator k-fold loop
On a big dataset explicitly unpersist train and validation folds allows to load
more data into memory in the next loop iteration. On my
Repository: spark
Updated Branches:
refs/heads/branch-1.3 0d932058e - 066301c65
[Minor] [SQL] Renames stringRddToDataFrame to stringRddToDataFrameHolder for
consistency
!-- Reviewable:start --
[img src=https://reviewable.io/review_button.png; height=40 alt=Review on
Repository: spark
Updated Branches:
refs/heads/branch-1.3 9cf7d7088 - 0d932058e
[Ml] SPARK-5804 Explicitly manage cache in Crossvalidator k-fold loop
On a big dataset explicitly unpersist train and validation folds allows to load
more data into memory in the next loop iteration. On my
[SPARK-4553] [SPARK-5767] [SQL] Wires Parquet data source with the newly
introduced write support for data source API
This PR migrates the Parquet data source to the new data source write support
API. Now users can also overwriting and appending to existing tables. Notice
that inserting into
Repository: spark
Updated Branches:
refs/heads/master 199a9e802 - 3ce58cf9c
http://git-wip-us.apache.org/repos/asf/spark/blob/3ce58cf9/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala
--
diff --git
Repository: spark
Updated Branches:
refs/heads/master d51d6ba15 - 199a9e802
[Minor] [SQL] Renames stringRddToDataFrame to stringRddToDataFrameHolder for
consistency
!-- Reviewable:start --
[img src=https://reviewable.io/review_button.png; height=40 alt=Review on
Repository: spark
Updated Branches:
refs/heads/branch-1.3 066301c65 - 78f7edb85
http://git-wip-us.apache.org/repos/asf/spark/blob/78f7edb8/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala
--
diff --git
62 matches
Mail list logo