Repository: spark
Updated Branches:
refs/heads/master 6550086bb -> 51f99fb25
[SQL] Fix typo in DataframeWriter doc
## What changes were proposed in this pull request?
The format of none should be consistent with other compression
codec(\`snappy\`, \`lz4\`) as \`none\`.
## How was this
Repository: spark
Updated Branches:
refs/heads/branch-2.2 467ee8dff -> 690f491f6
[SPARK-12717][PYTHON][BRANCH-2.2] Adding thread-safe broadcast pickle registry
## What changes were proposed in this pull request?
When using PySpark broadcast variables in a multi-threaded environment,
Repository: spark
Updated Branches:
refs/heads/branch-2.1 b31b30209 -> d93e45b8b
[SPARK-12717][PYTHON][BRANCH-2.1] Adding thread-safe broadcast pickle registry
## What changes were proposed in this pull request?
When using PySpark broadcast variables in a multi-threaded environment,
Repository: spark
Updated Branches:
refs/heads/master e7c59b417 -> 97ba49183
[SPARK-21602][R] Add map_keys and map_values functions to R
## What changes were proposed in this pull request?
This PR adds `map_values` and `map_keys` to R API.
```r
> df <- createDataFrame(cbind(model =
Repository: spark
Updated Branches:
refs/heads/master 58da1a245 -> 77cc0d67d
[SPARK-12717][PYTHON] Adding thread-safe broadcast pickle registry
## What changes were proposed in this pull request?
When using PySpark broadcast variables in a multi-threaded environment,
Repository: spark
Updated Branches:
refs/heads/master 42b9eda80 -> 966083105
[SPARK-21712][PYSPARK] Clarify type error for Column.substr()
Proposed changes:
* Clarify the type error that `Column.substr()` gives.
Test plan:
* Tested this manually.
* Test code:
```python
from
Repository: spark
Updated Branches:
refs/heads/master da8c59bde -> b0bdfce9c
[MINOR][BUILD] Download RAT and R version info over HTTPS; use RAT 0.12
## What changes were proposed in this pull request?
This is trivial, but bugged me. We should download software over HTTPS.
And we can use RAT
Repository: spark
Updated Branches:
refs/heads/master 6847e93cf -> 0fcde87aa
[SPARK-21658][SQL][PYSPARK] Add default None for value in na.replace in PySpark
## What changes were proposed in this pull request?
JIRA issue: https://issues.apache.org/jira/browse/SPARK-21658
Add default None for
Repository: spark-website
Updated Branches:
refs/heads/asf-site 6ff5039f3 -> 0e09b2f58
Update committer page
Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/0e09b2f5
Tree:
Repository: spark
Updated Branches:
refs/heads/master 1ba967b25 -> d4e7f20f5
[SPARKR][BUILD] AppVeyor change to latest R version
## What changes were proposed in this pull request?
R version update
## How was this patch tested?
AppVeyor
Author: Felix Cheung
Repository: spark
Updated Branches:
refs/heads/master 6830e90de -> f1a798b57
[MINOR] Minor comment fixes in merge_spark_pr.py script
## What changes were proposed in this pull request?
This PR proposes to fix few rather typos in `merge_spark_pr.py`.
- `# usage: ./apache-pr-merge.py
Repository: spark
Updated Branches:
refs/heads/master ee1304199 -> 08ef7d718
[MINOR][R][BUILD] More reliable detection of R version for Windows in AppVeyor
## What changes were proposed in this pull request?
This PR proposes to use https://rversions.r-pkg.org/r-release-win instead of
Repository: spark
Updated Branches:
refs/heads/master 894d5a453 -> 3a45c7fee
[INFRA] Close stale PRs
## What changes were proposed in this pull request?
This PR proposes to close stale PRs, mostly the same instances with #18017
Closes #14085 - [SPARK-16408][SQL] SparkSQL Added file get
Repository: spark
Updated Branches:
refs/heads/master 310454be3 -> 07a2b8738
[SPARK-21778][SQL] Simpler Dataset.sample API in Scala / Java
## What changes were proposed in this pull request?
Dataset.sample requires a boolean flag withReplacement as the first argument.
However, most of the
Repository: spark
Updated Branches:
refs/heads/master 054ddb2f5 -> a28728a9a
[SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support converting MapType to
json for PySpark and SparkR
## What changes were proposed in this pull request?
In previous work SPARK-21513, we has allowed `MapType` and
Repository: spark
Updated Branches:
refs/heads/master f180b6534 -> c11f24a94
[SPARK-18136] Fix SPARK_JARS_DIR for Python pip install on Windows
## What changes were proposed in this pull request?
Fix for setup of `SPARK_JARS_DIR` on Windows as it looks for
`%SPARK_HOME%\RELEASE` file
Repository: spark
Updated Branches:
refs/heads/branch-2.2 de6274a58 -> c0a34a9ff
[SPARK-18136] Fix SPARK_JARS_DIR for Python pip install on Windows
## What changes were proposed in this pull request?
Fix for setup of `SPARK_JARS_DIR` on Windows as it looks for
`%SPARK_HOME%\RELEASE` file
Repository: spark
Updated Branches:
refs/heads/branch-2.1 03db72149 -> 0b3e7cc6a
[SPARK-18136] Fix SPARK_JARS_DIR for Python pip install on Windows
## What changes were proposed in this pull request?
Fix for setup of `SPARK_JARS_DIR` on Windows as it looks for
`%SPARK_HOME%\RELEASE` file
Repository: spark
Updated Branches:
refs/heads/master 1da5822e6 -> a8d9ec8a6
[SPARK-21780][R] Simpler Dataset.sample API in R
## What changes were proposed in this pull request?
This PR make `sample(...)` able to omit `withReplacement` defaulting to `FALSE`.
In short, the following examples
Repository: spark
Updated Branches:
refs/heads/master 1d1a09be9 -> 1270e7175
[SPARK-22086][DOCS] Add expression description for CASE WHEN
## What changes were proposed in this pull request?
In SQL conditional expressions, only CASE WHEN lacks for expression
description. This patch fills the
Repository: spark
Updated Branches:
refs/heads/master 73d906722 -> f4073020a
[SPARK-22032][PYSPARK] Speed up StructType conversion
## What changes were proposed in this pull request?
StructType.fromInternal is calling f.fromInternal(v) for every field.
We can use precalculated information
Repository: spark
Updated Branches:
refs/heads/branch-2.2 51e5a821d -> 42852bb17
[SPARK-21985][PYSPARK] PairDeserializer is broken for double-zipped RDDs
## What changes were proposed in this pull request?
(edited)
Fixes a bug introduced in #16121
In PairDeserializer convert each batch of
Repository: spark
Updated Branches:
refs/heads/branch-2.1 e49c997fe -> 3ae7ab8e8
[SPARK-21985][PYSPARK] PairDeserializer is broken for double-zipped RDDs
## What changes were proposed in this pull request?
(edited)
Fixes a bug introduced in #16121
In PairDeserializer convert each batch of
Repository: spark
Updated Branches:
refs/heads/master f4073020a -> 6adf67dd1
[SPARK-21985][PYSPARK] PairDeserializer is broken for double-zipped RDDs
## What changes were proposed in this pull request?
(edited)
Fixes a bug introduced in #16121
In PairDeserializer convert each batch of keys
Repository: spark
Updated Branches:
refs/heads/branch-2.2 309c401a5 -> a86831d61
[SPARK-22043][PYTHON] Improves error message for show_profiles and dump_profiles
## What changes were proposed in this pull request?
This PR proposes to improve error message from:
```
>>> sc.show_profiles()
Repository: spark
Updated Branches:
refs/heads/branch-2.1 99de4b8f5 -> b35136a9e
[SPARK-22043][PYTHON] Improves error message for show_profiles and dump_profiles
## What changes were proposed in this pull request?
This PR proposes to improve error message from:
```
>>> sc.show_profiles()
Repository: spark
Updated Branches:
refs/heads/master 6308c65f0 -> 7c7266208
[SPARK-22043][PYTHON] Improves error message for show_profiles and dump_profiles
## What changes were proposed in this pull request?
This PR proposes to improve error message from:
```
>>> sc.show_profiles()
Repository: spark
Updated Branches:
refs/heads/master d2b2932d8 -> 3e6a714c9
[SPARK-21766][PYSPARK][SQL] DataFrame toPandas() raises ValueError with
nullable int columns
## What changes were proposed in this pull request?
When calling `DataFrame.toPandas()` (without Arrow enabled), if there
Repository: spark
Updated Branches:
refs/heads/master 2b6ff0cef -> e17901d6d
[SPARK-22049][DOCS] Confusing behavior of from_utc_timestamp and
to_utc_timestamp
## What changes were proposed in this pull request?
Clarify behavior of to_utc_timestamp/from_utc_timestamp with an example
## How
Repository: spark
Updated Branches:
refs/heads/master 0c03297bf -> c7b46d4d8
[SPARK-21877][DEPLOY, WINDOWS] Handle quotes in Windows command scripts
## What changes were proposed in this pull request?
All the windows command scripts can not handle quotes in parameter.
Run a windows command
Repository: spark
Updated Branches:
refs/heads/master 2028e5a82 -> bfc7e1fe1
[SPARK-20396][SQL][PYSPARK] groupby().apply() with pandas udf
## What changes were proposed in this pull request?
This PR adds an apply() function on df.groupby(). apply() takes a pandas udf
that is a
Repository: spark
Updated Branches:
refs/heads/master c8affec21 -> ae61f187a
[SPARK-22206][SQL][SPARKR] gapply in R can't work on empty grouping columns
## What changes were proposed in this pull request?
Looks like `FlatMapGroupsInRExec.requiredChildDistribution` didn't consider
empty
Repository: spark
Updated Branches:
refs/heads/branch-2.1 5bb4e931b -> 920372a19
[SPARK-22206][SQL][SPARKR] gapply in R can't work on empty grouping columns
## What changes were proposed in this pull request?
Looks like `FlatMapGroupsInRExec.requiredChildDistribution` didn't consider
empty
Repository: spark
Updated Branches:
refs/heads/branch-2.2 81232ce03 -> 8a4e7dd89
[SPARK-22206][SQL][SPARKR] gapply in R can't work on empty grouping columns
## What changes were proposed in this pull request?
Looks like `FlatMapGroupsInRExec.requiredChildDistribution` didn't consider
empty
Repository: spark
Updated Branches:
refs/heads/master e0503a722 -> 014dc8471
[SPARK-22233][CORE] Allow user to filter out empty split in HadoopRDD
## What changes were proposed in this pull request?
Add a flag spark.files.ignoreEmptySplits. When true, methods like that use
HadoopRDD and
Repository: spark
Updated Branches:
refs/heads/master dbb824125 -> 0dfc1ec59
[SPARK-21726][SQL][FOLLOW-UP] Check for structural integrity of the plan in
Optimzer in test mode
## What changes were proposed in this pull request?
The condition in `Optimizer.isPlanIntegral` is wrong. We should
Repository: spark
Updated Branches:
refs/heads/master d8f454086 -> 313c6ca43
[SPARK-21875][BUILD] Fix Java style bugs
## What changes were proposed in this pull request?
Fix Java code style so `./dev/lint-java` succeeds
## How was this patch tested?
Run `./dev/lint-java`
Author: Andrew
Repository: spark
Updated Branches:
refs/heads/master 6949a9c5c -> d8f454086
[SPARK-21839][SQL] Support SQL config for ORC compression
## What changes were proposed in this pull request?
This PR aims to support `spark.sql.orc.compression.codec` like Parquet's
Repository: spark
Updated Branches:
refs/heads/master 16c4c03c7 -> 64936c14a
[SPARK-21903][BUILD][FOLLOWUP] Upgrade scalastyle-maven-plugin and scalastyle
as well in POM and SparkBuild.scala
## What changes were proposed in this pull request?
This PR proposes to match scalastyle version in
Repository: spark
Updated Branches:
refs/heads/master fa0092bdd -> aad212547
Fixed pandoc dependency issue in python/setup.py
## Problem Description
When pyspark is listed as a dependency of another package, installing
the other package will cause an install failure in pyspark. When the
Repository: spark
Updated Branches:
refs/heads/branch-2.2 342cc2a4c -> 49968de52
Fixed pandoc dependency issue in python/setup.py
## Problem Description
When pyspark is listed as a dependency of another package, installing
the other package will cause an install failure in pyspark. When the
Repository: spark
Updated Branches:
refs/heads/master 1a9857476 -> 371e4e205
[SPARK-21513][SQL] Allow UDF to_json support converting MapType to json
# What changes were proposed in this pull request?
UDF to_json only supports converting `StructType` or `ArrayType` of
`StructType`s to a json
Repository: spark
Updated Branches:
refs/heads/branch-2.2 182478e03 -> b1b5a7fdc
[SPARK-20098][PYSPARK] dataType's typeName fix
## What changes were proposed in this pull request?
`typeName` classmethod has been fixed by using type -> typeName map.
## How was this patch tested?
local build
Repository: spark
Updated Branches:
refs/heads/master f76790557 -> 520d92a19
[SPARK-20098][PYSPARK] dataType's typeName fix
## What changes were proposed in this pull request?
`typeName` classmethod has been fixed by using type -> typeName map.
## How was this patch tested?
local build
Repository: spark
Updated Branches:
refs/heads/master 8a5eb5068 -> 6b45d7e94
[SPARK-21954][SQL] JacksonUtils should verify MapType's value type instead of
key type
## What changes were proposed in this pull request?
`JacksonUtils.verifySchema` verifies if a data type can be converted to
Repository: spark
Updated Branches:
refs/heads/branch-2.2 987682160 -> 182478e03
[SPARK-21954][SQL] JacksonUtils should verify MapType's value type instead of
key type
## What changes were proposed in this pull request?
`JacksonUtils.verifySchema` verifies if a data type can be converted to
Repository: spark
Updated Branches:
refs/heads/master 6273a711b -> 828fab035
[BUILD][TEST][SPARKR] add sparksubmitsuite to appveyor tests
## What changes were proposed in this pull request?
more file regex
## How was this patch tested?
Jenkins, AppVeyor
Author: Felix Cheung
Repository: spark
Updated Branches:
refs/heads/master dd7816758 -> 7d0a3ef4c
[SPARK-21610][SQL][FOLLOWUP] Corrupt records are not handled properly when
creating a dataframe from a file
## What changes were proposed in this pull request?
When the `requiredSchema` only contains
Repository: spark
Updated Branches:
refs/heads/branch-2.2 211d81beb -> 8acce00ac
[SPARK-22107] Change as to alias in python quickstart
## What changes were proposed in this pull request?
Updated docs so that a line of python in the quick start guide executes. Closes
#19283
## How was this
Repository: spark
Updated Branches:
refs/heads/master 576c43fb4 -> 20adf9aa1
[SPARK-22107] Change as to alias in python quickstart
## What changes were proposed in this pull request?
Updated docs so that a line of python in the quick start guide executes. Closes
#19283
## How was this
Repository: spark
Updated Branches:
refs/heads/master ceaec9383 -> 1fdfe6935
[SPARK-22112][PYSPARK] Supports RDD of strings as input in spark.read.csv in
PySpark
## What changes were proposed in this pull request?
We added a method to the scala API for creating a `DataFrame` from
Repository: spark
Updated Branches:
refs/heads/master f21f6ce99 -> ceaec9383
[BUILD] Close stale PRs
Closes #13794
Closes #18474
Closes #18897
Closes #18978
Closes #19152
Closes #19238
Closes #19295
Closes #19334
Closes #19335
Closes #19347
Closes #19236
Closes #19244
Closes #19300
Closes
Repository: spark
Updated Branches:
refs/heads/master ce204780e -> d8e825e3b
[SPARK-22106][PYSPARK][SQL] Disable 0-parameter pandas_udf and add doctests
## What changes were proposed in this pull request?
This change disables the use of 0-parameter pandas_udfs due to the API being
overly
Repository: spark
Updated Branches:
refs/heads/master c6610a997 -> 02c91e03f
[SPARK-22063][R] Fixes lint check failures in R by latest commit sha1 ID of
lint-r
## What changes were proposed in this pull request?
Currently, we set lintr to jimhester/lintra769c0b (see
Repository: spark
Updated Branches:
refs/heads/master 9244957b5 -> 7bf4da8a3
[MINOR] Fixed up pandas_udf related docs and formatting
## What changes were proposed in this pull request?
Fixed some minor issues with pandas_udf related docs and formatting.
## How was this patch tested?
NA
Repository: spark
Updated Branches:
refs/heads/master d2b8b63b9 -> 12e740bba
[SPARK-22130][CORE] UTF8String.trim() scans " " twice
## What changes were proposed in this pull request?
This PR allows us to scan a string including only white space (e.g. `" "`)
once while the current
Repository: spark
Updated Branches:
refs/heads/master 12e740bba -> 09cbf3df2
[SPARK-22125][PYSPARK][SQL] Enable Arrow Stream format for vectorized UDF.
## What changes were proposed in this pull request?
Currently we use Arrow File format to communicate with Python worker when
invoking
Repository: spark
Updated Branches:
refs/heads/master 2274d84ef -> 9d48bd0b3
[SPARK-22093][TESTS] Fixes `assume` in `UtilsSuite` and `HiveDDLSuite`
## What changes were proposed in this pull request?
This PR proposes to remove `assume` in `Utils.resolveURIs` and replace `assume`
to `assert`
Repository: spark
Updated Branches:
refs/heads/master 846bc61cf -> 95713eb4f
[SPARK-21804][SQL] json_tuple returns null values within repeated columns
except the first one
## What changes were proposed in this pull request?
When json_tuple in extracting values from JSON it returns null
Repository: spark
Updated Branches:
refs/heads/master 95713eb4f -> dc5d34d8d
[SPARK-19165][PYTHON][SQL] PySpark APIs using columns as arguments should
validate input types for column
## What changes were proposed in this pull request?
While preparing to take over
Repository: spark
Updated Branches:
refs/heads/master c108a5d30 -> 751f51336
[SPARK-21070][PYSPARK] Attempt to update cloudpickle again
## What changes were proposed in this pull request?
Based on https://github.com/apache/spark/pull/18282 by rgbkrk this PR attempts
to update to the current
Repository: spark
Updated Branches:
refs/heads/master 522e1f80d -> 3b66b1c44
[MINOR][DOCS] Minor doc fixes related with doc build and uses script dir in SQL
doc gen script
## What changes were proposed in this pull request?
This PR proposes both:
- Add information about Javadoc, SQL docs
Repository: spark
Updated Branches:
refs/heads/master 73e04ecc4 -> 41e0eb71a
[SPARK-21773][BUILD][DOCS] Installs mkdocs if missing in the path in SQL
documentation build
## What changes were proposed in this pull request?
This PR proposes to install `mkdocs` by `pip install` if missing in
Repository: spark
Updated Branches:
refs/heads/master 734ed7a7b -> b30a11a6a
[SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed
and incorrect paths
## What changes were proposed in this pull request?
`org.apache.spark.deploy.RPackageUtilsSuite`
```
- jars
Repository: spark-website
Updated Branches:
refs/remotes/apache/asf-site [created] 1895d5cb0
Update committer page
Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/1895d5cb
Tree:
Repository: spark
Updated Branches:
refs/heads/master acb7fed23 -> 07fd68a29
[SPARK-21897][PYTHON][R] Add unionByName API to DataFrame in Python and R
## What changes were proposed in this pull request?
This PR proposes to add a wrapper for `unionByName` API to R and Python as well.
Repository: spark
Updated Branches:
refs/heads/master f5e10a34e -> 5cd8ea99f
[SPARK-21779][PYTHON] Simpler DataFrame.sample API in Python
## What changes were proposed in this pull request?
This PR make `DataFrame.sample(...)` can omit `withReplacement` defaulting
`False`, consistently with
Repository: spark
Updated Branches:
refs/heads/master 5cd8ea99f -> 648a8626b
[SPARK-21789][PYTHON] Remove obsolete codes for parsing abstract schema strings
## What changes were proposed in this pull request?
This PR proposes to remove private functions that look not used in the main
codes,
Repository: spark
Updated Branches:
refs/heads/master 4e7a29efd -> 7f3c6ff4f
[SPARK-21903][BUILD] Upgrade scalastyle to 1.0.0.
## What changes were proposed in this pull request?
1.0.0 fixes an issue with import order, explicit type for public methods, line
length limitation and comment
Repository: spark
Updated Branches:
refs/heads/master 3d0e17424 -> e47f48c73
[SPARK-20886][CORE] HadoopMapReduceCommitProtocol to handle
FileOutputCommitter.getWorkPath==null
## What changes were proposed in this pull request?
Handles the situation where a
Repository: spark
Updated Branches:
refs/heads/master 4482ff23a -> ecf437a64
[SPARK-21534][SQL][PYSPARK] PickleException when creating dataframe from python
row with empty bytearray
## What changes were proposed in this pull request?
`PickleException` is thrown when creating dataframe from
Repository: spark
Updated Branches:
refs/heads/master 02218c4c7 -> 9104add4c
[SPARK-22217][SQL] ParquetFileFormat to support arbitrary OutputCommitters
## What changes were proposed in this pull request?
`ParquetFileFormat` to relax its requirement of output committer class from
Repository: spark
Updated Branches:
refs/heads/branch-2.2 cd51e2c32 -> cfc04e062
[SPARK-22217][SQL] ParquetFileFormat to support arbitrary OutputCommitters
## What changes were proposed in this pull request?
`ParquetFileFormat` to relax its requirement of output committer class from
Repository: spark
Updated Branches:
refs/heads/branch-2.2 010b50cea -> f8c83fdc5
[SPARK-21551][PYTHON] Increase timeout for PythonRDD.serveIterator
Backport of https://github.com/apache/spark/pull/18752
(https://issues.apache.org/jira/browse/SPARK-21551)
(cherry picked from commit
Repository: spark
Updated Branches:
refs/heads/master 884d4f95f -> d9798c834
[SPARK-22313][PYTHON] Mark/print deprecation warnings as DeprecationWarning for
deprecated APIs
## What changes were proposed in this pull request?
This PR proposes to mark the existing warnings as
Repository: spark
Updated Branches:
refs/heads/master 3d90b2cb3 -> 209b9361a
[SPARK-20791][PYSPARK] Use Arrow to create Spark DataFrame from Pandas
## What changes were proposed in this pull request?
This change uses Arrow to optimize the creation of a Spark DataFrame from a
Pandas
Repository: spark
Updated Branches:
refs/heads/master fba63c1a7 -> d49d9e403
[SPARK-21693][R][FOLLOWUP] Reduce shuffle partitions running R worker in few
tests to speed up
## What changes were proposed in this pull request?
This is a followup to reduce AppVeyor test time. This PR proposes
Repository: spark
Updated Branches:
refs/heads/master 1edb3175d -> b4edafa99
[SPARK-22495] Fix setup of SPARK_HOME variable on Windows
## What changes were proposed in this pull request?
Fixing the way how `SPARK_HOME` is resolved on Windows. While the previous
version was working with the
Repository: spark
Updated Branches:
refs/heads/master 572af5027 -> 327d25fe1
[SPARK-22572][SPARK SHELL] spark-shell does not re-initialize on :replay
## What changes were proposed in this pull request?
Ticket: [SPARK-22572](https://issues.apache.org/jira/browse/SPARK-22572)
## How was this
Repository: spark
Updated Branches:
refs/heads/branch-2.2 38a0532cf -> d7b14746d
[SPARK-22654][TESTS] Retry Spark tarball download if failed in
HiveExternalCatalogVersionsSuite
## What changes were proposed in this pull request?
Adds a simple loop to retry download of Spark tarballs from
Repository: spark
Updated Branches:
refs/heads/master 9c29c5576 -> 6eb203fae
[SPARK-22654][TESTS] Retry Spark tarball download if failed in
HiveExternalCatalogVersionsSuite
## What changes were proposed in this pull request?
Adds a simple loop to retry download of Spark tarballs from
Repository: spark
Updated Branches:
refs/heads/master 6eb203fae -> 932bd09c8
[SPARK-22635][SQL][ORC] FileNotFoundException while reading ORC files
containing special characters
## What changes were proposed in this pull request?
SPARK-22146 fix the FileNotFoundException issue only for the
Repository: spark
Updated Branches:
refs/heads/master 087879a77 -> 33d43bf1b
[SPARK-22484][DOC] Document PySpark DataFrame csv writer behavior wheâ¦
## What changes were proposed in this pull request?
In PySpark API Document, DataFrame.write.csv() says that setting the quote
parameter to
Repository: spark
Updated Branches:
refs/heads/master 8ff474f6e -> ab6f60c4d
[SPARK-22585][CORE] Path in addJar is not url encoded
## What changes were proposed in this pull request?
This updates a behavior of `addJar` method of `sparkContext` class. If path
without any scheme is passed as
Repository: spark
Updated Branches:
refs/heads/master ab6f60c4d -> 92cfbeeb5
[SPARK-21866][ML][PYTHON][FOLLOWUP] Few cleanups and fix image test failure in
Python 3.6.0 / NumPy 1.13.3
## What changes were proposed in this pull request?
Image test seems failed in Python 3.6.0 / NumPy 1.13.3.
Repository: spark
Updated Branches:
refs/heads/master ee10ca7ec -> aa4cf2b19
[SPARK-22651][PYTHON][ML] Prevent initiating multiple Hive clients for
ImageSchema.readImages
## What changes were proposed in this pull request?
Calling `ImageSchema.readImages` multiple times as below in PySpark
Repository: spark
Updated Branches:
refs/heads/branch-2.2 ba00bd961 -> f3f8c8767
[SPARK-22635][SQL][ORC] FileNotFoundException while reading ORC files
containing special characters
## What changes were proposed in this pull request?
SPARK-22146 fix the FileNotFoundException issue only for
Repository: spark
Updated Branches:
refs/heads/master 46776234a -> 0c8fca460
[SPARK-22811][PYSPARK][ML] Fix pyspark.ml.tests failure when Hive is not
available.
## What changes were proposed in this pull request?
pyspark.ml.tests is missing a py4j import. I've added the import and fixed the
Repository: spark
Updated Branches:
refs/heads/master 704af4bd6 -> 17cdabb88
[SPARK-19809][SQL][TEST] NullPointerException on zero-size ORC file
## What changes were proposed in this pull request?
Until 2.2.1, Spark raises `NullPointerException` on zero-size ORC files.
Usually, these
Repository: spark
Updated Branches:
refs/heads/master fbfa9be7e -> 3a07eff5a
[SPARK-22813][BUILD] Use lsof or /usr/sbin/lsof in run-tests.py
## What changes were proposed in this pull request?
In [the environment where `/usr/sbin/lsof` does not
Repository: spark
Updated Branches:
refs/heads/master 772e4648d -> fbfa9be7e
Revert "Revert "[SPARK-22496][SQL] thrift server adds operation logs""
This reverts commit e58f275678fb4f904124a4a2a1762f04c835eb0e.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
Repository: spark
Updated Branches:
refs/heads/master 0c8fca460 -> c2aeddf9e
[SPARK-22817][R] Use fixed testthat version for SparkR tests in AppVeyor
## What changes were proposed in this pull request?
`testthat` 2.0.0 is released and AppVeyor now started to use it instead of
1.0.2. And
Repository: spark
Updated Branches:
refs/heads/branch-2.2 b4f4be396 -> 1e4cca02f
[SPARK-22817][R] Use fixed testthat version for SparkR tests in AppVeyor
## What changes were proposed in this pull request?
`testthat` 2.0.0 is released and AppVeyor now started to use it instead of
1.0.2. And
Repository: spark
Updated Branches:
refs/heads/branch-2.1 ca19271cc -> 7bdad58e2
[SPARK-22377][BUILD] Use /usr/sbin/lsof if lsof does not exists in
release-build.sh
## What changes were proposed in this pull request?
This PR proposes to use `/usr/sbin/lsof` if `lsof` is missing in the path
Repository: spark
Updated Branches:
refs/heads/master f7534b37e -> c8b7f97b8
[SPARK-22377][BUILD] Use /usr/sbin/lsof if lsof does not exists in
release-build.sh
## What changes were proposed in this pull request?
This PR proposes to use `/usr/sbin/lsof` if `lsof` is missing in the path to
Repository: spark
Updated Branches:
refs/heads/branch-2.2 d905e85d2 -> 3ea6fd0c4
[SPARK-22377][BUILD] Use /usr/sbin/lsof if lsof does not exists in
release-build.sh
## What changes were proposed in this pull request?
This PR proposes to use `/usr/sbin/lsof` if `lsof` is missing in the path
Repository: spark
Updated Branches:
refs/heads/master b10837ab1 -> 57c5514de
[SPARK-22554][PYTHON] Add a config to control if PySpark should use daemon or
not for workers
## What changes were proposed in this pull request?
This PR proposes to add a flag to control if PySpark should use
Repository: spark
Updated Branches:
refs/heads/master d54bfec2e -> b10837ab1
[SPARK-22557][TEST] Use ThreadSignaler explicitly
## What changes were proposed in this pull request?
ScalaTest 3.0 uses an implicit `Signaler`. This PR makes it sure all Spark
tests uses `ThreadSignaler`
Repository: spark
Updated Branches:
refs/heads/master dce1610ae -> 8f0e88df0
[SPARK-20791][PYTHON][FOLLOWUP] Check for unicode column names in
createDataFrame with Arrow
## What changes were proposed in this pull request?
If schema is passed as a list of unicode strings for column names,
Repository: spark
Updated Branches:
refs/heads/master 3eb315d71 -> 223d83ee9
[SPARK-22476][R] Add dayofweek function to R
## What changes were proposed in this pull request?
This PR adds `dayofweek` to R API:
```r
data <- list(list(d = as.Date("2012-12-13")),
list(d =
1 - 100 of 8637 matches
Mail list logo