Repository: spark
Updated Branches:
refs/heads/master 9b618fb0d - f0dcbe8a7
[SPARK-8541] [PYSPARK] test the absolute error in approx doctests
A minor change but one which is (presumably) visible on the public api docs
webpage.
Author: Scott Taylor git...@megatron.me.uk
Closes #6942 from
Repository: spark
Updated Branches:
refs/heads/branch-1.4 22cc1ab66 - d0943afbc
[SPARK-8541] [PYSPARK] test the absolute error in approx doctests
A minor change but one which is (presumably) visible on the public api docs
webpage.
Author: Scott Taylor git...@megatron.me.uk
Closes #6942
Repository: spark
Updated Branches:
refs/heads/branch-1.3 45b4527e3 - 716dcf631
[SPARK-8541] [PYSPARK] test the absolute error in approx doctests
A minor change but one which is (presumably) visible on the public api docs
webpage.
Author: Scott Taylor git...@megatron.me.uk
Closes #6942
Repository: spark
Updated Branches:
refs/heads/branch-1.4 152f4465d - bd9bbd611
[SPARK-8462] [DOCS] Documentation fixes for Spark SQL
This fixes various minor documentation issues on the Spark SQL page
Author: Lars Francke lars.fran...@gmail.com
Closes #6890 from lfrancke/SPARK-8462 and
Repository: spark
Updated Branches:
refs/heads/master 43f50decd - 4ce3bab89
[SPARK-8462] [DOCS] Documentation fixes for Spark SQL
This fixes various minor documentation issues on the Spark SQL page
Author: Lars Francke lars.fran...@gmail.com
Closes #6890 from lfrancke/SPARK-8462 and
Repository: spark
Updated Branches:
refs/heads/master dc4131389 - 43f50decd
[SPARK-8135] Don't load defaults when reconstituting Hadoop Configurations
Author: Sandy Ryza sa...@cloudera.com
Closes #6679 from sryza/sandy-spark-8135 and squashes the following commits:
c5554ff [Sandy Ryza]
that the items usually have similar size, so we don't need
to adjust the batch size after first spill.
cc JoshRosen rxin angelini
Author: Davies Liu dav...@databricks.com
Closes #6714 from davies/batch_size and squashes the following commits:
b170dfb [Davies Liu] update test
b9be832 [Davies Liu] Merge
(it
was introduced for the old AMPCamp training, but isn't used anymore).
Author: Josh Rosen joshro...@databricks.com
Closes #6808 from JoshRosen/SPARK-8353 and squashes the following commits:
e59d8a7 [Josh Rosen] Suppress underline on hover
f518b6a [Josh Rosen] Turn on for all headers, since we use
(it
was introduced for the old AMPCamp training, but isn't used anymore).
Author: Josh Rosen joshro...@databricks.com
Closes #6808 from JoshRosen/SPARK-8353 and squashes the following commits:
e59d8a7 [Josh Rosen] Suppress underline on hover
f518b6a [Josh Rosen] Turn on for all headers, since we use H1s
Repository: spark
Updated Branches:
refs/heads/master 6765ef98d - 50a0496a4
[SPARK-7017] [BUILD] [PROJECT INFRA] Refactor dev/run-tests into Python
All, this is a first attempt at refactoring `dev/run-tests` into Python.
Initially I merely converted all Bash calls over to Python, then moved
Repository: spark
Updated Branches:
refs/heads/master 3b6107704 - 9db73ec12
[SPARK-8381][SQL]reuse typeConvert when convert Seq[Row] to catalyst type
reuse-typeConvert when convert Seq[Row] to CatalystType
Author: Lianhui Wang lianhuiwan...@gmail.com
Closes #6831 from
Repository: spark
Updated Branches:
refs/heads/master d1069cba4 - 165f52f2f
[HOTFIX] [PROJECT-INFRA] Fix bug in dev/run-tests for MLlib-only PRs
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/165f52f2
Tree:
of UnsafeRows, since
UnsafeRowConverter already used integers when calculating the size requirements
for rows.
Author: Josh Rosen joshro...@databricks.com
Closes #6809 from JoshRosen/sql-bytes-vs-words-fix and squashes the following
commits:
6520339 [Josh Rosen] Updates to reflect fact
of UnsafeRows, since
UnsafeRowConverter already used integers when calculating the size requirements
for rows.
Author: Josh Rosen joshro...@databricks.com
Closes #6809 from JoshRosen/sql-bytes-vs-words-fix and squashes the following
commits:
6520339 [Josh Rosen] Updates to reflect fact
should be safe.
Author: Josh Rosen joshro...@databricks.com
Closes #6773 from JoshRosen/SPARK-8319 and squashes the following commits:
7a14129 [Josh Rosen] Revise comments; add handler to guard against future
ShuffleManager implementations
07bb2c9 [Josh Rosen] Update comment to clarify
this in followup patches.
Author: Josh Rosen joshro...@databricks.com
Closes #6618 from JoshRosen/SPARK-8062-branch-1.2 and squashes the following
commits:
652fa3c [Josh Rosen] Re-name test and reapply fix
66fc600 [Josh Rosen] Fix and minimize regression test (verified that it still
fails)
1d8d125
Repository: spark
Updated Branches:
refs/heads/branch-1.4 f1d4e7e31 - df0bf71ee
[HOTFIX] Remove trailing whitespace to fix Scalastyle checks
866652c903d06d1cb4356283e0741119d84dcc21 enabled this check.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
Repository: spark
Updated Branches:
refs/heads/master 1281a3518 - 66a53a696
[HOTFIX] Replace FunSuite with SparkFunSuite.
This fixes a build break introduced by merging
a6430028ecd7a6130f1eb15af9ec00e242c46725,
which fails the new style checks that ensure that we use SparkFunSuite instead
of
Repository: spark
Updated Branches:
refs/heads/master 1c5b19827 - 82a396c2f
[SPARK-7910] [TINY] [JAVAAPI] expose partitioner information in javardd
Author: Holden Karau hol...@pigscanfly.ca
Closes #6464 from holdenk/SPARK-7910-expose-partitioner-information-in-javardd
and squashes the
only occurs when auto-reset is disabled and reference-tracking is
enabled.
Author: Josh Rosen joshro...@databricks.com
Closes #6293 from JoshRosen/kryo-instance-reuse-bug and squashes the following
commits:
e19726d [Josh Rosen] Add fix for SPARK-7766.
71845e3 [Josh Rosen] Add failing regression
that
this problem only occurs when auto-reset is disabled and reference-tracking is
enabled.
Author: Josh Rosen joshro...@databricks.com
Closes #6293 from JoshRosen/kryo-instance-reuse-bug and squashes the following
commits:
e19726d [Josh Rosen] Add fix for SPARK-7766.
71845e3 [Josh Rosen] Add failing
Repository: spark
Updated Branches:
refs/heads/branch-1.4 d6cb04463 - afde4019b
[SPARK-7760] add /json back into master worker pages; add test
Author: Imran Rashid iras...@cloudera.com
Closes #6284 from squito/SPARK-7760 and squashes the following commits:
5e02d8a [Imran Rashid] style;
Repository: spark
Updated Branches:
refs/heads/master 63a5ce75e - a16357413
[SPARK-7795] [CORE] Speed up task scheduling in standalone mode by reusing
serializer
My experiments with scheduling very short tasks in standalone cluster mode
indicated that a significant amount of time was being
Repository: spark
Updated Branches:
refs/heads/branch-1.4 e597692ac - c9a80fc40
[SPARK-7711] Add a startTime property to match the corresponding one in Scala
Author: Holden Karau hol...@pigscanfly.ca
Closes #6275 from holdenk/SPARK-771-startTime-is-missing-from-pyspark and
squashes the
Repository: spark
Updated Branches:
refs/heads/master 3d085 - 6b18cdc1b
[SPARK-7711] Add a startTime property to match the corresponding one in Scala
Author: Holden Karau hol...@pigscanfly.ca
Closes #6275 from holdenk/SPARK-771-startTime-is-missing-from-pyspark and
squashes the
build and a regular build. If a build is a regular one, we
always set _RUN_SQL_TESTS to true.
cc JoshRosen nchammas
Author: Yin Huai yh...@databricks.com
Closes #5955 from yhuai/runSQLTests and squashes the following commits:
3d399bc [Yin Huai] Always run SQL tests in master build.
(cherry
build and a regular build. If a build is a regular one, we
always set _RUN_SQL_TESTS to true.
cc JoshRosen nchammas
Author: Yin Huai yh...@databricks.com
Closes #5955 from yhuai/runSQLTests and squashes the following commits:
3d399bc [Yin Huai] Always run SQL tests in master build.
Project: http
and will cause problem because it is too early and before some
assert checking. E.g., if an attempt with incorrect `keyLengthBytes` marks
`isDefined` as true, the location can not be used later.
ping JoshRosen
Author: Liang-Chi Hsieh vii...@gmail.com
Closes #6324 from viirya/dup_isdefined
JoshRosen/SPARK-7251 and squashes the following commits:
05bd90a [Josh Rosen] Compare capacity, not size, to MAX_CAPACITY
2a20d71 [Josh Rosen] Fix maximum BytesToBytesMap capacity
bc4854b [Josh Rosen] Guard against overflow when growing BytesToBytesMap
f5feadf [Josh Rosen] Add test for iterating over
now use Guava's `Iterators.emptyIterator()` in place of
`Collections.emptyIterator()`, which isn't present in all Java 6 versions.
Author: Josh Rosen joshro...@databricks.com
Closes #6298 from JoshRosen/SPARK-7719-fix-java-6-test-code and squashes the
following commits:
5c9bd85 [Josh Rosen] Re
.
We now use Guava's `Iterators.emptyIterator()` in place of
`Collections.emptyIterator()`, which isn't present in all Java 6 versions.
Author: Josh Rosen joshro...@databricks.com
Closes #6298 from JoshRosen/SPARK-7719-fix-java-6-test-code and squashes the
following commits:
5c9bd85 [Josh Rosen
on a stress test
which launches huge numbers of short-lived shuffle map tasks back-to-back in
the same JVM.
Author: Josh Rosen joshro...@databricks.com
Closes #6227 from JoshRosen/SPARK-7698 and squashes the following commits:
fd6cb55 [Josh Rosen] SoftReference - WeakReference
b154e86 [Josh
on a stress test
which launches huge numbers of short-lived shuffle map tasks back-to-back in
the same JVM.
Author: Josh Rosen joshro...@databricks.com
Closes #6227 from JoshRosen/SPARK-7698 and squashes the following commits:
fd6cb55 [Josh Rosen] SoftReference - WeakReference
b154e86 [Josh Rosen
Repository: spark
Updated Branches:
refs/heads/master 9dadf019b - 32fbd297d
[SPARK-6216] [PYSPARK] check python version of worker with driver
This PR revert #5404, change to pass the version of python in driver into JVM,
check it in worker before deserializing closure, then it can works with
fix.
Author: Josh Rosen joshro...@databricks.com
Closes #6176 from JoshRosen/SPARK-7660-wrap-snappy and squashes the following
commits:
8b77aae [Josh Rosen] Wrap SnappyOutputStream to fix SPARK-7660
(cherry picked from commit f2cc6b5bccc3a70fd7d69183b1a068800831fe19)
Signed-off-by: Josh Rosen
.
Author: Josh Rosen joshro...@databricks.com
Closes #6176 from JoshRosen/SPARK-7660-wrap-snappy and squashes the following
commits:
8b77aae [Josh Rosen] Wrap SnappyOutputStream to fix SPARK-7660
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org
fix.
Author: Josh Rosen joshro...@databricks.com
Closes #6176 from JoshRosen/SPARK-7660-wrap-snappy and squashes the following
commits:
8b77aae [Josh Rosen] Wrap SnappyOutputStream to fix SPARK-7660
(cherry picked from commit f2cc6b5bccc3a70fd7d69183b1a068800831fe19)
Signed-off-by: Josh Rosen
fix.
Author: Josh Rosen joshro...@databricks.com
Closes #6176 from JoshRosen/SPARK-7660-wrap-snappy and squashes the following
commits:
8b77aae [Josh Rosen] Wrap SnappyOutputStream to fix SPARK-7660
(cherry picked from commit f2cc6b5bccc3a70fd7d69183b1a068800831fe19)
Signed-off-by: Josh Rosen
Repository: spark
Updated Branches:
refs/heads/master e8f0e016e - 7da33ce50
[HOTFIX] Add workaround for SPARK-7660 to fix JavaAPISuite failures.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7da33ce5
Tree:
Repository: spark
Updated Branches:
refs/heads/branch-1.4 7aa269f4b - 1206a5597
[HOTFIX] Add workaround for SPARK-7660 to fix JavaAPISuite failures.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1206a559
Tree:
Repository: spark
Updated Branches:
refs/heads/master 008a60dd3 - 35d6a99cb
[SPARK-7436] Fixed instantiation of custom recovery mode factory and added tests
Author: Jacek Lewandowski lewandowski.ja...@gmail.com
Closes #5977 from jacek-lewandowski/SPARK-7436 and squashes the following
[SPARK-6627] Finished rename to ShuffleBlockResolver
The previous cleanup-commit for SPARK-6627 renamed ShuffleBlockManager
to ShuffleBlockResolver, but didn't rename the associated subclasses and
variables; this commit does that.
I'm unsure whether it's ok to rename ExternalShuffleBlockManager,
[SPARK-6627] Finished rename to ShuffleBlockResolver
The previous cleanup-commit for SPARK-6627 renamed ShuffleBlockManager
to ShuffleBlockResolver, but didn't rename the associated subclasses and
variables; this commit does that.
I'm unsure whether it's ok to rename ExternalShuffleBlockManager,
Repository: spark
Updated Branches:
refs/heads/master 2d05f325d - 4b3bb0e43
http://git-wip-us.apache.org/repos/asf/spark/blob/4b3bb0e4/network/shuffle/src/test/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolverSuite.java
Repository: spark
Updated Branches:
refs/heads/branch-1.3 edcd3643a - 7fd212b57
[SPARK-7436] Fixed instantiation of custom recovery mode factory and added tests
Author: Jacek Lewandowski lewandowski.ja...@gmail.com
Closes #5975 from jacek-lewandowski/SPARK-7436-1.3 and squashes the following
Repository: spark
Updated Branches:
refs/heads/branch-1.4 4f01f5b56 - 89d94878f
[SPARK-7436] Fixed instantiation of custom recovery mode factory and added tests
Author: Jacek Lewandowski lewandowski.ja...@gmail.com
Closes #5976 from jacek-lewandowski/SPARK-7436-1.4 and squashes the following
Repository: spark
Updated Branches:
refs/heads/master 002c12384 - 845d1d4d0
Add `Private` annotation.
This was originally added as part of #4435, which was reverted.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
Repository: spark
Updated Branches:
refs/heads/branch-1.4 d651e2838 - 2163367ea
Add `Private` annotation.
This was originally added as part of #4435, which was reverted.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit:
and comments clarifying when
this works for KryoSerializer.
This change allows the optimizations in #4450 to be applied for shuffles that
use `SqlSerializer2`.
Author: Josh Rosen joshro...@databricks.com
Closes #5924 from JoshRosen/SPARK-7311 and squashes the following commits:
50a68ca [Josh Rosen
and comments clarifying when
this works for KryoSerializer.
This change allows the optimizations in #4450 to be applied for shuffles that
use `SqlSerializer2`.
Author: Josh Rosen joshro...@databricks.com
Closes #5924 from JoshRosen/SPARK-7311 and squashes the following commits:
50a68ca [Josh Rosen
Repository: spark
Updated Branches:
refs/heads/master c688e3c5e - 0092abb47
Some minor cleanup after SPARK-4550.
JoshRosen this PR addresses the comments you left on #4450 after it got merged.
Author: Sandy Ryza sa...@cloudera.com
Closes #5916 from sryza/sandy-spark-4550-cleanup
Repository: spark
Updated Branches:
refs/heads/branch-1.4 4afb578b7 - 762ff2e11
Some minor cleanup after SPARK-4550.
JoshRosen this PR addresses the comments you left on #4450 after it got merged.
Author: Sandy Ryza sa...@cloudera.com
Closes #5916 from sryza/sandy-spark-4550-cleanup
Repository: spark
Updated Branches:
refs/heads/master 968ad9721 - 77176619a
[SPARK-6661] Python type errors should print type, not object
Author: Elisey Zanko elisey.za...@gmail.com
Closes #5361 from 31z4/spark-6661 and squashes the following commits:
73c5d79 [Elisey Zanko] Python type
http://git-wip-us.apache.org/repos/asf/spark/blob/04e44b37/examples/src/main/python/sort.py
--
diff --git a/examples/src/main/python/sort.py b/examples/src/main/python/sort.py
index bb686f1..f6b0ecb 100755
---
Repository: spark
Updated Branches:
refs/heads/master 55f553a97 - 04e44b37c
http://git-wip-us.apache.org/repos/asf/spark/blob/04e44b37/python/pyspark/tests.py
--
diff --git a/python/pyspark/tests.py b/python/pyspark/tests.py
[SPARK-4897] [PySpark] Python 3 support
This PR update PySpark to support Python 3 (tested with 3.4).
Known issue: unpickle array from Pyrolite is broken in Python 3, those tests
are skipped.
TODO: ec2/spark-ec2.py is not fully tested with python3.
Author: Davies Liu dav...@databricks.com
http://git-wip-us.apache.org/repos/asf/spark/blob/04e44b37/python/pyspark/sql/types.py
--
diff --git a/python/pyspark/sql/types.py b/python/pyspark/sql/types.py
deleted file mode 100644
index ef76d84..000
---
http://git-wip-us.apache.org/repos/asf/spark/blob/04e44b37/python/pyspark/sql/_types.py
--
diff --git a/python/pyspark/sql/_types.py b/python/pyspark/sql/_types.py
new file mode 100644
index 000..492c0cb
--- /dev/null
+++
in
Python may be GCed, then the broadcast will be destroyed in JVM before the
PythonRDD.
This PR change to use PythonRDD to track the lifecycle of the broadcast object.
It also have a refactor about getNumPartitions() to avoid unnecessary creation
of PythonRDD, which could be heavy.
cc JoshRosen
Repository: spark
Updated Branches:
refs/heads/branch-1.2 964f54478 - 8e9fc27aa
Revert [SPARK-5634] [core] Show correct message in HS when no incomplete apps
f...
This reverts commit 5845a62361c39eb97df5de01c982821c8858de76.
This was reverted because it broke compilation for branch-1.2.
may be GCed, then the broadcast will be destroyed in JVM before the
PythonRDD.
This PR change to use PythonRDD to track the lifecycle of the broadcast object.
It also have a refactor about getNumPartitions() to avoid unnecessary creation
of PythonRDD, which could be heavy.
cc JoshRosen
Author
in
Python may be GCed, then the broadcast will be destroyed in JVM before the
PythonRDD.
This PR change to use PythonRDD to track the lifecycle of the broadcast object.
It also have a refactor about getNumPartitions() to avoid unnecessary creation
of PythonRDD, which could be heavy.
cc JoshRosen
/xerial/snappy-java/issues/100).
Author: Josh Rosen joshro...@databricks.com
Closes #5512 from JoshRosen/snappy-1.1.1.7 and squashes the following commits:
f1ac0f8 [Josh Rosen] Upgrade to snappy-java 1.1.1.7.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip
://github.com/xerial/snappy-java/issues/100).
Author: Josh Rosen joshro...@databricks.com
Closes #5512 from JoshRosen/snappy-1.1.1.7 and squashes the following commits:
f1ac0f8 [Josh Rosen] Upgrade to snappy-java 1.1.1.7.
(cherry picked from commit 6adb8bcbf0a1a7bfe2990de18c59c66cd7a0aeb8)
Signed-off
://github.com/xerial/snappy-java/issues/100).
Author: Josh Rosen joshro...@databricks.com
Closes #5512 from JoshRosen/snappy-1.1.1.7 and squashes the following commits:
f1ac0f8 [Josh Rosen] Upgrade to snappy-java 1.1.1.7.
(cherry picked from commit 6adb8bcbf0a1a7bfe2990de18c59c66cd7a0aeb8)
Signed-off
Repository: spark
Updated Branches:
refs/heads/master 4d4b24927 - a76b921a9
Revert [SPARK-6352] [SQL] Add DirectParquetOutputCommitter
This reverts commit b29663eeea440b1d1a288d41b5ddf67e77c5bd54.
I'm reverting this because it broke test compilation for the Hadoop 1.x
profiles.
Project:
Repository: spark
Updated Branches:
refs/heads/master 5c2844c51 - dea5dacc5
[HOTFIX] Add explicit return types to fix lint errors
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/dea5dacc
Tree:
Repository: spark
Updated Branches:
refs/heads/branch-1.3 ea13948b9 - 8d4176132
[SPARK-6677] [SQL] [PySpark] fix cached classes
It's possible to have two DataType object with same id (memory address) at
different time, we should check the cached classes to verify that it's
generated by
Repository: spark
Updated Branches:
refs/heads/master 0cc8fcb4c - 5d8f7b9e8
[SPARK-6677] [SQL] [PySpark] fix cached classes
It's possible to have two DataType object with same id (memory address) at
different time, we should check the cached classes to verify that it's
generated by given
Repository: spark
Updated Branches:
refs/heads/master b9baa4cd9 - 0375134f4
[SPARK-5969][PySpark] Fix descending pyspark.rdd.sortByKey.
The samples should always be sorted in ascending order, because
bisect.bisect_left is used on it. The reverse order of the result is already
achieved in
Repository: spark
Updated Branches:
refs/heads/master 0375134f4 - 4740d6a15
[SPARK-6216] [PySpark] check the python version in worker
Author: Davies Liu dav...@databricks.com
Closes #5404 from davies/check_version and squashes the following commits:
e559248 [Davies Liu] add tests
ec33b5f
Repository: spark
Updated Branches:
refs/heads/branch-1.3 ec3e76f1e - 48321b83d
[SPARK-5969][PySpark] Fix descending pyspark.rdd.sortByKey.
The samples should always be sorted in ascending order, because
bisect.bisect_left is used on it. The reverse order of the result is already
achieved
Repository: spark
Updated Branches:
refs/heads/branch-1.2 7a1583917 - daec1c635
[SPARK-5969][PySpark] Fix descending pyspark.rdd.sortByKey.
The samples should always be sorted in ascending order, because
bisect.bisect_left is used on it. The reverse order of the result is already
achieved
Repository: spark
Updated Branches:
refs/heads/master 15e0d2bd1 - f7e21dd1e
[SPARK-6506] [pyspark] Do not try to retrieve SPARK_HOME when not needed...
In particular, this makes pyspark in yarn-cluster mode fail unless
SPARK_HOME is set, when it's not really needed.
Author: Marcelo
Repository: spark
Updated Branches:
refs/heads/branch-1.3 cdef7d080 - e967ecaca
[SPARK-6506] [pyspark] Do not try to retrieve SPARK_HOME when not needed...
In particular, this makes pyspark in yarn-cluster mode fail unless
SPARK_HOME is set, when it's not really needed.
Author: Marcelo
that subclass ShuffleSuite.scala. This commit fixes that
problem.
JoshRosen would be great if you could take a look at this, since you wrote this
test originally.
Author: Kay Ousterhout kayousterh...@gmail.com
Closes #5401 from kayousterhout/SPARK-6753 and squashes the following commits:
368c540 [Kay
that subclass ShuffleSuite.scala. This commit fixes that
problem.
JoshRosen would be great if you could take a look at this, since you wrote this
test originally.
Author: Kay Ousterhout kayousterh...@gmail.com
Closes #5401 from kayousterhout/SPARK-6753 and squashes the following commits:
368c540 [Kay
that subclass ShuffleSuite.scala. This commit fixes that
problem.
JoshRosen would be great if you could take a look at this, since you wrote this
test originally.
Author: Kay Ousterhout kayousterh...@gmail.com
Closes #5401 from kayousterhout/SPARK-6753 and squashes the following commits:
368c540 [Kay
Repository: spark
Updated Branches:
refs/heads/branch-1.3 1cde04f21 - ab1b8edb8
[SPARK-6636] Use public DNS hostname everywhere in spark_ec2.py
The spark_ec2.py script uses public_dns_name everywhere in the script except
for testing ssh availability, which is done using the public ip address
Repository: spark
Updated Branches:
refs/heads/master a0846c4b6 - 6f0d55d76
[SPARK-6636] Use public DNS hostname everywhere in spark_ec2.py
The spark_ec2.py script uses public_dns_name everywhere in the script except
for testing ssh availability, which is done using the public ip address of
by metrics users, but it's probably okay to
do this in a major release as long as we document it in the release notes.
Author: Josh Rosen joshro...@databricks.com
Closes #5372 from JoshRosen/driver-id-fix and squashes the following commits:
42d3c10 [Josh Rosen] Clarify comment
0c5d04b [Josh Rosen
: Josh Rosen joshro...@databricks.com
Closes #5397 from JoshRosen/SPARK-6737 and squashes the following commits:
af3b02f [Josh Rosen] Consolidate stage completion handling code in a single
method.
e96ce3a [Josh Rosen] Consolidate stage completion handling code in a single
method.
3052aea [Josh
.
Author: Josh Rosen joshro...@databricks.com
Closes #5397 from JoshRosen/SPARK-6737 and squashes the following commits:
af3b02f [Josh Rosen] Consolidate stage completion handling code in a single
method.
e96ce3a [Josh Rosen] Consolidate stage completion handling code in a single
method.
3052aea
reproduction.
This patch fixes this issue by ensuring proper cleanup of these resources. It
also adds logging for unexpected error cases.
(See #4944 for the corresponding PR for 1.3/1.4).
Author: Josh Rosen joshro...@databricks.com
Closes #5174 from JoshRosen/executorclassloaderleak-branch-1.2
Repository: spark
Updated Branches:
refs/heads/master 0cce5451a - e3202aa2e
SPARK-6414: Spark driver failed with NPE on job cancelation
Use Option for ActiveJob.properties to avoid NPE bug
Author: Hung Lin hung@gmail.com
Closes #5124 from hunglin/SPARK-6414 and squashes the following
Repository: spark
Updated Branches:
refs/heads/branch-1.3 a6664dcd8 - 58e2b3fcd
SPARK-6414: Spark driver failed with NPE on job cancelation
Use Option for ActiveJob.properties to avoid NPE bug
Author: Hung Lin hung@gmail.com
Closes #5124 from hunglin/SPARK-6414 and squashes the
operation if
there are many (e.g. thousands) of retained jobs.
This patch adds a new map to `JobProgressListener` in order to speed up these
lookups.
Author: Josh Rosen joshro...@databricks.com
Closes #4830 from JoshRosen/statustracker-job-group-indexing and squashes the
following commits
Repository: spark
Updated Branches:
refs/heads/branch-1.2 a73055f7f - 8fa09a480
SPARK-6414: Spark driver failed with NPE on job cancelation
Use Option for ActiveJob.properties to avoid NPE bug
Author: Hung Lin hung@gmail.com
Closes #5124 from hunglin/SPARK-6414 and squashes the
Repository: spark
Updated Branches:
refs/heads/master 6e1c1ec67 - 440ea31b7
[SPARK-6621][Core] Fix the bug that calling EventLoop.stop in
EventLoop.onReceive/onError/onStart doesn't call onStop
Author: zsxwing zsxw...@gmail.com
Closes #5280 from zsxwing/SPARK-6621 and squashes the following
Repository: spark
Updated Branches:
refs/heads/branch-1.3 d21f77988 - ac705aa83
[SPARK-6621][Core] Fix the bug that calling EventLoop.stop in
EventLoop.onReceive/onError/onStart doesn't call onStop
Author: zsxwing zsxw...@gmail.com
Closes #5280 from zsxwing/SPARK-6621 and squashes the
Repository: spark
Updated Branches:
refs/heads/branch-1.3 1160cc9e1 - ee2bd70a4
[SPARK-6667] [PySpark] remove setReuseAddress
The reused address on server side had caused the server can not acknowledge the
connected connections, remove it.
This PR will retry once after timeout, it also add
Repository: spark
Updated Branches:
refs/heads/master 424e987df - 0cce5451a
[SPARK-6667] [PySpark] remove setReuseAddress
The reused address on server side had caused the server can not acknowledge the
connected connections, remove it.
This PR will retry once after timeout, it also add a
Repository: spark
Updated Branches:
refs/heads/branch-1.2 758ebf77d - a73055f7f
[SPARK-6667] [PySpark] remove setReuseAddress
The reused address on server side had caused the server can not acknowledge the
connected connections, remove it.
This PR will retry once after timeout, it also add
Repository: spark
Updated Branches:
refs/heads/branch-1.3 bc04fa2e2 - 98f72dfc1
[SPARK-6553] [pyspark] Support functools.partial as UDF
Use `f.__repr__()` instead of `f.__name__` when instantiating
`UserDefinedFunction`s, so `functools.partial`s may be used.
Author: ksonj k...@siberie.de
Repository: spark
Updated Branches:
refs/heads/master 86b439935 - 757b2e917
[SPARK-6553] [pyspark] Support functools.partial as UDF
Use `f.__repr__()` instead of `f.__name__` when instantiating
`UserDefinedFunction`s, so `functools.partial`s may be used.
Author: ksonj k...@siberie.de
.
Author: Josh Rosen joshro...@databricks.com
Closes #5276 from JoshRosen/SPARK-6614 and squashes the following commits:
d532ba7 [Josh Rosen] Check whether failed task was authorized committer
cbb3784 [Josh Rosen] Add regression test for SPARK-6614
Project: http://git-wip-us.apache.org/repos/asf
: Josh Rosen joshro...@databricks.com
Closes #5276 from JoshRosen/SPARK-6614 and squashes the following commits:
d532ba7 [Josh Rosen] Check whether failed task was authorized committer
cbb3784 [Josh Rosen] Add regression test for SPARK-6614
Project: http://git-wip-us.apache.org/repos/asf/spark
to this bug.
Author: Josh Rosen joshro...@databricks.com
Closes #5050 from JoshRosen/javardd-si-8905-fix and squashes the following
commits:
2feb068 [Josh Rosen] Use intermediate abstract classes to work around SPARK-3266
d5f3e5d [Josh Rosen] Add failing regression tests for SPARK-3266
(cherry
Repository: spark
Updated Branches:
refs/heads/master 3b5aaa6a5 - f17d43b03
[SPARK-6219] [Build] Check that Python code compiles
This PR expands the Python lint checks so that they check for obvious
compilation errors in our Python code.
For example:
```
$ ./dev/lint-python
Python lint
Repository: spark
Updated Branches:
refs/heads/master 3db138742 - 540b2a4ea
[SPARK-6394][Core] cleanup BlockManager companion object and improve the
getCacheLocs method in DAGScheduler
The current implementation include searching a HashMap many times, we can avoid
this.
Actually if you look
401 - 500 of 931 matches
Mail list logo