Repository: spark
Updated Branches:
refs/heads/master 855d12ac0 - 4dfe180fc
[SPARK-5473] [EC2] Expose SSH failures after status checks pass
If there is some fatal problem with launching a cluster, `spark-ec2` just hangs
without giving the user useful feedback on what the problem is.
This PR
Repository: spark
Updated Branches:
refs/heads/branch-1.3 5782ee29e - f2aa7b757
[SPARK-5473] [EC2] Expose SSH failures after status checks pass
If there is some fatal problem with launching a cluster, `spark-ec2` just hangs
without giving the user useful feedback on what the problem is.
Repository: spark
Updated Branches:
refs/heads/branch-1.3 6ddbca494 - 663d34ec8
[SQL] Remove the duplicated code
Author: Cheng Hao hao.ch...@intel.com
Closes #4494 from chenghao-intel/tiny_code_change and squashes the following
commits:
450dfe7 [Cheng Hao] remove the duplicated code
Repository: spark
Updated Branches:
refs/heads/master a2d33d0b0 - bd0b5ea70
[SQL] Remove the duplicated code
Author: Cheng Hao hao.ch...@intel.com
Closes #4494 from chenghao-intel/tiny_code_change and squashes the following
commits:
450dfe7 [Cheng Hao] remove the duplicated code
Project:
Repository: spark
Updated Branches:
refs/heads/master ef2f55b97 - c15134632
[SPARK-4964][Streaming][Kafka] More updates to Exactly-once Kafka stream
Changes
- Added example
- Added a critical unit test that verifies that offset ranges can be recovered
through checkpoints
Might add more
Repository: spark
Updated Branches:
refs/heads/master 3ec3ad295 - d302c4800
[SPARK-5698] Do not let user request negative # of executors
Otherwise we might crash the ApplicationMaster. Why? Please see
https://issues.apache.org/jira/browse/SPARK-5698.
sryza I believe this is also relevant in
Repository: spark
Updated Branches:
refs/heads/master bd0b5ea70 - ef2f55b97
[SPARK-5597][MLLIB] save/load for decision trees and emsembles
This is based on # from jkbradley with the following changes:
1. Node schema updated to
~~~
treeId: int
nodeId: Int
predict/
|- predict:
Repository: spark
Updated Branches:
refs/heads/branch-1.3 663d34ec8 - 01905c41e
[SPARK-5597][MLLIB] save/load for decision trees and emsembles
This is based on # from jkbradley with the following changes:
1. Node schema updated to
~~~
treeId: int
nodeId: Int
predict/
|-
Repository: spark
Updated Branches:
refs/heads/branch-1.3 01905c41e - 281614d7c
[SPARK-4964][Streaming][Kafka] More updates to Exactly-once Kafka stream
Changes
- Added example
- Added a critical unit test that verifies that offset ranges can be recovered
through checkpoints
Might add more
Repository: spark
Updated Branches:
refs/heads/branch-1.3 fa67877c2 - 43972b5d1
[SPARK-5678] Convert DataFrame to pandas.DataFrame and Series
```
pyspark.sql.DataFrame.to_pandas = to_pandas(self) unbound pyspark.sql.DataFrame
method
Collect all the rows and return a `pandas.DataFrame`.
Repository: spark
Updated Branches:
refs/heads/branch-1.3 c88d4ab1d - fa67877c2
[SPARK-5664][BUILD] Restore stty settings when exiting from SBT's spark-shell
For launching spark-shell from SBT.
Author: Liang-Chi Hsieh vii...@gmail.com
Closes #4451 from viirya/restore_stty and squashes the
Repository: spark
Updated Branches:
refs/heads/master afb131637 - dae216147
[SPARK-5664][BUILD] Restore stty settings when exiting from SBT's spark-shell
For launching spark-shell from SBT.
Author: Liang-Chi Hsieh vii...@gmail.com
Closes #4451 from viirya/restore_stty and squashes the
Repository: spark
Updated Branches:
refs/heads/branch-1.0 4b9234905 - 444ccdd80
SPARK-3242 [EC2] Spark 1.0.2 ec2 scripts creates clusters with Spark 1.0.1
installed by default
tdas you recorded this as a blocker to-do for branch 1.0. Seemed easy, so
here's a PR?
Author: Sean Owen
Repository: spark
Updated Branches:
refs/heads/branch-1.3 f2aa7b757 - c88d4ab1d
SPARK-4267 [YARN] Failing to launch jobs on Spark on YARN with Hadoop 2.5.0 or
later
Before passing to YARN, escape arguments in extraJavaOptions args, in order
to correctly handle cases like -Dfoo=one two
Repository: spark
Updated Branches:
refs/heads/master 0793ee1b4 - de7806048
SPARK-4267 [YARN] Failing to launch jobs on Spark on YARN with Hadoop 2.5.0 or
later
Before passing to YARN, escape arguments in extraJavaOptions args, in order
to correctly handle cases like -Dfoo=one two three.
Repository: spark
Updated Branches:
refs/heads/master 2a3629253 - 0ee53ebce
[SPARK-2096][SQL] support dot notation on array of struct
~~The rule is simple: If you want `a.b` work, then `a` must be some level of
nested array of struct(level 0 means just a StructType). And the result of
`a.b`
Repository: spark
Updated Branches:
refs/heads/branch-1.3 ce2c89cfb - 15f557fd9
[SPARK-2096][SQL] support dot notation on array of struct
~~The rule is simple: If you want `a.b` work, then `a` must be some level of
nested array of struct(level 0 means just a StructType). And the result of
Repository: spark
Updated Branches:
refs/heads/branch-1.3 15f557fd9 - e2bf59af1
[SPARK-5648][SQL] support alter ... unset tblproperties(key)
make hivecontext support alter ... unset tblproperties(key)
like :
alter view viewName unset tblproperties(k)
alter table tableName unset
Repository: spark
Updated Branches:
refs/heads/master 0ee53ebce - d08e7c2b4
[SPARK-5648][SQL] support alter ... unset tblproperties(key)
make hivecontext support alter ... unset tblproperties(key)
like :
alter view viewName unset tblproperties(k)
alter table tableName unset tblproperties(k)
Repository: spark
Updated Branches:
refs/heads/master d08e7c2b4 - 3ec3ad295
[SPARK-5699] [SQL] [Tests] Runs hive-thriftserver tests whenever SQL code is
modified
!-- Reviewable:start --
[img src=https://reviewable.io/review_button.png; height=40 alt=Review on
Repository: spark
Updated Branches:
refs/heads/branch-1.3 e2bf59af1 - 71f0f5115
[SPARK-5699] [SQL] [Tests] Runs hive-thriftserver tests whenever SQL code is
modified
!-- Reviewable:start --
[img src=https://reviewable.io/review_button.png; height=40 alt=Review on
Repository: spark
Updated Branches:
refs/heads/branch-1.3 71f0f5115 - 62b1e1fc0
[SPARK-5698] Do not let user request negative # of executors
Otherwise we might crash the ApplicationMaster. Why? Please see
https://issues.apache.org/jira/browse/SPARK-5698.
sryza I believe this is also
Repository: spark
Updated Branches:
refs/heads/branch-1.2 63eee523e - 515f65804
[SPARK-5698] Do not let user request negative # of executors
Otherwise we might crash the ApplicationMaster. Why? Please see
https://issues.apache.org/jira/browse/SPARK-5698.
sryza I believe this is also
Repository: spark
Updated Branches:
refs/heads/branch-1.2 4bad85485 - 97541b22e
[SPARK-5691] Fixing wrong data structure lookup for dupe app registration
In Master's registerApplication method, it checks if the application had
already registered by examining the addressToWorker hash map. In
Repository: spark
Updated Branches:
refs/heads/branch-1.0 444ccdd80 - f74bccbe3
[SPARK-5691] Fixing wrong data structure lookup for dupe app registration
In Master's registerApplication method, it checks if the application had
already registered by examining the addressToWorker hash map. In
Repository: spark
Updated Branches:
refs/heads/master dae216147 - 6fe70d843
[SPARK-5691] Fixing wrong data structure lookup for dupe app registratio...
In Master's registerApplication method, it checks if the application had
already registered by examining the addressToWorker hash map. In
Author: pwendell
Date: Mon Feb 9 21:27:38 2015
New Revision: 1658579
URL: http://svn.apache.org/r1658579
Log:
Adding release source doc
Added:
spark/releases/_posts/2015-02-09-spark-release-1-2-1.md
Modified:
spark/site/releases/spark-release-1-2-1.html
Added:
Repository: spark
Updated Branches:
refs/heads/branch-1.3 43972b5d1 - 6a0144c63
[SPARK-5691] Fixing wrong data structure lookup for dupe app registratio...
In Master's registerApplication method, it checks if the application had
already registered by examining the addressToWorker hash map. In
Repository: spark
Updated Branches:
refs/heads/branch-1.1 40bce6350 - 03d4097bc
[SPARK-5691] Fixing wrong data structure lookup for dupe app registration
In Master's registerApplication method, it checks if the application had
already registered by examining the addressToWorker hash map. In
Repository: spark
Updated Branches:
refs/heads/master 4dfe180fc - 0793ee1b4
SPARK-2149. [MLLIB] Univariate kernel density estimation
Author: Sandy Ryza sa...@cloudera.com
Closes #1093 from sryza/sandy-spark-2149 and squashes the following commits:
5f06b33 [Sandy Ryza] More review comments
Repository: spark
Updated Branches:
refs/heads/master 6fe70d843 - 0765af9b2
[SPARK-4905][STREAMING] FlumeStreamSuite fix.
Using String constructor instead of CharsetDecoder to see if it fixes the issue
of empty strings in Flume test output.
Author: Hari Shreedharan hshreedha...@apache.org
Repository: spark
Updated Branches:
refs/heads/branch-1.3 6a0144c63 - 18c5a999b
[SPARK-4905][STREAMING] FlumeStreamSuite fix.
Using String constructor instead of CharsetDecoder to see if it fixes the issue
of empty strings in Flume test output.
Author: Hari Shreedharan
Repository: spark
Updated Branches:
refs/heads/branch-1.2 97541b22e - 63eee523e
[SPARK-4905][STREAMING] FlumeStreamSuite fix.
Using String constructor instead of CharsetDecoder to see if it fixes the issue
of empty strings in Flume test output.
Author: Hari Shreedharan
Repository: spark
Updated Branches:
refs/heads/master f48199eb3 - b884daa58
[SPARK-5611] [EC2] Allow spark-ec2 repo and branch to be set on CLI of
spark_ec2.py
and by extension, the ami-list
Useful for using alternate spark-ec2 repos or branches.
Author: Florian Verhein
Repository: spark
Updated Branches:
refs/heads/master b884daa58 - 68b25cf69
[SQL] Add some missing DataFrame functions.
- as with a `Symbol`
- distinct
- sqlContext.emptyDataFrame
- move add/remove col out of RDDApi section
Author: Michael Armbrust mich...@databricks.com
Closes #4437 from
Repository: spark
Updated Branches:
refs/heads/branch-1.3 1e2fab22b - a70dca025
[SQL] Add some missing DataFrame functions.
- as with a `Symbol`
- distinct
- sqlContext.emptyDataFrame
- move add/remove col out of RDDApi section
Author: Michael Armbrust mich...@databricks.com
Closes #4437
Repository: spark
Updated Branches:
refs/heads/branch-1.3 a70dca025 - e24160142
[SQL] Code cleanup.
I added an unnecessary line of code in
https://github.com/apache/spark/commit/13531dd97c08563e53dacdaeaf1102bdd13ef825.
My bad. Let's delete it.
Author: Yin Huai yh...@databricks.com
Closes
Repository: spark
Updated Branches:
refs/heads/branch-1.3 e24160142 - 379233cd0
[SPARK-5696] [SQL] [HOTFIX] Asks HiveThriftServer2 to re-initialize log4j using
Hive configurations
In this way, log4j configurations overriden by jets3t-0.9.2.jar can be again
overriden by Hive default log4j
Repository: spark
Updated Branches:
refs/heads/master 5f0b30e59 - b8080aa86
[SPARK-5696] [SQL] [HOTFIX] Asks HiveThriftServer2 to re-initialize log4j using
Hive configurations
In this way, log4j configurations overriden by jets3t-0.9.2.jar can be again
overriden by Hive default log4j
Repository: spark
Updated Branches:
refs/heads/branch-1.3 379233cd0 - ce2c89cfb
[SPARK-5614][SQL] Predicate pushdown through Generate.
Now in Catalyst's rules, predicates can not be pushed through Generate nodes.
Further more, partition pruning in HiveTableScan can not be applied on those
Repository: spark
Updated Branches:
refs/heads/master b8080aa86 - 2a3629253
[SPARK-5614][SQL] Predicate pushdown through Generate.
Now in Catalyst's rules, predicates can not be pushed through Generate nodes.
Further more, partition pruning in HiveTableScan can not be applied on those
Repository: spark
Updated Branches:
refs/heads/master 0765af9b2 - f48199eb3
[SPARK-5675][SQL] XyzType companion object should subclass XyzType
Otherwise, the following will always return false in Java.
```scala
dataType instanceof StringType
```
Author: Reynold Xin r...@databricks.com
Repository: spark
Updated Branches:
refs/heads/branch-1.3 18c5a999b - 1e2fab22b
[SPARK-5675][SQL] XyzType companion object should subclass XyzType
Otherwise, the following will always return false in Java.
```scala
dataType instanceof StringType
```
Author: Reynold Xin r...@databricks.com
Repository: spark
Updated Branches:
refs/heads/branch-1.1 03d4097bc - 651ceaeb3
[SQL] Fix flaky SET test
Author: Michael Armbrust mich...@databricks.com
Closes #4480 from marmbrus/fixSetTests and squashes the following commits:
f2e501e [Michael Armbrust] [SQL] Fix flaky SET test
Project:
Repository: spark
Updated Branches:
refs/heads/master de7806048 - afb131637
[SPARK-5678] Convert DataFrame to pandas.DataFrame and Series
```
pyspark.sql.DataFrame.to_pandas = to_pandas(self) unbound pyspark.sql.DataFrame
method
Collect all the rows and return a `pandas.DataFrame`.
[SPARK-5469] restructure pyspark.sql into multiple files
All the DataTypes moved into pyspark.sql.types
The changes can be tracked by `--find-copies-harder -M25`
```
davieslocalhost:~/work/spark/python$ git diff --find-copies-harder -M25
--numstat master..
2 5
http://git-wip-us.apache.org/repos/asf/spark/blob/08488c17/python/pyspark/sql/__init__.py
--
diff --git a/python/pyspark/sql/__init__.py b/python/pyspark/sql/__init__.py
new file mode 100644
index 000..0a5ba00
--- /dev/null
http://git-wip-us.apache.org/repos/asf/spark/blob/f0562b42/python/pyspark/sql/__init__.py
--
diff --git a/python/pyspark/sql/__init__.py b/python/pyspark/sql/__init__.py
new file mode 100644
index 000..0a5ba00
--- /dev/null
Repository: spark
Updated Branches:
refs/heads/branch-1.3 62b1e1fc0 - f0562b423
http://git-wip-us.apache.org/repos/asf/spark/blob/f0562b42/python/pyspark/sql/types.py
--
diff --git a/python/pyspark/sql/types.py
[SPARK-5469] restructure pyspark.sql into multiple files
All the DataTypes moved into pyspark.sql.types
The changes can be tracked by `--find-copies-harder -M25`
```
davieslocalhost:~/work/spark/python$ git diff --find-copies-harder -M25
--numstat master..
2 5
Repository: spark
Updated Branches:
refs/heads/master d302c4800 - 08488c175
http://git-wip-us.apache.org/repos/asf/spark/blob/08488c17/python/pyspark/sql/types.py
--
diff --git a/python/pyspark/sql/types.py
http://git-wip-us.apache.org/repos/asf/spark/blob/08488c17/python/pyspark/sql.py
--
diff --git a/python/pyspark/sql.py b/python/pyspark/sql.py
deleted file mode 100644
index 6a6dfbc..000
--- a/python/pyspark/sql.py
+++
http://git-wip-us.apache.org/repos/asf/spark/blob/f0562b42/python/pyspark/sql.py
--
diff --git a/python/pyspark/sql.py b/python/pyspark/sql.py
deleted file mode 100644
index 6a6dfbc..000
--- a/python/pyspark/sql.py
+++
Repository: spark
Updated Branches:
refs/heads/branch-1.3 f0562b423 - dad05e068
Add a config option to print DAG.
Add a config option spark.rddDebug.enable to check whether to print DAG info.
When spark.rddDebug.enable is true, it will print information about DAG in
the log.
Author:
Repository: spark
Updated Branches:
refs/heads/master 08488c175 - 31d435ecf
Add a config option to print DAG.
Add a config option spark.rddDebug.enable to check whether to print DAG info.
When spark.rddDebug.enable is true, it will print information about DAG in
the log.
Author:
Repository: spark
Updated Branches:
refs/heads/branch-1.3 dad05e068 - ebf1df03d
SPARK-4900 [MLLIB] MLlib SingularValueDecomposition ARPACK IllegalStateException
Fix ARPACK error code mapping, at least. It's not yet clear whether the error
is what we expect from ARPACK. If it isn't, not sure
Repository: spark
Updated Branches:
refs/heads/master 31d435ecf - 36c4e1d75
SPARK-4900 [MLLIB] MLlib SingularValueDecomposition ARPACK IllegalStateException
Fix ARPACK error code mapping, at least. It's not yet clear whether the error
is what we expect from ARPACK. If it isn't, not sure if
Repository: spark
Updated Branches:
refs/heads/master 36c4e1d75 - 20a601310
http://git-wip-us.apache.org/repos/asf/spark/blob/20a60131/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
--
diff --git
[SPARK-2996] Implement userClassPathFirst for driver, yarn.
Yarn's config option `spark.yarn.user.classpath.first` does not work the same
way as
`spark.files.userClassPathFirst`; Yarn's version is a lot more dangerous, in
that it
modifies the system classpath, instead of restricting the changes
Repository: spark
Updated Branches:
refs/heads/branch-1.3 ebf1df03d - 6a1e0f967
http://git-wip-us.apache.org/repos/asf/spark/blob/6a1e0f96/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
--
diff --git
[SPARK-2996] Implement userClassPathFirst for driver, yarn.
Yarn's config option `spark.yarn.user.classpath.first` does not work the same
way as
`spark.files.userClassPathFirst`; Yarn's version is a lot more dangerous, in
that it
modifies the system classpath, instead of restricting the changes
Repository: spark
Updated Branches:
refs/heads/branch-1.3 6a1e0f967 - 832625509
[SPARK-5703] AllJobsPage throws empty.max exception
If you have a `SparkListenerJobEnd` event without the corresponding
`SparkListenerJobStart` event, then `JobProgressListener` will create an empty
`JobUIData`
Repository: spark
Updated Branches:
refs/heads/branch-1.2 515f65804 - 53de2378e
[SPARK-5703] AllJobsPage throws empty.max exception
If you have a `SparkListenerJobEnd` event without the corresponding
`SparkListenerJobStart` event, then `JobProgressListener` will create an empty
`JobUIData`
Repository: spark
Updated Branches:
refs/heads/branch-1.3 832625509 - 6ddbca494
[SPARK-5701] Only set ShuffleReadMetrics when task has shuffle deps
The updateShuffleReadMetrics method in TaskMetrics (called by the executor
heartbeater) will currently always add a ShuffleReadMetrics to
Repository: spark
Updated Branches:
refs/heads/master a95ed5215 - a2d33d0b0
[SPARK-5701] Only set ShuffleReadMetrics when task has shuffle deps
The updateShuffleReadMetrics method in TaskMetrics (called by the executor
heartbeater) will currently always add a ShuffleReadMetrics to
65 matches
Mail list logo