spark git commit: [SPARK-5147][Streaming] Delete the received data WAL log periodically

2015-01-21 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.2 dd184290f -> cab410c52 [SPARK-5147][Streaming] Delete the received data WAL log periodically This is a refactored fix based on jerryshao 's PR #4037 This enabled deletion of old WAL files containing the received block data. Improvements

spark git commit: [SPARK-5147][Streaming] Delete the received data WAL log periodically

2015-01-21 Thread tdas
Repository: spark Updated Branches: refs/heads/master fcb3e1862 -> 3027f06b4 [SPARK-5147][Streaming] Delete the received data WAL log periodically This is a refactored fix based on jerryshao 's PR #4037 This enabled deletion of old WAL files containing the received block data. Improvements ove

spark git commit: [SPARK-5317]Set BoostingStrategy.defaultParams With Enumeration Algo.Classification or Algo.Regression

2015-01-21 Thread meng
Repository: spark Updated Branches: refs/heads/master ca7910d6d -> fcb3e1862 [SPARK-5317]Set BoostingStrategy.defaultParams With Enumeration Algo.Classification or Algo.Regression JIRA Issue: https://issues.apache.org/jira/browse/SPARK-5317 When setting the BoostingStrategy.defaultParams("Cla

spark git commit: [SPARK-3424][MLLIB] cache point distances during k-means|| init

2015-01-21 Thread meng
Repository: spark Updated Branches: refs/heads/master 27bccc5ea -> ca7910d6d [SPARK-3424][MLLIB] cache point distances during k-means|| init This PR ports the following feature implemented in #2634 by derrickburns: * During k-means|| initialization, we should cache costs (squared distances)

spark git commit: [SPARK-5202] [SQL] Add hql variable substitution support

2015-01-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9bad06226 -> 27bccc5ea [SPARK-5202] [SQL] Add hql variable substitution support https://cwiki.apache.org/confluence/display/Hive/LanguageManual+VariableSubstitution This is a block issue for the CLI user, it impacts the existed hql scripts

spark git commit: [SPARK-5355] make SparkConf thread-safe

2015-01-21 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 079b3be81 -> dd184290f [SPARK-5355] make SparkConf thread-safe The SparkConf is not thread-safe, but is accessed by many threads. The getAll() could return parts of the configs if another thread is access it. This PR changes SparkConf

spark git commit: [SPARK-5355] make SparkConf thread-safe

2015-01-21 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 3be2a887b -> 9bad06226 [SPARK-5355] make SparkConf thread-safe The SparkConf is not thread-safe, but is accessed by many threads. The getAll() could return parts of the configs if another thread is access it. This PR changes SparkConf.set

svn commit: r1653723 [2/2] - in /spark: news/_posts/ site/ site/graphx/ site/mllib/ site/news/ site/releases/ site/screencasts/ site/sql/ site/streaming/

2015-01-21 Thread matei
Modified: spark/site/releases/spark-release-0-5-2.html URL: http://svn.apache.org/viewvc/spark/site/releases/spark-release-0-5-2.html?rev=1653723&r1=1653722&r2=1653723&view=diff == --- spark/site/releases/spark-release-0-5

svn commit: r1653723 [1/2] - in /spark: news/_posts/ site/ site/graphx/ site/mllib/ site/news/ site/releases/ site/screencasts/ site/sql/ site/streaming/

2015-01-21 Thread matei
Author: matei Date: Thu Jan 22 00:27:22 2015 New Revision: 1653723 URL: http://svn.apache.org/r1653723 Log: Add Summit East news item Added: spark/news/_posts/2015-01-21-spark-summit-east-agenda-posted.md spark/site/news/spark-summit-east-agenda-posted.html Modified: spark/site/commun

spark git commit: [SPARK-4984][CORE][WEBUI] Adding a pop-up containing the full job description when it is very long

2015-01-21 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master ba19689fe -> 3be2a887b [SPARK-4984][CORE][WEBUI] Adding a pop-up containing the full job description when it is very long In some case the job description will be very long, such as a long sql. refer to #3718 This PR add a pop-up for job

spark git commit: Make sure only owner can read / write to directories created for the job.

2015-01-21 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 bb8bd11da -> 079b3be81 Make sure only owner can read / write to directories created for the job. Whenever a directory is created by the utility method, immediately restrict its permissions so that only the owner has access to its conten

spark git commit: [SQL] [Minor] Remove deprecated parquet tests

2015-01-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master b328ac6c8 -> ba19689fe [SQL] [Minor] Remove deprecated parquet tests This PR removes the deprecated `ParquetQuerySuite`, renamed `ParquetQuerySuite2` to `ParquetQuerySuite`, and refactored changes introduced in #4115 to `ParquetFilterSuit

spark git commit: Revert "[SPARK-5244] [SQL] add coalesce() in sql parser"

2015-01-21 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 8361078ef -> b328ac6c8 Revert "[SPARK-5244] [SQL] add coalesce() in sql parser" This reverts commit 812d3679f5f97df7b667cbc3365a49866ebc02d5. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/

spark git commit: [SPARK-5009] [SQL] Long keyword support in SQL Parsers

2015-01-21 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 812d3679f -> 8361078ef [SPARK-5009] [SQL] Long keyword support in SQL Parsers * The `SqlLexical.allCaseVersions` will cause `StackOverflowException` if the key word is too long, the patch will fix that by normalizing all of the keywords i

spark git commit: [SPARK-5006][Deploy]spark.port.maxRetries doesn't work

2015-01-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.2 37db20c94 -> bb8bd11da [SPARK-5006][Deploy]spark.port.maxRetries doesn't work https://issues.apache.org/jira/browse/SPARK-5006 I think the issue is produced in https://github.com/apache/spark/pull/1777. Not digging mesos's backend yet

spark git commit: [SPARK-5244] [SQL] add coalesce() in sql parser

2015-01-21 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 3ee3ab592 -> 812d3679f [SPARK-5244] [SQL] add coalesce() in sql parser Author: Daoyuan Wang Closes #4040 from adrian-wang/coalesce and squashes the following commits: 0ac8e8f [Daoyuan Wang] add coalesce() in sql parser Project: http://

spark git commit: [SPARK-5064][GraphX] Add numEdges upperbound validation for R-MAT graph generator to prevent infinite loop

2015-01-21 Thread ankurdave
Repository: spark Updated Branches: refs/heads/branch-1.2 e90f6b5c6 -> 37db20c94 [SPARK-5064][GraphX] Add numEdges upperbound validation for R-MAT graph generator to prevent infinite loop I looked into GraphGenerators#chooseCell, and found that chooseCell can't generate more edges than pow(2

spark git commit: [SPARK-5064][GraphX] Add numEdges upperbound validation for R-MAT graph generator to prevent infinite loop

2015-01-21 Thread ankurdave
Repository: spark Updated Branches: refs/heads/master 7450a992b -> 3ee3ab592 [SPARK-5064][GraphX] Add numEdges upperbound validation for R-MAT graph generator to prevent infinite loop I looked into GraphGenerators#chooseCell, and found that chooseCell can't generate more edges than pow(2, (2

spark git commit: [SPARK-4161]Spark shell class path is not correctly set if "spark.driver.extraClassPath" is set in defaults.conf

2015-01-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.1 8b9b4b24a -> 7805480db [SPARK-4161]Spark shell class path is not correctly set if "spark.driver.extraClassPath" is set in defaults.conf Author: GuoQiang Li Closes #3050 from witgo/SPARK-4161 and squashes the following commits: abb6f

spark git commit: [SPARK-4749] [mllib]: Allow initializing KMeans clusters using a seed

2015-01-21 Thread meng
Repository: spark Updated Branches: refs/heads/master aa1e22b17 -> 7450a992b [SPARK-4749] [mllib]: Allow initializing KMeans clusters using a seed This implements the functionality for SPARK-4749 and provides units tests in Scala and PySpark Author: nate.crosswhite Author: nxwhite-str Auth

spark git commit: [SPARK-4569] Rename 'externalSorting' in Aggregator

2015-01-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.1 ecdeca038 -> 8b9b4b24a [SPARK-4569] Rename 'externalSorting' in Aggregator Hi all - I've renamed the unhelpfully named variable and added a comment clarifying what's actually happening. Author: Ilya Ganelin Closes #3666 from ilganel

spark git commit: [MLlib] [SPARK-5301] Missing conversions and operations on IndexedRowMatrix and CoordinateMatrix

2015-01-21 Thread meng
Repository: spark Updated Branches: refs/heads/master 2eeada373 -> aa1e22b17 [MLlib] [SPARK-5301] Missing conversions and operations on IndexedRowMatrix and CoordinateMatrix * Transpose is missing from CoordinateMatrix (this is cheap to compute, so it should be there) * IndexedRowMatrix shou

spark git commit: [SPARK-4161]Spark shell class path is not correctly set if "spark.driver.extraClassPath" is set in defaults.conf

2015-01-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.2 1d730170c -> e90f6b5c6 [SPARK-4161]Spark shell class path is not correctly set if "spark.driver.extraClassPath" is set in defaults.conf Author: GuoQiang Li Closes #3050 from witgo/SPARK-4161 and squashes the following commits: abb6f

spark git commit: [SPARK-4759] Fix driver hanging from coalescing partitions

2015-01-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.2 fd6266f45 -> 0c13eed25 [SPARK-4759] Fix driver hanging from coalescing partitions The driver hangs sometimes when we coalesce RDD partitions. See JIRA for more details and reproduction. This is because our use of empty string as defau

spark git commit: [SPARK-4569] Rename 'externalSorting' in Aggregator

2015-01-21 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.2 0c13eed25 -> 1d730170c [SPARK-4569] Rename 'externalSorting' in Aggregator Hi all - I've renamed the unhelpfully named variable and added a comment clarifying what's actually happening. Author: Ilya Ganelin Closes #3666 from ilganel

spark git commit: SPARK-1714. Take advantage of AMRMClient APIs to simplify logic in YarnA...

2015-01-21 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 8c06a5faa -> 2eeada373 SPARK-1714. Take advantage of AMRMClient APIs to simplify logic in YarnA... ...llocator The goal of this PR is to simplify YarnAllocator as much as possible and get it up to the level of code quality we see in the r

spark git commit: [SPARK-5336][YARN]spark.executor.cores must not be less than spark.task.cpus

2015-01-21 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 424d8c6ff -> 8c06a5faa [SPARK-5336][YARN]spark.executor.cores must not be less than spark.task.cpus https://issues.apache.org/jira/browse/SPARK-5336 Author: WangTao Author: WangTaoTheTonic Closes #4123 from WangTaoTheTonic/SPARK-5336 an