spark git commit: [Doc] Improve Python DataFrame documentation

2015-03-31 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 c4c982a65 -> e527b3590 [Doc] Improve Python DataFrame documentation Author: Reynold Xin Closes #5287 from rxin/pyspark-df-doc-cleanup-context and squashes the following commits: 1841b60 [Reynold Xin] Lint. f2007f1 [Reynold Xin] func

spark git commit: [Doc] Improve Python DataFrame documentation

2015-03-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 37326079d -> 305abe1e5 [Doc] Improve Python DataFrame documentation Author: Reynold Xin Closes #5287 from rxin/pyspark-df-doc-cleanup-context and squashes the following commits: 1841b60 [Reynold Xin] Lint. f2007f1 [Reynold Xin] function

spark git commit: [SPARK-6614] OutputCommitCoordinator should clear authorized committer only after authorized committer fails, not after any failure

2015-03-31 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.3 d85164637 -> c4c982a65 [SPARK-6614] OutputCommitCoordinator should clear authorized committer only after authorized committer fails, not after any failure In OutputCommitCoordinator, there is some logic to clear the authorized committ

spark git commit: [SPARK-6614] OutputCommitCoordinator should clear authorized committer only after authorized committer fails, not after any failure

2015-03-31 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 0e00f12d3 -> 37326079d [SPARK-6614] OutputCommitCoordinator should clear authorized committer only after authorized committer fails, not after any failure In OutputCommitCoordinator, there is some logic to clear the authorized committer's

spark git commit: [SPARK-5692] [MLlib] Word2Vec save/load

2015-03-31 Thread meng
Repository: spark Updated Branches: refs/heads/master 2036bc599 -> 0e00f12d3 [SPARK-5692] [MLlib] Word2Vec save/load Word2Vec model now supports saving and loading. a] The Metadata stored in JSON format consists of "version", "classname", "vectorSize" and "numWords" b] The data stored in Par

spark git commit: [SPARK-6633][SQL] Should be "Contains" instead of "EndsWith" when constructing sources.StringContains

2015-03-31 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 5a957fe0d -> d85164637 [SPARK-6633][SQL] Should be "Contains" instead of "EndsWith" when constructing sources.StringContains Author: Liang-Chi Hsieh Closes #5299 from viirya/stringcontains and squashes the following commits: c1ece4c

spark git commit: [SPARK-6633][SQL] Should be "Contains" instead of "EndsWith" when constructing sources.StringContains

2015-03-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master beebb7ffc -> 2036bc599 [SPARK-6633][SQL] Should be "Contains" instead of "EndsWith" when constructing sources.StringContains Author: Liang-Chi Hsieh Closes #5299 from viirya/stringcontains and squashes the following commits: c1ece4c [Li

spark git commit: [SPARK-5371][SQL] Propagate types after function conversion, before futher resolution

2015-03-31 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.3 045228f38 -> 5a957fe0d [SPARK-5371][SQL] Propagate types after function conversion, before futher resolution Before it was possible for a query to flip back and forth from a resolved state, allowing resolution to propagate up before c

spark git commit: [SPARK-6255] [MLLIB] Support multiclass classification in Python API

2015-03-31 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 46de6c05e -> b5bd75d90 [SPARK-6255] [MLLIB] Support multiclass classification in Python API Python API parity check for classification and multiclass classification support, major disparities need to be added for Python: ```scala LogisticR

spark git commit: [SPARK-6145][SQL] fix ORDER BY on nested fields

2015-03-31 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.3 778c87686 -> 045228f38 [SPARK-6145][SQL] fix ORDER BY on nested fields This PR is based on work by cloud-fan in #4904, but with two differences: - We isolate the logic for Sort's special handling into `ResolveSortReferences` - We avoi

spark git commit: [SPARK-5371][SQL] Propagate types after function conversion, before futher resolution

2015-03-31 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master b5bd75d90 -> beebb7ffc [SPARK-5371][SQL] Propagate types after function conversion, before futher resolution Before it was possible for a query to flip back and forth from a resolved state, allowing resolution to propagate up before coerc

spark git commit: [SPARK-6598][MLLIB] Python API for IDFModel

2015-03-31 Thread meng
Repository: spark Updated Branches: refs/heads/master cd48ca501 -> 46de6c05e [SPARK-6598][MLLIB] Python API for IDFModel This is the sub-task of SPARK-6254. Wrapping IDFModel `idf` member function for pyspark. Author: lewuathe Closes #5264 from Lewuathe/SPARK-6598 and squashes the following

spark git commit: [SPARK-6145][SQL] fix ORDER BY on nested fields

2015-03-31 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 810201447 -> cd48ca501 [SPARK-6145][SQL] fix ORDER BY on nested fields This PR is based on work by cloud-fan in #4904, but with two differences: - We isolate the logic for Sort's special handling into `ResolveSortReferences` - We avoid cr

spark git commit: [SPARK-6575] [SQL] Adds configuration to disable schema merging while converting metastore Parquet tables

2015-03-31 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.3 9ebefb1f1 -> 778c87686 [SPARK-6575] [SQL] Adds configuration to disable schema merging while converting metastore Parquet tables Consider a metastore Parquet table that 1. doesn't have schema evolution issue 2. has lots of data files

spark git commit: [SPARK-6575] [SQL] Adds configuration to disable schema merging while converting metastore Parquet tables

2015-03-31 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master a7992ffaf -> 810201447 [SPARK-6575] [SQL] Adds configuration to disable schema merging while converting metastore Parquet tables Consider a metastore Parquet table that 1. doesn't have schema evolution issue 2. has lots of data files and/

spark git commit: [SPARK-6555] [SQL] Overrides equals() and hashCode() for MetastoreRelation

2015-03-31 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.3 fd600cec0 -> 9ebefb1f1 [SPARK-6555] [SQL] Overrides equals() and hashCode() for MetastoreRelation Also removes temporary workarounds made in #5183 and #5251. [https://reviewable.io/review_button.png"; height=40 alt="Review on Reviewa

spark git commit: [SPARK-6555] [SQL] Overrides equals() and hashCode() for MetastoreRelation

2015-03-31 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master d01a6d8c3 -> a7992ffaf [SPARK-6555] [SQL] Overrides equals() and hashCode() for MetastoreRelation Also removes temporary workarounds made in #5183 and #5251. [https://reviewable.io/review_button.png"; height=40 alt="Review on Reviewable"

spark git commit: [SPARK-4894][mllib] Added Bernoulli option to NaiveBayes model in mllib

2015-03-31 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master a05835b89 -> d01a6d8c3 [SPARK-4894][mllib] Added Bernoulli option to NaiveBayes model in mllib Added optional model type parameter for NaiveBayes training. Can be either Multinomial or Bernoulli. When Bernoulli is given the Bernoulli smo

spark git commit: [SPARK-6542][SQL] add CreateStruct

2015-03-31 Thread lian
Repository: spark Updated Branches: refs/heads/master 314afd0e2 -> a05835b89 [SPARK-6542][SQL] add CreateStruct Similar to `CreateArray`, we can add `CreateStruct` to create nested columns. marmbrus Author: Xiangrui Meng Closes #5195 from mengxr/SPARK-6542 and squashes the following commit

spark git commit: [SPARK-6618][SQL] HiveMetastoreCatalog.lookupRelation should use fine-grained lock

2015-03-31 Thread lian
Repository: spark Updated Branches: refs/heads/master b80a030e9 -> 314afd0e2 [SPARK-6618][SQL] HiveMetastoreCatalog.lookupRelation should use fine-grained lock JIRA: https://issues.apache.org/jira/browse/SPARK-6618 Author: Yin Huai Closes #5281 from yhuai/lookupRelationLock and squashes th

spark git commit: [SPARK-6618][SQL] HiveMetastoreCatalog.lookupRelation should use fine-grained lock

2015-03-31 Thread lian
Repository: spark Updated Branches: refs/heads/branch-1.3 cf651a46e -> fd600cec0 [SPARK-6618][SQL] HiveMetastoreCatalog.lookupRelation should use fine-grained lock JIRA: https://issues.apache.org/jira/browse/SPARK-6618 Author: Yin Huai Closes #5281 from yhuai/lookupRelationLock and squashe

spark git commit: [SPARK-6623][SQL] Alias DataFrame.na.drop and DataFrame.na.fill in Python.

2015-03-31 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 a97d4e6bf -> cf651a46e [SPARK-6623][SQL] Alias DataFrame.na.drop and DataFrame.na.fill in Python. To maintain consistency with the Scala API. Author: Reynold Xin Closes #5284 from rxin/df-na-alias and squashes the following commits:

spark git commit: [SPARK-6623][SQL] Alias DataFrame.na.drop and DataFrame.na.fill in Python.

2015-03-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master f07e71406 -> b80a030e9 [SPARK-6623][SQL] Alias DataFrame.na.drop and DataFrame.na.fill in Python. To maintain consistency with the Scala API. Author: Reynold Xin Closes #5284 from rxin/df-na-alias and squashes the following commits: 19f

spark git commit: [SPARK-6625][SQL] Add common string filters to data sources.

2015-03-31 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 67c885e3c -> a97d4e6bf [SPARK-6625][SQL] Add common string filters to data sources. Filters such as startsWith, endsWith, contains will be very useful for data sources that provide search functionality, e.g. Succinct, Elastic Search, S

spark git commit: [SPARK-6625][SQL] Add common string filters to data sources.

2015-03-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 56775571c -> f07e71406 [SPARK-6625][SQL] Add common string filters to data sources. Filters such as startsWith, endsWith, contains will be very useful for data sources that provide search functionality, e.g. Succinct, Elastic Search, Solr.