[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-113145019 [Test build #35125 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35125/consoleFull) for PR 6880 at commit [`0c0a478`](https://github.com/apache/spark/commit/0c0a478568b8abcd37744f2435ef359e9d7f2392). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-113145071 [Test build #35126 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35126/consoleFull) for PR 6262 at commit [`3ab8c7a`](https://github.com/apache/spark/commit/3ab8c7a4e666fc0b9d60b1462e8f233b94ce783e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...
Github user yijieshen commented on the pull request: https://github.com/apache/spark/pull/6881#issuecomment-113145106 I prefer to keep the origin column names in the newly created struct, since I think it's more meaningful than `col1, col2, col3`, and we could just leave the unnamed columns to `col1, col2 ...`, which is also compatible with Hive's semantic. I've also made related changes in #6874 to loosen parameter requirements of [`struct`](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L723) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7862] [SQL] Disable the error message r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6882#issuecomment-113145081 [Test build #35129 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35129/consoleFull) for PR 6882 at commit [`402f746`](https://github.com/apache/spark/commit/402f746e4215a28c49806d84c1d3d993f18c9f8d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6877#issuecomment-113145010 [Test build #35128 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35128/consoleFull) for PR 6877 at commit [`a3cd55b`](https://github.com/apache/spark/commit/a3cd55b61440fd9121e50b35fb3a0325986cd550). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6881#issuecomment-113146743 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8238][SPARK-8239][SPARK-8242][SPARK-824...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/6843#discussion_r32727925 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -313,3 +313,131 @@ case class StringLength(child: Expression) extends UnaryExpression with ExpectsI defineCodeGen(ctx, ev, c = s($c).length()) } } + +/** + * Returns the numeric value of the first character of str. + */ +case class Ascii(child: Expression) extends UnaryExpression with ExpectsInputTypes { + override def dataType: DataType = IntegerType + override def expectedChildTypes: Seq[DataType] = Seq(StringType) + + override def eval(input: InternalRow): Any = { +val string = child.eval(input) +if (string == null) { + null +} else { + val bytes = string.asInstanceOf[UTF8String].getBytes + if (bytes.length 0) { --- End diff -- I copied the logic from Hive, Hive doesn't check if it's a utf8 string. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Followup SPARK-8387][WEBUI] Update driver log...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6878#issuecomment-113109990 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Followup SPARK-8387][WEBUI] Update driver log...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6878#issuecomment-113109866 [Test build #35120 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35120/console) for PR 6878 at commit [`13be948`](https://github.com/apache/spark/commit/13be948b455ee4ee6db5bd6beafd9854e5428e68). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8160][SQL]Support using external sortin...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6875#issuecomment-113121395 Build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6881#issuecomment-113144419 [Test build #35127 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35127/consoleFull) for PR 6881 at commit [`35fa5fb`](https://github.com/apache/spark/commit/35fa5fbd7b97879acdf1d2027ed0fa587b8ae301). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-113144080 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6263][MLLIB] Python MLlib API missing i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5707#issuecomment-113149919 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6263][MLLIB] Python MLlib API missing i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5707#issuecomment-113149945 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8160][SQL]Support using external sortin...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6875#issuecomment-113121333 [Test build #35122 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35122/console) for PR 6875 at commit [`44a3e62`](https://github.com/apache/spark/commit/44a3e62bc1cc2d06ec29a7095e9c77e1da21b772). * This patch **passes all tests**. * This patch **does not merge cleanly**. * This patch adds the following public classes _(experimental)_: * `trait GeneratedAggregate ` * `case class HashGeneratedAggregate(` * `case class SortMergeAggregate(` * ` case class ComputedAggregate(` * `case class SortMergeGeneratedAggregate(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7155] [CORE] Allow newAPIHadoopFile to ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/5708#discussion_r32720959 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -926,7 +926,9 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli // The call to new NewHadoopJob automatically adds security credentials to conf, // so we don't need to explicitly add them ourselves val job = new NewHadoopJob(conf) -NewFileInputFormat.addInputPath(job, new Path(path)) +// Use addInputPaths so that newAPIHadoopFile aligns with hadoopFile in taking +// comma separated files as input. (see SPARK-7155) +NewFileInputFormat.addInputPaths(job, path) --- End diff -- ... but then what would you do about the inconsistency problem? some methods would then use one, others use the other. That's a bigger problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/6881#issuecomment-113124607 @yijieshen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7862] [SQL] Disable the error message r...
GitHub user chenghao-intel opened a pull request: https://github.com/apache/spark/pull/6882 [SPARK-7862] [SQL] Disable the error message redirect to stderr This is a follow up of #6404, the ScriptTransformation prints the error msg into stderr directly, probably be a disaster for application log. You can merge this pull request into a Git repository by running: $ git pull https://github.com/chenghao-intel/spark verbose Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6882.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6882 commit 402f746e4215a28c49806d84c1d3d993f18c9f8d Author: Cheng Hao hao.ch...@intel.com Date: 2015-06-18T12:12:50Z disable the error message redirection for stderr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8407][SQL]complex type constructors: st...
Github user yijieshen commented on a diff in the pull request: https://github.com/apache/spark/pull/6874#discussion_r32726438 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala --- @@ -79,3 +80,44 @@ case class CreateStruct(children: Seq[Expression]) extends Expression { InternalRow(children.map(_.eval(input)): _*) } } + +/** + * Creates a struct with the given field names and values + * + * @param children Seq(name1, val1, name2, val2, ...) + */ +case class CreateNamedStruct(children: Seq[Expression]) extends Expression { + assert(children.size % 2 == 0, NamedStruct expects an even number of arguments.) + + private val nameExprs = children.zipWithIndex.filter(_._2 % 2 == 0).map(_._1) + private val valExprs = children.zipWithIndex.filter(_._2 % 2 == 1).map(_._1) + + private lazy val names = nameExprs.map { case name = +name match { + case NonNullLiteral(str, StringType) = +str.asInstanceOf[UTF8String].toString + case _ = +throw new IllegalArgumentException(Expressions of odd index should be + + s Literal(_, StringType), get ${name.dataType} instead) +} + } + + override def foldable: Boolean = children.forall(_.foldable) + + override lazy val resolved: Boolean = childrenResolved --- End diff -- Get it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/6881#issuecomment-113150996 Ok, sound reasonable to me, closing this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8238][SPARK-8239][SPARK-8242][SPARK-824...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6843#issuecomment-113160423 [Test build #35132 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35132/consoleFull) for PR 6843 at commit [`05cc18e`](https://github.com/apache/spark/commit/05cc18e37be9f2e23d3fe99a20892e91330ce469). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7155] [CORE] Allow newAPIHadoopFile to ...
Github user EugenCepoi commented on a diff in the pull request: https://github.com/apache/spark/pull/5708#discussion_r32720782 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -926,7 +926,9 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli // The call to new NewHadoopJob automatically adds security credentials to conf, // so we don't need to explicitly add them ourselves val job = new NewHadoopJob(conf) -NewFileInputFormat.addInputPath(job, new Path(path)) +// Use addInputPaths so that newAPIHadoopFile aligns with hadoopFile in taking +// comma separated files as input. (see SPARK-7155) +NewFileInputFormat.addInputPaths(job, path) --- End diff -- The reason to use addInputPaths would be for preserving compatibility. I had the luck to have some unit tests that detected this change, but others might encounter it in production. But as this has been already released, I guess we can stick with `setInputPaths`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8407][SQL]complex type constructors: st...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/6874#discussion_r32724856 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -737,6 +735,21 @@ object functions { } /** + * Creates a new struct column with given field names and columns. + * The input columns should be of length 2*n and follow (name1, col1, name2, col2), + * name* should be String Literal + * + * @group normal_funcs + * @since 1.5.0 + */ + @scala.annotation.varargs + def named_struct(cols: Column*): Column = { +require(cols.length % 2 == 0, + snamed_struct expects an even number of arguments.) +CreateNamedStruct(cols.map(_.expr)) + } + + /** --- End diff -- We usually will have the column names version of API. For example: ``` def namedStruct(colName: String, colNames: String*): Column ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...
GitHub user chenghao-intel opened a pull request: https://github.com/apache/spark/pull/6881 [SPARK-8283][SQL] CreateStruct should not specify the field names `CreateStruct` = `GenericUDFStruct` which always give the default column names for the output struct like (col1, col2...colN) You can merge this pull request into a Git repository by running: $ git pull https://github.com/chenghao-intel/spark struct Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6881.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6881 commit 673ef10e19ceef6b03129e289c4fab45d3585f92 Author: Cheng Hao hao.ch...@intel.com Date: 2015-06-18T11:15:48Z Give default field names for CreateStruct commit b49227671574db6284ddaad016e5d30b96788f2a Author: Cheng Hao hao.ch...@intel.com Date: 2015-06-18T11:22:13Z scalastyle commit 35fa5fbd7b97879acdf1d2027ed0fa587b8ae301 Author: Cheng Hao hao.ch...@intel.com Date: 2015-06-18T11:30:08Z fix the bugs in unittest for CreateStruct --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7862] [SQL] Disable the error message r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6882#issuecomment-113144079 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6877#issuecomment-113144078 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7913][Core]Make AppendOnlyMap use the s...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6879#issuecomment-113143934 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7913][Core]Make AppendOnlyMap use the s...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6879#issuecomment-113143840 [Test build #35124 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35124/console) for PR 6879 at commit [`912c0ad`](https://github.com/apache/spark/commit/912c0adeb92d7c33af05c99970640a66868be374). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6881#issuecomment-113144081 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-113144077 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8407][SQL]complex type constructors: st...
Github user yijieshen commented on a diff in the pull request: https://github.com/apache/spark/pull/6874#discussion_r32726086 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -737,6 +735,21 @@ object functions { } /** + * Creates a new struct column with given field names and columns. + * The input columns should be of length 2*n and follow (name1, col1, name2, col2), + * name* should be String Literal + * + * @group normal_funcs + * @since 1.5.0 + */ + @scala.annotation.varargs + def named_struct(cols: Column*): Column = { +require(cols.length % 2 == 0, + snamed_struct expects an even number of arguments.) +CreateNamedStruct(cols.map(_.expr)) + } + + /** --- End diff -- I found a little difficult here to name the parameters in this API, since it should be fieldName1, value1, fieldName2, value2, I'll consider this again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-113112104 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6877#issuecomment-113111717 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6877#issuecomment-113111606 [Test build #35119 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35119/console) for PR 6877 at commit [`923cee4`](https://github.com/apache/spark/commit/923cee4586f5747ad596deefc352aac0429a2dc1). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6263][MLLIB] Python MLlib API missing i...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5707#issuecomment-113150494 [Test build #35131 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35131/consoleFull) for PR 5707 at commit [`1502d13`](https://github.com/apache/spark/commit/1502d13e535cd76aa3afaaca70a7cbe0c28b4d29). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8238][SPARK-8239][SPARK-8242][SPARK-824...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6843#issuecomment-113159214 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6881#issuecomment-113125643 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6877#issuecomment-113136109 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7862] [SQL] Disable the error message r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6882#issuecomment-113137581 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8301][SQL] Improve UTF8String substring...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/6804#issuecomment-113149719 Probably be `Nonnull`, this will give more strict check by FindBug. It will be great if you can run FindBug locally after the change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...
Github user chenghao-intel closed the pull request at: https://github.com/apache/spark/pull/6881 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8407][SQL]complex type constructors: st...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/6874#discussion_r32724484 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala --- @@ -79,3 +80,44 @@ case class CreateStruct(children: Seq[Expression]) extends Expression { InternalRow(children.map(_.eval(input)): _*) } } + +/** + * Creates a struct with the given field names and values + * + * @param children Seq(name1, val1, name2, val2, ...) + */ +case class CreateNamedStruct(children: Seq[Expression]) extends Expression { + assert(children.size % 2 == 0, NamedStruct expects an even number of arguments.) + + private val nameExprs = children.zipWithIndex.filter(_._2 % 2 == 0).map(_._1) + private val valExprs = children.zipWithIndex.filter(_._2 % 2 == 1).map(_._1) + + private lazy val names = nameExprs.map { case name = +name match { + case NonNullLiteral(str, StringType) = +str.asInstanceOf[UTF8String].toString + case _ = +throw new IllegalArgumentException(Expressions of odd index should be + + s Literal(_, StringType), get ${name.dataType} instead) +} + } + + override def foldable: Boolean = children.forall(_.foldable) + + override lazy val resolved: Boolean = childrenResolved --- End diff -- We'd better remove this, as it's covered by its parent class. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8407][SQL]complex type constructors: st...
Github user yijieshen commented on a diff in the pull request: https://github.com/apache/spark/pull/6874#discussion_r32725827 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -737,6 +735,21 @@ object functions { } /** + * Creates a new struct column with given field names and columns. + * The input columns should be of length 2*n and follow (name1, col1, name2, col2), + * name* should be String Literal + * + * @group normal_funcs + * @since 1.5.0 + */ + @scala.annotation.varargs + def named_struct(cols: Column*): Column = { --- End diff -- OK. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6881#issuecomment-113146910 [Test build #35130 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35130/consoleFull) for PR 6881 at commit [`2efe8ba`](https://github.com/apache/spark/commit/2efe8ba0ad1a8371c0493b7e247a683156da17b0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7155] [CORE] Allow newAPIHadoopFile to ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/5708#discussion_r32719742 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -926,7 +926,9 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli // The call to new NewHadoopJob automatically adds security credentials to conf, // so we don't need to explicitly add them ourselves val job = new NewHadoopJob(conf) -NewFileInputFormat.addInputPath(job, new Path(path)) +// Use addInputPaths so that newAPIHadoopFile aligns with hadoopFile in taking +// comma separated files as input. (see SPARK-7155) +NewFileInputFormat.addInputPaths(job, path) --- End diff -- The problem is that the rest of the API already used `setInputPaths` so one or the other behavior really needed to change in order to fix that. I think the logic was that nobody _should_ have been relying on anything but the method arg to set the path. I personally think it's less confusing to not have two ways to specify a path. At this point though I think it would need a very good reason to change the behavior again since it's not a question of fixing an inconsistency anymore. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8407][SQL]complex type constructors: st...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/6874#discussion_r32724921 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -737,6 +735,21 @@ object functions { } /** + * Creates a new struct column with given field names and columns. + * The input columns should be of length 2*n and follow (name1, col1, name2, col2), + * name* should be String Literal + * + * @group normal_funcs + * @since 1.5.0 + */ + @scala.annotation.varargs + def named_struct(cols: Column*): Column = { --- End diff -- the function name should be camel style. `namedStruct`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6881#issuecomment-113146720 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8238][SPARK-8239][SPARK-8242][SPARK-824...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6843#issuecomment-113159272 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6740][SQL] Fix NOT operator precedence.
Github user smola commented on a diff in the pull request: https://github.com/apache/spark/pull/6326#discussion_r32732289 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala --- @@ -228,7 +228,12 @@ class SqlParser extends AbstractSparkSQLParser with DataTypeParser { andExpression * (OR ^^^ { (e1: Expression, e2: Expression) = Or(e1, e2) }) protected lazy val andExpression: Parser[Expression] = -comparisonExpression * (AND ^^^ { (e1: Expression, e2: Expression) = And(e1, e2) }) +booleanFactor * (AND ^^^ { (e1: Expression, e2: Expression) = And(e1, e2) }) --- End diff -- @chenghao-intel Not sure how. Adding the NOT clause in the expression rule would break precedence rules. Also, binding expression - orExpression - andExpression - booleanFactor - comparison is pretty much they way it is expressed in the grammars for standard SQL. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-113165776 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4055#issuecomment-113169097 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4055#issuecomment-113169123 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8080][STREAMING] Receiver.store with It...
Github user dibbhatt commented on the pull request: https://github.com/apache/spark/pull/6707#issuecomment-113174385 hi @tdas . Let me know if latest changes are fine --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4644 blockjoin
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6883#issuecomment-113178842 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-113184674 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-113184578 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8320] [Streaming] Add example in stream...
Github user koeninger commented on the pull request: https://github.com/apache/spark/pull/6862#issuecomment-113184801 I'm not a python programmer, but isn't the direct translation of that kafkaStreams = map(lambda _:KafkaUtils.createStream(...), range(0, numStreams)) Maybe append is more idiomatic... at any rate what's there looks like it will work --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6881#issuecomment-113169281 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8348][SQL] Add in operator to DataFrame...
Github user yu-iskw commented on the pull request: https://github.com/apache/spark/pull/6824#issuecomment-113181215 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8056][SQL] Design an easier way to cons...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6686#issuecomment-113189491 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8056][SQL] Design an easier way to cons...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6686#issuecomment-113189528 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8238][SPARK-8239][SPARK-8242][SPARK-824...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6843#issuecomment-113191198 [Test build #35132 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35132/console) for PR 6843 at commit [`05cc18e`](https://github.com/apache/spark/commit/05cc18e37be9f2e23d3fe99a20892e91330ce469). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class Ascii(child: Expression) extends UnaryExpression ` * `case class Base64(child: Expression) extends UnaryExpression ` * `case class UnBase64(child: Expression) extends UnaryExpression ` * `case class Decode(bin: Expression, charset: Expression) extends Expression ` * `case class Encode(value: Expression, charset: Expression) extends Expression ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6740][SQL] Fix NOT operator precedence.
Github user smola commented on the pull request: https://github.com/apache/spark/pull/6326#issuecomment-113166469 @marmbrus Done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6881#issuecomment-113166685 [Test build #35127 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35127/console) for PR 6881 at commit [`35fa5fb`](https://github.com/apache/spark/commit/35fa5fbd7b97879acdf1d2027ed0fa587b8ae301). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...
Github user suyanNone commented on the pull request: https://github.com/apache/spark/pull/4055#issuecomment-113166545 @andrewor14 @srowen Already refine with the comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4055#issuecomment-113166266 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4055#issuecomment-113166319 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6881#issuecomment-113166774 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4055#issuecomment-113166827 [Test build #35133 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35133/consoleFull) for PR 4055 at commit [`0c161a7`](https://github.com/apache/spark/commit/0c161a7abc99d470c6450af143bb580fec7e3bc3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4055#issuecomment-113169866 [Test build #35134 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35134/consoleFull) for PR 4055 at commit [`d836d83`](https://github.com/apache/spark/commit/d836d83001225cc170fa2d38ebd8c35430b7bfdc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MLLIB] [spark-2352] Implementation of an Arti...
Github user hntd187 commented on the pull request: https://github.com/apache/spark/pull/1290#issuecomment-11317 @avulanov Also, we're going to have to add a dependency with this with the HDF5 library, I think this should be handled the way the netlib is handled with the user having to enable a profile when building spark. So, normally it wouldn't be available, but if you build with it you can use it. I'll update the POM to account for that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7155] [CORE] Allow newAPIHadoopFile to ...
Github user EugenCepoi commented on a diff in the pull request: https://github.com/apache/spark/pull/5708#discussion_r32741755 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -926,7 +926,9 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli // The call to new NewHadoopJob automatically adds security credentials to conf, // so we don't need to explicitly add them ourselves val job = new NewHadoopJob(conf) -NewFileInputFormat.addInputPath(job, new Path(path)) +// Use addInputPaths so that newAPIHadoopFile aligns with hadoopFile in taking +// comma separated files as input. (see SPARK-7155) +NewFileInputFormat.addInputPaths(job, path) --- End diff -- continued the conversation on the [jira ticket SPARK-8439](https://issues.apache.org/jira/browse/SPARK-8439). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8104][SQL] auto alias expressions in an...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6647#issuecomment-113190638 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4055#issuecomment-113167204 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4055#issuecomment-113167200 [Test build #35133 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35133/console) for PR 4055 at commit [`0c161a7`](https://github.com/apache/spark/commit/0c161a7abc99d470c6450af143bb580fec7e3bc3). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7862] [SQL] Disable the error message r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6882#issuecomment-113177974 [Test build #35129 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35129/console) for PR 6882 at commit [`402f746`](https://github.com/apache/spark/commit/402f746e4215a28c49806d84c1d3d993f18c9f8d). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class ElementwiseProduct(VectorTransformer):` * `case class CreateStruct(children: Seq[Expression]) extends Expression ` * `case class Logarithm(left: Expression, right: Expression)` * `case class SetCommand(kv: Option[(String, Option[String])]) extends RunnableCommand with Logging ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7862] [SQL] Disable the error message r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6882#issuecomment-113178136 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8348][SQL] Add in operator to DataFrame...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6824#issuecomment-113182938 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8348][SQL] Add in operator to DataFrame...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6824#issuecomment-113182987 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8348][SQL] Add in operator to DataFrame...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6824#issuecomment-113183907 [Test build #35135 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35135/consoleFull) for PR 6824 at commit [`6f744ac`](https://github.com/apache/spark/commit/6f744ac88cb6c0905bf2297bd4d85e53037090fb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6877#issuecomment-113188524 [Test build #35128 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35128/console) for PR 6877 at commit [`a3cd55b`](https://github.com/apache/spark/commit/a3cd55b61440fd9121e50b35fb3a0325986cd550). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8104][SQL] auto alias expressions in an...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6647#issuecomment-113190675 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8104][SQL] auto alias expressions in an...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6647#issuecomment-113196994 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8104][SQL] auto alias expressions in an...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6647#issuecomment-113196957 [Test build #35138 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35138/console) for PR 6647 at commit [`02ed8a3`](https://github.com/apache/spark/commit/02ed8a347b2e35557a466a4d9b82694473e72e37). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class UnresolvedAlias(child: Expression) extends NamedExpression` * `abstract class ExtractValueWithStruct extends ExtractValue ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-113165679 [Test build #35126 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35126/console) for PR 6262 at commit [`3ab8c7a`](https://github.com/apache/spark/commit/3ab8c7a4e666fc0b9d60b1462e8f233b94ce783e). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6263][MLLIB] Python MLlib API missing i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5707#issuecomment-113169048 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4644 blockjoin
GitHub user koertkuipers opened a pull request: https://github.com/apache/spark/pull/6883 SPARK-4644 blockjoin Although the discussion (and design doc) under SPARK-4644 seem focussed on other aspects of skew (OOM mostly) than this pullreq (which focusses on avoiding a single reducer taking a long time), i decided to put this pullreq under SPARK-4644 anyhow, to avoid the proliferation of JIRA tickets. If this is not the right place let me know and i will move it. Inspired by block join in scalding. From scalding docs: This is useful in cases where the data has extreme skew. A symptom of this is that we may see a job stuck for a very long time on a small number of reducers. A block join is way to get around this: we add a random integer field and a replica field to every tuple in the left and right pipes. We then join on the original keys and on these new dummy fields. These dummy fields make it less likely that the skewed keys will be hashed to the same reducer. The final data size is right * rightReplication + left * leftReplication but because of the fragmentation, we are guaranteed the same number of hits as the original join. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tresata/spark feat-blockjoin Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6883.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6883 commit 77d8fee6ad7ba5f83eb0c82b7f1625e2206a5446 Author: Koert Kuipers ko...@tresata.com Date: 2015-06-17T20:35:18Z add blockJoin, blockLeftOuterJoin and blockRightOuterJoin to spark core commit d1fd3e020812c72c44a6461d9c94065e2784cdbb Author: Koert Kuipers ko...@tresata.com Date: 2015-06-17T23:48:43Z correct scaladocs for block join functions commit 2114df748f62b53155d7db5524e163504cead228 Author: Koert Kuipers ko...@tresata.com Date: 2015-06-18T03:36:21Z add block joins to java api --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-113184900 [Test build #35136 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35136/consoleFull) for PR 6262 at commit [`3344a21`](https://github.com/apache/spark/commit/3344a2171eeb54c07e9b8af036e327e4e4de143f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6877#issuecomment-113188582 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8056][SQL] Design an easier way to cons...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6686#issuecomment-113190209 [Test build #35137 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35137/consoleFull) for PR 6686 at commit [`8109e00`](https://github.com/apache/spark/commit/8109e0067b3abce6f4eec937b39c6d7db2eb6b71). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8104][SQL] auto alias expressions in an...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6647#issuecomment-113191390 [Test build #35138 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35138/consoleFull) for PR 6647 at commit [`02ed8a3`](https://github.com/apache/spark/commit/02ed8a347b2e35557a466a4d9b82694473e72e37). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4055#issuecomment-113191005 [Test build #35134 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35134/console) for PR 4055 at commit [`d836d83`](https://github.com/apache/spark/commit/d836d83001225cc170fa2d38ebd8c35430b7bfdc). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4055#issuecomment-113191053 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8238][SPARK-8239][SPARK-8242][SPARK-824...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6843#issuecomment-113191227 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6263][MLLIB] Python MLlib API missing i...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5707#issuecomment-113169006 [Test build #35131 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35131/console) for PR 5707 at commit [`1502d13`](https://github.com/apache/spark/commit/1502d13e535cd76aa3afaaca70a7cbe0c28b4d29). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6881#issuecomment-113169252 [Test build #35130 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35130/console) for PR 6881 at commit [`2efe8ba`](https://github.com/apache/spark/commit/2efe8ba0ad1a8371c0493b7e247a683156da17b0). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SQL][SPARK-7088] Fix analysis for 3rd pa...
Github user smola commented on the pull request: https://github.com/apache/spark/pull/6853#issuecomment-113169112 @marmbrus Yes, this patch is meant just to delay the check until check analysis. The reason is that just because ResolveReferences rule cannot resolve the plan, that does not mean that there is no other rule resolving it. I think this is the main idea behind how rules work in catalyst, right? Each rule takes care of what it knows and ignores the unknown. With respect my use case, my custom logical plan does produce new attributes. I have also added resolution rules for it on my side. So yes, analysis is checked. But the current problem with this is that I need to maintain a copy of ResolveReferences (i.e. FixedResolveReferences) in my code, instead of just adding my new logic in ResolveMyCustomPlan. Then I have to override SQLContext and the analyzer just to be able to replace the default rule with mine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-113197110 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8320] [Streaming] Add example in stream...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/6862#discussion_r32744362 --- Diff: docs/streaming-programming-guide.md --- @@ -1937,6 +1937,16 @@ JavaPairDStreamString, String unifiedStream = streamingContext.union(kafkaStre unifiedStream.print(); {% endhighlight %} /div +div data-lang=python markdown=1 +{% highlight python %} +numStreams = 5 +kafkaStreams = [] +for _ in range (numStreams): + kafkaStreams.append(KafkaUtils.createStream(...)) --- End diff -- Nit: List comprehension is more Pythonic ``` kafkaStreams = [KafkaUtils.createStream(...) for _ in range (numStreams)] ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6880#issuecomment-113197083 **[Test build #35125 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35125/console)** for PR 6880 at commit [`0c0a478`](https://github.com/apache/spark/commit/0c0a478568b8abcd37744f2435ef359e9d7f2392) after a configured wait of `175m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org