date:20150618

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6880#issuecomment-113145019
  
  [Test build #35125 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35125/consoleFull)
 for   PR 6880 at commit 
[`0c0a478`](https://github.com/apache/spark/commit/0c0a478568b8abcd37744f2435ef359e9d7f2392).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6262#issuecomment-113145071
  
  [Test build #35126 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35126/consoleFull)
 for   PR 6262 at commit 
[`3ab8c7a`](https://github.com/apache/spark/commit/3ab8c7a4e666fc0b9d60b1462e8f233b94ce783e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...

2015-06-18 Thread yijieshen

Github user yijieshen commented on the pull request:

https://github.com/apache/spark/pull/6881#issuecomment-113145106
  
I prefer to keep the origin column names in the newly created struct, since 
I think it's more meaningful than `col1, col2, col3`, and we could just leave 
the unnamed columns to `col1, col2 ...`, which is also compatible with Hive's 
semantic.

I've also made related changes in #6874 to loosen parameter requirements of 
[`struct`](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L723)
 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7862] [SQL] Disable the error message r...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6882#issuecomment-113145081
  
  [Test build #35129 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35129/consoleFull)
 for   PR 6882 at commit 
[`402f746`](https://github.com/apache/spark/commit/402f746e4215a28c49806d84c1d3d993f18c9f8d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6877#issuecomment-113145010
  
  [Test build #35128 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35128/consoleFull)
 for   PR 6877 at commit 
[`a3cd55b`](https://github.com/apache/spark/commit/a3cd55b61440fd9121e50b35fb3a0325986cd550).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6881#issuecomment-113146743
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8238][SPARK-8239][SPARK-8242][SPARK-824...

2015-06-18 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/6843#discussion_r32727925
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
 ---
@@ -313,3 +313,131 @@ case class StringLength(child: Expression) extends 
UnaryExpression with ExpectsI
 defineCodeGen(ctx, ev, c = s($c).length())
   }
 }
+
+/**
+ * Returns the numeric value of the first character of str.
+ */
+case class Ascii(child: Expression) extends UnaryExpression with 
ExpectsInputTypes {
+  override def dataType: DataType = IntegerType
+  override def expectedChildTypes: Seq[DataType] = Seq(StringType)
+
+  override def eval(input: InternalRow): Any = {
+val string = child.eval(input)
+if (string == null) {
+  null
+} else {
+  val bytes = string.asInstanceOf[UTF8String].getBytes
+  if (bytes.length  0) {
--- End diff --

I copied the logic from Hive, Hive doesn't check if it's a utf8 string.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Followup SPARK-8387][WEBUI] Update driver log...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6878#issuecomment-113109990
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Followup SPARK-8387][WEBUI] Update driver log...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6878#issuecomment-113109866
  
  [Test build #35120 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35120/console)
 for   PR 6878 at commit 
[`13be948`](https://github.com/apache/spark/commit/13be948b455ee4ee6db5bd6beafd9854e5428e68).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8160][SQL]Support using external sortin...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6875#issuecomment-113121395
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6881#issuecomment-113144419
  
  [Test build #35127 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35127/consoleFull)
 for   PR 6881 at commit 
[`35fa5fb`](https://github.com/apache/spark/commit/35fa5fbd7b97879acdf1d2027ed0fa587b8ae301).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6262#issuecomment-113144080
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6263][MLLIB] Python MLlib API missing i...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5707#issuecomment-113149919
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6263][MLLIB] Python MLlib API missing i...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5707#issuecomment-113149945
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8160][SQL]Support using external sortin...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6875#issuecomment-113121333
  
  [Test build #35122 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35122/console)
 for   PR 6875 at commit 
[`44a3e62`](https://github.com/apache/spark/commit/44a3e62bc1cc2d06ec29a7095e9c77e1da21b772).
 * This patch **passes all tests**.
 * This patch **does not merge cleanly**.
 * This patch adds the following public classes _(experimental)_:
  * `trait GeneratedAggregate `
  * `case class HashGeneratedAggregate(`
  * `case class SortMergeAggregate(`
  * `  case class ComputedAggregate(`
  * `case class SortMergeGeneratedAggregate(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7155] [CORE] Allow newAPIHadoopFile to ...

2015-06-18 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/5708#discussion_r32720959
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -926,7 +926,9 @@ class SparkContext(config: SparkConf) extends Logging 
with ExecutorAllocationCli
 // The call to new NewHadoopJob automatically adds security 
credentials to conf,
 // so we don't need to explicitly add them ourselves
 val job = new NewHadoopJob(conf)
-NewFileInputFormat.addInputPath(job, new Path(path))
+// Use addInputPaths so that newAPIHadoopFile aligns with hadoopFile 
in taking
+// comma separated files as input. (see SPARK-7155)
+NewFileInputFormat.addInputPaths(job, path)
--- End diff --

... but then what would you do about the inconsistency problem? some 
methods would then use one, others use the other. That's a bigger problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...

2015-06-18 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/6881#issuecomment-113124607
  
@yijieshen


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7862] [SQL] Disable the error message r...

2015-06-18 Thread chenghao-intel

GitHub user chenghao-intel opened a pull request:

https://github.com/apache/spark/pull/6882

[SPARK-7862] [SQL] Disable the error message redirect to stderr

This is a follow up of #6404, the ScriptTransformation prints the error msg 
into stderr directly, probably be a disaster for application log.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chenghao-intel/spark verbose

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/6882.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6882


commit 402f746e4215a28c49806d84c1d3d993f18c9f8d
Author: Cheng Hao hao.ch...@intel.com
Date:   2015-06-18T12:12:50Z

disable the error message redirection for stderr




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8407][SQL]complex type constructors: st...

2015-06-18 Thread yijieshen

Github user yijieshen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6874#discussion_r32726438
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala
 ---
@@ -79,3 +80,44 @@ case class CreateStruct(children: Seq[Expression]) 
extends Expression {
 InternalRow(children.map(_.eval(input)): _*)
   }
 }
+
+/**
+ * Creates a struct with the given field names and values
+ *
+ * @param children Seq(name1, val1, name2, val2, ...)
+ */
+case class CreateNamedStruct(children: Seq[Expression]) extends Expression 
{
+  assert(children.size % 2 == 0, NamedStruct expects an even number of 
arguments.)
+
+  private val nameExprs = children.zipWithIndex.filter(_._2 % 2 == 
0).map(_._1)
+  private val valExprs = children.zipWithIndex.filter(_._2 % 2 == 
1).map(_._1)
+
+  private lazy val names = nameExprs.map { case name =
+name match {
+  case NonNullLiteral(str, StringType) =
+str.asInstanceOf[UTF8String].toString
+  case _ =
+throw new IllegalArgumentException(Expressions of odd index 
should be +
+  s Literal(_, StringType), get ${name.dataType} instead)
+}
+  }
+
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  override lazy val resolved: Boolean = childrenResolved
--- End diff --

Get it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...

2015-06-18 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/6881#issuecomment-113150996
  
Ok, sound reasonable to me, closing this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8238][SPARK-8239][SPARK-8242][SPARK-824...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6843#issuecomment-113160423
  
  [Test build #35132 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35132/consoleFull)
 for   PR 6843 at commit 
[`05cc18e`](https://github.com/apache/spark/commit/05cc18e37be9f2e23d3fe99a20892e91330ce469).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7155] [CORE] Allow newAPIHadoopFile to ...

2015-06-18 Thread EugenCepoi

Github user EugenCepoi commented on a diff in the pull request:

https://github.com/apache/spark/pull/5708#discussion_r32720782
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -926,7 +926,9 @@ class SparkContext(config: SparkConf) extends Logging 
with ExecutorAllocationCli
 // The call to new NewHadoopJob automatically adds security 
credentials to conf,
 // so we don't need to explicitly add them ourselves
 val job = new NewHadoopJob(conf)
-NewFileInputFormat.addInputPath(job, new Path(path))
+// Use addInputPaths so that newAPIHadoopFile aligns with hadoopFile 
in taking
+// comma separated files as input. (see SPARK-7155)
+NewFileInputFormat.addInputPaths(job, path)
--- End diff --

The reason to use addInputPaths would be for preserving compatibility. I 
had the luck to have some unit tests that detected this change, but others 
might encounter it in production. 

But as this has been already released, I guess we can stick with 
`setInputPaths`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8407][SQL]complex type constructors: st...

2015-06-18 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/6874#discussion_r32724856
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -737,6 +735,21 @@ object functions {
   }
 
   /**
+   * Creates a new struct column with given field names and columns.
+   * The input columns should be of length 2*n and follow (name1, col1, 
name2, col2),
+   * name* should be String Literal
+   *
+   * @group normal_funcs
+   * @since 1.5.0
+   */
+  @scala.annotation.varargs
+  def named_struct(cols: Column*): Column = {
+require(cols.length % 2 == 0,
+  snamed_struct expects an even number of arguments.)
+CreateNamedStruct(cols.map(_.expr))
+  }
+
+  /**
--- End diff --

We usually will have the column names version of API.  For example:
```
def namedStruct(colName: String, colNames: String*): Column
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...

2015-06-18 Thread chenghao-intel

GitHub user chenghao-intel opened a pull request:

https://github.com/apache/spark/pull/6881

[SPARK-8283][SQL] CreateStruct should not specify the field names

`CreateStruct` = `GenericUDFStruct` which always give the default column 
names for the output struct like (col1, col2...colN)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chenghao-intel/spark struct

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/6881.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6881


commit 673ef10e19ceef6b03129e289c4fab45d3585f92
Author: Cheng Hao hao.ch...@intel.com
Date:   2015-06-18T11:15:48Z

Give default field names for CreateStruct

commit b49227671574db6284ddaad016e5d30b96788f2a
Author: Cheng Hao hao.ch...@intel.com
Date:   2015-06-18T11:22:13Z

scalastyle

commit 35fa5fbd7b97879acdf1d2027ed0fa587b8ae301
Author: Cheng Hao hao.ch...@intel.com
Date:   2015-06-18T11:30:08Z

fix the bugs in unittest for CreateStruct




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7862] [SQL] Disable the error message r...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6882#issuecomment-113144079
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6877#issuecomment-113144078
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7913][Core]Make AppendOnlyMap use the s...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6879#issuecomment-113143934
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7913][Core]Make AppendOnlyMap use the s...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6879#issuecomment-113143840
  
  [Test build #35124 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35124/console)
 for   PR 6879 at commit 
[`912c0ad`](https://github.com/apache/spark/commit/912c0adeb92d7c33af05c99970640a66868be374).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6881#issuecomment-113144081
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6880#issuecomment-113144077
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8407][SQL]complex type constructors: st...

2015-06-18 Thread yijieshen

Github user yijieshen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6874#discussion_r32726086
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -737,6 +735,21 @@ object functions {
   }
 
   /**
+   * Creates a new struct column with given field names and columns.
+   * The input columns should be of length 2*n and follow (name1, col1, 
name2, col2),
+   * name* should be String Literal
+   *
+   * @group normal_funcs
+   * @since 1.5.0
+   */
+  @scala.annotation.varargs
+  def named_struct(cols: Column*): Column = {
+require(cols.length % 2 == 0,
+  snamed_struct expects an even number of arguments.)
+CreateNamedStruct(cols.map(_.expr))
+  }
+
+  /**
--- End diff --

I found a little difficult here to name the parameters in this API, since 
it should be fieldName1, value1, fieldName2, value2, I'll consider this again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6262#issuecomment-113112104
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6877#issuecomment-113111717
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6877#issuecomment-113111606
  
  [Test build #35119 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35119/console)
 for   PR 6877 at commit 
[`923cee4`](https://github.com/apache/spark/commit/923cee4586f5747ad596deefc352aac0429a2dc1).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6263][MLLIB] Python MLlib API missing i...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5707#issuecomment-113150494
  
  [Test build #35131 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35131/consoleFull)
 for   PR 5707 at commit 
[`1502d13`](https://github.com/apache/spark/commit/1502d13e535cd76aa3afaaca70a7cbe0c28b4d29).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8238][SPARK-8239][SPARK-8242][SPARK-824...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6843#issuecomment-113159214
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6881#issuecomment-113125643
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6877#issuecomment-113136109
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7862] [SQL] Disable the error message r...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6882#issuecomment-113137581
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8301][SQL] Improve UTF8String substring...

2015-06-18 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/6804#issuecomment-113149719
  
Probably be `Nonnull`, this will give more strict check by FindBug. It will 
be great if you can run FindBug locally after the change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...

2015-06-18 Thread chenghao-intel

Github user chenghao-intel closed the pull request at:

https://github.com/apache/spark/pull/6881


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8407][SQL]complex type constructors: st...

2015-06-18 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/6874#discussion_r32724484
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypes.scala
 ---
@@ -79,3 +80,44 @@ case class CreateStruct(children: Seq[Expression]) 
extends Expression {
 InternalRow(children.map(_.eval(input)): _*)
   }
 }
+
+/**
+ * Creates a struct with the given field names and values
+ *
+ * @param children Seq(name1, val1, name2, val2, ...)
+ */
+case class CreateNamedStruct(children: Seq[Expression]) extends Expression 
{
+  assert(children.size % 2 == 0, NamedStruct expects an even number of 
arguments.)
+
+  private val nameExprs = children.zipWithIndex.filter(_._2 % 2 == 
0).map(_._1)
+  private val valExprs = children.zipWithIndex.filter(_._2 % 2 == 
1).map(_._1)
+
+  private lazy val names = nameExprs.map { case name =
+name match {
+  case NonNullLiteral(str, StringType) =
+str.asInstanceOf[UTF8String].toString
+  case _ =
+throw new IllegalArgumentException(Expressions of odd index 
should be +
+  s Literal(_, StringType), get ${name.dataType} instead)
+}
+  }
+
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  override lazy val resolved: Boolean = childrenResolved
--- End diff --

We'd better remove this, as it's covered by its parent class.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8407][SQL]complex type constructors: st...

2015-06-18 Thread yijieshen

Github user yijieshen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6874#discussion_r32725827
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -737,6 +735,21 @@ object functions {
   }
 
   /**
+   * Creates a new struct column with given field names and columns.
+   * The input columns should be of length 2*n and follow (name1, col1, 
name2, col2),
+   * name* should be String Literal
+   *
+   * @group normal_funcs
+   * @since 1.5.0
+   */
+  @scala.annotation.varargs
+  def named_struct(cols: Column*): Column = {
--- End diff --

OK.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6881#issuecomment-113146910
  
  [Test build #35130 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35130/consoleFull)
 for   PR 6881 at commit 
[`2efe8ba`](https://github.com/apache/spark/commit/2efe8ba0ad1a8371c0493b7e247a683156da17b0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7155] [CORE] Allow newAPIHadoopFile to ...

2015-06-18 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/5708#discussion_r32719742
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -926,7 +926,9 @@ class SparkContext(config: SparkConf) extends Logging 
with ExecutorAllocationCli
 // The call to new NewHadoopJob automatically adds security 
credentials to conf,
 // so we don't need to explicitly add them ourselves
 val job = new NewHadoopJob(conf)
-NewFileInputFormat.addInputPath(job, new Path(path))
+// Use addInputPaths so that newAPIHadoopFile aligns with hadoopFile 
in taking
+// comma separated files as input. (see SPARK-7155)
+NewFileInputFormat.addInputPaths(job, path)
--- End diff --

The problem is that the rest of the API already used `setInputPaths` so one 
or the other behavior really needed to change in order to fix that. I think the 
logic was that nobody _should_ have been relying on anything but the method arg 
to set the path. I personally think it's less confusing to not have two ways to 
specify a path. At this point though I think it would need a very good reason 
to change the behavior again since it's not a question of fixing an 
inconsistency anymore.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8407][SQL]complex type constructors: st...

2015-06-18 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/6874#discussion_r32724921
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -737,6 +735,21 @@ object functions {
   }
 
   /**
+   * Creates a new struct column with given field names and columns.
+   * The input columns should be of length 2*n and follow (name1, col1, 
name2, col2),
+   * name* should be String Literal
+   *
+   * @group normal_funcs
+   * @since 1.5.0
+   */
+  @scala.annotation.varargs
+  def named_struct(cols: Column*): Column = {
--- End diff --

the function name should be camel style. `namedStruct`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6881#issuecomment-113146720
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8238][SPARK-8239][SPARK-8242][SPARK-824...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6843#issuecomment-113159272
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6740][SQL] Fix NOT operator precedence.

2015-06-18 Thread smola

Github user smola commented on a diff in the pull request:

https://github.com/apache/spark/pull/6326#discussion_r32732289
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala ---
@@ -228,7 +228,12 @@ class SqlParser extends AbstractSparkSQLParser with 
DataTypeParser {
 andExpression * (OR ^^^ { (e1: Expression, e2: Expression) = Or(e1, 
e2) })
 
   protected lazy val andExpression: Parser[Expression] =
-comparisonExpression * (AND ^^^ { (e1: Expression, e2: Expression) = 
And(e1, e2) })
+booleanFactor * (AND ^^^ { (e1: Expression, e2: Expression) = And(e1, 
e2) })
--- End diff --

@chenghao-intel Not sure how. Adding the NOT clause in the expression rule 
would break precedence rules. Also, binding expression - orExpression - 
andExpression - booleanFactor - comparison is pretty much they way it is 
expressed in the grammars for standard SQL.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6262#issuecomment-113165776
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4055#issuecomment-113169097
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4055#issuecomment-113169123
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8080][STREAMING] Receiver.store with It...

2015-06-18 Thread dibbhatt

Github user dibbhatt commented on the pull request:

https://github.com/apache/spark/pull/6707#issuecomment-113174385
  
hi @tdas . Let me know if latest changes are fine


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113178842
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6262#issuecomment-113184674
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6262#issuecomment-113184578
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8320] [Streaming] Add example in stream...

2015-06-18 Thread koeninger

Github user koeninger commented on the pull request:

https://github.com/apache/spark/pull/6862#issuecomment-113184801
  
I'm not a python programmer, but isn't the direct translation of that

kafkaStreams = map(lambda _:KafkaUtils.createStream(...), range(0, 
numStreams))

Maybe append is more idiomatic... at any rate what's there looks like it 
will work


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6881#issuecomment-113169281
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8348][SQL] Add in operator to DataFrame...

2015-06-18 Thread yu-iskw

Github user yu-iskw commented on the pull request:

https://github.com/apache/spark/pull/6824#issuecomment-113181215
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8056][SQL] Design an easier way to cons...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6686#issuecomment-113189491
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8056][SQL] Design an easier way to cons...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6686#issuecomment-113189528
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8238][SPARK-8239][SPARK-8242][SPARK-824...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6843#issuecomment-113191198
  
  [Test build #35132 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35132/console)
 for   PR 6843 at commit 
[`05cc18e`](https://github.com/apache/spark/commit/05cc18e37be9f2e23d3fe99a20892e91330ce469).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Ascii(child: Expression) extends UnaryExpression `
  * `case class Base64(child: Expression) extends UnaryExpression `
  * `case class UnBase64(child: Expression) extends UnaryExpression `
  * `case class Decode(bin: Expression, charset: Expression) extends 
Expression `
  * `case class Encode(value: Expression, charset: Expression) extends 
Expression `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6740][SQL] Fix NOT operator precedence.

2015-06-18 Thread smola

Github user smola commented on the pull request:

https://github.com/apache/spark/pull/6326#issuecomment-113166469
  
@marmbrus Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6881#issuecomment-113166685
  
  [Test build #35127 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35127/console)
 for   PR 6881 at commit 
[`35fa5fb`](https://github.com/apache/spark/commit/35fa5fbd7b97879acdf1d2027ed0fa587b8ae301).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...

2015-06-18 Thread suyanNone

Github user suyanNone commented on the pull request:

https://github.com/apache/spark/pull/4055#issuecomment-113166545
  
@andrewor14 @srowen Already refine with the comments. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4055#issuecomment-113166266
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4055#issuecomment-113166319
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6881#issuecomment-113166774
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4055#issuecomment-113166827
  
  [Test build #35133 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35133/consoleFull)
 for   PR 4055 at commit 
[`0c161a7`](https://github.com/apache/spark/commit/0c161a7abc99d470c6450af143bb580fec7e3bc3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4055#issuecomment-113169866
  
  [Test build #35134 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35134/consoleFull)
 for   PR 4055 at commit 
[`d836d83`](https://github.com/apache/spark/commit/d836d83001225cc170fa2d38ebd8c35430b7bfdc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [MLLIB] [spark-2352] Implementation of an Arti...

2015-06-18 Thread hntd187

Github user hntd187 commented on the pull request:

https://github.com/apache/spark/pull/1290#issuecomment-11317
  
@avulanov Also, we're going to have to add a dependency with this with the 
HDF5 library, I think this should be handled the way the netlib is handled with 
the user having to enable a profile when building spark. So, normally it 
wouldn't be available, but if you build with it you can use it. I'll update the 
POM to account for that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7155] [CORE] Allow newAPIHadoopFile to ...

2015-06-18 Thread EugenCepoi

Github user EugenCepoi commented on a diff in the pull request:

https://github.com/apache/spark/pull/5708#discussion_r32741755
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -926,7 +926,9 @@ class SparkContext(config: SparkConf) extends Logging 
with ExecutorAllocationCli
 // The call to new NewHadoopJob automatically adds security 
credentials to conf,
 // so we don't need to explicitly add them ourselves
 val job = new NewHadoopJob(conf)
-NewFileInputFormat.addInputPath(job, new Path(path))
+// Use addInputPaths so that newAPIHadoopFile aligns with hadoopFile 
in taking
+// comma separated files as input. (see SPARK-7155)
+NewFileInputFormat.addInputPaths(job, path)
--- End diff --

continued the conversation on the [jira ticket 
SPARK-8439](https://issues.apache.org/jira/browse/SPARK-8439).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8104][SQL] auto alias expressions in an...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6647#issuecomment-113190638
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4055#issuecomment-113167204
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4055#issuecomment-113167200
  
  [Test build #35133 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35133/console)
 for   PR 4055 at commit 
[`0c161a7`](https://github.com/apache/spark/commit/0c161a7abc99d470c6450af143bb580fec7e3bc3).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7862] [SQL] Disable the error message r...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6882#issuecomment-113177974
  
  [Test build #35129 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35129/console)
 for   PR 6882 at commit 
[`402f746`](https://github.com/apache/spark/commit/402f746e4215a28c49806d84c1d3d993f18c9f8d).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class ElementwiseProduct(VectorTransformer):`
  * `case class CreateStruct(children: Seq[Expression]) extends Expression `
  * `case class Logarithm(left: Expression, right: Expression)`
  * `case class SetCommand(kv: Option[(String, Option[String])]) extends 
RunnableCommand with Logging `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7862] [SQL] Disable the error message r...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6882#issuecomment-113178136
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8348][SQL] Add in operator to DataFrame...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6824#issuecomment-113182938
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8348][SQL] Add in operator to DataFrame...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6824#issuecomment-113182987
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8348][SQL] Add in operator to DataFrame...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6824#issuecomment-113183907
  
  [Test build #35135 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35135/consoleFull)
 for   PR 6824 at commit 
[`6f744ac`](https://github.com/apache/spark/commit/6f744ac88cb6c0905bf2297bd4d85e53037090fb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6877#issuecomment-113188524
  
  [Test build #35128 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35128/console)
 for   PR 6877 at commit 
[`a3cd55b`](https://github.com/apache/spark/commit/a3cd55b61440fd9121e50b35fb3a0325986cd550).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8104][SQL] auto alias expressions in an...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6647#issuecomment-113190675
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8104][SQL] auto alias expressions in an...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6647#issuecomment-113196994
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8104][SQL] auto alias expressions in an...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6647#issuecomment-113196957
  
  [Test build #35138 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35138/console)
 for   PR 6647 at commit 
[`02ed8a3`](https://github.com/apache/spark/commit/02ed8a347b2e35557a466a4d9b82694473e72e37).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class UnresolvedAlias(child: Expression) extends NamedExpression`
  * `abstract class ExtractValueWithStruct extends ExtractValue `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6262#issuecomment-113165679
  
  [Test build #35126 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35126/console)
 for   PR 6262 at commit 
[`3ab8c7a`](https://github.com/apache/spark/commit/3ab8c7a4e666fc0b9d60b1462e8f233b94ce783e).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6263][MLLIB] Python MLlib API missing i...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5707#issuecomment-113169048
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-18 Thread koertkuipers

GitHub user koertkuipers opened a pull request:

https://github.com/apache/spark/pull/6883

SPARK-4644 blockjoin

Although the discussion (and design doc) under SPARK-4644 seem focussed on 
other aspects of skew (OOM mostly) than this pullreq (which focusses on 
avoiding a single reducer taking a long time), i decided to put this pullreq 
under SPARK-4644 anyhow, to avoid the proliferation of JIRA tickets. If this is 
not the right place let me know and i will move it.

Inspired by block join in scalding.
From scalding docs:

This is useful in cases where the data has extreme skew. A symptom of this 
is that we may see a job stuck for a very long time on a small number of 
reducers.
A block join is way to get around this: we add a random integer field and a 
replica field to every tuple in the left and right pipes. We then join on the 
original keys and on these new dummy fields. These dummy fields make it less 
likely that the skewed keys will be hashed to the same reducer.

The final data size is right * rightReplication + left * leftReplication 
but because of the fragmentation, we are guaranteed the same number of hits as 
the original join.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tresata/spark feat-blockjoin

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/6883.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6883


commit 77d8fee6ad7ba5f83eb0c82b7f1625e2206a5446
Author: Koert Kuipers ko...@tresata.com
Date:   2015-06-17T20:35:18Z

add blockJoin, blockLeftOuterJoin and blockRightOuterJoin to spark core

commit d1fd3e020812c72c44a6461d9c94065e2784cdbb
Author: Koert Kuipers ko...@tresata.com
Date:   2015-06-17T23:48:43Z

correct scaladocs for block join functions

commit 2114df748f62b53155d7db5524e163504cead228
Author: Koert Kuipers ko...@tresata.com
Date:   2015-06-18T03:36:21Z

add block joins to java api




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6262#issuecomment-113184900
  
  [Test build #35136 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35136/consoleFull)
 for   PR 6262 at commit 
[`3344a21`](https://github.com/apache/spark/commit/3344a2171eeb54c07e9b8af036e327e4e4de143f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8434][SQL]Add a pretty parameter to s...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6877#issuecomment-113188582
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8056][SQL] Design an easier way to cons...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6686#issuecomment-113190209
  
  [Test build #35137 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35137/consoleFull)
 for   PR 6686 at commit 
[`8109e00`](https://github.com/apache/spark/commit/8109e0067b3abce6f4eec937b39c6d7db2eb6b71).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8104][SQL] auto alias expressions in an...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6647#issuecomment-113191390
  
  [Test build #35138 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35138/consoleFull)
 for   PR 6647 at commit 
[`02ed8a3`](https://github.com/apache/spark/commit/02ed8a347b2e35557a466a4d9b82694473e72e37).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4055#issuecomment-113191005
  
  [Test build #35134 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35134/console)
 for   PR 4055 at commit 
[`d836d83`](https://github.com/apache/spark/commit/d836d83001225cc170fa2d38ebd8c35430b7bfdc).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5259][CORE]Make sure mapStage.pendingta...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4055#issuecomment-113191053
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8238][SPARK-8239][SPARK-8242][SPARK-824...

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6843#issuecomment-113191227
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6263][MLLIB] Python MLlib API missing i...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5707#issuecomment-113169006
  
  [Test build #35131 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35131/console)
 for   PR 5707 at commit 
[`1502d13`](https://github.com/apache/spark/commit/1502d13e535cd76aa3afaaca70a7cbe0c28b4d29).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8283][SQL] CreateStruct should not spec...

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6881#issuecomment-113169252
  
  [Test build #35130 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35130/console)
 for   PR 6881 at commit 
[`2efe8ba`](https://github.com/apache/spark/commit/2efe8ba0ad1a8371c0493b7e247a683156da17b0).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP][SQL][SPARK-7088] Fix analysis for 3rd pa...

2015-06-18 Thread smola

Github user smola commented on the pull request:

https://github.com/apache/spark/pull/6853#issuecomment-113169112
  
@marmbrus Yes, this patch is meant just to delay the check until check 
analysis. The reason is that just because ResolveReferences rule cannot resolve 
the plan, that does not mean that there is no other rule resolving it. I think 
this is the main idea behind how rules work in catalyst, right? Each rule takes 
care of what it knows and ignores the unknown.

With respect my use case, my custom logical plan does produce new 
attributes. I have also added resolution rules for it on my side. So yes, 
analysis is checked. But the current problem with this is that I need to 
maintain a copy of ResolveReferences (i.e. FixedResolveReferences) in my code, 
instead of just adding my new logic in ResolveMyCustomPlan. Then I have to 
override SQLContext and the analyzer just to be able to replace the default 
rule with mine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-06-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6880#issuecomment-113197110
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8320] [Streaming] Add example in stream...

2015-06-18 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/6862#discussion_r32744362
  
--- Diff: docs/streaming-programming-guide.md ---
@@ -1937,6 +1937,16 @@ JavaPairDStreamString, String unifiedStream = 
streamingContext.union(kafkaStre
 unifiedStream.print();
 {% endhighlight %}
 /div
+div data-lang=python markdown=1
+{% highlight python %}
+numStreams = 5
+kafkaStreams = []
+for _ in range (numStreams):
+ kafkaStreams.append(KafkaUtils.createStream(...))
--- End diff --

Nit: List comprehension is more Pythonic
```
kafkaStreams = [KafkaUtils.createStream(...) for _ in range (numStreams)]
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8402][MLLIB] DP Means Clustering

2015-06-18 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6880#issuecomment-113197083
  
**[Test build #35125 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35125/console)**
 for PR 6880 at commit 
[`0c0a478`](https://github.com/apache/spark/commit/0c0a478568b8abcd37744f2435ef359e9d7f2392)
 after a configured wait of `175m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1300 matches

Mail list logo