date:20160319

[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...

2016-03-19 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11836#discussion_r56755343
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -31,17 +32,34 @@ import 
org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, SubqueryAlias}
  * proxy to the underlying metastore (e.g. Hive Metastore) and it also 
manages temporary
  * tables and functions of the Spark Session that it belongs to.
  */
-class SessionCatalog(externalCatalog: ExternalCatalog) {
+class SessionCatalog(externalCatalog: ExternalCatalog, conf: CatalystConf) 
{
   import ExternalCatalog._
 
-  private[this] val tempTables = new ConcurrentHashMap[String, LogicalPlan]
-  private[this] val tempFunctions = new ConcurrentHashMap[String, 
CatalogFunction]
+  def this(externalCatalog: ExternalCatalog) {
+this(externalCatalog, new SimpleCatalystConf(true))
+  }
+
+  protected[this] val tempTables = new ConcurrentHashMap[String, 
LogicalPlan]
+  protected[this] val tempFunctions = new ConcurrentHashMap[String, 
CatalogFunction]
 
   // Note: we track current database here because certain operations do 
not explicitly
   // specify the database (e.g. DROP TABLE my_table). In these cases we 
must first
   // check whether the temporary table or function exists, then, if not, 
operate on
   // the corresponding item in the current database.
-  private[this] var currentDb = "default"
+  protected[this] var currentDb = {
+val defaultName = "default"
+val defaultDbDefinition = CatalogDatabase(defaultName, "default 
database", "", Map())
+// Initialize default database if it doesn't already exist
+createDatabase(defaultDbDefinition, ignoreIfExists = true)
+defaultName
+  }
+
+  /**
+   * Format table name, taking into account case sensitivity.
+   */
+  protected[this] def formatTableName(name: String): String = {
+if (conf.caseSensitiveAnalysis) name else name.toLowerCase
--- End diff --

Later, it will be good to use this to handle other db name as well for the 
consistency reason (it will not actually have any effect right now though).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...

2016-03-19 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11836#discussion_r56755267
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -31,17 +32,34 @@ import 
org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, SubqueryAlias}
  * proxy to the underlying metastore (e.g. Hive Metastore) and it also 
manages temporary
  * tables and functions of the Spark Session that it belongs to.
  */
-class SessionCatalog(externalCatalog: ExternalCatalog) {
+class SessionCatalog(externalCatalog: ExternalCatalog, conf: CatalystConf) 
{
   import ExternalCatalog._
 
-  private[this] val tempTables = new ConcurrentHashMap[String, LogicalPlan]
-  private[this] val tempFunctions = new ConcurrentHashMap[String, 
CatalogFunction]
+  def this(externalCatalog: ExternalCatalog) {
+this(externalCatalog, new SimpleCatalystConf(true))
+  }
+
+  protected[this] val tempTables = new ConcurrentHashMap[String, 
LogicalPlan]
+  protected[this] val tempFunctions = new ConcurrentHashMap[String, 
CatalogFunction]
 
   // Note: we track current database here because certain operations do 
not explicitly
   // specify the database (e.g. DROP TABLE my_table). In these cases we 
must first
   // check whether the temporary table or function exists, then, if not, 
operate on
   // the corresponding item in the current database.
-  private[this] var currentDb = "default"
+  protected[this] var currentDb = {
+val defaultName = "default"
+val defaultDbDefinition = CatalogDatabase(defaultName, "default 
database", "", Map())
+// Initialize default database if it doesn't already exist
+createDatabase(defaultDbDefinition, ignoreIfExists = true)
+defaultName
+  }
+
+  /**
+   * Format table name, taking into account case sensitivity.
+   */
+  protected[this] def formatTableName(name: String): String = {
+if (conf.caseSensitiveAnalysis) name else name.toLowerCase
--- End diff --

Just a note at here. Hive metastore is always case insensitive. So, the 
case sensitivity setting is mainly for temp tables and temp functions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13629] [ML] Add binary toggle Param to ...

2016-03-19 Thread hhbyyh

Github user hhbyyh commented on the pull request:

https://github.com/apache/spark/pull/11536#issuecomment-198511013
  
Get it. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13974][SQL] sub-query names do not need...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11783#issuecomment-197987787
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53436/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13038] [PySpark] Add load/save to pipel...

2016-03-19 Thread jkbradley

Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/11683#issuecomment-197552521
  
https://issues.apache.org/jira/browse/SPARK-13951


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13898][SQL] Merge DatasetHolder and Dat...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11737#issuecomment-197734353
  
**[Test build #53394 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53394/consoleFull)**
 for PR 11737 at commit 
[`59cae95`](https://github.com/apache/spark/commit/59cae95a34fb8bd8cfee0da5b34fc5d27b0f85d2).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13972] [SQL] [FOLLOW-UP] When creating ...

2016-03-19 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/11825#issuecomment-198464786
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13873] [SQL] Avoid copy of UnsafeRow wh...

2016-03-19 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11740#discussion_r56376958
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegen.scala 
---
@@ -396,7 +401,7 @@ case class WholeStageCodegen(child: SparkPlan) extends 
UnaryNode with CodegenSup
 s"""
|$evaluateInputs
|${code.code.trim}
-   |append(${code.value}.copy());
+   |append(${code.value}$doCopy);
--- End diff --

If there is only one row will be buffered here, we do not need to copy it. 
The parent of WholeStageCodegen will do the copy if needed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13845][CORE]Using onBlockUpdated to rep...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11679#issuecomment-197753298
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8884][MLlib] 1-sample Anderson-Darling ...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11780#issuecomment-197789313
  
**[Test build #53416 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53416/consoleFull)**
 for PR 11780 at commit 
[`a9a59cb`](https://github.com/apache/spark/commit/a9a59cba0d0ba29c8a483a6bf18426db14d59860).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13977] [SQL] Brings back Shuffled hash ...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11788#issuecomment-198026929
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13977] [SQL] Brings back Shuffled hash ...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11788#issuecomment-198032509
  
**[Test build #53451 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53451/consoleFull)**
 for PR 11788 at commit 
[`6385777`](https://github.com/apache/spark/commit/638577786d4849665fad561b0042efebe5babb8f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13764][SQL] Parse modes in JSON data so...

2016-03-19 Thread HyukjinKwon

Github user HyukjinKwon commented on the pull request:

https://github.com/apache/spark/pull/11756#issuecomment-197636844
  
@cloud-fan Sorry, one more question. Would it be great if we maybe make 
`spark.sql.columnNameOfCorruptRecord` as an option just like the compression 
option for other data sources?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13816][Graphx] Add parameter checks for...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11655#issuecomment-197277636
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP][SPARK-13809][SQL] State store for stream...

2016-03-19 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/11645#discussion_r56376064
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreRDDSuite.scala
 ---
@@ -0,0 +1,147 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import java.io.File
+import java.nio.file.Files
+
+import scala.util.Random
+
+import org.scalatest.{BeforeAndAfter, BeforeAndAfterAll}
+
+import org.apache.spark.{SparkConf, SparkContext, SparkFunSuite}
+import org.apache.spark.LocalSparkContext._
+import org.apache.spark.rdd.RDD
+import org.apache.spark.scheduler.ExecutorCacheTaskLocation
+import org.apache.spark.sql.catalyst.util.quietly
+import org.apache.spark.util.Utils
+
+class StateStoreRDDSuite extends SparkFunSuite with BeforeAndAfter with 
BeforeAndAfterAll {
+
+  private val conf = new 
SparkConf().setMaster("local").setAppName(this.getClass.getCanonicalName)
+  private var tempDir = 
Files.createTempDirectory("StateStoreRDDSuite").toString
+
+  import StateStoreCoordinatorSuite._
+  import StateStoreSuite._
+
+  after {
+StateStore.stop()
+  }
+
+  override def afterAll(): Unit = {
+super.afterAll()
+Utils.deleteRecursively(new File(tempDir))
+  }
+
+  test("versioning and immutability") {
+quietly {
+  withSpark(new SparkContext(conf)) { sc =>
+val path = Utils.createDirectory(tempDir, 
Random.nextString(10)).toString
--- End diff --

createTempDir


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12469][CORE][WIP/RFC] Consistent accumu...

2016-03-19 Thread squito

Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/11105#discussion_r56421658
  
--- Diff: core/src/main/scala/org/apache/spark/Accumulable.scala ---
@@ -146,6 +212,32 @@ class Accumulable[R, T] private (
   def merge(term: R) { value_ = param.addInPlace(value_, term)}
 
   /**
+   * Merge in pending updates for ac consistent accumulators or merge 
accumulated values for
+   * regular accumulators. This is only called on the driver when merging 
task results together.
+   */
+  private[spark] def internalMerge(term: Any) {
+if (!consistent) {
+  merge(term.asInstanceOf[R])
+} else {
+  mergePending(term.asInstanceOf[mutable.HashMap[(Int, Int, Int), R]])
+}
+  }
+
+  /**
+   * Merge another Accumulable's pending updates, checks to make sure that 
each pending update has
+   * not already been processed before updating.
+   */
+  private[spark] def mergePending(term: mutable.HashMap[(Int, Int, Int), 
R]) = {
+term.foreach{case ((rddId, shuffleId, splitId), v) =>
+  val splits = processed.getOrElseUpdate((rddId, shuffleId), new 
mutable.BitSet())
+  if (!splits.contains(splitId)) {
+splits += splitId
+value_ = param.addInPlace(value_, v)
+  }
--- End diff --

I don't think you need both `completed` and `processed` -- they seem to be 
doing more or less the same thing.  You could change this to:

```scala
term.foreach{case (partitionKey, v) =>
  if (!completed.contains(partitionKey)) {
completed += partitionKey
value_ = param.addInPlace(value_, v)
  }
}
```

If I understand correctly, there is a bit of logical distinction between 
the two -- `processed` was being used on the driver, that is across multiple 
tasks, to track what had been accumulated and what hadn't.  `completed`, OTOH, 
was being used on the executors, to track the updates coming from *one task*.  
Typically it would only contain a single entry, for the one `(rdd, shuffle, 
partition)` tuple that was being used, since tasks *usually* only compute one 
partition, but that isn't true when there is a coalesce involved.

If that explanation sounds correct, its probably best to put it into a 
comment somewhere, and I'd still say that you can eliminate `processed` and 
just explain how `completed` will get used in two different ways on the 
executors and on the driver.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197537016
  
**[Test build #53343 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53343/consoleFull)**
 for PR 10355 at commit 
[`9feca44`](https://github.com/apache/spark/commit/9feca44e2796baaa9ccf5dfb03e2b9a0eabb731b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP][SPARK-13809][SQL] State store for stream...

2016-03-19 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/11645#discussion_r56376208
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala
 ---
@@ -0,0 +1,471 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import java.io.File
+
+import scala.collection.mutable
+import scala.util.Random
+
+import org.apache.hadoop.fs.Path
+import org.scalatest.{BeforeAndAfter, PrivateMethodTester}
+import org.scalatest.concurrent.Eventually._
+import org.scalatest.time.SpanSugar._
+
+import org.apache.spark.{SparkConf, SparkContext, SparkFunSuite}
+import org.apache.spark.LocalSparkContext._
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.GenericInternalRow
+import org.apache.spark.sql.catalyst.util.quietly
+import org.apache.spark.unsafe.types.UTF8String
+import org.apache.spark.util.Utils
+
+class StateStoreSuite extends SparkFunSuite with BeforeAndAfter with 
PrivateMethodTester {
+  type MapType = mutable.HashMap[InternalRow, InternalRow]
+
+  import StateStoreCoordinatorSuite._
+  import StateStoreSuite._
+
+  private val tempDir = Utils.createTempDir().toString
+
+  after {
+StateStore.stop()
+  }
+
+  test("update, remove, commit, and all data iterator") {
+val provider = newStoreProvider()
+
+// Verify state before starting a new set of updates
+assert(provider.latestIterator().isEmpty)
+
+val store = provider.getStore(0)
+assert(!store.hasCommitted)
+intercept[IllegalStateException] {
+  store.iterator()
+}
+intercept[IllegalStateException] {
+  store.updates()
+}
+
+// Verify state after updating
+update(store, "a", 1)
+intercept[IllegalStateException] {
+  store.iterator()
+}
+intercept[IllegalStateException] {
+  store.updates()
+}
+assert(provider.latestIterator().isEmpty)
+
+// Make updates, commit and then verify state
+update(store, "b", 2)
+update(store, "aa", 3)
+remove(store, _.startsWith("a"))
+assert(store.commit() === 1)
+
+assert(store.hasCommitted)
+assert(unwrapToSet(store.iterator()) === Set("b" -> 2))
+assert(unwrapToSet(provider.latestIterator()) === Set("b" -> 2))
+assert(fileExists(provider, version = 1, isSnapshot = false))
+assert(getDataFromFiles(provider) === Set("b" -> 2))
+
+// Trying to get newer versions should fail
+intercept[Exception] {
+  provider.getStore(2)
+}
+intercept[Exception] {
+  getDataFromFiles(provider, 2)
+}
+
+// New updates to the reloaded store with new version, and does not 
change old version
+val reloadedStore = new HDFSBackedStateStoreProvider(store.id, 
provider.directory).getStore(1)
+update(reloadedStore, "c", 4)
+assert(reloadedStore.commit() === 2)
+assert(unwrapToSet(reloadedStore.iterator()) === Set("b" -> 2, "c" -> 
4))
+assert(getDataFromFiles(provider) === Set("b" -> 2, "c" -> 4))
+assert(getDataFromFiles(provider, version = 1) === Set("b" -> 2))
+assert(getDataFromFiles(provider, version = 2) === Set("b" -> 2, "c" 
-> 4))
+  }
+
+  test("updates iterator with all combos of updates and removes") {
+val provider = newStoreProvider()
+var currentVersion: Int = 0
+def withStore(body: StateStore => Unit): Unit = {
+  val store = provider.getStore(currentVersion)
+  body(store)
+  currentVersion += 1
+}
+
+// New data should be seen in updates as value added, even if they had 
multiple updates
+withStore { store =>
+  update(store, "a", 1)
+  update(store, "aa", 1)
+  update(store, "aa", 2)

[GitHub] spark pull request: [Spark-13772] fix data type mismatch for decim...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11605#issuecomment-197827396
  
**[Test build #53421 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53421/consoleFull)**
 for PR 11605 at commit 
[`42addd6`](https://github.com/apache/spark/commit/42addd64bfb864ff59fecc5c4c11852d7cd49f60).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Minor][DOC] Add JavaStreamingTestExample

2016-03-19 Thread MLnick

Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/11776#discussion_r56478518
  
--- Diff: 
examples/src/main/java/org/apache/spark/examples/mllib/JavaStreamingTestExample.java
 ---
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.mllib;
+
+
+import org.apache.spark.Accumulator;
+import org.apache.spark.api.java.function.VoidFunction;
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.api.java.function.Function;
+// $example on$
+import org.apache.spark.mllib.stat.test.BinarySample;
+import org.apache.spark.mllib.stat.test.StreamingTest;
+import org.apache.spark.mllib.stat.test.StreamingTestResult;
+// $example off$
+import org.apache.spark.SparkConf;
+import org.apache.spark.streaming.Duration;
+import org.apache.spark.streaming.Seconds;
+import org.apache.spark.streaming.api.java.JavaDStream;
+import org.apache.spark.streaming.api.java.JavaStreamingContext;
+import org.apache.spark.util.Utils;
+
+
+/**
+ * Perform streaming testing using Welch's 2-sample t-test on a stream of 
data, where the data
+ * stream arrives as text files in a directory. Stops when the two groups 
are statistically
+ * significant (p-value < 0.05) or after a user-specified timeout in 
number of batches is exceeded.
+ *
+ * The rows of the text files must be in the form `Boolean, Double`. For 
example:
+ *   false, -3.92
+ *   true, 99.32
+ *
+ * Usage:
+ *   JavaStreamingTestExample   
+ *
+ * To run on your local machine using the directory `dataDir` with 5 
seconds between each batch and
+ * a timeout after 100 insignificant batches, call:
+ *$ bin/run-example mllib.JavaStreamingTestExample dataDir 5 100
+ *
+ * As you add text files to `dataDir` the significance test wil 
continually update every
+ * `batchDuration` seconds until the test becomes significant (p-value < 
0.05) or the number of
+ * batches processed exceeds `numBatchesTimeout`.
+ */
+public class JavaStreamingTestExample {
+  public static void main(String[] args) {
+if (args.length != 3) {
+  System.err.println("Usage: JavaStreamingTestExample " +
+"  ");
+System.exit(1);
+}
+
+String dataDir = args[0];
+Duration batchDuration = Seconds.apply(Long.valueOf(args[1]));
+int numBatchesTimeout = Integer.valueOf(args[2]);
+
+SparkConf conf = new 
SparkConf().setMaster("local").setAppName("StreamingTestExample");
+JavaStreamingContext ssc = new JavaStreamingContext(conf, 
batchDuration);
+
+
ssc.checkpoint(Utils.createTempDir(System.getProperty("java.io.tmpdir"), 
"spark").toString());
+
+// $example on$
+JavaDStream data = ssc.textFileStream(dataDir).map(
+  new Function() {
+@Override
+public BinarySample call(String line) throws Exception {
+  String[] ts = line.split(",");
+  boolean label = Boolean.valueOf(ts[0]);
+  double value = Double.valueOf(ts[1]);
+  return new BinarySample(label, value);
+}
+  });
+
+StreamingTest streamingTest = new StreamingTest()
+  .setPeacePeriod(0)
+  .setWindowSize(0)
+  .setTestMethod("welch");
+
+JavaDStream out = 
streamingTest.registerStream(data);
+out.print();
+// $example off$
+
+// Stop processing if test becomes significant or we time out
+final Accumulator timeoutCounter =
+  ssc.sparkContext().accumulator(numBatchesTimeout);
+
+out.foreachRDD(new VoidFunction() {
+  @Override
+  public void call(JavaRDD rdd) throws Exception {
+timeoutCounter.add(-1);
+
+long cntSignificant = rdd.filter(new Function() {
+  @Override
+  public Boolean

[GitHub] spark pull request: [SPARK-13922][SQL] Filter rows with null attri...

2016-03-19 Thread nongli

Github user nongli commented on a diff in the pull request:

https://github.com/apache/spark/pull/11749#discussion_r56416653
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnarBatch.java
 ---
@@ -345,15 +358,25 @@ public void setColumn(int ordinal, ColumnVector 
column) {
* in this batch will not include this row.
*/
   public final void markFiltered(int rowId) {
-assert(filteredRows[rowId] == false);
+assert(!filteredRows[rowId]);
 filteredRows[rowId] = true;
 ++numRowsFiltered;
   }
 
+  /**
+   * Marks a given column as non-nullable. Any row that has a NULL value 
for the corresponding
+   * attribute is filtered out.
+   */
+  public final void filterNullsInColumn(int ordinal) {
+assert(!nullFilteredColumns.contains(ordinal));
--- End diff --

I don't think this assert is necessary. I think this is perfectly valid and 
makes this api easier to use.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13871][SQL] Support for inferring filte...

2016-03-19 Thread sameeragarwal

Github user sameeragarwal commented on the pull request:

https://github.com/apache/spark/pull/11665#issuecomment-197456454
  
thanks, all comments addressed!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13866][SQL] Handle decimal type in CSV ...

2016-03-19 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/11724#discussion_r56444849
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala
 ---
@@ -108,14 +109,38 @@ private[csv] object CSVInferSchema {
   }
 
   private def tryParseDouble(field: String): DataType = {
-if ((allCatch opt field.toDouble).isDefined) {
+val doubleTry = allCatch opt field.toDouble
--- End diff --

Ah.. numeric types with fractions can be also `Decimal`. It has precision 
and scale.

```scala
import java.math.BigDecimal
scala> BigDecimal.valueOf(1.)
res4: java.math.BigDecimal = 1.

scala> BigDecimal.valueOf(1.).precision
res6: Int = 5

scala> BigDecimal.valueOf(1.).scale
res7: Int = 4
```

```scala
import java.math.BigDecimal
scala> BigDecimal.valueOf(1)
res5: java.math.BigDecimal = 1

scala> BigDecimal.valueOf(1).precision
res8: Int = 1

scala> BigDecimal.valueOf(1).scale
res9: Int = 0
```

`DoubleType` with fractions can lose precision if it has too many.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13826][SQL] Addendum: update documentat...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11814#issuecomment-198223965
  
**[Test build #53512 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53512/consoleFull)**
 for PR 11814 at commit 
[`3b3fcca`](https://github.com/apache/spark/commit/3b3fcca9e48a6c07d8364966e153d57ee0e13290).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13815][MLlib] Provide better Exception ...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11772#issuecomment-197565822
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Added transitive closure transformation to Cat...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11777#issuecomment-197769134
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13974][SQL] sub-query names do not need...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11783#issuecomment-197913025
  
**[Test build #53430 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53430/consoleFull)**
 for PR 11783 at commit 
[`e4edc0e`](https://github.com/apache/spark/commit/e4edc0e22212d9bb2a09bb84eb2e75de128d3736).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13805][SQL] Generate code that get a va...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11636#issuecomment-198109663
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13986][CORE][MLLIB] Remove `DeveloperAp...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11797#issuecomment-198231936
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53503/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13980][WIP] Incrementally serialize blo...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11791#issuecomment-198086746
  
**[Test build #53456 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53456/consoleFull)**
 for PR 11791 at commit 
[`7dc3623`](https://github.com/apache/spark/commit/7dc362331f3f549670ecd9488db456b4136a3ad7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13579][build][wip] Stop building the ma...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11796#issuecomment-198098184
  
**[Test build #53472 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53472/consoleFull)**
 for PR 11796 at commit 
[`54336b6`](https://github.com/apache/spark/commit/54336b6305a0cc5fb2b80247f5ab76e1b6f08407).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13826][SQL] Revises Dataset ScalaDoc

2016-03-19 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/11769#issuecomment-197827320
  
Well, already minimized changes, but still too large to display :(


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13118][SQL] Expression encoding for opt...

2016-03-19 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/11708#issuecomment-197701587
  
Thanks - merging in master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13963][ML] Adding binary toggle param t...

2016-03-19 Thread BryanCutler

GitHub user BryanCutler opened a pull request:

https://github.com/apache/spark/pull/11832

[SPARK-13963][ML] Adding binary toggle param to HashingTF

## What changes were proposed in this pull request?
Adding binary toggle parameter to ml.feature.HashingTF, as well as 
mllib.feature.HashingTF since the former wraps this functionality.  This 
parameter, if true, will set non-zero valued term counts to 1 to transform term 
count features to binary values that are well suited for discrete probability 
models.

## How was this patch tested?
Added unit tests for ML and MLlib



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/BryanCutler/spark 
binary-param-HashingTF-SPARK-13963

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11832.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11832


commit a5ff3309c0d07e57177374133130803eb98ebffb
Author: Bryan Cutler 
Date:   2016-03-18T21:19:19Z

[SPARK-13963] Adding binary toggle to HashingTF in ml/mllib

commit 31097231769860b86d1d3234ebf7d4e95f96e5cb
Author: Bryan Cutler 
Date:   2016-03-18T21:19:48Z

Added unit test for HashingTF binary toggle

commit ca1436166a1292f92d72408c10cf606623b31bbd
Author: Bryan Cutler 
Date:   2016-03-18T21:26:34Z

fixed param description text




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13921] Store serialized blocks as multi...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11748#issuecomment-197603358
  
**[Test build #53351 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53351/consoleFull)**
 for PR 11748 at commit 
[`4f5074e`](https://github.com/apache/spark/commit/4f5074ece49030a6e7134f7ece706ed441c02ee4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13921] Store serialized blocks as multi...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11748#issuecomment-198081291
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12182][ML] Distributed binning for tree...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10231#issuecomment-198519918
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53555/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13942][CORE][DOCS] Remove Shark-related...

2016-03-19 Thread dongjoon-hyun

GitHub user dongjoon-hyun opened a pull request:

https://github.com/apache/spark/pull/11770

[SPARK-13942][CORE][DOCS] Remove Shark-related docs and visibility for 2.x

## What changes were proposed in this pull request?

`Shark` was merged into `Spark SQL` since [July 
2014](https://databricks.com/blog/2014/07/01/shark-spark-sql-hive-on-spark-and-the-future-of-sql-on-spark.html).
 

The followings seem to be the only legacy.

**Migration Guide**
```
- ## Migration Guide for Shark Users
- ...
- ### Scheduling
- ...
- ### Reducer number
- ...
- ### Caching
```

**SparkEnv visibility and comments**
```
- *
- * NOTE: This is not intended for external use. This is exposed for Shark 
and may be made private
- *   in a future release.
  */
 @DeveloperApi
-class SparkEnv (
+private[spark] class SparkEnv (
```

For Spark 2.x, we had better clean up those docs and comments in any way. 
However, the visibility of `SparkEnv` class might be controversial. At the 
first attempt, this issue proposes to change both stuffs according to the 
note(*This is exposed for Shark*). During review process, the change on 
visibility might be removed.

## How was this patch tested?

Pass the Jenkins test.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dongjoon-hyun/spark SPARK-13942

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11770.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11770


commit f91c5480e9a3c4644a9f95ebbc48833abbed2ea0
Author: Dongjoon Hyun 
Date:   2016-03-16T19:53:34Z

[SPARK-13942][CORE][DOCS] Remove Shark-related docs and visibility for 2.x




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13449] Naive Bayes wrapper in SparkR

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11486#issuecomment-197676388
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8971][MLLIB][ML] Support balanced class...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8112#issuecomment-197808784
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53406/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13950] [SQL] generate code for sort mer...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11771#issuecomment-197592218
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53371/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13989][SQL] Remove non-vectorized/unsaf...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11799#issuecomment-198538107
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12639] [SQL] Mark Filters Fully Handled...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11317#issuecomment-197605244
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53374/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13928] Move org.apache.spark.Logging in...

2016-03-19 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/11764#issuecomment-197645762
  
Sorry I missed it as this message is so far away from the final one... 
@JoshRosen thanks again!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [MINOR][SQL][BUILD] Remove duplicated lines

2016-03-19 Thread dongjoon-hyun

GitHub user dongjoon-hyun opened a pull request:

https://github.com/apache/spark/pull/11773

[MINOR][SQL][BUILD] Remove duplicated lines

## What changes were proposed in this pull request?

This PR removes three minor duplicated lines. First one is making the 
following unreachable code warning.
```
JoinSuite.scala:52: unreachable code
[warn]   case j: BroadcastHashJoin => j
```
The other two are just consecutive repetitions in `Seq` of MiMa filters.

## How was this patch tested?

Pass the existing Jenkins test.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dongjoon-hyun/spark remove_duplicated_line

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11773.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11773






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13764][SQL] Parse modes in JSON data so...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11756#issuecomment-198236937
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53507/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13989][SQL] Remove non-vectorized/unsaf...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11799#issuecomment-198141927
  
**[Test build #53477 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53477/consoleFull)**
 for PR 11799 at commit 
[`ef90585`](https://github.com/apache/spark/commit/ef90585abd1a33806ee51b7acbd589a3cb33af72).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13885][YARN] Fix attempt id regression ...

2016-03-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/11721


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13826][SQL] Revises Dataset ScalaDoc

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11769#issuecomment-197898940
  
**[Test build #53427 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53427/consoleFull)**
 for PR 11769 at commit 
[`50502a5`](https://github.com/apache/spark/commit/50502a5152b9b0c0458ebd6b7ad48524b1422c58).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Spark-13772] fix data type mismatch for decim...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11605#issuecomment-197875847
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53422/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP][SPARK-13809][SQL] State store for stream...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11645#issuecomment-197671956
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53388/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13761] [ML] Deprecate validateParams

2016-03-19 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/11620#discussion_r56418523
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
 ---
@@ -322,7 +337,8 @@ object GeneralizedLinearRegression extends 
DefaultParamsReadable[GeneralizedLine
 
   /**
* A description of the error distribution to be used in the model.
-   * @param name the name of the family.
+*
--- End diff --

indentation


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14014] [SQL] Replace existing catalog w...

2016-03-19 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/11836#issuecomment-198572871
  
@yhuai @rxin


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13805][SQL] Generate code that get a va...

2016-03-19 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/11636#discussion_r56567647
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -199,7 +210,8 @@ class CodegenContext {
   case StringType => s"$input.getUTF8String($ordinal)"
   case BinaryType => s"$input.getBinary($ordinal)"
   case CalendarIntervalType => s"$input.getInterval($ordinal)"
-  case t: StructType => s"$input.getStruct($ordinal, ${t.size})"
+  case t: StructType => if (!isColumnarType(input)) { 
s"$input.getStruct($ordinal, ${t.size})" }
--- End diff --

Why not make them have the same APIs?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-198057482
  
**[Test build #53458 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53458/consoleFull)**
 for PR 10355 at commit 
[`e8a56d5`](https://github.com/apache/spark/commit/e8a56d5927b4652ac0b89fb3783a495a239d0eae).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13924][SQL] officially support multi-in...

2016-03-19 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/11754#issuecomment-197457013
  
Merging in master.  Thanks.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-19 Thread hhbyyh

Github user hhbyyh commented on the pull request:

https://github.com/apache/spark/pull/11549#issuecomment-197773071
  
Thanks @mengxr @thunterdb @yanboliang for the review. Sent an update:
1. resolve the conflict with GLMSummary.
2. revert the summary statistics related part.
3. extract family and link name in R


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13977] [SQL] Brings back Shuffled hash ...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11788#issuecomment-198266969
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Spark-13772] fix data type mismatch for decim...

2016-03-19 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/11605#issuecomment-197825834
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13980][WIP] Incrementally serialize blo...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11791#issuecomment-198218448
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53496/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13432][SQL] add the source file name an...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11301#issuecomment-198239021
  
**[Test build #53517 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53517/consoleFull)**
 for PR 11301 at commit 
[`62fa1de`](https://github.com/apache/spark/commit/62fa1de1832fa23a9adc425073a96d494a806798).
 * This patch **fails R style tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13871][SQL] Support for inferring filte...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11665#issuecomment-197457623
  
**[Test build #53330 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53330/consoleFull)**
 for PR 11665 at commit 
[`336a18e`](https://github.com/apache/spark/commit/336a18e3e9f55514545a28f0bb32658dee2ff70b).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13898][SQL] Merge DatasetHolder and Dat...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11737#issuecomment-197734597
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-197750038
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53397/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13853][SQL] QueryPlan sub-classes shoul...

2016-03-19 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/11673#discussion_r56599894
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1450,7 +1450,9 @@ object CleanupAliases extends Rule[LogicalPlan] {
 case c: CreateStructUnsafe if !stop =>
   stop = true
   c.copy(children = c.children.map(trimNonTopLevelAliases))
-case Alias(child, _) if !stop => child
+// Only eliminate aliases for named expressions, otherwise we may 
turn an `Alias` to a
+// normal expression and break the type requirement for it.
+case Alias(child: NamedExpression, _) if !stop => child
--- End diff --

I tried to do it before, but not all logical plans can be accessed in 
catalyst, for example, `EvaluatePython`. Should we have a more general 
mechanism for declaring operators producing new attributes?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13805][SQL] Generate code that get a va...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11636#issuecomment-198597887
  
**[Test build #53584 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53584/consoleFull)**
 for PR 11636 at commit 
[`cdd3078`](https://github.com/apache/spark/commit/cdd3078c4a252ab8701cd1bfb08e911f3878db65).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197586274
  
**[Test build #53368 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53368/consoleFull)**
 for PR 10355 at commit 
[`3b424dc`](https://github.com/apache/spark/commit/3b424dc88211c83615ef8f16402879cc9eb45c2e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13982][SparkR] KMean's predict: Feature...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11793#issuecomment-198081241
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13926] Automatically use Kryo serialize...

2016-03-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/11755


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13742][Core] Add non-iterator interface...

2016-03-19 Thread holdenk

Github user holdenk commented on a diff in the pull request:

https://github.com/apache/spark/pull/11578#discussion_r56619813
  
--- Diff: 
core/src/main/scala/org/apache/spark/util/random/RandomSampler.scala ---
@@ -41,6 +41,12 @@ trait RandomSampler[T, U] extends Pseudorandom with 
Cloneable with Serializable
   /** take a random sample */
   def sample(items: Iterator[T]): Iterator[U]
--- End diff --

I think if we want to keep it (which could make sense) - we should maybe 
add a default implementation for it so we don't have duplicate sampling logic 
(as we do right now in the classes)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13989][SQL] Remove non-vectorized/unsaf...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11799#issuecomment-198142033
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13889][YARN][Branch-1.6]Fix the calcula...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11813#issuecomment-198215415
  
**[Test build #53505 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53505/consoleFull)**
 for PR 11813 at commit 
[`17d8bc1`](https://github.com/apache/spark/commit/17d8bc1f13c3b29e22ecbec6a9f08491e5970368).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13926] Automatically use Kryo serialize...

2016-03-19 Thread nongli

Github user nongli commented on the pull request:

https://github.com/apache/spark/pull/11755#issuecomment-197552094
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13853][SQL] QueryPlan sub-classes shoul...

2016-03-19 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/11673#issuecomment-197792870
  
hi @marmbrus , yes your idea makes sense, we should use `Alias` to produce 
new attributes if we can. Except leaf nodes, there are some special cases we 
still need to use `producedAttributes`: `Generate`, `Expand`, `ScriptTransform` 
and aggregate related operators. In this PR, I made `EvaluatePython` to use 
alias for new attribute, and clean up those special cases to use 
`producedAttributes`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13919] [SQL] fix column pruning through...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11828#issuecomment-198585992
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Minor][DOC] Add JavaStreamingTestExample

2016-03-19 Thread MLnick

Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/11776#discussion_r56469940
  
--- Diff: 
examples/src/main/java/org/apache/spark/examples/mllib/JavaStreamingTestExample.java
 ---
@@ -0,0 +1,123 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.mllib;
+
+
+import org.apache.spark.Accumulator;
+// $example on$
+import org.apache.spark.api.java.function.VoidFunction;
--- End diff --

Don't think this import is actually required, as that code is after the 
final `$example off$`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13908][SQL] Add a LocalLimit for Collec...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11817#issuecomment-198296418
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13761] [ML] Deprecate validateParams

2016-03-19 Thread jkbradley

Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/11620#issuecomment-197621807
  
No problem. Thanks for the PR!
LGTM
Merging with master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7425] [ML] spark.ml Predictor should su...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10355#issuecomment-197623813
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13989][SQL] Remove non-vectorized/unsaf...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11799#issuecomment-198136602
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53476/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14011][CORE][SQL] Enable `LineLength` J...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11831#issuecomment-198543525
  
**[Test build #53566 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53566/consoleFull)**
 for PR 11831 at commit 
[`2923ef0`](https://github.com/apache/spark/commit/2923ef095369376be03a868c2bf2375294dab6d1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13958]Executor OOM due to unbounded gro...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11794#issuecomment-198534785
  
**[Test build #53550 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53550/consoleFull)**
 for PR 11794 at commit 
[`4db3880`](https://github.com/apache/spark/commit/4db388084de12d805ae905d9062db75e130a7b73).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13904][Scheduler]Add support for plugga...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11723#issuecomment-198300194
  
**[Test build #53525 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53525/consoleFull)**
 for PR 11723 at commit 
[`ae808d7`](https://github.com/apache/spark/commit/ae808d73e022077dba6ad999627589eed4730270).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13928] Move org.apache.spark.Logging in...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11764#issuecomment-197681101
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13995][SQL] Constraints should take car...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11809#issuecomment-198209937
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13973] [PySpark]: `ipython notebook` is...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11829#issuecomment-198552276
  
**[Test build #53567 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53567/consoleFull)**
 for PR 11829 at commit 
[`e1a0c40`](https://github.com/apache/spark/commit/e1a0c40d4ec4101074f8e5310f357cedbdbec60a).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13942][CORE][DOCS] Remove Shark-related...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11770#issuecomment-197593028
  
**[Test build #53370 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53370/consoleFull)**
 for PR 11770 at commit 
[`74e51b1`](https://github.com/apache/spark/commit/74e51b13faffa322a7bceff74254301f06113b49).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13928] Move org.apache.spark.Logging in...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11764#issuecomment-197700857
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53392/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13871][SQL] Support for inferring filte...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11665#issuecomment-197458671
  
**[Test build #53331 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53331/consoleFull)**
 for PR 11665 at commit 
[`92d935f`](https://github.com/apache/spark/commit/92d935fc8c1204b5d8272655dbe0e606270e5854).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13972][SQ][WIP] hive tests should fail ...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11782#issuecomment-198274820
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53524/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Trivial][Docs] Two typos in comment

2016-03-19 Thread zhengruifeng

GitHub user zhengruifeng opened a pull request:

https://github.com/apache/spark/pull/11761

[Trivial][Docs] Two typos in comment

## What changes were proposed in this pull request?
two typos


## How was this patch tested?
no tests




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhengruifeng/spark typo

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11761.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11761


commit a3fd6ff37a59c411188dddba6e618576bb3ea8f6
Author: Zheng RuiFeng 
Date:   2016-03-16T11:27:13Z

simple typo




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12719][HOTFIX] Fix compilation against ...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11798#issuecomment-198168724
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12299][CORE][WIP] Remove history servin...

2016-03-19 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/10991#issuecomment-197606014
  
I don't think that the Master should have any event-log consumption logic 
whatsoever.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13764][SQL] Parse modes in JSON data so...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11756#issuecomment-198217639
  
**[Test build #53507 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53507/consoleFull)**
 for PR 11756 at commit 
[`3ff900e`](https://github.com/apache/spark/commit/3ff900ec904991e79bf6267c16ee38dfc15660be).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13976][SQL] do not remove sub-queries a...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11786#issuecomment-197992472
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13805][SQL] Generate code that get a va...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11636#issuecomment-198585189
  
**[Test build #53584 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53584/consoleFull)**
 for PR 11636 at commit 
[`cdd3078`](https://github.com/apache/spark/commit/cdd3078c4a252ab8701cd1bfb08e911f3878db65).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13068][PYSPARK][ML] Type conversion for...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11663#issuecomment-198009920
  
**[Test build #53446 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53446/consoleFull)**
 for PR 11663 at commit 
[`c483da8`](https://github.com/apache/spark/commit/c483da8a6b1c305a96ce2a4a25ac8a0a1f98e653).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP][SPARK-13883][SQL] Parquet Implementation...

2016-03-19 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11709#issuecomment-197585281
  
**[Test build #53367 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53367/consoleFull)**
 for PR 11709 at commit 
[`8ccdd77`](https://github.com/apache/spark/commit/8ccdd77fb1bdb63a589a214e6151d81b54eeb524).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13926] Automatically use Kryo serialize...

2016-03-19 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/11755#issuecomment-197718175
  
Merging in master!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13761] [ML] Remove remaining uses of va...

2016-03-19 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11790#issuecomment-198035808
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53447/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 5 6 7 8 9 10 >

401 - 500 of 1811 matches

Mail list logo