date:20170723

[GitHub] spark issue #18721: [SPARK-21516][SQL][Test] Overriding afterEach() in Datas...

2017-07-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18721
  
**[Test build #79899 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79899/testReport)**
 for PR 18721 at commit 
[`0e4d057`](https://github.com/apache/spark/commit/0e4d057528198d711df501450b1ef6fb84abe491).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17953: [SPARK-20680][SQL] Spark-sql do not support for void col...

2017-07-23 Thread LantaoJin

Github user LantaoJin commented on the issue:

https://github.com/apache/spark/pull/17953
  
@HyukjinKwon Sure. Thank you for reminding me. I almost forgot it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18719: [SPARK-21512][SQL][TEST] DatasetCacheSuite needs ...

2017-07-23 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/18719#discussion_r128953394
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala ---
@@ -25,6 +25,11 @@ import org.apache.spark.storage.StorageLevel
 class DatasetCacheSuite extends QueryTest with SharedSQLContext {
   import testImplicits._
 
+  // Clear all persistent datasets after each test
+  override def afterEach(): Unit = {
+spark.sharedState.cacheManager.clearCache()
--- End diff --

I submitted #18721


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18721: [SPARK-21516][SQL][Test] Overriding afterEach() i...

2017-07-23 Thread kiszk

GitHub user kiszk opened a pull request:

https://github.com/apache/spark/pull/18721

[SPARK-21516][SQL][Test] Overriding afterEach() in DatasetCacheSuite must 
call super.afterEach()

## What changes were proposed in this pull request?

This PR ensures to call `super.afterEach()` in overriding `afterEach()` 
method in `DatasetCacheSuite`. When we override `afterEach()` method in 
Testsuite, we have to call `super.afterEach()`.

This is a follow-up of #18719 and SPARK-21512.

## How was this patch tested?

Used the existing test suite

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kiszk/spark SPARK-21516

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18721.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18721


commit 0e4d057528198d711df501450b1ef6fb84abe491
Author: Kazuaki Ishizaki 
Date:   2017-07-24T05:43:38Z

initial commit




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...

2017-07-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18711
  
**[Test build #79898 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79898/testReport)**
 for PR 18711 at commit 
[`28b2c89`](https://github.com/apache/spark/commit/28b2c89ccf2e3302d2e0da6b1591241875f6c73c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18711: [SPARK-21506][DOC]The description of "spark.execu...

2017-07-23 Thread 10110346

Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18711#discussion_r128952895
  
--- Diff: docs/configuration.md ---
@@ -1106,7 +1106,7 @@ Apart from these, the following properties are also 
available, and may be useful
 parameter allows an application to run multiple executors on the
 same worker, provided that there are enough cores on that
 worker. Otherwise, only one executor per application will run on
-each worker.
+each worker usually.
--- End diff --

thanks, I will update it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18684: [SPARK-21475][Core] Use NIO's Files API to replace FileI...

2017-07-23 Thread jerryshao

Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/18684
  
@srowen any further comment? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18649: [SPARK-21395][SQL] Spark SQL hive-thriftserver doesn't r...

2017-07-23 Thread debugger87

Github user debugger87 commented on the issue:

https://github.com/apache/spark/pull/18649
  
@cloud-fan @srowen If we can't find maintainer of hive-thriftserver in 
Spark, I may have to close this PR in few days later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18708: [SPARK-21339] [CORE] spark-shell --packages option does ...

2017-07-23 Thread devaraj-kavali

Github user devaraj-kavali commented on the issue:

https://github.com/apache/spark/pull/18708
  
Thanks @HyukjinKwon for checking this and for the link.

> Are you saying "file:///C:/Users//.ivy2/jars/.jar" is not the correct 
form of URI on Windows?

This URI form is correct for java.io.File and also for ClassLoader. I see 
that that the jar is getting loaded with this URI form from the ClassLoader 
during my inspection.

SparkSubmit.scala

```
  private def addJarToClasspath(localJar: String, loader: 
MutableURLClassLoader) {
val uri = Utils.resolveURI(localJar)
uri.getScheme match {
  case "file" | "local" =>
val file = new File(uri.getPath)
if (file.exists()) {
  loader.addURL(file.toURI.toURL)
} else {
```


But this URI form is not supported by the `java -classpath`. In this below 
code, the `jars` are having the `file:///` scheme which is not supported and 
causing the issue.

Main.scala

```
  private[repl] def doMain(args: Array[String], _interp: SparkILoop): Unit 
= {
interp = _interp
val jars = Utils.getUserJars(conf, isShell = 
true).mkString(File.pathSeparator)
val interpArguments = List(
  "-Yrepl-class-based",
  "-Yrepl-outdir", s"${outputDir.getAbsolutePath}",
  "-classpath", jars
```

And also we can see the difference with these java commands,

In Windows,
```
C:\>java -classpath file:///C:/work/test/a1.jar com.test.second.ClassZ
Error: Could not find or load main class com.test.second.ClassZ

C:\>java -classpath C:/work/test/a1.jar com.test.second.ClassZ
From main method
```

In Unix,

```
[devaraj@devaraj-work-pc ~]$ java -classpath 
/home/devaraj/work/install/test/a1.jar com.test.second.ClassZ
From main method

[devaraj@devaraj-work-pc ~]$ java -classpath 
file:///home/devaraj/work/install/test/a1.jar com.test.second.ClassZ
From main method
```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-07-23 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/18664
  
In my opinion, we should definitely specify the timezone to keep the 
correct timestamp.
I'm not sure which is the suitable one yet, but the candidates would be:

1. `"UTC"`
Spark SQL has timestamp value as the number of micros since `1970-01-01 
00:00:00.0 UTC` internally.
2. `SQLConf.SESSION_LOCAL_TIMEZONE`
Spark SQL represents and calculates in timezone related operations using 
this timezone. If there isn't the config value, the value will fallback to 
`DateTimeUtils.defaultTimeZone()`.
3. `DateTimeUtils.defaultTimeZone()`
The system timezone.

Hopefully we might specify the timezone when 
`spark.conf.set("spark.sql.execution.arrow.enable", "false")`, too, but it 
would affect backward-compatibility?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...

2017-07-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18711
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...

2017-07-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18711
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79895/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18071: [SPARK-20855][Docs][DStream] Update the Spark kinesis do...

2017-07-23 Thread yssharma

Github user yssharma commented on the issue:

https://github.com/apache/spark/pull/18071
  
I noticed that now. Yes I will post an updated patch today. Thanks 
@HyukjinKwon 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...

2017-07-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18711
  
**[Test build #79895 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79895/testReport)**
 for PR 18711 at commit 
[`efcec8e`](https://github.com/apache/spark/commit/efcec8e6ff0a5728520ddcafbc44a3c2622172fd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18709
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18709
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79896/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18709
  
**[Test build #79896 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79896/testReport)**
 for PR 18709 at commit 
[`9fba6db`](https://github.com/apache/spark/commit/9fba6db4bb450a024c75a9eed158b69dbd1afd41).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18687
  
This really depends on how you implement the global statement cache and 
management. All the compiled plans can be stored in the cache. The plans can be 
reused, if possible (the reused plans might not be identical). 

The plan recompilation in this PR can be part of global statement cache and 
management. Plan recompilation can be manual or automatic. In this specific 
case, it should be done automatically. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18612: [SPARK-21388][ML][PySpark] GBTs inherit from HasStepSize...

2017-07-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18612
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18612: [SPARK-21388][ML][PySpark] GBTs inherit from HasStepSize...

2017-07-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18612
  
**[Test build #79897 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79897/testReport)**
 for PR 18612 at commit 
[`aaa9187`](https://github.com/apache/spark/commit/aaa918770cc2505f37dd6f792bb6879f97e30b06).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18093: [WIP][SPARK-20774][SQL] Cancel all jobs when QueryExecti...

2017-07-23 Thread liyichao

Github user liyichao commented on the issue:

https://github.com/apache/spark/pull/18093
  
Sorry about that, I will test it when I have time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18612: [SPARK-21388][ML][PySpark] GBTs inherit from HasStepSize...

2017-07-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18612
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79897/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-23 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r128947347
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,173 +101,167 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
+}
 
 /**
- * The trait used to represent the type of a Window Frame Boundary.
+ * The trait used to represent special boundaries used in a window frame.
  */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = NullType
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
+/** UNBOUNDED boundary. */
+case object Unbounded extends SpecialFrameBoundary
+
+/** CURRENT ROW boundary. */
+case object CurrentRow extends SpecialFrameBoundary
+
 /**
- * Extractor for making working with frame boundaries easier.
+ * Represents a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait WindowFrame extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** Used as a placeholder when a frame specification is not defined. */
+case object UnspecifiedFrame extends WindowFrame
 
-  override def toString: String = "UNBOUNDED PRECEDING"
-}
+/**
+ * A specified Window Frame. The val lower/uppper can be either a foldable 
[[Expression]] or a
+ * [[SpecialFrameBoundary]].
+ */
+case class SpecifiedWindowFrame(
+frameType: FrameType,
+lower: Expression,
+upper: Expression)
+  extends WindowFrame {
 
-/**  PRECEDING boundary. */
-case class ValuePreceding(value: Int) extends FrameBoundary {
-  def

[GitHub] spark issue #18071: [SPARK-20855][Docs][DStream] Update the Spark kinesis do...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18071
  
It looks the last test was failed by style checking. Would you have some 
time to fix them up?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18071: [SPARK-20855][WIP][DStream] Update the Spark kinesis doc...

2017-07-23 Thread yssharma

Github user yssharma commented on the issue:

https://github.com/apache/spark/pull/18071
  
@HyukjinKwon The PR is ready. Just waiting for some ð from the reviewers.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18250: [SPARK-21024][SQL] CSV parse mode handles Univocity pars...

2017-07-23 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/18250
  
I think we could be back if we have better way to handle this, so I'll 
close this for now (we better keeping this discussion in jira).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18250: [SPARK-21024][SQL] CSV parse mode handles Univoci...

2017-07-23 Thread maropu

Github user maropu closed the pull request at:

https://github.com/apache/spark/pull/18250


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18711: [SPARK-21506][DOC]The description of "spark.execu...

2017-07-23 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18711#discussion_r128945012
  
--- Diff: docs/configuration.md ---
@@ -1106,7 +1106,7 @@ Apart from these, the following properties are also 
available, and may be useful
 parameter allows an application to run multiple executors on the
 same worker, provided that there are enough cores on that
 worker. Otherwise, only one executor per application will run on
-each worker.
+each worker usually.
--- End diff --

```
In standalone and Mesos coarse-grained modes, setting this
parameter allows an application to launch multiple executors on the
same worker in one schedule iteration, provided that there are enough
cores on that worker. Otherwise, only one executor per application
will be scheduled on each worker during one schedule iteration.
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18711: [SPARK-21506][DOC]The description of "spark.execu...

2017-07-23 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18711#discussion_r128945886
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -580,7 +580,8 @@ private[deploy] class Master(
* The number of cores assigned to each executor is configurable. When 
this is explicitly set,
* multiple executors from the same application may be launched on the 
same worker if the worker
* has enough cores and memory. Otherwise, each executor grabs all the 
cores available on the
-   * worker by default, in which case only one executor may be launched on 
each worker.
+   * worker by default, in which case only one executor may be launched on 
each worker only one
+   * schedules.
--- End diff --

Should also add:
```
Note that when `spark.executor.cores` is not set, we may still launch 
multiple executors from the same application on the same worker. Consider appA 
and appB both have one executor running on worker1, and appA.coresLeft > 0, 
then appB is finished and release all its cores on worker1, thus for the next 
schedule iteration, appA launchs a new executor that grabs all the free cores 
on worker1, therefore we get mulfiple executors from appA running on worker1.
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18711: [SPARK-21506][DOC]The description of "spark.execu...

2017-07-23 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18711#discussion_r128943527
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -580,7 +580,8 @@ private[deploy] class Master(
* The number of cores assigned to each executor is configurable. When 
this is explicitly set,
* multiple executors from the same application may be launched on the 
same worker if the worker
* has enough cores and memory. Otherwise, each executor grabs all the 
cores available on the
-   * worker by default, in which case only one executor may be launched on 
each worker.
+   * worker by default, in which case only one executor may be launched on 
each worker only one
--- End diff --

Perhaps rephrase this to:
```
in which case only one executor may be launched on each worker during one 
schedule iteration.
```
?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13320: [SPARK-13184][SQL] Add a datasource-specific option minP...

2017-07-23 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/13320
  
No, so I'll close this for now and we move this discussion to jira.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13320: [SPARK-13184][SQL] Add a datasource-specific opti...

2017-07-23 Thread maropu

Github user maropu closed the pull request at:

https://github.com/apache/spark/pull/13320


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17882: [WIP][SPARK-20079][yarn] Re registration of AM hangs spa...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17882
  
Thank you @jerryshao.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18154: [SPARK-20932][ML]CountVectorizer support handle persiste...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18154
  
I don't know ML as much as reviewing this. I just wanted to be sure if it 
is in progress in any way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16577: [SPARK-19214][SQL] Typed aggregate count output field na...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16577
  
@aray WDYT on ^?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16445: [SPARK-19043][SQL]Make SparkSQLSessionManager more confi...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16445
  
Hi @yaooqinn, is this PR active? if so, would you address or answer to the 
review comments above?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18093: [SPARK-20774][SQL] Cancel all jobs when QueryExection th...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18093
  
I think a patch should at least be manually tested @liyichao.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18071: [SPARK-20855][WIP][DStream] Update the Spark kinesis doc...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18071
  
Hi @yssharma, is it still WIP? I think we should make the PR on the 
mergable state at least.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17882: [WIP][SPARK-20079][yarn] Re registration of AM hangs spa...

2017-07-23 Thread jerryshao

Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/17882
  
I think this could be closed, @vanzin already created a new PR based on 
this (#18663).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18127: [SPARK-6628][SQL][Branch-2.1] Fix ClassCastException whe...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18127
  
Hi @weiqingy, I just wonder if it is in progress in any way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13706: [SPARK-15988] [SQL] Implement DDL commands: Create/Drop ...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/13706
  
@gatorsmile, I just wonder if it is ready to proceed further this PR. It 
looks the PR you linked is merged properly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17371: [SPARK-19903][PYSPARK][SS] window operator miss the `wat...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17371
  
gentle ping @uncleGen, is this PR still active?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14601: [SPARK-13979][Core] Killed executor is re spawned withou...

2017-07-23 Thread agsachin

Github user agsachin commented on the issue:

https://github.com/apache/spark/pull/14601
  
Thanks @jiangxb2987 will add this test case by tomorrow and will be update 
the pr with results 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18154: [SPARK-20932][ML]CountVectorizer support handle persiste...

2017-07-23 Thread zhengruifeng

Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/18154
  
@hhbyyh @HyukjinKwon  Sorry to reply late. 
I think it may be better to use a special logic if it is more efficient in 
performance. 
What is your opinion? @yanboliang @HyukjinKwon 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18242: Modify JavaModelSelectionViaCrossValidationExample

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18242
  
ping @arsinux 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17953: [SPARK-20680][SQL] Spark-sql do not support for void col...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17953
  
@LantaoJin do you have some time to address the review comment above?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17979: [SPARK-19320][MESOS][WIP]allow specifying a hard limit o...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17979
  
@yanji84, is this PR active? if so, would you answer to the question above?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17180: [SPARK-19839][Core]release longArray in BytesToBytesMap

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17180
  
Hi @zhzhan is this PR active? if so, would you answer or address the review 
comment?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13320: [SPARK-13184][SQL] Add a datasource-specific option minP...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/13320
  
@maropu, I just wonder if it is in progress in any way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16365: [SPARK-18950][SQL] Report conflicting fields when mergin...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16365
  
ping @bravo-zhang for adding the test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18154: [SPARK-20932][ML]CountVectorizer support handle persiste...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18154
  
@zhengruifeng Would you answer or address the review comments above?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18717: [SPARK-21510] [SQL] Add isMaterialized() and eage...

2017-07-23 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18717#discussion_r128944222
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2739,6 +2752,26 @@ class Dataset[T] private[sql](
   }
 
   /**
+   * Returns true when the Dataset is cached and materialized.
+   *
+   * @group basic
+   * @since 2.3.0
+   */
+  def isMaterialized(): Boolean = {
--- End diff --

What is the use scenarios for this API? RDD also doesn't have the API to 
check it's materialized. I think in practice, we don't need to check if an 
Dataset/RDD is materialized, except for test purpose.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12147: [SPARK-14361][SQL]Window function exclude clause

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/12147
  
Gentle ping @xwu0226, how is the update going?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17882: [WIP][SPARK-20079][yarn] Re registration of AM hangs spa...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17882
  
gentle ping @witgo for review comments above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18612: [SPARK-21388][ML][PySpark] GBTs inherit from HasStepSize...

2017-07-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18612
  
**[Test build #79897 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79897/testReport)**
 for PR 18612 at commit 
[`aaa9187`](https://github.com/apache/spark/commit/aaa918770cc2505f37dd6f792bb6879f97e30b06).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #11205: [SPARK-11334][Core] Handle maximum task failure situatio...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/11205
  
gentle ping @rustagi 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16766: [SPARK-19426][SQL] Custom coalesce for Dataset

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16766
  
Hi @mariusvniekerk, would you be able to fix the javadoc errors?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14239: [SPARK-16593] [CORE] [WIP] Provide a pre-fetch mechanism...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14239
  
Is it @f7753 ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14158: [SPARK-13547] [SQL] [WEBUI] Add SQL query in web UI's SQ...

2017-07-23 Thread nblintao

Github user nblintao commented on the issue:

https://github.com/apache/spark/pull/14158
  
@HyukjinKwon It's still active. I'll fix it when I'm available. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14085: [SPARK-16408][SQL] SparkSQL Added file get Exception: is...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14085
  
@zenglinxi0615 Could you answer to the question above if you are active?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18328: [SPARK-21121][SQL] Support changing storage level via th...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18328
  
@dosoft Is this PR active? then would you mind if I ask to reply to the 
review comment above?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18717: [SPARK-21510] [SQL] Add isMaterialized() and eage...

2017-07-23 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18717#discussion_r128943409
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2739,6 +2752,26 @@ class Dataset[T] private[sql](
   }
 
   /**
+   * Returns true when the Dataset is cached and materialized.
+   *
+   * @group basic
+   * @since 2.3.0
+   */
+  def isMaterialized(): Boolean = {
+queryExecution.sparkPlan match {
+  case i: InMemoryTableScanExec =>
+val blockManager = sparkSession.sparkContext.env.blockManager
+val rdd = i.relation.cachedColumnBuffers
+val blockIDs = rdd.partitions.indices.map(index => 
RDDBlockId(rdd.id, index))
+blockIDs.foreach { bid =>
+  if (blockManager.get(bid).nonEmpty) 
blockManager.releaseLock(bid) else return false
--- End diff --

I think this won't work for remote blocks. `BlockManager.get` doesn't 
acquire any locks for remote blocks. Once you release lock on them, you will 
get exception.

Btw, for remote blocks, it fetches the blocks back, so it's costly for this 
purpose.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17530: [SPARK-5158] Access kerberized HDFS from Spark standalon...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17530
  
gentle ping @themodernlife on ^.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #11494: [SPARK-10399][CORE][SQL] Introduce OffHeapMemoryBlock to...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/11494
  
gentle ping @yzotov.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14158: [SPARK-13547] [SQL] [WEBUI] Add SQL query in web UI's SQ...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14158
  
gentle ping @nblintao, is this PR active? If so, I guess the test failure 
should be fixed if related.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18651: [SPARK-21383][Core] Fix the YarnAllocator allocat...

2017-07-23 Thread djvulee

Github user djvulee commented on a diff in the pull request:

https://github.com/apache/spark/pull/18651#discussion_r128943316
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
 ---
@@ -525,9 +534,11 @@ private[yarn] class YarnAllocator(
   } catch {
 case NonFatal(e) =>
   logError(s"Failed to launch executor $executorId on 
container $containerId", e)
-  // Assigned container should be released immediately to 
avoid unnecessary resource
-  // occupation.
+  // Assigned container should be released immediately
+  // to avoid unnecessary resource occupation.
   amClient.releaseAssignedContainer(containerId)
+  } finally {
+numExecutorsStarting.decrementAndGet()
--- End diff --

I agree that put `numExecutorsStarting.decrementAndGet()` together with 
`numExecutorsRunning.incrementAndGet()` in the `updateInternalState` is better 
if we can.

Why I try to put `numExecutorsStarting.decrementAndGet()` in the `finally` 
block is that if there some Exceptions is not `NonFatal`, and caught by the 
following code, we may can not allocated resources as we  specified, this is 
the same as @vanzin worried.

We may double the count in the current code, but this only slow down the 
allocation rate for a small time.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15227: [SPARK-17655][SQL]Remove unused variables declarations a...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/15227
  
gentle ping @yaooqinn on ^.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18269: [SPARK-21056][SQL] Use at most one spark job to list fil...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18269
  
gentle ping @bbossy, I just want to be sure if it is in progress in any way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14601: [SPARK-13979][Core] Killed executor is re spawned withou...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14601
  
gentle ping @agsachin.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...

2017-07-23 Thread jiangxb1987

Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/18711
  
yea, I understand the issue now. Thank you for clarify!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18717: [SPARK-21510] [SQL] Add isMaterialized() and eage...

2017-07-23 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18717#discussion_r128942266
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala ---
@@ -25,9 +25,22 @@ import org.apache.spark.storage.StorageLevel
 class DatasetCacheSuite extends QueryTest with SharedSQLContext {
   import testImplicits._
 
-  // Clear all persistent datasets after each test
   override def afterEach(): Unit = {
-spark.sharedState.cacheManager.clearCache()
+try {
+  // Clear all persistent datasets after each test
+  spark.sharedState.cacheManager.clearCache()
+} finally {
+  super.afterEach()
+}
+  }
+
+  test("eager persist") {
+val ds = Seq("1", "2").toDF()
+ds.persist(eager = false)
+assert(!ds.isMaterialized())
+ds.persist(eager = true)
+ds.collect()
--- End diff --

To test if it is eagerly persisted, why we need a collect here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18250: [SPARK-21024][SQL] CSV parse mode handles Univocity pars...

2017-07-23 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/18250
  
ok, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17519: [SPARK-15352][Doc] follow-up: add configuration docs for...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17519
  
Hi @lins05, is this PR active?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18354: [SPARK-18016][SQL][CATALYST][BRANCH-2.1] Code Generation...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18354
  
@bdrillard, sounds properly backported. Would you close this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18687: [SPARK-21484][SQL] Fix inconsistent query plans of Datas...

2017-07-23 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18687
  
cc @cloud-fan Can you help review this too? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18383: [SPARK-21167][SS] Set kafka clientId while fetch message...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18383
  
gentle ping @dijingran. I want to be sure if it is in progress in any way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18250: [SPARK-21024][SQL] CSV parse mode handles Univocity pars...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18250
  
I am sorry @maropu. I can't think of a good way to handle this for now ... 
Will be back after thinking more maybe ..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18668: [SPARK-21451][SQL]get `spark.hadoop.*` properties from s...

2017-07-23 Thread yaooqinn

Github user yaooqinn commented on the issue:

https://github.com/apache/spark/pull/18668
  
ping @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18687: [SPARK-21484][SQL][WIP] Fix inconsistent query plans of ...

2017-07-23 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18687
  
> So far, we do not support dynamic SQL statement, but this is a potential 
feature we can explore in the future. A global statement cache and management 
can reduce the optimization costs, especially when our CBO optimizer is more 
advanced.

I agree. The dynamic statement cache is used to reduce the costly 
preparation process by reusing prepared (parsed, analyzed and optimized) 
statement. IIUC, this only works for identical dynamic SQL statements. When CBO 
optimizer goes more advanced and more costly, the statement cache might help 
reduce the cost for identical statements.

The current cache in SparkSQL is not for statement cache, but for query 
plan fragment (and its execution result) cache. A query doesn't need to be 
identical to cached query plan. It can reuse the cached plan even when the 
cached one is just a fragment of it.

So seems to me they are orthogonal and can be complementary.







---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17096
  
gentle ping ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16611: [SPARK-17967][SPARK-17878][SQL][PYTHON] Support for arra...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16611
  
gentle ping ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18592: [SPARK-21368][SQL] TPCDSQueryBenchmark can't refer query...

2017-07-23 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/18592
  
@gatorsmile Could we merge this first? I feel we could discuss more on jira?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18304: [SPARK-21098] Set lineseparator csv multiline and csv wr...

2017-07-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18304
  
Would you mind if I ask to wait the resolution of 
`https://github.com/apache/spark/pull/18581` ? Strictly, they are orthogonal as 
that PR tries to not change the default line separator but I would like to wait 
and see other opinions on this. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...

2017-07-23 Thread 10110346

Github user 10110346 commented on the issue:

https://github.com/apache/spark/pull/18711
  
If the other application is finished, it will release cores


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18713: [SPARK-21509][SQL] Add a config to enable adaptive query...

2017-07-23 Thread jinxing64

Github user jinxing64 commented on the issue:

https://github.com/apache/spark/pull/18713
  
cc @cloud-fan @jiangxb1987 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...

2017-07-23 Thread jiangxb1987

Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/18711
  
How can we wait for new free cores on the same worker, if we only start one 
executor for each worker initially?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18687: [SPARK-21484][SQL][WIP] Fix inconsistent query plans of ...

2017-07-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18687
  
So far, we do not support dynamic SQL statement, but this is a potential 
feature we can explore in the future. A global statement cache and management 
can reduce the optimization costs, especially when our CBO optimizer is more 
advanced. 

At the same time, it also resolves a more general issue. We can invalidate 
all the physical plans that are built based on the stale info. Data cache is 
just one of the examples. It could also include out-of-dated statistics. The 
previous optimized plan might not make sense any more when the underlying 
tables inserts/deletes a large amount of data. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18719: [SPARK-21512][SQL][TEST] DatasetCacheSuite needs to exec...

2017-07-23 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/18719
  
good catch, i will create a follow-up PR today.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18709
  
**[Test build #79896 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79896/testReport)**
 for PR 18709 at commit 
[`9fba6db`](https://github.com/apache/spark/commit/9fba6db4bb450a024c75a9eed158b69dbd1afd41).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...

2017-07-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18711
  
**[Test build #79895 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79895/testReport)**
 for PR 18711 at commit 
[`efcec8e`](https://github.com/apache/spark/commit/efcec8e6ff0a5728520ddcafbc44a3c2622172fd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18719: [SPARK-21512][SQL][TEST] DatasetCacheSuite needs ...

2017-07-23 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18719#discussion_r128937203
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala ---
@@ -25,6 +25,11 @@ import org.apache.spark.storage.StorageLevel
 class DatasetCacheSuite extends QueryTest with SharedSQLContext {
   import testImplicits._
 
+  // Clear all persistent datasets after each test
+  override def afterEach(): Unit = {
+spark.sharedState.cacheManager.clearCache()
--- End diff --

+1, ideally we should always call `super.afterEach`, @kiszk can you send a 
follow-up PR? thx


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...

2017-07-23 Thread 10110346

Github user 10110346 commented on the issue:

https://github.com/apache/spark/pull/18711
  
@jiangxb1987  If `app.coresLeft` is not  zero and  there is no more free 
cores left, it is not ending.Waiting for some workers have  free cores, this 
app will be  assigned cores continue, and this time another executor may be 
launched on the same worker.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18687: [SPARK-21484][SQL][WIP] Fix inconsistent query plans of ...

2017-07-23 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18687
  
Do we have dynamic SQL statements?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18714: [SPARK-20236][SQL] hive style partition overwrite

2017-07-23 Thread ericl

Github user ericl commented on the issue:

https://github.com/apache/spark/pull/18714
  
Got it.

On Sun, Jul 23, 2017, 10:40 PM Wenchen Fan  wrote:

> *@cloud-fan* commented on this pull request.
> --
>
> In
> 
core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala
> :
>
> > @@ -52,12 +55,22 @@ class HadoopMapReduceCommitProtocol(jobId: String, 
path: String)
> */
>@transient private var addedAbsPathFiles: mutable.Map[String, String] 
= null
>
> +  @transient private var partitionPaths: mutable.Set[String] = null
> +
> +  @transient private var stagingDir: Path = _
>
> stagingDir may not needed, but we do need partitionPaths, which tracks
> partitions with default path.
>
> â
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18717: [SPARK-21510] [SQL] Add isMaterialized() and eager persi...

2017-07-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18717
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18717: [SPARK-21510] [SQL] Add isMaterialized() and eager persi...

2017-07-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18717
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79894/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18717: [SPARK-21510] [SQL] Add isMaterialized() and eager persi...

2017-07-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18717
  
**[Test build #79894 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79894/testReport)**
 for PR 18717 at commit 
[`9077258`](https://github.com/apache/spark/commit/9077258af86d7403e28723d9d6d457bd7bf1cc76).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18709
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18709
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79893/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18709: [SPARK-21504] [SQL] Add spark version info into table me...

2017-07-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18709
  
**[Test build #79893 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79893/testReport)**
 for PR 18709 at commit 
[`4c511c3`](https://github.com/apache/spark/commit/4c511c30e968afad9666d4e5d125ef45387cf491).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 >

1 - 100 of 245 matches

Mail list logo