[GitHub] spark issue #19354: [SPARK-20992][Scheduler] Add links in documentation to N...

2017-09-27 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/19354
  
@rcgenova I'm referring to https://issues.apache.org/jira/browse/SPARK-18278
This note was added in https://github.com/apache/spark/pull/17522 and I 
actually disagree that this should have gone in. Why not add this when it's 
actually in Spark? @foxish @rxin @mridulm 
I think this PR here is one of the reasons, actually


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19377: [SPARK-22154] add a shutdown hook that explains why the ...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19377
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19377: [SPARK-22154] add a shutdown hook that explains why the ...

2017-09-27 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/19377
  
The events of a shutdown are already logged; what does this add? it's 
already clear something is shutting down


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19377: [SPARK-22154] add a shutdown hook that explains w...

2017-09-27 Thread liu-zhaokun
GitHub user liu-zhaokun opened a pull request:

https://github.com/apache/spark/pull/19377

[SPARK-22154] add a shutdown hook that explains why the output is 
terminating

It would be nice to add a shutdown hook here that explains why the output 
is terminating. Otherwise if the worker dies the executor logs will silently 
stop.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/liu-zhaokun/spark master0928

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19377.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19377


commit 0be78c3e41edb27a00153696fb8ab74a95904131
Author: liuzhaokun 
Date:   2017-09-28T06:50:40Z

[SPARK-22154] add a shutdown hook that explains why the output is 
terminating




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19358: [SPARK-22135] [MESOS] metrics in spark-dispatcher...

2017-09-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19358


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19175: [SPARK-21964][SQL]Enable splitting the Aggregate (on Exp...

2017-09-27 Thread DonnyZone
Github user DonnyZone commented on the issue:

https://github.com/apache/spark/pull/19175
  
cc @hvanhovell @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19358: [SPARK-22135] [MESOS] metrics in spark-dispatcher not be...

2017-09-27 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/19358
  
Merging to master and branch 2.2. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19287: [SPARK-22074][Core] Task killed by other attempt task sh...

2017-09-27 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/19287
  
Generally it looks fine to me.

CC @markhamstra @squito , would you please help to review it? Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19294: [SPARK-21549][CORE] Respect OutputFormats with no output...

2017-09-27 Thread szhem
Github user szhem commented on the issue:

https://github.com/apache/spark/pull/19294
  
@mridulm Regarding FileFormatWriter I've implemented some basic tests which 
show that

1. [FileFormatWriter 
fails](https://github.com/apache/spark/blob/3f958a99921d149fb9fdf7ba7e78957afdad1405/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala#L118)
 even before setupJob on the committer is called [if the path is 
null](https://github.com/apache/spark/pull/19294/files#diff-bc98a3d91cf4f95f4f473146400044aaR40)

   FileOutputFormat.setOutputPath(job, new Path(outputSpec.outputPath))

2. [FileFormatWriter 
succeeds](https://github.com/apache/spark/pull/19294/files#diff-bc98a3d91cf4f95f4f473146400044aaR70)
 in case of default partitioning [when customPath is not 
defined](https://github.com/apache/spark/blob/3f958a99921d149fb9fdf7ba7e78957afdad1405/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala#L501)
 (the second branch of the `if` statement)

val currentPath = if (customPath.isDefined) {
  committer.newTaskTempFileAbsPath(taskAttemptContext, 
customPath.get, ext)
} else {
  committer.newTaskTempFile(taskAttemptContext, partDir, ext)
}

3. [FileFormatWriter 
succeeds](https://github.com/apache/spark/pull/19294/files#diff-bc98a3d91cf4f95f4f473146400044aaR107)
 in case of custom partitioning [when customPath is 
defined](https://github.com/apache/spark/blob/3f958a99921d149fb9fdf7ba7e78957afdad1405/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala#L499)
 (the first branch of the `if` statement)

val currentPath = if (customPath.isDefined) {
  committer.newTaskTempFileAbsPath(taskAttemptContext, 
customPath.get, ext)
} else {
  committer.newTaskTempFile(taskAttemptContext, partDir, ext)
}

Is there anything else I can help with to be sure nothing else was affected?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on...

2017-09-27 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/19370#discussion_r141532178
  
--- Diff: bin/run-example.cmd ---
@@ -17,6 +17,13 @@ rem See the License for the specific language governing 
permissions and
 rem limitations under the License.
 rem
 
-set SPARK_HOME=%~dp0..
+rem Figure out where the Spark framework is installed
+set FIND_SPARK_HOME_SCRIPT=%~dp0find_spark_home.py
+if exist "%FIND_SPARK_HOME_SCRIPT%" (
+  for /f %%i in ('python %FIND_SPARK_HOME_SCRIPT%') do set SPARK_HOME=%%i
--- End diff --

I'm not sure if python would be the right one (python2, python3, is it in 
PATH)?
and I don't think `cmd /c foo.py` works either


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-27 Thread fjh100456
Github user fjh100456 commented on the issue:

https://github.com/apache/spark/pull/19218
  
@dongjoon-hyun  @gatorsmile 
I'd fix them. Could you help me to review it again? Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19287: [SPARK-22074][Core] Task killed by other attempt task sh...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19287
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19287: [SPARK-22074][Core] Task killed by other attempt task sh...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19287
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82262/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19287: [SPARK-22074][Core] Task killed by other attempt task sh...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19287
  
**[Test build #82262 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82262/testReport)**
 for PR 19287 at commit 
[`71b0d58`](https://github.com/apache/spark/commit/71b0d5821ca4d2738fb608073acbb8f0ba1d8d29).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19294: [SPARK-21549][CORE] Respect OutputFormats with no output...

2017-09-27 Thread mridulm
Github user mridulm commented on the issue:

https://github.com/apache/spark/pull/19294
  
@szhem Did you resolve issues with newTaskTempFile, newTaskTempFileAbsPath, 
etc potentially still throwing NPE due to path being null ?
I saw multiple invocations in spark sql which are calling into it ? 
(`FileFormatWriter.newOutputWriter` for example)

I did run against SHC (spark hadoop connector) and that seems to work fine 
for basic writes - though it might probably be because it was not exhaustive 
and does not exercise other options in spark sql.

Essentially, any use of `path` in `HadoopMapReduceCommitProtocol` can be 
potentially problematic if it can be `null`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-09-27 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16578
  
@mallman For the problem 
https://github.com/apache/spark/pull/16578#issuecomment-327797222 identified by 
@snir, I've submitted a fix at https://github.com/VideoAmp/spark-public/pull/8. 
I think it can solve this problem. Can you review it? Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19330
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82264/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19330
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19330
  
**[Test build #82264 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82264/testReport)**
 for PR 19330 at commit 
[`0b2b52c`](https://github.com/apache/spark/commit/0b2b52c2bc2e03b7583a55f6bfc8f5c3d21a72e1).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19222
  
**[Test build #82267 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82267/testReport)**
 for PR 19222 at commit 
[`732cad1`](https://github.com/apache/spark/commit/732cad1689eef2d0456be9afe7bfe417828ce0ee).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19376: [SPARK-22153][SQL] Rename ShuffleExchange -> ShuffleExch...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19376
  
**[Test build #82266 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82266/testReport)**
 for PR 19376 at commit 
[`4d08353`](https://github.com/apache/spark/commit/4d0835382e766984593a96b4b5dc02631bdc3476).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19218
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19363: [Minor]Override toString of KeyValueGroupedDataset

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19363
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19218
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82261/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19363: [Minor]Override toString of KeyValueGroupedDataset

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19363
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82259/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19376: [SPARK-22153][SQL] Rename ShuffleExchange -> ShuffleExch...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19376
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82265/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19376: [SPARK-22153][SQL] Rename ShuffleExchange -> ShuffleExch...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19376
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19376: [SPARK-22153][SQL] Rename ShuffleExchange -> ShuffleExch...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19376
  
**[Test build #82265 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82265/testReport)**
 for PR 19376 at commit 
[`1de6165`](https://github.com/apache/spark/commit/1de6165cbe6abe3af5dee7882b8cd9185493).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19218
  
**[Test build #82261 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82261/testReport)**
 for PR 19218 at commit 
[`7615939`](https://github.com/apache/spark/commit/7615939268e1bea213e6511dcf6e5346ec21fb23).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19363: [Minor]Override toString of KeyValueGroupedDataset

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19363
  
**[Test build #82259 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82259/testReport)**
 for PR 19363 at commit 
[`289b23d`](https://github.com/apache/spark/commit/289b23d2536cf190a40f4a5494a57e0854771e10).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19376: [SPARK-22153][SQL] Rename ShuffleExchange -> ShuffleExch...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19376
  
**[Test build #82265 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82265/testReport)**
 for PR 19376 at commit 
[`1de6165`](https://github.com/apache/spark/commit/1de6165cbe6abe3af5dee7882b8cd9185493).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19376: [SPARK-22153][SQL] Rename ShuffleExchange -> Shuf...

2017-09-27 Thread rxin
GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/19376

[SPARK-22153][SQL] Rename ShuffleExchange -> ShuffleExchangeExec

## What changes were proposed in this pull request?
For some reason when we added the Exec suffix to all physical operators, we 
missed this one. I was looking for this physical operator today and couldn't 
find it, because I was looking for ExchangeExec.

## How was this patch tested?
This is a simple rename and should be covered by existing tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark SPARK-22153

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19376.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19376


commit 9e9627dba1157499798a06f21d5aedf83dbe9acd
Author: Reynold Xin 
Date:   2017-09-28T04:18:07Z

[SPARK-22153][SQL] Rename ShuffleExchange -> ShuffleExchangeExec

commit 1de6165cbe6abe3af5dee7882b8cd9185493
Author: Reynold Xin 
Date:   2017-09-28T04:19:43Z

Fix 100 char




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19344: [SPARK-22122][SQL] Respect WITH clauses to count ...

2017-09-27 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/19344#discussion_r141522496
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
 ---
@@ -66,25 +68,23 @@ object TPCDSQueryBenchmark extends Logging {
 classLoader = Thread.currentThread().getContextClassLoader)
 
   // This is an indirect hack to estimate the size of each query's 
input by traversing the
-  // logical plan and adding up the sizes of all tables that appear in 
the plan. Note that this
-  // currently doesn't take WITH subqueries into account which might 
lead to fairly inaccurate
-  // per-row processing time for those cases.
-  val queryRelations = scala.collection.mutable.HashSet[String]()
-  spark.sql(queryString).queryExecution.logical.map {
-case UnresolvedRelation(t: TableIdentifier) =>
-  queryRelations.add(t.table)
-case lp: LogicalPlan =>
-  lp.expressions.foreach { _ foreach {
-case subquery: SubqueryExpression =>
-  subquery.plan.foreach {
-case UnresolvedRelation(t: TableIdentifier) =>
-  queryRelations.add(t.table)
-case _ =>
-  }
-case _ =>
-  }
+  // logical plan and adding up the sizes of all tables that appear in 
the plan.
+  val planToCheck = 
mutable.Stack[LogicalPlan](spark.sql(queryString).queryExecution.logical)
--- End diff --

oh, yea. Since the original code does so, I just added the logic. But, the 
suggestion sounds good to me, so I'll update soon. Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19330
  
**[Test build #82264 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82264/testReport)**
 for PR 19330 at commit 
[`0b2b52c`](https://github.com/apache/spark/commit/0b2b52c2bc2e03b7583a55f6bfc8f5c3d21a72e1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19272: [Spark-21842][Mesos] Support Kerberos ticket renewal and...

2017-09-27 Thread ArtRand
Github user ArtRand commented on the issue:

https://github.com/apache/spark/pull/19272
  
Hey @kalvinnchau thanks for having the patience to try this. This is a 
curious error though. 

If you look at the `addAll` method called by 
`UserGroupInformation.addCredentials()` it should overwrite the current 
credentials. 

I tried to reproduce your error, but being less patient, I changed my HDFS 
setup to request the tokens be updated every minute instead of everyday by 
adding the following to hdfs-site.xml:
```

dfs.namenode.delegation.token.max-lifetime
6

```

I added some logging to the executor backend to check if they were indeed 
being updated. 

```
case UpdateDelegationTokens(tokens) =>
  logInfo("Got request to update tokens")
  val oldCreds = UserGroupInformation.getCurrentUser.getCredentials
  for (t <- oldCreds.getAllTokens.asScala) {
logInfo(s"Old Creds ${DelegationTokenIdentifier.stringifyToken(t)}")
  }
  val creds = SparkHadoopUtil.get.deserialize(tokens)
  for (t <- creds.getAllTokens.asScala) {
val s = DelegationTokenIdentifier.stringifyToken(t)
logInfo(s"Got new tokens $s")
  }
  SparkHadoopUtil.get.addDelegationTokens(tokens, env.conf)
  val newCreds = UserGroupInformation.getCurrentUser.getCredentials
  for (t <- newCreds.getAllTokens.asScala) {
logInfo(s"New creds ${DelegationTokenIdentifier.stringifyToken(t)}")
  }
```

and indeed when I check the logs, indeed the token number has been updated. 
```
17/09/28 03:32:58 INFO CoarseGrainedExecutorBackend: Got request to update 
tokens
17/09/28 03:32:58 INFO CoarseGrainedExecutorBackend: Old Creds 
HDFS_DELEGATION_TOKEN token 29 for hdfs on ha-hdfs:hdfs
17/09/28 03:32:59 INFO CoarseGrainedExecutorBackend: Got new tokens 
HDFS_DELEGATION_TOKEN token 31 for hdfs on ha-hdfs:hdfs
17/09/28 03:32:59 INFO CoarseGrainedExecutorBackend: New creds 
HDFS_DELEGATION_TOKEN token 31 for hdfs on ha-hdfs:hdfs
```
then some time later (in fact there was another update in the middle):
```
17/09/28 03:35:14 INFO CoarseGrainedExecutorBackend: Got request to update 
tokens
17/09/28 03:35:14 INFO CoarseGrainedExecutorBackend: Old Creds 
HDFS_DELEGATION_TOKEN token 34 for hdfs on ha-hdfs:hdfs
17/09/28 03:35:14 INFO CoarseGrainedExecutorBackend: Got new tokens 
HDFS_DELEGATION_TOKEN token 35 for hdfs on ha-hdfs:hdfs
17/09/28 03:35:14 INFO CoarseGrainedExecutorBackend: New creds 
HDFS_DELEGATION_TOKEN token 35 for hdfs on ha-hdfs:hdfs
```

I will run a 24h experiment to verify, but hopefully there is a way to 
validate that the update is working without waiting that long just to debug!

@vanzin Could you eyeball this, am I missing something obvious? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19370
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82258/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19370
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19370
  
**[Test build #82258 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82258/testReport)**
 for PR 19370 at commit 
[`0b12975`](https://github.com/apache/spark/commit/0b12975b0d224ed4173616b34106c3c3911371b9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82263/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19222
  
**[Test build #82263 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82263/testReport)**
 for PR 19222 at commit 
[`f2b27ff`](https://github.com/apache/spark/commit/f2b27ff9689f6b8dbcbdc6c46007d8b6f92ad241).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19222
  
**[Test build #82263 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82263/testReport)**
 for PR 19222 at commit 
[`f2b27ff`](https://github.com/apache/spark/commit/f2b27ff9689f6b8dbcbdc6c46007d8b6f92ad241).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19344: [SPARK-22122][SQL] Respect WITH clauses to count ...

2017-09-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19344#discussion_r141520921
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
 ---
@@ -66,25 +68,23 @@ object TPCDSQueryBenchmark extends Logging {
 classLoader = Thread.currentThread().getContextClassLoader)
 
   // This is an indirect hack to estimate the size of each query's 
input by traversing the
-  // logical plan and adding up the sizes of all tables that appear in 
the plan. Note that this
-  // currently doesn't take WITH subqueries into account which might 
lead to fairly inaccurate
-  // per-row processing time for those cases.
-  val queryRelations = scala.collection.mutable.HashSet[String]()
-  spark.sql(queryString).queryExecution.logical.map {
-case UnresolvedRelation(t: TableIdentifier) =>
-  queryRelations.add(t.table)
-case lp: LogicalPlan =>
-  lp.expressions.foreach { _ foreach {
-case subquery: SubqueryExpression =>
-  subquery.plan.foreach {
-case UnresolvedRelation(t: TableIdentifier) =>
-  queryRelations.add(t.table)
-case _ =>
-  }
-case _ =>
-  }
+  // logical plan and adding up the sizes of all tables that appear in 
the plan.
+  val planToCheck = 
mutable.Stack[LogicalPlan](spark.sql(queryString).queryExecution.logical)
--- End diff --

The analyzer rule `CTESubstitution`  will replace `With`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19344: [SPARK-22122][SQL] Respect WITH clauses to count ...

2017-09-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19344#discussion_r141520163
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
 ---
@@ -66,25 +68,23 @@ object TPCDSQueryBenchmark extends Logging {
 classLoader = Thread.currentThread().getContextClassLoader)
 
   // This is an indirect hack to estimate the size of each query's 
input by traversing the
-  // logical plan and adding up the sizes of all tables that appear in 
the plan. Note that this
-  // currently doesn't take WITH subqueries into account which might 
lead to fairly inaccurate
-  // per-row processing time for those cases.
-  val queryRelations = scala.collection.mutable.HashSet[String]()
-  spark.sql(queryString).queryExecution.logical.map {
-case UnresolvedRelation(t: TableIdentifier) =>
-  queryRelations.add(t.table)
-case lp: LogicalPlan =>
-  lp.expressions.foreach { _ foreach {
-case subquery: SubqueryExpression =>
-  subquery.plan.foreach {
-case UnresolvedRelation(t: TableIdentifier) =>
-  queryRelations.add(t.table)
-case _ =>
-  }
-case _ =>
-  }
+  // logical plan and adding up the sizes of all tables that appear in 
the plan.
+  val planToCheck = 
mutable.Stack[LogicalPlan](spark.sql(queryString).queryExecution.logical)
--- End diff --

Why not using the plan that has been analyzed?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19361: [SPARK-22140] Add TPCDSQuerySuite

2017-09-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19361#discussion_r141519844
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/TPCDSQuerySuite.scala ---
@@ -0,0 +1,348 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.spark.sql.catalyst.util.resourceToString
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.test.SharedSQLContext
+
+class TPCDSQuerySuite extends QueryTest with SharedSQLContext with 
BeforeAndAfterAll {
+
+  /**
+   * Drop all the tables
+   */
+  protected override def afterAll(): Unit = {
+try {
+  spark.sessionState.catalog.reset()
+} finally {
+  super.afterAll()
+}
+  }
+
+  override def beforeAll() {
+super.beforeAll()
+sql(
+  """
+|CREATE TABLE `catalog_page` (
+|`cp_catalog_page_sk` INT, `cp_catalog_page_id` STRING, 
`cp_start_date_sk` INT,
+|`cp_end_date_sk` INT, `cp_department` STRING, `cp_catalog_number` 
INT,
+|`cp_catalog_page_number` INT, `cp_description` STRING, `cp_type` 
STRING)
+|USING parquet
+  """.stripMargin)
+
+sql(
+  """
+|CREATE TABLE `catalog_returns` (
+|`cr_returned_date_sk` INT, `cr_returned_time_sk` INT, 
`cr_item_sk` INT,
+|`cr_refunded_customer_sk` INT, `cr_refunded_cdemo_sk` INT, 
`cr_refunded_hdemo_sk` INT,
+|`cr_refunded_addr_sk` INT, `cr_returning_customer_sk` INT, 
`cr_returning_cdemo_sk` INT,
+|`cr_returning_hdemo_sk` INT, `cr_returning_addr_sk` INT, 
`cr_call_center_sk` INT,
+|`cr_catalog_page_sk` INT, `cr_ship_mode_sk` INT, 
`cr_warehouse_sk` INT, `cr_reason_sk` INT,
+|`cr_order_number` INT, `cr_return_quantity` INT, 
`cr_return_amount` DECIMAL(7,2),
+|`cr_return_tax` DECIMAL(7,2), `cr_return_amt_inc_tax` 
DECIMAL(7,2), `cr_fee` DECIMAL(7,2),
+|`cr_return_ship_cost` DECIMAL(7,2), `cr_refunded_cash` 
DECIMAL(7,2),
+|`cr_reversed_charge` DECIMAL(7,2), `cr_store_credit` DECIMAL(7,2),
+|`cr_net_loss` DECIMAL(7,2))
+|USING parquet
+  """.stripMargin)
+
+sql(
+  """
+|CREATE TABLE `customer` (
+|`c_customer_sk` INT, `c_customer_id` STRING, `c_current_cdemo_sk` 
INT,
+|`c_current_hdemo_sk` INT, `c_current_addr_sk` INT, 
`c_first_shipto_date_sk` INT,
+|`c_first_sales_date_sk` INT, `c_salutation` STRING, 
`c_first_name` STRING,
+|`c_last_name` STRING, `c_preferred_cust_flag` STRING, 
`c_birth_day` INT,
+|`c_birth_month` INT, `c_birth_year` INT, `c_birth_country` 
STRING, `c_login` STRING,
+|`c_email_address` STRING, `c_last_review_date` STRING)
+|USING parquet
+  """.stripMargin)
+
+sql(
+  """
+|CREATE TABLE `customer_address` (
+|`ca_address_sk` INT, `ca_address_id` STRING, `ca_street_number` 
STRING,
+|`ca_street_name` STRING, `ca_street_type` STRING, 
`ca_suite_number` STRING,
+|`ca_city` STRING, `ca_county` STRING, `ca_state` STRING, `ca_zip` 
STRING,
+|`ca_country` STRING, `ca_gmt_offset` DECIMAL(5,2), 
`ca_location_type` STRING)
+|USING parquet
+  """.stripMargin)
+
+sql(
+  """
+|CREATE TABLE `customer_demographics` (
+|`cd_demo_sk` INT, `cd_gender` STRING, `cd_marital_status` STRING,
+|`cd_education_status` STRING, `cd_purchase_estimate` INT, 
`cd_credit_rating` STRING,
+|`cd_dep_count` INT, `cd_dep_employed_count` INT, 
`cd_dep_college_count` INT)
+|USING parquet
+  """.stripMargin)
+
+sql(
+  """
+|CREATE TABLE `date_dim` (
+|`d_date_sk` INT, `d_date_id` STRING, `d_date` STRING,
+|`d_month_seq` INT, `d_week_seq

[GitHub] spark issue #19287: [SPARK-22074][Core] Task killed by other attempt task sh...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19287
  
**[Test build #82262 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82262/testReport)**
 for PR 19287 at commit 
[`71b0d58`](https://github.com/apache/spark/commit/71b0d5821ca4d2738fb608073acbb8f0ba1d8d29).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19344: [SPARK-22122][SQL] Respect WITH clauses to count input r...

2017-09-27 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/19344
  
ping


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19218
  
**[Test build #82261 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82261/testReport)**
 for PR 19218 at commit 
[`7615939`](https://github.com/apache/spark/commit/7615939268e1bea213e6511dcf6e5346ec21fb23).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19287: [SPARK-22074][Core] Task killed by other attempt ...

2017-09-27 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/19287#discussion_r141513379
  
--- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskInfo.scala ---
@@ -74,6 +81,10 @@ class TaskInfo(
 gettingResultTime = time
   }
 
+  private[spark] def markKilledAttempt: Unit = {
--- End diff --

Sorry for the missing, I change it right now.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19218
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82260/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19218
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19218
  
**[Test build #82260 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82260/testReport)**
 for PR 19218 at commit 
[`fd73145`](https://github.com/apache/spark/commit/fd731457e774895a4e28c3d5118ab5e6113151d6).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19218
  
**[Test build #82260 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82260/testReport)**
 for PR 19218 at commit 
[`fd73145`](https://github.com/apache/spark/commit/fd731457e774895a4e28c3d5118ab5e6113151d6).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19287: [SPARK-22074][Core] Task killed by other attempt ...

2017-09-27 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/19287#discussion_r141510861
  
--- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskInfo.scala ---
@@ -74,6 +81,10 @@ class TaskInfo(
 gettingResultTime = time
   }
 
+  private[spark] def markKilledAttempt: Unit = {
--- End diff --

I think I suggested you to add parenthesis in this method signature `def 
markKilledByOtherAttempt()`, can you please change it?

> It is better to change the method signature to def 
markKilledByOtherAttempt(): Unit = {, since this method has side affect, it is 
better to add parenthesis.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19363: [Minor]Override toString of KeyValueGroupedDataset

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19363
  
**[Test build #82259 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82259/testReport)**
 for PR 19363 at commit 
[`289b23d`](https://github.com/apache/spark/commit/289b23d2536cf190a40f4a5494a57e0854771e10).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19374
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19374
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82256/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19374
  
**[Test build #82256 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82256/testReport)**
 for PR 19374 at commit 
[`675d5b6`](https://github.com/apache/spark/commit/675d5b6002bea13fdecf1bc834e02bce064b7e0a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19363: [Minor]Override toString of KeyValueGroupedDatase...

2017-09-27 Thread yaooqinn
Github user yaooqinn commented on a diff in the pull request:

https://github.com/apache/spark/pull/19363#discussion_r141509523
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala ---
@@ -54,6 +55,14 @@ class KeyValueGroupedDataset[K, V] private[sql](
   private def sparkSession = queryExecution.sparkSession
 
   /**
+   * Returns the schema of this Dataset.
+   *
+   * @group basic
+   * @since 2.3.0
+   */
+  def schema: StructType = queryExecution.analyzed.schema
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19374
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19374
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82255/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19374
  
**[Test build #82255 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82255/testReport)**
 for PR 19374 at commit 
[`5f7a187`](https://github.com/apache/spark/commit/5f7a1878cf52bf5dc3df204497eff02e5a2c2e86).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19374
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19374
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82254/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19358: [SPARK-22135] [MESOS] metrics in spark-dispatcher not be...

2017-09-27 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/19358
  
LGTM.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19374
  
**[Test build #82254 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82254/testReport)**
 for PR 19374 at commit 
[`1f102fd`](https://github.com/apache/spark/commit/1f102fd4271043a03e137f229b610e481bca53e0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19338: [SPARK-22123][CORE] Add latest failure reason for...

2017-09-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19338


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19375: [MINOR] Fixed up pandas_udf related docs and form...

2017-09-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19375


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19375: [MINOR] Fixed up pandas_udf related docs and formatting

2017-09-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19375
  
Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19338: [SPARK-22123][CORE] Add latest failure reason for task s...

2017-09-27 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/19338
  
lGTM, merging to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19090: [SPARK-21877][DEPLOY, WINDOWS] Handle quotes in Windows ...

2017-09-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19090
  
@jsnowacki, would you mind if I ask double check this PR when you have some 
time?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-09-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19370
  
cc @holdenk, @felixcheung and @ueshin who I believe are interested in this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on...

2017-09-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/19370#discussion_r141503492
  
--- Diff: bin/pyspark2.cmd ---
@@ -18,7 +18,12 @@ rem limitations under the License.
 rem
 
 rem Figure out where the Spark framework is installed
-set SPARK_HOME=%~dp0..
+set FIND_SPARK_HOME_SCRIPT=%~dp0find_spark_home.py
+if exist "%FIND_SPARK_HOME_SCRIPT%" (
+  for /f %%i in ('python %FIND_SPARK_HOME_SCRIPT%') do set SPARK_HOME=%%i
--- End diff --

Mind adding some comments? I believe we resemble here:


https://github.com/apache/spark/blob/9244957b500cb2b458c32db2c63293a1444690d7/bin/find-spark-home#L28-L40

which detects `find_spark_home.py` that should be included in pip 
installation:


https://github.com/apache/spark/blob/aad2125475dcdeb4a0410392b6706511db17bac4/python/setup.py#L143-L145

I'd be nicer if PR description explains this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19370
  
**[Test build #82258 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82258/testReport)**
 for PR 19370 at commit 
[`0b12975`](https://github.com/apache/spark/commit/0b12975b0d224ed4173616b34106c3c3911371b9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19374
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82253/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19374
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19374
  
**[Test build #82253 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82253/testReport)**
 for PR 19374 at commit 
[`0e5e5e0`](https://github.com/apache/spark/commit/0e5e5e0ef0d2ba030af71132955c63aadf4ca970).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-09-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19370
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19361: [SPARK-22140] Add TPCDSQuerySuite

2017-09-27 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/19361#discussion_r141501351
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/TPCDSQuerySuite.scala ---
@@ -0,0 +1,348 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.spark.sql.catalyst.util.resourceToString
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.test.SharedSQLContext
+
+class TPCDSQuerySuite extends QueryTest with SharedSQLContext with 
BeforeAndAfterAll {
+
+  /**
+   * Drop all the tables
+   */
+  protected override def afterAll(): Unit = {
+try {
+  spark.sessionState.catalog.reset()
+} finally {
+  super.afterAll()
+}
+  }
+
+  override def beforeAll() {
+super.beforeAll()
+sql(
+  """
+|CREATE TABLE `catalog_page` (
+|`cp_catalog_page_sk` INT, `cp_catalog_page_id` STRING, 
`cp_start_date_sk` INT,
+|`cp_end_date_sk` INT, `cp_department` STRING, `cp_catalog_number` 
INT,
+|`cp_catalog_page_number` INT, `cp_description` STRING, `cp_type` 
STRING)
+|USING parquet
+  """.stripMargin)
+
+sql(
+  """
+|CREATE TABLE `catalog_returns` (
+|`cr_returned_date_sk` INT, `cr_returned_time_sk` INT, 
`cr_item_sk` INT,
+|`cr_refunded_customer_sk` INT, `cr_refunded_cdemo_sk` INT, 
`cr_refunded_hdemo_sk` INT,
+|`cr_refunded_addr_sk` INT, `cr_returning_customer_sk` INT, 
`cr_returning_cdemo_sk` INT,
+|`cr_returning_hdemo_sk` INT, `cr_returning_addr_sk` INT, 
`cr_call_center_sk` INT,
+|`cr_catalog_page_sk` INT, `cr_ship_mode_sk` INT, 
`cr_warehouse_sk` INT, `cr_reason_sk` INT,
+|`cr_order_number` INT, `cr_return_quantity` INT, 
`cr_return_amount` DECIMAL(7,2),
+|`cr_return_tax` DECIMAL(7,2), `cr_return_amt_inc_tax` 
DECIMAL(7,2), `cr_fee` DECIMAL(7,2),
+|`cr_return_ship_cost` DECIMAL(7,2), `cr_refunded_cash` 
DECIMAL(7,2),
+|`cr_reversed_charge` DECIMAL(7,2), `cr_store_credit` DECIMAL(7,2),
+|`cr_net_loss` DECIMAL(7,2))
+|USING parquet
+  """.stripMargin)
+
+sql(
+  """
+|CREATE TABLE `customer` (
+|`c_customer_sk` INT, `c_customer_id` STRING, `c_current_cdemo_sk` 
INT,
+|`c_current_hdemo_sk` INT, `c_current_addr_sk` INT, 
`c_first_shipto_date_sk` INT,
+|`c_first_sales_date_sk` INT, `c_salutation` STRING, 
`c_first_name` STRING,
+|`c_last_name` STRING, `c_preferred_cust_flag` STRING, 
`c_birth_day` INT,
+|`c_birth_month` INT, `c_birth_year` INT, `c_birth_country` 
STRING, `c_login` STRING,
+|`c_email_address` STRING, `c_last_review_date` STRING)
+|USING parquet
+  """.stripMargin)
+
+sql(
+  """
+|CREATE TABLE `customer_address` (
+|`ca_address_sk` INT, `ca_address_id` STRING, `ca_street_number` 
STRING,
+|`ca_street_name` STRING, `ca_street_type` STRING, 
`ca_suite_number` STRING,
+|`ca_city` STRING, `ca_county` STRING, `ca_state` STRING, `ca_zip` 
STRING,
+|`ca_country` STRING, `ca_gmt_offset` DECIMAL(5,2), 
`ca_location_type` STRING)
+|USING parquet
+  """.stripMargin)
+
+sql(
+  """
+|CREATE TABLE `customer_demographics` (
+|`cd_demo_sk` INT, `cd_gender` STRING, `cd_marital_status` STRING,
+|`cd_education_status` STRING, `cd_purchase_estimate` INT, 
`cd_credit_rating` STRING,
+|`cd_dep_count` INT, `cd_dep_employed_count` INT, 
`cd_dep_college_count` INT)
+|USING parquet
+  """.stripMargin)
+
+sql(
+  """
+|CREATE TABLE `date_dim` (
+|`d_date_sk` INT, `d_date_id` STRING, `d_date` STRING,
+|`d_month_seq` INT, `d_week_seq`

[GitHub] spark issue #19267: [WIP][SPARK-20628][CORE] Blacklist nodes when they trans...

2017-09-27 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/19267
  
@juanrh do you plan on working more on this before removing the "WIP"? Not 
sure what's your expectation here. People generally look over "WIP"s with so 
many other PRs to look at.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19362: [SPARK-22141][SQL] Propagate empty relation befor...

2017-09-27 Thread caneGuy
Github user caneGuy commented on a diff in the pull request:

https://github.com/apache/spark/pull/19362#discussion_r141301509
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -136,6 +134,8 @@ abstract class Optimizer(sessionCatalog: SessionCatalog)
 Batch("LocalRelation", fixedPoint,
   ConvertToLocalRelation,
   PropagateEmptyRelation) ::
+Batch("Check Cartesian Products", Once,
--- End diff --

I think the comment of `CheckCartesianProducts` should also be updated and 
add this constrain.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19375: [MINOR] Fixed up pandas_udf related docs and formatting

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19375
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82257/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19375: [MINOR] Fixed up pandas_udf related docs and formatting

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19375
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19375: [MINOR] Fixed up pandas_udf related docs and formatting

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19375
  
**[Test build #82257 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82257/testReport)**
 for PR 19375 at commit 
[`70f4c45`](https://github.com/apache/spark/commit/70f4c45a97cfb18ca3d2f19501956cdf948750e3).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19361: [SPARK-22140] Add TPCDSQuerySuite

2017-09-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19361


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19361: [SPARK-22140] Add TPCDSQuerySuite

2017-09-27 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19361
  
Thanks! Merged to master/2.2


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19375: [MINOR] Fixed up pandas_udf related docs and formatting

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19375
  
**[Test build #82257 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82257/testReport)**
 for PR 19375 at commit 
[`70f4c45`](https://github.com/apache/spark/commit/70f4c45a97cfb18ca3d2f19501956cdf948750e3).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19375: [MINOR] Fixed up pandas_udf related docs and formatting

2017-09-27 Thread BryanCutler
Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/19375
  
@HyukjinKwon @ueshin hopefully this is ok to do without a related JIRA, 
just some minor cleanup.  Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19375: [MINOR] Fixed up pandas_udf related docs and form...

2017-09-27 Thread BryanCutler
GitHub user BryanCutler opened a pull request:

https://github.com/apache/spark/pull/19375

[MINOR] Fixed up pandas_udf related docs and formatting

## What changes were proposed in this pull request?

Fixed some minor issues with pandas_udf related docs and formatting.

## How was this patch tested?

NA


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/BryanCutler/spark 
arrow-pandas_udf-cleanup-minor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19375.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19375


commit 70f4c45a97cfb18ca3d2f19501956cdf948750e3
Author: Bryan Cutler 
Date:   2017-09-27T23:47:01Z

fixed up some doc typos and formatting




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19374
  
**[Test build #82256 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82256/testReport)**
 for PR 19374 at commit 
[`675d5b6`](https://github.com/apache/spark/commit/675d5b6002bea13fdecf1bc834e02bce064b7e0a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19374
  
**[Test build #82255 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82255/testReport)**
 for PR 19374 at commit 
[`5f7a187`](https://github.com/apache/spark/commit/5f7a1878cf52bf5dc3df204497eff02e5a2c2e86).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19374
  
**[Test build #82254 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82254/testReport)**
 for PR 19374 at commit 
[`1f102fd`](https://github.com/apache/spark/commit/1f102fd4271043a03e137f229b610e481bca53e0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19354: [SPARK-20992][Scheduler] Add links in documentation to N...

2017-09-27 Thread rcgenova
Github user rcgenova commented on the issue:

https://github.com/apache/spark/pull/19354
  
We don't see much in the way of Kubernetes pull requests against Spark. Can 
you elaborate on exactly what you mean by "it's being integrated now"?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19194: [SPARK-20589] Allow limiting task concurrency per...

2017-09-27 Thread dhruve
Github user dhruve commented on a diff in the pull request:

https://github.com/apache/spark/pull/19194#discussion_r141482741
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -512,6 +535,9 @@ private[spark] class TaskSetManager(
   serializedTask)
   }
 } else {
+  if (runningTasks >= maxConcurrentTasks) {
+logDebug("Already running max. no. of concurrent tasks.")
--- End diff --

I'll make the change for this and also update any comments to explain the 
behavior so far. Also I am not clear on the earlier reply as to what was the 
resolution for accounting the activeJobId. Do you still have any inputs ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread skonto
Github user skonto commented on the issue:

https://github.com/apache/spark/pull/19374
  
@ArtRand @susanxhuynh pls review.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-09-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19374
  
**[Test build #82253 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82253/testReport)**
 for PR 19374 at commit 
[`0e5e5e0`](https://github.com/apache/spark/commit/0e5e5e0ef0d2ba030af71132955c63aadf4ca970).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-27 Thread dhruve
Github user dhruve commented on the issue:

https://github.com/apache/spark/pull/19194
  
Configuring at the stage level seems to be the appropriate and more 
deterministic choice. If we agree on changing the API, we can start another 
effort looking in that direction. Till then we can mark this feature as 
experimental or have an undocumented config.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19374: [SPARK-22145][MESOS] fix supervise with checkpoin...

2017-09-27 Thread skonto
GitHub user skonto opened a pull request:

https://github.com/apache/spark/pull/19374

[SPARK-22145][MESOS] fix supervise with checkpointing on mesos

## What changes were proposed in this pull request?

- Fixes the issue with frameworkId being recovered by checkpointed data.
- Keeps submission driver id is the only index for all data structures in 
the dispatcher. 
Allocates a different task id per driver retry to satisfy the mesos 
requirements.
Check the relevant ticket.
## How was this patch tested?

Manually tested this. Launched a streaming job with checkpointing to hdfs, 
made the driver fail several times and observed behavior:

![image](https://user-images.githubusercontent.com/7945591/30940500-f7d2a744-a3e9-11e7-8c56-f2ccbb271e80.png)


![image](https://user-images.githubusercontent.com/7945591/30940550-19bc15de-a3ea-11e7-8a11-f48abfe36720.png)


![image](https://user-images.githubusercontent.com/7945591/30940524-083ea308-a3ea-11e7-83ae-00d3fa17b928.png)


![image](https://user-images.githubusercontent.com/7945591/30940579-2f0fb242-a3ea-11e7-82f9-86179da28b8c.png)


![image](https://user-images.githubusercontent.com/7945591/30940591-3b561b0e-a3ea-11e7-9dbd-e71912bb2ef3.png)


![image](https://user-images.githubusercontent.com/7945591/30940605-49c810ca-a3ea-11e7-8af5-67930851fd38.png)


![image](https://user-images.githubusercontent.com/7945591/30940631-59f4a288-a3ea-11e7-88cb-c3741b72bb13.png)


![image](https://user-images.githubusercontent.com/7945591/30940642-62346c9e-a3ea-11e7-8935-82e494925f67.png)


![image](https://user-images.githubusercontent.com/7945591/30940653-6c46d53c-a3ea-11e7-8dd1-5840d484d28c.png)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/skonto/spark fix_retry

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19374.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19374


commit 0e5e5e0ef0d2ba030af71132955c63aadf4ca970
Author: Stavros Kontopoulos 
Date:   2017-09-27T22:04:38Z

fix supervise with checkpointing




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19270: [SPARK-21809] : Change Stage Page to use datatables to s...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19270
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82252/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19270: [SPARK-21809] : Change Stage Page to use datatables to s...

2017-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19270
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >