[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14971
  
**[Test build #65220 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65220/consoleFull)**
 for PR 14971 at commit 
[`d3dcb56`](https://github.com/apache/spark/commit/d3dcb564509fd2a32a3fadefb811495affaaa466).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14947: [SPARK-17388][SQL] Support for inferring type date/times...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14947
  
**[Test build #65219 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65219/consoleFull)**
 for PR 14947 at commit 
[`cda9d7a`](https://github.com/apache/spark/commit/cda9d7a3daea3b13398b20fadf06be4d8620f493).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14947: [SPARK-17388][SQL] Support for inferring type date/times...

2016-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14947
  
Hi @davies , it seems you made some changes related with this before. Could 
you please take a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14388: [SPARK-16362][SQL] Support ArrayType and StructType in v...

2016-09-10 Thread mallman
Github user mallman commented on the issue:

https://github.com/apache/spark/pull/14388
  
@viirya Any progress on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15049: [SPARK-17310][SQL] Add an option to disable record-level...

2016-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/15049
  
cc @davies @andreweduffy @rdblue 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15049: [SPARK-17310][SQL] Add an option to disable record-level...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15049
  
**[Test build #65218 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65218/consoleFull)**
 for PR 15049 at commit 
[`7b2e27e`](https://github.com/apache/spark/commit/7b2e27e5e6510679323def779cf0c2f99b195adc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15049: [SPARK-17310][SQL] Add an option to disable recor...

2016-09-10 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/15049

[SPARK-17310][SQL] Add an option to disable record-level filter in 
Parquet-side

## What changes were proposed in this pull request?

There is a concern that Spark-side codegen row-by-row filtering might be 
faster than Parquet's one in general due to type-boxing and virtual function 
calls which Spark's one tries to avoid.

So, this PR adds an option to disable/enable record-by-record filtering in 
Parquet side.

This was also discussed in https://github.com/apache/spark/pull/14671.

## How was this patch tested?

Manually benchmarks were performed. I generate a billion (1,000,000,000) 
records and tested equality comparison concatenated with `OR`. This filter 
combinations were made from 5 to 30. 

It seem indeed Spark-filtering is faster in the test case and the gap 
increased as the filter tree becomes larger.

The details are as below:

**Code**

```scala
test("Parquet-side filter vs Spark-side filter - record by record") {
  withTempPath { path =>
val N = 1000 * 1000 * 1000
val df = spark.range(N).toDF("a")
df.write.parquet(path.getAbsolutePath)

val benchmark = new Benchmark("Parquet-side vs Spark-side", N)
Seq(5, 10, 20, 30).foreach { num =>
  val filterExpr = (0 to num).map(i => s"a = $i").mkString(" OR ")

  benchmark.addCase(s"Parquet-side filter - number of filters [$num]", 
3) { _ =>
withSQLConf(SQLConf.PARQUET_VECTORIZED_READER_ENABLED.key -> 
false.toString,
  SQLConf.PARQUET_RECORD_FILTER_ENABLED.key -> true.toString) {

  // We should strip Spark-side filter to compare correctly.
  stripSparkFilter(

spark.read.parquet(path.getAbsolutePath).filter(filterExpr)).count()
}
  }

  benchmark.addCase(s"Spark-side filter - number of filters [$num]", 3) 
{ _ =>
withSQLConf(SQLConf.PARQUET_VECTORIZED_READER_ENABLED.key -> 
false.toString,
  SQLConf.PARQUET_RECORD_FILTER_ENABLED.key -> false.toString) {

  
spark.read.parquet(path.getAbsolutePath).filter(filterExpr).count()
}
  }
}

benchmark.run()
  }
}
```

**Result**

```
Parquet-side vs Spark-side:  Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative


Parquet-side filter - number of filters [5]  4268 / 4367234.3   
4.3   0.8X
Spark-side filter - number of filters [5]  3709 / 3741269.6 
  3.7   0.9X
Parquet-side filter - number of filters [10]  5673 / 5727176.3  
 5.7   0.6X
Spark-side filter - number of filters [10]  3588 / 3632278.7
   3.6   0.9X
Parquet-side filter - number of filters [20]  8024 / 8440124.6  
 8.0   0.4X
Spark-side filter - number of filters [20]  3912 / 3946255.6
   3.9   0.8X
Parquet-side filter - number of filters [30]11936 / 12041 83.8  
11.9   0.3X
Spark-side filter - number of filters [30]  3929 / 3978254.5
   3.9   0.8X
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark SPARK-17310

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15049.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15049


commit 7b2e27e5e6510679323def779cf0c2f99b195adc
Author: hyukjinkwon 
Date:   2016-09-11T04:34:21Z

Add an option to disable record-level filter in Parquet-side




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15048: [SPARK-17409] [SQL] Do Not Optimize Query in CTAS More T...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15048
  
**[Test build #65217 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65217/consoleFull)**
 for PR 15048 at commit 
[`da7deed`](https://github.com/apache/spark/commit/da7deed2e1e9e350affcee909159a200a4b7d5b8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15048: [SPARK-17409] [SQL] Do Not Optimize Query in CTAS...

2016-09-10 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/15048

[SPARK-17409] [SQL] Do Not Optimize Query in CTAS More Than Once

### What changes were proposed in this pull request?
As explained in https://github.com/apache/spark/pull/14797:
>Some analyzer rules have assumptions on logical plans, optimizer may break 
these assumption, we should not pass an optimized query plan into 
QueryExecution (will be analyzed again), otherwise we may some weird bugs.
For example, we have a rule for decimal calculation to promote the 
precision before binary operations, use PromotePrecision as placeholder to 
indicate that this rule should not apply twice. But a Optimizer rule will 
remove this placeholder, that break the assumption, then the rule applied 
twice, cause wrong result.

We should not optimize the query in CTAS more than once. For example, 
```Scala
spark.range(99, 101).createOrReplaceTempView("tab1")
val sqlStmt = "SELECT id, cast(id as long) * cast('1.0' as decimal(38, 18)) 
as num FROM tab1"
sql(s"CREATE TABLE tab2 USING PARQUET AS $sqlStmt")
checkAnswer(spark.table("tab2"), sql(sqlStmt))
```
Before this PR, the results do not match
```
== Results ==
!== Correct Answer - 2 ==   == Spark Answer - 2 ==
![100,100.00]   [100,null]
 [99,99.00] [99,99.00]
```
After this PR, the results match.
```
+---+--+
|id |num   |
+---+--+
|99 |99.00 |
|100|100.00|
+---+--+
```

In this PR, we do not treat the `query` in CTAS as a child. Thus, the 
`query` will not be optimized when optimizing CTAS statement. However, we still 
need to analyze it for normalize and verify the CTAS in the Analyzer. Thus, we 
do it in the analyzer rule `PreprocessDDL`, because so far only this rule needs 
the analyzed plan of the `query`.

### How was this patch tested?
Added a test

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark ctasOptimized

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15048.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15048


commit f7941e846c5ed42a4453518500fbf4938f3f1032
Author: gatorsmile 
Date:   2016-09-11T04:09:02Z

fix

commit 3a203f920abf742b2f2ab344d0231f992d8e5355
Author: gatorsmile 
Date:   2016-09-11T04:20:39Z

Merge remote-tracking branch 'upstream/master' into ctasOptimized

commit da7deed2e1e9e350affcee909159a200a4b7d5b8
Author: gatorsmile 
Date:   2016-09-11T04:38:07Z

one more test case




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15044: [WIP][SQL][SPARK-17490] Optimize SerializeFromObject() f...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15044
  
**[Test build #65216 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65216/consoleFull)**
 for PR 15044 at commit 
[`2b22d12`](https://github.com/apache/spark/commit/2b22d128ef4c51643cd4dcdbe17a1f3d28362a90).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14788: [SPARK-17174][SQL] Add the support for TimestampType for...

2016-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14788
  
Hi @cloud-fan and @hvanhovell , could I ask to take another look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13758
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65215/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13758
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13758
  
**[Test build #65215 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65215/consoleFull)**
 for PR 13758 at commit 
[`80a9038`](https://github.com/apache/spark/commit/80a90385f469e7bce24f456467de3ac6821b771a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13758
  
**[Test build #65215 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65215/consoleFull)**
 for PR 13758 at commit 
[`80a9038`](https://github.com/apache/spark/commit/80a90385f469e7bce24f456467de3ac6821b771a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15040: [WIP] [SPARK-17487] [SQL] Configurable bucketing info ex...

2016-09-10 Thread tejasapatil
Github user tejasapatil commented on the issue:

https://github.com/apache/spark/pull/15040
  
@cloud-fan : cc'ing you as you have lot of context about bucketing in 
Spark. I am looking for early feedback about this change wrt approach. I have 
included details in the PR description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15047: [SPARK-17495] [SQL] Add Hash capability semantically equ...

2016-09-10 Thread tejasapatil
Github user tejasapatil commented on the issue:

https://github.com/apache/spark/pull/15047
  
@rxin : can you recommend me someone for reviewing this PR ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15047: [SPARK-17495] [SQL] Add Hash capability semantically equ...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15047
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15047: [SPARK-17495] [SQL] Add Hash capability semantically equ...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15047
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65214/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15047: [SPARK-17495] [SQL] Add Hash capability semantically equ...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15047
  
**[Test build #65214 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65214/consoleFull)**
 for PR 15047 at commit 
[`c898f5a`](https://github.com/apache/spark/commit/c898f5af10ead29416fec7fee49de5c37a7f48cb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class HiveHash(children: Seq[Expression], seed: Int) extends 
HashExpression[Int] `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14644: [MESOS] Enable GPU support with Mesos

2016-09-10 Thread skonto
Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/14644#discussion_r78283698
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
 ---
@@ -103,6 +103,7 @@ private[spark] class MesosCoarseGrainedSchedulerBackend(
   private val stateLock = new ReentrantLock
 
   val extraCoresPerExecutor = conf.getInt("spark.mesos.extra.cores", 0)
+  val maxGpus = conf.getInt("spark.mesos.gpus.max", 0)
--- End diff --

@klueska  I think you need to autodiscover anything, the concept is similar 
to cpus max in the scheduler.
@tnachen  I think there should be some logic checking current total against 
the max gpus configured like in the case of cpusMax. I dont see any. I expect 
offers to be splitted. In that case we need to check the sum of the assigned 
gpus against the max right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15047: [SPARK-17495] [SQL] Add Hash capability semantically equ...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15047
  
**[Test build #65214 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65214/consoleFull)**
 for PR 15047 at commit 
[`c898f5a`](https://github.com/apache/spark/commit/c898f5af10ead29416fec7fee49de5c37a7f48cb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15047: [SPARK-17495] [SQL] Add Hash capability semantica...

2016-09-10 Thread tejasapatil
GitHub user tejasapatil opened a pull request:

https://github.com/apache/spark/pull/15047

[SPARK-17495] [SQL] Add Hash capability semantically equivalent to Hive's

## What changes were proposed in this pull request?

Jira : https://issues.apache.org/jira/browse/SPARK-17495

Spark internally uses Murmur3Hash for partitioning. This is different from 
the one used by Hive. For queries which use bucketing this leads to different 
results if one tries the same query on both engines. For us, we want users to 
have backward compatibility to that one can switch parts of applications across 
the engines without observing regressions.

This PR includes `HiveHash`, `HiveHashFunction`, `HiveHasher` which mimics 
Hive's hashing at 
https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java#L638

I am intentionally not introducing any usages of this hash function in rest 
of the code to keep this PR small. My eventual goal is to have Hive bucketing 
support in Spark. Once this PR gets in, I will make hash function pluggable in 
relevant areas (eg. `HashPartitioning`'s `partitionIdExpression` has Murmur3 
hardcoded : 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala#L265)

## How was this patch tested?

Added `HiveHashSuite`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tejasapatil/spark SPARK-17495_hive_hash

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15047.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15047


commit c898f5af10ead29416fec7fee49de5c37a7f48cb
Author: Tejas Patil 
Date:   2016-09-10T02:59:24Z

Add Hashing capability equivalent to Hive




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14912: [SPARK-17357][SQL] Fix current predicate pushdown

2016-09-10 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14912
  
I am thinking whether it makes more sense to maintain multiple semantically 
equivalent predicate sets for each `Filter`. In your example, we have both `(a 
> 10 || b > 2) && (a > 10 || c == 3)` and `(a > 10) || (b > 2 && c == 3)`. If 
we also considering the predicate transitivity inferences and predicate 
simplication at the same time, we could have multiple semantically equivalent 
predicate sets. Then, we have more chances to push down the predicates.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15046: [SPARK-17492] [SQL] Fix Reading Cataloged Data Sources w...

2016-09-10 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/15046
  
cc @yhuai @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15046: [SPARK-17492] [SQL] Fix Reading Cataloged Data Sources w...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15046
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15046: [SPARK-17492] [SQL] Fix Reading Cataloged Data Sources w...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15046
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65213/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15046: [SPARK-17492] [SQL] Fix Reading Cataloged Data Sources w...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15046
  
**[Test build #65213 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65213/consoleFull)**
 for PR 15046 at commit 
[`4ab1b8a`](https://github.com/apache/spark/commit/4ab1b8a45c9a8b9ed1f7ee85202eddf397235df4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14956: [SPARK-17389] [ML] [MLLIB] KMeans speedup with better ch...

2016-09-10 Thread mateiz
Github user mateiz commented on the issue:

https://github.com/apache/spark/pull/14956
  
Cool, thanks for improving the PIC test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST clause in OR...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14842
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST clause in OR...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14842
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65212/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST clause in OR...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14842
  
**[Test build #65212 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65212/consoleFull)**
 for PR 14842 at commit 
[`8e5a223`](https://github.com/apache/spark/commit/8e5a223806e02f00759e250d704f2d248e9f9e41).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15015: [SPARK-16445][MLlib][SparkR] Fix @return descript...

2016-09-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/15015


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15015: [SPARK-16445][MLlib][SparkR] Fix @return description for...

2016-09-10 Thread shivaram
Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/15015
  
Thanks @keypointt - Merging into master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15046: [SPARK-17492] [SQL] Fix Reading Cataloged Data Sources w...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15046
  
**[Test build #65213 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65213/consoleFull)**
 for PR 15046 at commit 
[`4ab1b8a`](https://github.com/apache/spark/commit/4ab1b8a45c9a8b9ed1f7ee85202eddf397235df4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15046: [SPARK-17492] [SQL] Fix Reading Cataloged Data So...

2016-09-10 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/15046

[SPARK-17492] [SQL] Fix Reading Cataloged Data Sources without Extending 
SchemaRelationProvider

### What changes were proposed in this pull request?
For data sources without extending `SchemaRelationProvider`, we expect 
users to not specify schemas when they creating tables. If the schema is input 
from users, an exception is issued. 

Since Spark 2.1, for any data source, to avoid infer the schema every time, 
we store the schema in the metastore catalog. Thus, when reading a cataloged 
data source table, the schema could be read from metastore catalog. In this 
case, we also got an exception. For example, 

```Scala
sql(
  s"""
 |CREATE TABLE relationProvierWithSchema
 |USING org.apache.spark.sql.sources.SimpleScanSource
 |OPTIONS (
 |  From '1',
 |  To '10'
 |)
   """.stripMargin)
spark.table(tableName).show()
```
```
org.apache.spark.sql.sources.SimpleScanSource does not allow user-specified 
schemas.;
```

This PR is to fix the above issue. When building a data source, we 
introduce a flag `isSchemaFromUsers` to indicate whether the schema is really 
input from users. If true, we issue an exception. Otherwise, we will call the 
`createRelation` of `RelationProvider` to generate the `BaseRelation`, in which 
it contains the actual schema.

### How was this patch tested?
Added a few cases.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark tempViewCases

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15046.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15046


commit 17c2d50c9d4788aaa68f7f57fc873762940a8e9d
Author: gatorsmile 
Date:   2016-09-10T15:31:26Z

fix

commit 00a49fe60f86775e19f038791a766195d506087a
Author: gatorsmile 
Date:   2016-09-10T15:41:18Z

clean

commit 335e0d6d5a19b30ec000db8d935869e006dd81e7
Author: gatorsmile 
Date:   2016-09-10T15:42:11Z

clean

commit 4ab1b8a45c9a8b9ed1f7ee85202eddf397235df4
Author: gatorsmile 
Date:   2016-09-10T16:12:26Z

add one more test case




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution by dedupli...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14452
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65211/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution by dedupli...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14452
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution by dedupli...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14452
  
**[Test build #65211 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65211/consoleFull)**
 for PR 14452 at commit 
[`23e2dc8`](https://github.com/apache/spark/commit/23e2dc865eef690eb273cc69888ca577eaa603a2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST clause in OR...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14842
  
**[Test build #65212 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65212/consoleFull)**
 for PR 14842 at commit 
[`8e5a223`](https://github.com/apache/spark/commit/8e5a223806e02f00759e250d704f2d248e9f9e41).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15039: [SPARK-17447] Performance improvement in Partitio...

2016-09-10 Thread codlife
Github user codlife commented on a diff in the pull request:

https://github.com/apache/spark/pull/15039#discussion_r78278899
  
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -55,14 +55,16 @@ object Partitioner {
* We use two method parameters (rdd, others) to enforce callers passing 
at least 1 RDD.
*/
   def defaultPartitioner(rdd: RDD[_], others: RDD[_]*): Partitioner = {
-val bySize = (Seq(rdd) ++ others).sortBy(_.partitions.length).reverse
-for (r <- bySize if r.partitioner.isDefined && 
r.partitioner.get.numPartitions > 0) {
-  return r.partitioner.get
-}
-if (rdd.context.conf.contains("spark.default.parallelism")) {
-  new HashPartitioner(rdd.context.defaultParallelism)
+val rdds = (Seq(rdd) ++ others)
+val hashPartitioner = rdds.filter(_.partitioner.exists(_.numPartitions 
> 0))
--- End diff --

First time to commit, but enjoy the process,i have updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15039: [SPARK-17447] Performance improvement in Partitio...

2016-09-10 Thread codlife
Github user codlife commented on a diff in the pull request:

https://github.com/apache/spark/pull/15039#discussion_r78278876
  
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -55,14 +55,16 @@ object Partitioner {
* We use two method parameters (rdd, others) to enforce callers passing 
at least 1 RDD.
*/
   def defaultPartitioner(rdd: RDD[_], others: RDD[_]*): Partitioner = {
-val bySize = (Seq(rdd) ++ others).sortBy(_.partitions.length).reverse
-for (r <- bySize if r.partitioner.isDefined && 
r.partitioner.get.numPartitions > 0) {
-  return r.partitioner.get
-}
-if (rdd.context.conf.contains("spark.default.parallelism")) {
-  new HashPartitioner(rdd.context.defaultParallelism)
+val rdds = (Seq(rdd) ++ others)
+val hashPartitioner = rdds.filter(_.partitioner.exists(_.numPartitions 
> 0))
--- End diff --

@srowen  Thank you ,I will lean much about code style.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15039: [SPARK-17447] Performance improvement in Partitio...

2016-09-10 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/15039#discussion_r78278727
  
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -55,14 +55,16 @@ object Partitioner {
* We use two method parameters (rdd, others) to enforce callers passing 
at least 1 RDD.
*/
   def defaultPartitioner(rdd: RDD[_], others: RDD[_]*): Partitioner = {
-val bySize = (Seq(rdd) ++ others).sortBy(_.partitions.length).reverse
-for (r <- bySize if r.partitioner.isDefined && 
r.partitioner.get.numPartitions > 0) {
-  return r.partitioner.get
-}
-if (rdd.context.conf.contains("spark.default.parallelism")) {
-  new HashPartitioner(rdd.context.defaultParallelism)
+val rdds = (Seq(rdd) ++ others)
+val hashPartitioner = rdds.filter(_.partitioner.exists(_.numPartitions 
> 0))
--- End diff --

hasPartitioner, not hashPartitioner. You should copy the code I provided.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST clause in OR...

2016-09-10 Thread xwu0226
Github user xwu0226 commented on the issue:

https://github.com/apache/spark/pull/14842
  
Retest please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15035: [SPARK-17477]: SparkSQL cannot handle schema evolution f...

2016-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/15035
  
Do you mind if I ask whether this work with vectorized parquet reader too? 
I know normal Parquet reader uses `SpecificMutableRow` but IIRC, Parquet 
vectorized reader replies on `ColumnarBatch` which does not use 
`SpecificMutableRow`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14956: [SPARK-17389] [ML] [MLLIB] KMeans speedup with better ch...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14956
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14956: [SPARK-17389] [ML] [MLLIB] KMeans speedup with better ch...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14956
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65210/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14956: [SPARK-17389] [ML] [MLLIB] KMeans speedup with better ch...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14956
  
**[Test build #65210 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65210/consoleFull)**
 for PR 14956 at commit 
[`b5aaec9`](https://github.com/apache/spark/commit/b5aaec9a398fc4ac0754efb1e14345c3464acd49).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15035: [SPARK-17477]: SparkSQL cannot handle schema evolution f...

2016-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/15035
  
Shouldn't we change the reading path for Parquet rather than changing the 
target row to avoid per-record type dispatch? Also, it seems a Parquet specific 
issue but I wonder making changes in row is a good approach. 

I remember my PR to support upcasting in schema for Parquet, 
https://github.com/apache/spark/pull/14215 which I decided to close for a 
better approach.

I haven't taken a look so closely but I will and leave some comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14737: [SPARK-17171][WEB UI] DAG will list all partitions in th...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14737
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65209/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14737: [SPARK-17171][WEB UI] DAG will list all partitions in th...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14737
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14737: [SPARK-17171][WEB UI] DAG will list all partitions in th...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14737
  
**[Test build #65209 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65209/consoleFull)**
 for PR 14737 at commit 
[`5163a51`](https://github.com/apache/spark/commit/5163a51a81ea509bd76b3452fa33fb83078c279e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution by dedupli...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14452
  
**[Test build #65211 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65211/consoleFull)**
 for PR 14452 at commit 
[`23e2dc8`](https://github.com/apache/spark/commit/23e2dc865eef690eb273cc69888ca577eaa603a2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #12575: [SPARK-14803][SQL][Optimizer] A bug in EliminateS...

2016-09-10 Thread sun-rui
Github user sun-rui closed the pull request at:

https://github.com/apache/spark/pull/12575


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15045: [Spark Core][MINOR] fix "default partitioner cannot part...

2016-09-10 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/15045
  
Why bother saying 'specified' or 'default' at all though? it's probably 
even more informative to state that HashPartitioner doesn't work, no matter 
what the source. If the user specified HashPartitioner, that's clear. If they 
didn't, they'll still recognize that the other half of the message is relevant: 
some thing doesn't like their array keys.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14956: [SPARK-17389] [ML] [MLLIB] KMeans speedup with be...

2016-09-10 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/14956#discussion_r78277526
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala
 ---
@@ -395,7 +395,7 @@ object PowerIterationClustering extends Logging {
 val points = v.mapValues(x => Vectors.dense(x)).cache()
 val model = new KMeans()
   .setK(k)
-  .setSeed(0L)
+  .setSeed(5L)
--- End diff --

I got the tests to pass reliably by simply making the two sets of points 
generated in this test both contain 10 points, not 10 and 40. Balancing them 
made the issue go away.

As to why the paper 'works', I'm actually not clear it does. It does not 
actually just k-means cluster the values. They say they run 100 clusterings and 
take the most common cluster assignment. It's a little ambiguous what this 
means, but may be the source of difference. AFAICT the current PIC test does 
present a situation that PIC clustering won't get right, often, if it uses 
straight k-means internally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14956: [SPARK-17389] [ML] [MLLIB] KMeans speedup with better ch...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14956
  
**[Test build #65210 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65210/consoleFull)**
 for PR 14956 at commit 
[`b5aaec9`](https://github.com/apache/spark/commit/b5aaec9a398fc4ac0754efb1e14345c3464acd49).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14956: [SPARK-17389] [ML] [MLLIB] KMeans speedup with be...

2016-09-10 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/14956#discussion_r78277099
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala
 ---
@@ -395,7 +395,7 @@ object PowerIterationClustering extends Logging {
 val points = v.mapValues(x => Vectors.dense(x)).cache()
 val model = new KMeans()
   .setK(k)
-  .setSeed(0L)
+  .setSeed(5L)
--- End diff --

Sorry Matei I kind of missed your point. Yes it's a bit more strange to be 
changing a seed in a non-test file. Same reasoning, but I agree. There's a seed 
here to begin with for determinism but probably shouldn't matter.

I think I understand the problem, and I think it's the test. k-means is 
used to cluster 1D data. The test case generates two concentric circles of 
points of radius 1 and 4, which are intended to form k=2 separate clusters in 
the derived values that are clustered by k-means internally. That's even clear 
from looking at the similarities plotted:


![rplot](https://cloud.githubusercontent.com/assets/822522/18410522/2de05f76-775d-11e6-8e75-07d3d31e5cae.png)

While it's clear what the clustering is supposed to be, it's not actually 
the lowest-cost k-means clustering. Many clusterings do find the 'wrong' better 
clustering which is one that would include a few of the leftmost elements of 
the right group into the left one. Many other clusterings get the 'right' 
answer which is a big local minimum but not optimal. In fact, k-means|| init 
seems to do worse than random here exactly because it's less likely to find the 
local minimum.

I don't think the choice of radii matters here, since the resulting values 
above are basically invariant.

I'm going to have to read the paper more to understand what the difference 
is here. It's not quite the k-means change here, and, we can make this test 
pass easily by either

- Set seed back to 0 and fix init steps = 5 for this use of k-means, 
because that happens to work. Then this implementation doesn't change at all. 
It means it does more work just to make the test pass.
- Set seed to, say, 5 to get this to pass, on the theory that the choice of 
seed still doesn't seem to matter per se, and 5 is no worse than 0.

Obviously I want to understand a little more about how this is ever 
supposed to work in PIC, though it ends up being a slightly different issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13758
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13758
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65208/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13758
  
**[Test build #65208 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65208/consoleFull)**
 for PR 13758 at commit 
[`8639319`](https://github.com/apache/spark/commit/863931994a2f24936b5312f7a8f79ae8204d57b1).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15045: [Spark Core][MINOR] fix "default partitioner cannot part...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15045
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65207/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15045: [Spark Core][MINOR] fix "default partitioner cannot part...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15045
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15045: [Spark Core][MINOR] fix "default partitioner cannot part...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15045
  
**[Test build #65207 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65207/consoleFull)**
 for PR 15045 at commit 
[`6520854`](https://github.com/apache/spark/commit/6520854c565b87c80bd96a26b9b2aaefa0c5f752).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15045: [Spark Core][MINOR] fix "default partitioner cannot part...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15045
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15045: [Spark Core][MINOR] fix "default partitioner cannot part...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15045
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65206/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15045: [Spark Core][MINOR] fix "default partitioner cannot part...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15045
  
**[Test build #65206 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65206/consoleFull)**
 for PR 15045 at commit 
[`d423b41`](https://github.com/apache/spark/commit/d423b4165a0b778852a76cf8d04615ec2465c4d0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15039: [SPARK-17447] Performance improvement in Partitio...

2016-09-10 Thread codlife
Github user codlife commented on a diff in the pull request:

https://github.com/apache/spark/pull/15039#discussion_r78276638
  
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -55,14 +55,15 @@ object Partitioner {
* We use two method parameters (rdd, others) to enforce callers passing 
at least 1 RDD.
*/
   def defaultPartitioner(rdd: RDD[_], others: RDD[_]*): Partitioner = {
-val bySize = (Seq(rdd) ++ others).sortBy(_.partitions.length).reverse
-for (r <- bySize if r.partitioner.isDefined && 
r.partitioner.get.numPartitions > 0) {
-  return r.partitioner.get
+val rdds = Seq(rdd) ++ others
+val filteredRdds = rdds.filter( _.partitioner.exists(_.numPartitions > 
0 ))
--- End diff --

@srowen thank you very much , i am a new hand about saprk, but i'm 
interested in it very much,i have fixed my code style.thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15022: [SPARK-17465] [Spark Core] Inappropriate memory manageme...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15022
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65200/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15022: [SPARK-17465] [Spark Core] Inappropriate memory manageme...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15022
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15022: [SPARK-17465] [Spark Core] Inappropriate memory manageme...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15022
  
**[Test build #65200 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65200/consoleFull)**
 for PR 15022 at commit 
[`6b11fe8`](https://github.com/apache/spark/commit/6b11fe8d07728e9add07d8df5845658f9fef3e60).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14969: [SPARK-17406][WEB UI] limit timeline executor events

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14969
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65204/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14969: [SPARK-17406][WEB UI] limit timeline executor events

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14969
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14969: [SPARK-17406][WEB UI] limit timeline executor events

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14969
  
**[Test build #65204 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65204/consoleFull)**
 for PR 14969 at commit 
[`c725891`](https://github.com/apache/spark/commit/c7258916b8f34cc31edcb7033e783d990a3fa769).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14737: [SPARK-17171][WEB UI] DAG will list all partitions in th...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14737
  
**[Test build #65209 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65209/consoleFull)**
 for PR 14737 at commit 
[`5163a51`](https://github.com/apache/spark/commit/5163a51a81ea509bd76b3452fa33fb83078c279e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13758
  
**[Test build #65208 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65208/consoleFull)**
 for PR 13758 at commit 
[`8639319`](https://github.com/apache/spark/commit/863931994a2f24936b5312f7a8f79ae8204d57b1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14894: [SPARK-17330] [SPARK UT] Clean up spark-warehouse in UT

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14894
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14894: [SPARK-17330] [SPARK UT] Clean up spark-warehouse in UT

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14894
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65205/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14894: [SPARK-17330] [SPARK UT] Clean up spark-warehouse in UT

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14894
  
**[Test build #65205 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65205/consoleFull)**
 for PR 14894 at commit 
[`cd8e9a4`](https://github.com/apache/spark/commit/cd8e9a4ddf704f2d01df870cc898af53e62a9d2f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15042: [SPARK-17449] [Documentation] [Relation between heartbea...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15042
  
**[Test build #3254 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3254/consoleFull)**
 for PR 15042 at commit 
[`83031c4`](https://github.com/apache/spark/commit/83031c4d285db633c6468ef8471810765f62c0be).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14969: [SPARK-17406][WEB UI] limit timeline executor events

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14969
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65203/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14969: [SPARK-17406][WEB UI] limit timeline executor events

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14969
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14969: [SPARK-17406][WEB UI] limit timeline executor events

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14969
  
**[Test build #65203 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65203/consoleFull)**
 for PR 14969 at commit 
[`4dda55c`](https://github.com/apache/spark/commit/4dda55ca2f614228fbd6f926fd201073894a8abf).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15045: [Spark Core][MINOR] fix partitionBy error message

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15045
  
**[Test build #65207 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65207/consoleFull)**
 for PR 15045 at commit 
[`6520854`](https://github.com/apache/spark/commit/6520854c565b87c80bd96a26b9b2aaefa0c5f752).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when the data ...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15041
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65202/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15045: [Spark Core][MINOR] fix partitionBy error message

2016-09-10 Thread WeichenXu123
Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/15045
  
oh, there are 5 similar messages..
I check the others, the others may be set the default one,
so I update their message as "Specified or default partitioner..."
but the one in `partitionBy` must be set by user, can't use default one,
because the `partitionBy` API is let user specify how the RDD to be 
partitioned.
other cases, the `partitioner` parameter is optional and if user not 
specify it,
it will be set into the default `HashPartitioner`.
Now I update the code.
thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when the data ...

2016-09-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15041
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when the data ...

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15041
  
**[Test build #65202 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65202/consoleFull)**
 for PR 15041 at commit 
[`e7c6b16`](https://github.com/apache/spark/commit/e7c6b1625b6a67cfab958f64f5238811d5a39640).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15045: [Spark Core][MINOR] fix partitionBy error message

2016-09-10 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/15045
  
There are 5 instances of this check in the file -- they should all be 
handled the same way. I'm not sure this is accurate either because some code 
paths lead to these methods when HashPartitioner is used as a default. Just say 
that HashPartitioner can't be used? refactor one check method for this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14961: [SPARK-17379] [BUILD] Upgrade netty-all to 4.0.41 final ...

2016-09-10 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14961
  
@a-roberts are you in a position to add this change to this PR as an 
experiment? I can try it on the side too. I can't seem to reproduce the failure 
locally, even when fully rebuilding the project with a newer netty.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15045: [Spark Core][MINOR] fix partitionBy error message

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15045
  
**[Test build #65206 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65206/consoleFull)**
 for PR 15045 at commit 
[`d423b41`](https://github.com/apache/spark/commit/d423b4165a0b778852a76cf8d04615ec2465c4d0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15045: [Spark Core][MINOR] fix partitionBy error message

2016-09-10 Thread WeichenXu123
GitHub user WeichenXu123 opened a pull request:

https://github.com/apache/spark/pull/15045

[Spark Core][MINOR] fix partitionBy error message

## What changes were proposed in this pull request?

In order to avoid confusing user,
it is better to change `PairRDDfunctions.partitionBy` error message
from
`Default partitioner cannot partition array keys.`
==>
`Specified partitioner cannot partition array keys.`

## How was this patch tested?

N/A


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WeichenXu123/spark 
fix_partitionBy_error_message

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15045.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15045


commit d423b4165a0b778852a76cf8d04615ec2465c4d0
Author: WeichenXu 
Date:   2016-09-08T12:24:06Z

fix partitionBy error message




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15043: [SPARK-17491] Close serialization stream to fix wrong an...

2016-09-10 Thread lins05
Github user lins05 commented on the issue:

https://github.com/apache/spark/pull/15043
  
Did a simple test and it does fix the bug. One interesting thing is while 
records.count() returns a smaller number than the actual count, the spark UI 
still shows the correct records number, in my test case it's 2999808 v.s. 
30.

![screen shot 2016-09-10 at 5 57 38 
pm](https://cloud.githubusercontent.com/assets/717363/18409696/70a407e0-7780-11e6-9f22-7b55c24b0595.png)

![screen shot 2016-09-10 at 5 58 26 
pm](https://cloud.githubusercontent.com/assets/717363/18409697/75f5254e-7780-11e6-98fd-5cae496f7c22.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14671: [SPARK-17091][SQL] ParquetFilters rewrite IN to OR of Eq

2016-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14671
  
Thanks for confirming this. I will work on this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15028: [SPARK-17336][PYSPARK] Fix appending multiple times to P...

2016-09-10 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/15028
  
I think the current behavior might be worse on that dimension ... you might 
get several different versions of things at once on the classpath, not just 
redundant copies. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14969: [SPARK-17406][WEB UI] limit timeline executor events

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14969
  
**[Test build #65204 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65204/consoleFull)**
 for PR 14969 at commit 
[`c725891`](https://github.com/apache/spark/commit/c7258916b8f34cc31edcb7033e783d990a3fa769).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14894: [SPARK-17330] [SPARK UT] Clean up spark-warehouse in UT

2016-09-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14894
  
**[Test build #65205 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65205/consoleFull)**
 for PR 14894 at commit 
[`cd8e9a4`](https://github.com/apache/spark/commit/cd8e9a4ddf704f2d01df870cc898af53e62a9d2f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14894: [SPARK-17330] [SPARK UT] Clean up spark-warehouse in UT

2016-09-10 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14894
  
Jenkins test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14985: [SPARK-17396][core] Share the task support betwee...

2016-09-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14985


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14985: [SPARK-17396][core] Share the task support between Union...

2016-09-10 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14985
  
Merged to master/2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >