[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution by dedupli...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14452
  
**[Test build #63108 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63108/consoleFull)**
 for PR 14452 at commit 
[`55a44c8`](https://github.com/apache/spark/commit/55a44c85ebfb9a065902995662c2353fdc562224).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution by dedupli...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14452
  
**[Test build #63107 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63107/consoleFull)**
 for PR 14452 at commit 
[`00b29ed`](https://github.com/apache/spark/commit/00b29ede65b84e0fc99ab9e0ebd33f6092077bbc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14452: [SPARK-16849][SQL] Improve subquery execution by ...

2016-08-01 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/14452

[SPARK-16849][SQL] Improve subquery execution by deduplicating the 
subqueries with the same results

## What changes were proposed in this pull request?

The subqueries in SparkSQL will be run even they have the same physical 
plan and output same results. We should be able to deduplicate these subqueries 
which are referred in a query for many times.

## How was this patch tested?

Jenkins tests.




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 single-exec-subquery

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14452.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14452


commit 00b29ede65b84e0fc99ab9e0ebd33f6092077bbc
Author: Liang-Chi Hsieh 
Date:   2016-08-01T03:41:34Z

Dedup common subqueries.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14420: [SPARK-14204] [SQL] register driverClass rather than use...

2016-08-01 Thread zzcclp
Github user zzcclp commented on the issue:

https://github.com/apache/spark/pull/14420
  
@JoshRosen can you have a look at this pr?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14451: [SPARK-16848][SQL] Make jdbc() and read.format("jdbc") c...

2016-08-01 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14451
  
Yeah, keep it open. That PR just tries to get all the possible holes 
(corner cases).  You know, I do not care which PR is merged, but, in my 
opinion, we need to cover all the cases. That is for Read API. Originally, I 
think we should do the same for the Write API. Later, it sounds like the 
efforts are not worthy. Thus, I did not continue it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14451: [SPARK-16848][SQL] Make jdbc() and read.format("jdbc") c...

2016-08-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14451
  
Oh, you took a look already. Yes, it seems your PR includes this change. Do 
you mind if I leave this open? This bit seems arguably get merged quickly.

I don't mind if this credits to him (for other reviewers).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14451: [SPARK-16848][SQL] Make jdbc() and read.format("jdbc") c...

2016-08-01 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14451
  
Is this related to: https://github.com/apache/spark/pull/13770?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14451: [SPARK-16848][SQL] Make jdbc() and read.format("jdbc") c...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14451
  
**[Test build #63106 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63106/consoleFull)**
 for PR 14451 at commit 
[`3def251`](https://github.com/apache/spark/commit/3def251fb9bd213e0d343cd404f9896a576c0d74).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14451: Make jdbc() and read.format("jdbc") consistently ...

2016-08-01 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/14451

Make jdbc() and read.format("jdbc") consistently throwing exception for 
user-specified schema

## What changes were proposed in this pull request?

Currently,

```scala
spark.read.schema(StructType(Seq())).jdbc(...),show()
```

does not throws an exception whereas

```scala

spark.read.schema(StructType(Seq())).option(...).format("jdbc").load().show()
```

does as below:

```
jdbc does not allow user-specified schemas.;
org.apache.spark.sql.AnalysisException: jdbc does not allow user-specified 
schemas.;
at 
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:320)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:122)
at 
org.apache.spark.sql.jdbc.JDBCSuite$$anonfun$17.apply$mcV$sp(JDBCSuite.scala:351)
```

It'd make sense throwing the exception when user specifies schema 
identically.

This PR makes the behaviour consistent for both jdbc APIs.

## How was this patch tested?

Unit test in `JDBCSuite`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark SPARK-16848

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14451.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14451


commit 3def251fb9bd213e0d343cd404f9896a576c0d74
Author: hyukjinkwon 
Date:   2016-08-02T05:07:57Z

Make jdbc() and read.format("jdbc") consistently throwing exception for 
user-specified schema




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14449: [SPARK-16843][MLLIB] add the percentage ChiSquareSelecto...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14449
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14298: [SPARK-16283][SQL] Implement `percentile_approx` SQL fun...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14298
  
**[Test build #63105 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63105/consoleFull)**
 for PR 14298 at commit 
[`c0acf16`](https://github.com/apache/spark/commit/c0acf1697ad369302068aeaecad59e812038d14a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14450: [SPARK-16847][SQL] Prevent to potentially read corrupt s...

2016-08-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14450
  
This will be a legitimate change because this replaces the deprecated usage 
of constructor. Please let me cc @liancheng and @srowen as well as it is also 
partly about building.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14450: [SPARK-16847][SQL] Prevent to potentially read corrupt s...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14450
  
**[Test build #63104 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63104/consoleFull)**
 for PR 14450 at commit 
[`3c46111`](https://github.com/apache/spark/commit/3c461117852c86eae631b06cacfd72773653083c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14450: [SPARK-16847][SQL] Prevent to potentially read co...

2016-08-01 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/14450

[SPARK-16847][SQL] Prevent to potentially read corrupt statstics on binary 
in Parquet a VectorizedReader

## What changes were proposed in this pull request?

it is still possible to read corrupt Parquet's statistics.
This problem was found in 
[PARQUET-251](https://issues.apache.org/jira/browse/PARQUET-251) and we 
disabled filter pushdown on binary columns in Spark before.

We enabled this after upgrading Parquet but it seems there are potential 
incompatibility for Parquet files written in lower Spark versions.

Currently, this does not affect Parquet standard API. However, In Spark, we 
implemented a vectorized reader, separately with Parquet's standard API. For 
standard API, this is being handled but not in the vectorized reader.

This will be okay in Spark 2.0 because we don't use the statistics for not 
in vectorized reader, https://github.com/apache/spark/pull/13701. However, if 
we support this, we will meet this potential incompatibility. 

It is okay to just pass `FileMetaData`. This is being handled in parquet-mr 
(See 
https://github.com/apache/parquet-mr/commit/e3b95020f777eb5e0651977f654c1662e3ea1f29)

## How was this patch tested?

N/A


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark SPARK-16847

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14450.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14450


commit 3c461117852c86eae631b06cacfd72773653083c
Author: hyukjinkwon 
Date:   2016-08-02T04:31:04Z

Prevent to potentially read corrupt statstics on binary in Parquet via 
VectorizedReader




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14449: [SPARK-16843][MLLIB] add the percentage ChiSquare...

2016-08-01 Thread mpjlu
GitHub user mpjlu opened a pull request:

https://github.com/apache/spark/pull/14449

[SPARK-16843][MLLIB] add the percentage ChiSquareSelector feature

## What changes were proposed in this pull request?

add the percentage ChiSquareSelector feature


## How was this patch tested?

add scala ut




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mpjlu/spark chisquare2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14449.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14449


commit fb3c9a93e4b1b20f6738a3b56d8fb0604fbbb59e
Author: Peng, Meng 
Date:   2016-08-01T05:00:16Z

add the percentage ChiSquareSelector feature




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14446: [SPARK-16841][SQL] Improves the row level metrics perfor...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14446
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63101/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14446: [SPARK-16841][SQL] Improves the row level metrics perfor...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14446
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14446: [SPARK-16841][SQL] Improves the row level metrics perfor...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14446
  
**[Test build #63101 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63101/consoleFull)**
 for PR 14446 at commit 
[`1054b74`](https://github.com/apache/spark/commit/1054b74f18193378942b7fde26df36e06bff765e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11411: [SPARK-13385][MLlib] Enable AssociationRules to generate...

2016-08-01 Thread zhengruifeng
Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/11411
  
Test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12135: [SPARK-14352][SQL] approxQuantile should support multi c...

2016-08-01 Thread zhengruifeng
Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/12135
  
Test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12983: [SPARK-15213][PySpark] Unify 'range' usages

2016-08-01 Thread zhengruifeng
Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/12983
  
@srowen In Python2, `xrange` is more efficient than `range`.  
This PR add 'range = xrange' in files like `python/pyspark/accumulators.py` 
 `python/pyspark/heapq3.py` `python/pyspark/heapq3.py` etc. So those file may 
run faster in Python2. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13657: [SPARK-15939][ML][PySpark] Clarify ml.linalg usag...

2016-08-01 Thread zhengruifeng
Github user zhengruifeng closed the pull request at:

https://github.com/apache/spark/pull/13657


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14368: [SPARK-16734][EXAMPLES][SQL] Revise examples of all lang...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14368
  
**[Test build #63103 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63103/consoleFull)**
 for PR 14368 at commit 
[`e1a521f`](https://github.com/apache/spark/commit/e1a521fce6221629634b7f85335dc0ae568dd4c0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14368: [SPARK-16734][EXAMPLES][SQL] Revise examples of all lang...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14368
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14368: [SPARK-16734][EXAMPLES][SQL] Revise examples of all lang...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14368
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63103/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14448: [Spark-16579][SparkR] Add install.spark function

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14448
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14448: [Spark-16579][SparkR] Add install.spark function

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14448
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63102/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14448: [Spark-16579][SparkR] Add install.spark function

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14448
  
**[Test build #63102 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63102/consoleFull)**
 for PR 14448 at commit 
[`370cc5d`](https://github.com/apache/spark/commit/370cc5d567ab7d1568d64b2d3b3b63af5f22725f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14368: [SPARK-16734][EXAMPLES][SQL] Revise examples of all lang...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14368
  
**[Test build #63103 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63103/consoleFull)**
 for PR 14368 at commit 
[`e1a521f`](https://github.com/apache/spark/commit/e1a521fce6221629634b7f85335dc0ae568dd4c0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13893: [SPARK-14172][SQL] Hive table partition predicate not pa...

2016-08-01 Thread jiangxb1987
Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/13893
  
ping @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14368: [SPARK-16734][EXAMPLES][SQL] Revise examples of all lang...

2016-08-01 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14368
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14368: [SPARK-16734][EXAMPLES][SQL] Revise examples of all lang...

2016-08-01 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14368
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14401: [SPARK-16793][SQL]Set the temporary warehouse path to sc...

2016-08-01 Thread jiangxb1987
Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/14401
  
@rxin As @yhuai previously addressed, this change benifits in following 
cases:
1. Right now, we set the warehouse path to the default one firstly, and 
then we override the setting in `TestHiveSharedState` when we create 
`metadataHive`. This flow is not easy to follow and can introduce confusion in 
debugging.
2. Removing the field of `warehousePath` will be the first step in removing 
`TestHiveSessionState` and `TestHiveSharedState`, so that we can really test 
the reflection logic based on the setting of `CATALOG_IMPLEMENTATION`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14427: [SPARK-16818] Exchange reuse incorrectly reuses scans ov...

2016-08-01 Thread ericl
Github user ericl commented on the issue:

https://github.com/apache/spark/pull/14427
  
Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14427: [SPARK-16818] Exchange reuse incorrectly reuses s...

2016-08-01 Thread ericl
Github user ericl closed the pull request at:

https://github.com/apache/spark/pull/14427


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14427: [SPARK-16818] Exchange reuse incorrectly reuses scans ov...

2016-08-01 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14427
  
@ericl can you close this?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14427: [SPARK-16818] Exchange reuse incorrectly reuses scans ov...

2016-08-01 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14427
  
Merging in branch-2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14448: [Spark-16579][SparkR] Add install.spark function

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14448
  
**[Test build #63102 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63102/consoleFull)**
 for PR 14448 at commit 
[`370cc5d`](https://github.com/apache/spark/commit/370cc5d567ab7d1568d64b2d3b3b63af5f22725f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14448: [Spark-16579][SparkR] Add install.spark function

2016-08-01 Thread junyangq
GitHub user junyangq opened a pull request:

https://github.com/apache/spark/pull/14448

[Spark-16579][SparkR] Add install.spark function

## What changes were proposed in this pull request?

Add an `install.spark` function to the SparkR package. User can run 
`install.spark()` to install Spark to a local directory within R if not 
existing one found.

It searches for installation files in three ways, in the following order.

1. user provided mirror site in `mirrorUrl`

2. mirror site suggested from apache website

3. hardcoded backup option

## How was this patch tested?

Manual tests.

(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/junyangq/spark SPARK-16579-2.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14448.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14448


commit 0b676314a13a8a796ee45baf99f4bc6d936d01d5
Author: Junyang Qian 
Date:   2016-07-29T22:24:07Z

Add install.spark function to SparkR

Users can download and install Spark package inside R console




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14446: [SPARK-16841][SQL] Improves the row level metrics perfor...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14446
  
**[Test build #63101 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63101/consoleFull)**
 for PR 14446 at commit 
[`1054b74`](https://github.com/apache/spark/commit/1054b74f18193378942b7fde26df36e06bff765e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14446: [SPARK-16841][SQL] Improves the row level metrics perfor...

2016-08-01 Thread clockfly
Github user clockfly commented on the issue:

https://github.com/apache/spark/pull/14446
  
 retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor port rest...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/11157
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63099/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor port rest...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/11157
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor port rest...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/11157
  
**[Test build #63099 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63099/consoleFull)**
 for PR 11157 at commit 
[`efc1d18`](https://github.com/apache/spark/commit/efc1d183c2f04bd1bd71f2b5425432a588b68caa).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14446: [SPARK-16841][SQL] Improves the row level metrics perfor...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14446
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14446: [SPARK-16841][SQL] Improves the row level metrics perfor...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14446
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63097/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14440: [SPARK-16835][ML] add training data unpersist handling w...

2016-08-01 Thread WeichenXu123
Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/14440
  
sounds reasonable...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14396: [SPARK-16787] SparkContext.addFile() should not throw if...

2016-08-01 Thread JoshRosen
Github user JoshRosen commented on the issue:

https://github.com/apache/spark/pull/14396
  
@zsxwing, I meant to describe what happens on executors in the following 
scenario:

- `addFile(foo)` is called for the first time at `timestamp = 1`
- A task runs on an executor and downloads the copy of the file added at 
`timestamp = 1`.
  - By default, the [file fetch cache is 
enabled](https://github.com/apache/spark/blob/2eedc00b04ef8ca771ff64c4f834c25f835f5f44/core/src/main/scala/org/apache/spark/util/Utils.scala#L432)
 and filenames in that cache incorporate timestamps. Thus, this file will be 
downloaded to a file named `$timestamp_cache`.
- `addFile(foo)` is called a second time at `timestamp = 2` and the same 
file is passed to it.
- A task runs on an executor and discovers that the added file's timestamp 
(2) is newer than the timestamp of the file that it has already downloaded (1), 
so it tries to fetch files again:
  - Because the file with the newer timestamp is not present in the fetch 
file cache, a new copy of the file will be downloaded. **<--- this is the 
second download I was referring to**

If the fetch file cache is disabled, on the other hand, then we directly 
call 
[`doFetchFile`](https://github.com/apache/spark/blob/2eedc00b04ef8ca771ff64c4f834c25f835f5f44/core/src/main/scala/org/apache/spark/util/Utils.scala#L617)
 which, in turn, will call `downloadFile()`, which [downloads the file to a 
temporary 
file](https://github.com/apache/spark/blob/2eedc00b04ef8ca771ff64c4f834c25f835f5f44/core/src/main/scala/org/apache/spark/util/Utils.scala#L499)
 before considering whether to overwrite an existing file.

In either case, it looks like re-adding a file with a new timestamp will 
trigger downloads on the executors and those downloads will be unnecessary if 
the file's contents are unchanged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14440: [SPARK-16835][ML] add training data unpersist han...

2016-08-01 Thread WeichenXu123
Github user WeichenXu123 closed the pull request at:

https://github.com/apache/spark/pull/14440


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14446: [SPARK-16841][SQL] Improves the row level metrics perfor...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14446
  
**[Test build #63097 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63097/consoleFull)**
 for PR 14446 at commit 
[`1054b74`](https://github.com/apache/spark/commit/1054b74f18193378942b7fde26df36e06bff765e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13778: [SPARK-16062][SPARK-15989][SQL] Fix two bugs of Python-o...

2016-08-01 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/13778
  
ping @cloud-fan again, this is waiting for a while. Do you have time to 
look at again? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptron Class...

2016-08-01 Thread keypointt
Github user keypointt commented on the issue:

https://github.com/apache/spark/pull/14447
  
hi @mengxr , could you please tell me how to debug R wrapper? Thanks a lot

I tried to read documentation and google, but cannot figure it out myself. 
From SparkR console, the error message is too vague as below, and I tried to 
`tail -f` spark logs but no error messages, and also I tried to create a 
`RWrapperSuite` but the class is private and cannot be accessed.

```
> model <- spark.mlp(irisDF, ~ Sepal_Length + Sepal_Width + Petal_Length + 
Petal_Width, blockSize=128, initialWeights=seq(1, 9, by = 2), layers=3, 
solver='LBFGS', seed=1234L, maxIter=100, tol=0.5, stepSize=1)
16/08/01 17:16:38 ERROR RBackendHandler: fit on 
org.apache.spark.ml.r.MultilayerPerceptronClassifierWrapper failed
Error in invokeJava(isStatic = TRUE, className, methodName, ...) :
In addition: Warning message:
In if (is.na(object)) { :
  the condition has length > 1 and only the first element will be used
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14442: [SPARK-16836][SQL] Add support for CURRENT_DATE/CURRENT_...

2016-08-01 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14442
  
Can you add an end-to-end test for this in SQLQuerySuite? It's not a great 
place but we will refactor it soon.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptron Class...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14447
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptron Class...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14447
  
**[Test build #63100 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63100/consoleFull)**
 for PR 14447 at commit 
[`04f7fed`](https://github.com/apache/spark/commit/04f7fed0682548068d4bfddebce7bed276432a4d).
 * This patch **fails R style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptron Class...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14447
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63100/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptron Class...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14447
  
**[Test build #63100 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63100/consoleFull)**
 for PR 14447 at commit 
[`04f7fed`](https://github.com/apache/spark/commit/04f7fed0682548068d4bfddebce7bed276432a4d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14434: [SPARK-16828][SQL] remove MaxOf and MinOf

2016-08-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14434


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14447: [SPARK-16445][MLlib][SparkR] Multilayer Perceptro...

2016-08-01 Thread keypointt
GitHub user keypointt opened a pull request:

https://github.com/apache/spark/pull/14447

[SPARK-16445][MLlib][SparkR] Multilayer Perceptron Classifier wrapper in 
SparkR

https://issues.apache.org/jira/browse/SPARK-16445

## What changes were proposed in this pull request?

Create Multilayer Perceptron Classifier wrapper in SparkR


## How was this patch tested?

Tested manually on local machine

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/keypointt/spark SPARK-16445

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14447.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14447


commit 400a5ec398bbae0f50f3220cec2acc20bc8b1d6a
Author: Xin Ren 
Date:   2016-07-15T19:20:17Z

[SPARK-16445] add to r method list

commit 2e3fe27bd7f6109cacbe3e6b8a675c4034cc11e4
Author: Xin Ren 
Date:   2016-07-16T17:54:32Z

Merge branch 'master' into SPARK-16445

commit 4d822b0d8e67adf264ff2019766ddbf3221934e8
Author: Xin Ren 
Date:   2016-07-18T06:26:50Z

Merge branch 'master' into SPARK-16445

commit ce6f74d5020282762adbd96e101ed95ae9abf92c
Author: Xin Ren 
Date:   2016-07-19T06:56:29Z

[SPARK-16445] add R method for monmlp

commit eff4097ffd9d66b7f226b07a8b6e89c4f7c15336
Author: Xin Ren 
Date:   2016-07-20T06:26:46Z

[SPARK-16445] add fit() in r wrapper

commit fb87bd58f1490356d9c0b99d791194e1a18f03e6
Author: Xin Ren 
Date:   2016-07-22T05:59:45Z

[SPARK-16445] model exists already, remove added ones

commit 0ed2280d7ca7d25b86411b3d97fa3e85353b19b1
Author: Xin Ren 
Date:   2016-07-22T06:11:04Z

[SPARK-16445] rename, monmlp, to, mlp

commit 2d1d1400fc168a7628c62df88d8267ca12eceb0a
Author: Xin Ren 
Date:   2016-07-22T06:22:56Z

[SPARK-16445] fix styles

commit bddde5c09bd65a2608c4287c9461b61c3598efab
Author: Xin Ren 
Date:   2016-07-22T06:42:07Z

[SPARK-16445] r style fix

commit fc3b9492f6333e1049d2ea483e141f442a152098
Author: Xin Ren 
Date:   2016-07-22T06:51:12Z

[SPARK-16445] missed json4s import

commit f3aa8fd75a67c557e193a9f030b07001781097a1
Author: Xin Ren 
Date:   2016-07-26T22:04:16Z

Merge branch 'master' into SPARK-16445

commit 61c8122a2584dafb581b045bd3cd7c9742022786
Author: Xin Ren 
Date:   2016-07-26T22:04:31Z

Merge branch 'SPARK-16445' of https://github.com/keypointt/spark into 
SPARK-16445

commit 07638f4f310469109ca766d14916a77960f80987
Author: Xin Ren 
Date:   2016-07-27T00:10:04Z

[SPARK-16445] correct r method name

commit 79675ad567e16494d1d2445b773dc6fd3649bc7c
Author: Xin Ren 
Date:   2016-07-27T01:03:34Z

[SPARK-16445] tmp save

commit 2d66705a4f26e2823c7102032f596d74a278bc68
Author: Xin Ren 
Date:   2016-07-29T22:36:39Z

[SPARK-16445] fix model name

commit b7c4f0cd4870054eb628b333c016fabea37eb957
Author: Xin Ren 
Date:   2016-07-30T00:50:55Z

[SPARK-16445] fix parameters

commit 52c23106d1623a6f54fb7ed2eae842988e8c7bbf
Author: Xin Ren 
Date:   2016-08-01T22:19:29Z

Merge branch 'master' into SPARK-16445

commit 04f7fed0682548068d4bfddebce7bed276432a4d
Author: Xin Ren 
Date:   2016-08-02T00:52:27Z

[SPARK-16445] r test failing




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14434: [SPARK-16828][SQL] remove MaxOf and MinOf

2016-08-01 Thread yhuai
Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/14434
  
Thanks. Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14392: [SPARK-16446] [SparkR] [ML] Gaussian Mixture Mode...

2016-08-01 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/14392#discussion_r73079016
  
--- Diff: R/pkg/R/mllib.R ---
@@ -632,3 +659,106 @@ setMethod("predict", signature(object = 
"AFTSurvivalRegressionModel"),
   function(object, newData) {
 return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
   })
+
+#' Multivariate Gaussian Mixture Model (GMM)
+#'
+#' Fits multivariate gaussian mixture model against a Spark DataFrame, 
similarly to R's
+#' mvnormalmixEM(). Users can call \code{summary} to print a summary of 
the fitted model,
+#' \code{predict} to make predictions on new data, and 
\code{write.ml}/\code{read.ml}
+#' to save/load fitted models.
+#'
+#' @param data SparkDataFrame for training
+#' @param formula A symbolic description of the model to be fitted. 
Currently only a few formula
+#'operators are supported, including '~', '.', ':', '+', 
and '-'.
+#'Note that the response variable of formula is empty in 
spark.mvnormalmixEM.
+#' @param k Number of independent Gaussians in the mixture model.
+#' @param maxIter Maximum iteration number
+#' @param tol The convergence tolerance
+#' @aliases spark.mvnormalmixEM,SparkDataFrame,formula-method
+#' @return \code{spark.mvnormalmixEM} returns a fitted multivariate 
gaussian mixture model
+#' @rdname spark.mvnormalmixEM
+#' @name spark.mvnormalmixEM
+#' @export
+#' @examples
+#' \dontrun{
+#' sparkR.session()
+#' library(mvtnorm)
+#' set.seed(100)
+#' a <- rmvnorm(4, c(0, 0))
+#' b <- rmvnorm(6, c(3, 4))
+#' data <- rbind(a, b)
+#' df <- createDataFrame(as.data.frame(data))
+#' model <- spark.mvnormalmixEM(df, ~ V1 + V2, k = 2)
+#' summary(model)
+#'
+#' # fitted values on training data
+#' fitted <- predict(model, df)
+#' head(select(fitted, "V1", "prediction"))
+#'
+#' # save fitted model to input path
+#' path <- "path/to/model"
+#' write.ml(model, path)
+#'
+#' # can also read back the saved model and print
+#' savedModel <- read.ml(path)
+#' summary(savedModel)
+#' }
+#' @note spark.mvnormalmixEM since 2.1.0
+#' @seealso mixtools: 
\url{https://cran.r-project.org/web/packages/mixtools/}
+#' @seealso \link{predict}, \link{read.ml}, \link{write.ml}
+setMethod("spark.mvnormalmixEM", signature(data = "SparkDataFrame", 
formula = "formula"),
+  function(data, formula, k = 2, maxIter = 100, tol = 0.01) {
+formula <- paste(deparse(formula), collapse = "")
+jobj <- 
callJStatic("org.apache.spark.ml.r.GaussianMixtureWrapper", "fit", data@sdf,
+formula, as.integer(k), 
as.integer(maxIter), tol)
+return(new("GaussianMixtureModel", jobj = jobj))
+  })
+
+#  Get the summary of a multivariate gaussian mixture model
+
+#' @param object A fitted gaussian mixture model
+#' @return \code{summary} returns the model's lambda, mu, sigma and 
posterior
+#' @rdname spark.mvnormalmixEM
--- End diff --

You can also run the `check-cran.sh` script in `R/` and see if there are 
any warnings related to the methods being added in this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13647: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2016-08-01 Thread wangmiao1981
Github user wangmiao1981 commented on the issue:

https://github.com/apache/spark/pull/13647
  
ping @mengxr @jkbradley @yanboliang Can you give me some comments on this 
PR? I can start improving it for 2.1+.

Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14433: [SPARK-16829][SparkR]:sparkR sc.setLogLevel doesn't work

2016-08-01 Thread wangmiao1981
Github user wangmiao1981 commented on the issue:

https://github.com/apache/spark/pull/14433
  
@felixcheung I will try to retrieve terminal/shell type before printing out 
the message. I will update the PR if I can find a way of doing that. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14445: [SPARK-16320] [SQL] Fix performance regression for parqu...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14445
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63096/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14445: [SPARK-16320] [SQL] Fix performance regression for parqu...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14445
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14445: [SPARK-16320] [SQL] Fix performance regression for parqu...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14445
  
**[Test build #63096 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63096/consoleFull)**
 for PR 14445 at commit 
[`272fb81`](https://github.com/apache/spark/commit/272fb8100f1861d78f78d7bc34e1ff68284b773a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14368: [SPARK-16734][EXAMPLES][SQL] Revise examples of a...

2016-08-01 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/14368#discussion_r73076628
  
--- Diff: examples/src/main/r/RSparkSQLExample.R ---
@@ -18,31 +18,43 @@
 library(SparkR)
 
 # $example on:init_session$
-sparkR.session(appName = "MyApp", sparkConfig = list(spark.executor.memory 
= "1g"))
+sparkR.session(appName = "MyApp", sparkConfig = 
list(spark.some.config.option = "some-value"))
--- End diff --

It's just an example for how to set extra configuration options. It's not 
read anywhere.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor port rest...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/11157
  
**[Test build #63099 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63099/consoleFull)**
 for PR 11157 at commit 
[`efc1d18`](https://github.com/apache/spark/commit/efc1d183c2f04bd1bd71f2b5425432a588b68caa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor port rest...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/11157
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor port rest...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/11157
  
**[Test build #63098 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63098/consoleFull)**
 for PR 11157 at commit 
[`a8e828f`](https://github.com/apache/spark/commit/a8e828fecb88c254a89ee82e68c9aa548969dfdb).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor port rest...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/11157
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63098/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor po...

2016-08-01 Thread skonto
Github user skonto commented on a diff in the pull request:

https://github.com/apache/spark/pull/11157#discussion_r73074211
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala
 ---
@@ -356,4 +374,233 @@ private[mesos] trait MesosSchedulerUtils extends 
Logging {
 
sc.conf.getTimeAsSeconds("spark.mesos.rejectOfferDurationForReachedMaxCores", 
"120s")
   }
 
+  /**
+   * Checks executor ports if they are within some range of the offered 
list of ports ranges,
+   *
+   * @param sc the Spark Context
+   * @param ports the list of ports to check
+   * @return true if ports are within range false otherwise
+   */
+  protected def checkPorts(sc: SparkContext, ports: List[(Long, Long)]): 
Boolean = {
+
+def checkIfInRange(port: Long, ps: List[(Long, Long)]): Boolean = {
+  ps.exists(r => r._1 <= port & r._2 >= port)
+}
+
+val portsToCheck = ManagedPorts.getPortValues(sc.conf)
+val nonZeroPorts = portsToCheck.filter(_ != 0)
+val withinRange = nonZeroPorts.forall(p => checkIfInRange(p, ports))
+// make sure we have enough ports to allocate per offer
+ports.map(r => r._2 - r._1 + 1).sum >= portsToCheck.size && withinRange
+  }
+
+  /**
+   * Partitions port resources.
+   *
+   * @param conf the spark config
+   * @param ports the ports offered
+   * @return resources left, port resources to be used and the list of 
assigned ports
+   */
+  def partitionPorts(
+  conf: SparkConf,
+  ports: List[Resource])
+: (List[Resource], List[Resource], List[Long]) = {
+val taskPortRanges = getRangeResourceWithRoleInfo(ports.asJava, 
"ports")
+val portsToCheck = ManagedPorts.getPortValues(conf)
+val nonZeroPorts = portsToCheck.filter(_ != 0)
+// reserve non zero ports first
+val nonZeroResources = reservePorts(taskPortRanges, nonZeroPorts)
+// reserve actual port numbers for zero ports - not set by the user
+val numOfZeroPorts = portsToCheck.count(_ == 0)
+val randPorts = pickRandomPortsFromRanges(nonZeroResources._1, 
numOfZeroPorts)
+val zeroResources = reservePorts(nonZeroResources._1, randPorts)
+val (portResourcesLeft, portResourcesToBeUsed) =
+  createResources(nonZeroResources, zeroResources)
+(portResourcesLeft, portResourcesToBeUsed, nonZeroPorts ++ randPorts)
+  }
+
+  private object ManagedPorts {
+val portNames = List("spark.executor.port", "spark.blockManager.port")
+
+def getPortValues(conf: SparkConf): List[Long] = {
+  portNames.map(conf.getLong(_, 0))
+}
+  }
+
+  private def createResources(
--- End diff --

ok


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor port rest...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/11157
  
**[Test build #63098 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63098/consoleFull)**
 for PR 11157 at commit 
[`a8e828f`](https://github.com/apache/spark/commit/a8e828fecb88c254a89ee82e68c9aa548969dfdb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14446: [SPARK-16841][SQL] Improves the row level metrics perfor...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14446
  
**[Test build #63097 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63097/consoleFull)**
 for PR 14446 at commit 
[`1054b74`](https://github.com/apache/spark/commit/1054b74f18193378942b7fde26df36e06bff765e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14434: [SPARK-16828][SQL] remove MaxOf and MinOf

2016-08-01 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14434#discussion_r73073652
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -662,10 +662,6 @@ object NullPropagation extends Rule[LogicalPlan] {
   case e @ Substring(_, Literal(null, _), _) => Literal.create(null, 
e.dataType)
   case e @ Substring(_, _, Literal(null, _)) => Literal.create(null, 
e.dataType)
 
-  // MaxOf and MinOf can't do null propagation
-  case e: MaxOf => e
-  case e: MinOf => e
--- End diff --

no, we put `MaxOf` and `MinOf` here because they are a special case of 
`BinaryArithmetic`, but `Greatest` and `Least` is not binary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14446: [SPARK-16841][SQL] Improves the row level metrics...

2016-08-01 Thread clockfly
GitHub user clockfly opened a pull request:

https://github.com/apache/spark/pull/14446

[SPARK-16841][SQL] Improves the row level metrics performance when reading 
Parquet table

## What changes were proposed in this pull request?

When reading from Parquet table, Spark updates row level metrics like 
recordsRead, bytesRead.
The implementation is not very efficient. It may take 20% of read them to 
update these metrics.

Test benchmark:
```
// Generates parquet table with nested columns

spark.range(1).select(struct($"id").as("nc")).write.parquet("/tmp/data4")

def time[R](block: => R): Long = {
val t0 = System.nanoTime()
val result = block// call-by-name
val t1 = System.nanoTime()
println("Elapsed time: " + (t1 - t0)/100 + "ms")
(t1 - t0)/100
}

val x = ((0 until 20).toList.map(x => 
time(spark.read.parquet("/tmp/data4").filter($"nc.id" < 100).collect(.sum/20
```

## How was this patch tested?

Exisiting unit tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/clockfly/spark improve_metrics_performance

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14446.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14446


commit 1054b74f18193378942b7fde26df36e06bff765e
Author: Sean Zhong 
Date:   2016-08-01T23:35:30Z

improve row level metrics performance




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14411: [SPARK-16804][SQL] Correlated subqueries containing LIMI...

2016-08-01 Thread nsyca
Github user nsyca commented on the issue:

https://github.com/apache/spark/pull/14411
  
@hvanhovell,

First, my apologies for delaying the replies. I am travelling this week, 
only getting spontaneous connections. Thank you for your explanation of the 
implementation and the reason behind the choice of the implementation. It is 
very helpful for a beginner like me.

My bad! What I meant in my previous comment on rewriting of subqueries to 
join is actually the moving of the positions of the correlated predicates from 
their original positions to outside of the scopes of subqueries, specifically, 
the call to the function pullOutCorrelatedPredicates() -- I hope I got it right 
this time. I see this as one of the root causes of many problems. Bear with me, 
I don't have a good solution as I am still getting myself familiar with the 
code. Here is an example of the problems, in my opinion. With the rewrite, we 
cannot distinct between the EXISTS form and IN form of the original SQL.

select * from t1 where exists (select 1 from t2 where t1.c1=t2.c2)
-and-
select * from t1 where t1.c1 in (select t2.c2 from t2)

are represented after Analysis phase. This does not have issue because they 
are semantically equivalent. However, when we add the NOT in

select * from t1 where not exists (select 1 from t2 where t1.c1=t2.c2)
-and-
select * from t1 where t1.c1 not in (select t2.c2 from t2)

are NOT semantically equivalent when T2.C2 can produce NULL values.

Lastly, your comment on the operator SAMPLE seems right. I will give it 
shot on adding it to this PR.

Thanks again for your patience.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14445: [SPARK-16320] [SQL] Fix performance regression for parqu...

2016-08-01 Thread clockfly
Github user clockfly commented on the issue:

https://github.com/apache/spark/pull/14445
  
@gatorsmile  Thanks! updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14176: [SPARK-16525][SQL] Enable Row Based HashMap in Ha...

2016-08-01 Thread sameeragarwal
Github user sameeragarwal commented on a diff in the pull request:

https://github.com/apache/spark/pull/14176#discussion_r73070005
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ---
@@ -279,9 +280,15 @@ case class HashAggregateExec(
 .map(_.asInstanceOf[DeclarativeAggregate])
   private val bufferSchema = 
StructType.fromAttributes(aggregateBufferAttributes)
 
-  // The name for Vectorized HashMap
-  private var vectorizedHashMapTerm: String = _
-  private var isVectorizedHashMapEnabled: Boolean = _
+  // The name for Fast HashMap
+  private var fastHashMapTerm: String = _
+  // whether vectorized hashmap or row based hashmap is enabled
+  // we make sure that at most one of the two flags is true
+  // i.e., assertFalse(isVectorizedHashMapEnabled && 
isRowBasedHashMapEnabled)
+  private var isVectorizedHashMapEnabled: Boolean = false
+  private var isRowBasedHashMapEnabled: Boolean = false
+  // auxiliary flag, true if any of two above is true
+  private var isFastHashMapEnabled: Boolean = false
--- End diff --

Sure, what I meant was that we can even initialize it with 
`isVectorizedHashMapEnabled || isRowBasedHashMapEnabled` to make the implied 
semantics clear.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14384
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63095/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14384
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14445: [SPARK-16320][SQL] Fix performance regression for parque...

2016-08-01 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14445
  
Maybe, we can correct the PR description and make it more accurate.

This PR avoids the extra memory copy when the vectorized parquet record 
reader is not being used for reading a non-partitioned Parquet table. One of 
the typical case is the parquet table with non atomic types, including null, 
UDTs, arrays, structs, and maps. Another case is when users set 
`spark.sql.parquet.enableVectorizedReader` to `false`. 

Is my understanding correct?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14384
  
**[Test build #63095 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63095/consoleFull)**
 for PR 14384 at commit 
[`119e576`](https://github.com/apache/spark/commit/119e57601b6bd0b9aa0ad29ca20624f18f13a362).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14445: [SPARK-16320][SQL] Fix performance regression for parque...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14445
  
**[Test build #63096 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63096/consoleFull)**
 for PR 14445 at commit 
[`272fb81`](https://github.com/apache/spark/commit/272fb8100f1861d78f78d7bc34e1ff68284b773a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14442: [SPARK-16836][SQL] Add support for CURRENT_DATE/CURRENT_...

2016-08-01 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14442
  
LGTM

FYI, MySQL and PostgreSQL support NOW as a synonym of CURRENT_TIMESTAMP


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14384
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14384
  
**[Test build #63094 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63094/consoleFull)**
 for PR 14384 at commit 
[`b376dfb`](https://github.com/apache/spark/commit/b376dfb35dc8ebb90804264dd5683514a3166d9e).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14384
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63094/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14445: [SPARK-16320][SQL] Fix performance regression for...

2016-08-01 Thread clockfly
GitHub user clockfly opened a pull request:

https://github.com/apache/spark/pull/14445

[SPARK-16320][SQL] Fix performance regression for parquet table with nested 
fields

## What changes were proposed in this pull request?

For non-partitioned parquet table with nested column, Spark 2.0 adds an 
extra unnecessary memory copy to append partition values for each row.

By fixing this bug, we get about 30% performance gain in test case like 
this:

```
// Generates parquet table with nested columns

spark.range(1).select(struct($"id").as("nc")).write.parquet("/tmp/data4")

val t0 = System.nanoTime()
val x = ((0 until 20).toList.map(x => 
time(spark.read.parquet("/tmp/data4").filter($"nc.id" < 100).collect(.sum/20
println("Elapsed time: " + (System.nanoTime() - t0)/100 + "ms")
``` 


## How was this patch tested?

Existing unit tests




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/clockfly/spark fix_parquet_regression_2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14445.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14445


commit 272fb8100f1861d78f78d7bc34e1ff68284b773a
Author: Sean Zhong 
Date:   2016-08-01T04:29:44Z

fix parquet_regression




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14439: [SPARK-16714][SPARK-16735][SPARK-16646] array, ma...

2016-08-01 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14439#discussion_r73064350
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -157,6 +145,26 @@ object TypeCoercion {
 })
   }
 
+  /**
+   * Similar to [[findWiderCommonType]], but can't promote to string.
+   */
+  private def findWiderTypeWithoutStringPromotion(types: Seq[DataType]): 
Option[DataType] = {
--- End diff --

It is weird that its name is `findWiderTypeWithoutStringPromotion` because 
`findTightestCommonTypeOfTwo` is used inside. Also, let's add more docs to this 
method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14439: [SPARK-16714][SPARK-16735][SPARK-16646] array, map, grea...

2016-08-01 Thread yhuai
Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/14439
  
It will be good to summarize the behaviors of other systems in the 
description. Let's also explain the behavioral change of this pr in the 
description. So, others can understand its implication. 

Also, I am wondering if we can change the behavior of 
`DecimalPrecision.widerDecimalType`. Right now, `widerDecimalType` will 
truncate the integral part, which is not intuitive. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14384
  
**[Test build #63095 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63095/consoleFull)**
 for PR 14384 at commit 
[`119e576`](https://github.com/apache/spark/commit/119e57601b6bd0b9aa0ad29ca20624f18f13a362).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14212: [SPARK-16558][Examples][MLlib] examples/mllib/LDAExample...

2016-08-01 Thread yinxusen
Github user yinxusen commented on the issue:

https://github.com/apache/spark/pull/14212
  
@MLnick They serve different purpose. This one is for users who have built 
their tools upon it. The `LatentDirichletAllocationExample` is for ML docs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14444: [SPARK-16839] [SQL] redundant aliases after cleanupAlias...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/1
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14444: [SPARK-16839] [SQL] redundant aliases after cleanupAlias...

2016-08-01 Thread eyalfa
Github user eyalfa commented on the issue:

https://github.com/apache/spark/pull/1
  
@cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14444: [SPARK-16839] [SQL] redundant aliases after clean...

2016-08-01 Thread eyalfa
GitHub user eyalfa opened a pull request:

https://github.com/apache/spark/pull/1

[SPARK-16839] [SQL] redundant aliases after cleanupAliases

## What changes were proposed in this pull request?
a failing test, soon to add a proposed fix

## How was this patch tested?
running the analysis suite, making sure added test fails while existing 
tests are still passing.

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)


(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/eyalfa/spark 
SPARK-16839_redundant_aliases_after_cleanupAliases

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1


commit 313d584532a4435e02e39e574070354cdef240ea
Author: eyal farago 
Date:   2016-08-01T21:51:45Z

SPARK-16839_redundant_aliases_after_cleanupAliases: failing test.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14384
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14384
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63093/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-08-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14384
  
**[Test build #63093 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63093/consoleFull)**
 for PR 14384 at commit 
[`b376dfb`](https://github.com/apache/spark/commit/b376dfb35dc8ebb90804264dd5683514a3166d9e).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14441: [SPARK-16837] [SQL] TimeWindow incorrectly drops slideDu...

2016-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14441
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63090/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >