[GitHub] spark issue #19124: [SPARK-21912][SQL] ORC/Parquet table should not create i...

2017-09-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19124
  
Thank you for your reviewing and helping this PR, @tejasapatil , @viirya , 
and @HyukjinKwon , too!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...

2017-09-06 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18975#discussion_r137449077
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLParserSuite.scala
 ---
@@ -524,6 +525,50 @@ class DDLParserSuite extends PlanTest with 
SharedSQLContext {
 assert(e.message.contains("you can only specify one of them."))
   }
 
+  test("insert overwrite directory") {
+val v1 = "INSERT OVERWRITE DIRECTORY '/tmp/file' USING parquet SELECT 
1 as a"
+parser.parsePlan(v1) match {
+  case InsertIntoDir(_, storage, provider, query, overwrite) =>
+assert(storage.locationUri != None && 
storage.locationUri.get.toString == "/tmp/file")
--- End diff --

Nit:
```Scala
assert(storage.locationUri.isDefined && storage.locationUri.get.toString == 
"/tmp/file")
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19124: [SPARK-21912][SQL] ORC/Parquet table should not create i...

2017-09-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19124
  
@gatorsmile . Thank you for your help! This PR is almost made by you.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...

2017-09-06 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18975#discussion_r137448976
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLParserSuite.scala
 ---
@@ -32,7 +32,8 @@ import 
org.apache.spark.sql.catalyst.dsl.plans.DslLogicalPlan
 import org.apache.spark.sql.catalyst.expressions.JsonTuple
 import org.apache.spark.sql.catalyst.parser.ParseException
 import org.apache.spark.sql.catalyst.plans.PlanTest
-import org.apache.spark.sql.catalyst.plans.logical.{Generate, LogicalPlan, 
Project, ScriptTransformation}
+import org.apache.spark.sql.catalyst.plans.logical.{Generate, 
InsertIntoDir, LogicalPlan,
+Project, ScriptTransformation}
--- End diff --

We do not have a limit of characters. If it is too long, our style is 
```Scala
import org.apache.spark.sql.catalyst.plans.logical.{Generate, 
InsertIntoDir, LogicalPlan}
import org.apache.spark.sql.catalyst.plans.logical.{Project, 
ScriptTransformation}
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19124: [SPARK-21912][SQL] ORC/Parquet table should not c...

2017-09-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19124


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19124: [SPARK-21912][SQL] ORC/Parquet table should not create i...

2017-09-06 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19124
  
Thanks! Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19124: [SPARK-21912][SQL] ORC/Parquet table should not create i...

2017-09-06 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19124
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17451: [SPARK-19866][ML][PySpark] Add local version of Word2Vec...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17451
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81493/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17451: [SPARK-19866][ML][PySpark] Add local version of Word2Vec...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17451
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17451: [SPARK-19866][ML][PySpark] Add local version of Word2Vec...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17451
  
**[Test build #81493 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81493/testReport)**
 for PR 17451 at commit 
[`b4d928d`](https://github.com/apache/spark/commit/b4d928d41b9d1d97c512d1f6c5381db4589cd793).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSub...

2017-09-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19151


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-06 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19086
  
If we follow what our current way, rename becomes a special case. All the 
other commands are following different resolution ways.

Just curious which company are your from? I am trying to see the impact of 
this PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...

2017-09-06 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19151
  
Thanks! Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...

2017-09-06 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19151
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19149: [SPARK-21652][SQL][FOLLOW-UP] Fix rule conflict between ...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19149
  
**[Test build #81495 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81495/testReport)**
 for PR 19149 at commit 
[`1a22533`](https://github.com/apache/spark/commit/1a22533e21fd98e815ad425e6e46228b97e55386).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts

2017-09-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19150
  
`BlockGeneratorSuite` and `StreamTest` is fixed and tested.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19136: [DO NOT MERGE][SPARK-15689][SQL] data source v2

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19136
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81490/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19136: [DO NOT MERGE][SPARK-15689][SQL] data source v2

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19136
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19136: [DO NOT MERGE][SPARK-15689][SQL] data source v2

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19136
  
**[Test build #81490 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81490/testReport)**
 for PR 19136 at commit 
[`89cbfb7`](https://github.com/apache/spark/commit/89cbfb7c98325852c5c97d321d83fe91154b129e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public class DataSourceV2Options `
  * `public abstract class DataSourceV2Reader `
  * `class RowToUnsafeRowReadTask implements ReadTask `
  * `class RowToUnsafeDataReader implements DataReader `
  * `class DataSourceRDDPartition(val index: Int, val readTask: 
ReadTask[UnsafeRow])`
  * `class DataSourceRDD(`
  * `case class DataSourceV2Relation(`
  * `case class DataSourceV2ScanExec(`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19150
  
**[Test build #81494 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81494/testReport)**
 for PR 19150 at commit 
[`678d1b2`](https://github.com/apache/spark/commit/678d1b214f2d8cec72b47782177e406cbab9f5ee).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17451: [SPARK-19866][ML][PySpark] Add local version of Word2Vec...

2017-09-06 Thread keypointt
Github user keypointt commented on the issue:

https://github.com/apache/spark/pull/17451
  
```
>>> from pyspark.ml.feature import Word2Vec
>>> sent = ("a b " * 100 + "a c " * 10).split(" ")
>>> doc = spark.createDataFrame([(sent,), (sent,)], ["sentence"])
>>> word2Vec = Word2Vec(vectorSize=5, seed=42, inputCol="sentence", 
outputCol="model")
>>> model = word2Vec.fit(doc)
```
above is the setup, and I created the `vec` below. It's fitting in  
`model.findSynonyms` nicely
```
>>> from pyspark.ml.linalg import Vectors
>>> vec = Vectors.dense([0.267, -0.2691, 0.058, -0.0801, 0.1821, 0.4162, 
0.0259, -0.2163, 0.1787, 0.0764])

>>> model.findSynonyms(vec, 2)
DataFrame[word: string, similarity: double]
```
but `vec` cannot fit in `model.findSynonymsArray` even its type is ``
```
>>> model.findSynonymsArray(vec, 2)
word:
[0.267,-0.2691,0.058,-0.0801,0.1821,0.4162,0.0259,-0.2163,0.1787,0.0764]
Traceback (most recent call last):
  File "", line 1, in 
  File 
"/Users/renxin/Documents/workspace/spark/python/pyspark/ml/feature.py", line 
2951, in findSynonymsArray
tuples = self._java_obj.findSynonymsArray(word, num)
  File 
"/Users/renxin/Documents/workspace/spark/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py",
 line 1160, in __call__
  File 
"/Users/renxin/Documents/workspace/spark/python/pyspark/sql/utils.py", line 63, 
in deco
return f(*a, **kw)
  File 
"/Users/renxin/Documents/workspace/spark/python/lib/py4j-0.10.6-src.zip/py4j/protocol.py",
 line 324, in get_return_value
py4j.protocol.Py4JError: An error occurred while calling 
o65.findSynonymsArray. Trace:
py4j.Py4JException: Method findSynonymsArray([class java.util.ArrayList, 
class java.lang.Integer]) does not exist
at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)
at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326)
at py4j.Gateway.invoke(Gateway.java:274)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:745)


>>> type(vec)

```

here `vec` is taken as `java.util.ArrayList` 
does `self._java_obj.findSynonymsArray(word, num)` behave differently from 
`self._call_java("findSynonyms", word, num)` for Vector type? 

thank you Holden 😄 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17451: [SPARK-19866][ML][PySpark] Add local version of Word2Vec...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17451
  
**[Test build #81493 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81493/testReport)**
 for PR 17451 at commit 
[`b4d928d`](https://github.com/apache/spark/commit/b4d928d41b9d1d97c512d1f6c5381db4589cd793).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17451: [SPARK-19866][ML][PySpark] Add local version of Word2Vec...

2017-09-06 Thread keypointt
Github user keypointt commented on the issue:

https://github.com/apache/spark/pull/17451
  
`self._java_obj.findSynonymsArray` is totally a much nicer and more elegant 
solution 👍 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19077
  
**[Test build #81492 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81492/testReport)**
 for PR 19077 at commit 
[`0c6647c`](https://github.com/apache/spark/commit/0c6647cca3868a24f07c077bd9e37d436b49f5e8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts

2017-09-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19150
  
Oh, I see. Thank you so much. I'll add that.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19150
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81485/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19150
  
**[Test build #81485 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81485/testReport)**
 for PR 19150 at commit 
[`ab339b3`](https://github.com/apache/spark/commit/ab339b31b311035ebb75e8f079000d306cab16b8).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class DriverSuite extends SparkFunSuite with TimeLimits `
  * `class AsyncRDDActionsSuite extends SparkFunSuite with 
BeforeAndAfterAll with TimeLimits `
  * `class DAGSchedulerSuite extends SparkFunSuite with LocalSparkContext 
with TimeLimits `
  * `class EventLoopSuite extends SparkFunSuite with TimeLimits `
  * `trait StreamTest extends QueryTest with SharedSQLContext with 
TimeLimits with BeforeAndAfterAll `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19150
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-09-06 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r137442821
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -48,6 +48,13 @@ public long size() {
   }
 
   /**
+   * Reset the size of the memory block.
+   */
--- End diff --

Thanks,i will  add a check.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-09-06 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r137442763
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -47,23 +48,29 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
+long alignedSize = 
ByteArrayMethods.roundNumberOfBytesToNearestWord(size);
--- End diff --

yeah,I think it's acceptable


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts

2017-09-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19150
  
Looks `stop ensures correct shutdown` in `BlockGeneratorSuite` is dependent 
on interrupting - http://www.scalatest.org/release_notes/3.0.0:

> If you were relying on the default behavior of interrupting a thread on 
the JVM in ScalaTest 2.2.x, you'll need to define an implicit val referring to 
a `ThreadSignaler`

ScalaTest looks they changed the default for good reasons bug looks we 
should explicitly set `ThreadSignaler` to keep the previous behaviour more 
conservatively.

```scala
implicit val defaultSignaler: Signaler = ThreadSignaler
```

I just double checked this passes the pending tests.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts

2017-09-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19150
  
Thank  you, @jerryshao . I'll fix the typo.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-09-06 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r137441814
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java ---
@@ -48,6 +48,13 @@ public long size() {
   }
 
   /**
+   * Reset the size of the memory block.
+   */
--- End diff --

It is dangerous to reset to a invalid size. We should add a check here or 
put a WARNING in the method comment. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-09-06 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19077#discussion_r137439954
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java
 ---
@@ -47,23 +48,29 @@ private boolean shouldPool(long size) {
 
   @Override
   public MemoryBlock allocate(long size) throws OutOfMemoryError {
-if (shouldPool(size)) {
+long alignedSize = 
ByteArrayMethods.roundNumberOfBytesToNearestWord(size);
--- End diff --

Maybe minor but some small allocations will be counted for pooling 
mechanism but they are not before, e.g. `POOLING_THRESHOLD_BYTES` - 1.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19151
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19151
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81489/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19151
  
**[Test build #81489 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81489/testReport)**
 for PR 19151 at commit 
[`f05f281`](https://github.com/apache/spark/commit/f05f281eb5fda2b68e7e5f7a1a61a87a7a4bc467).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid call reserveUnrollMemoryForThis...

2017-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19135
  
is it better to do batch unrolling? i.e., we can check memory usage and 
request memory for like every 10 records, instead of doing it for every record.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19077
  
**[Test build #81491 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81491/testReport)**
 for PR 19077 at commit 
[`729df24`](https://github.com/apache/spark/commit/729df248bf44818202d1ca61b30ab43daf8aea8d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19151
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81486/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19151
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19151
  
**[Test build #81486 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81486/testReport)**
 for PR 19151 at commit 
[`4fc4d05`](https://github.com/apache/spark/commit/4fc4d05fd8dfa5397f790051196893d2b6fb2ca5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19136: [DO NOT MERGE][SPARK-15689][SQL] data source v2

2017-09-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19136#discussion_r137435478
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SchemaRequiredDataSourceV2.java
 ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2;
+
+import org.apache.spark.sql.sources.v2.reader.DataSourceV2Reader;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * A variant of `DataSourceV2` which requires users to provide a schema 
when reading data. A data
+ * source can inherit both `DataSourceV2` and `SchemaRequiredDataSourceV2` 
if it supports both schema
+ * inference and user-specified schemas.
--- End diff --

cc @rdblue for the new API of schema reference.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts

2017-09-06 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/19150
  
LGTM, there's a typo in PR description, "Timeouts is deprecated." not 
"TimeLimits".


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19152: [SPARK-21915][ML][PySpark] Model 1 and Model 2 ParamMaps...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19152
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19136: [DO NOT MERGE][SPARK-15689][SQL] data source v2

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19136
  
**[Test build #81490 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81490/testReport)**
 for PR 19136 at commit 
[`89cbfb7`](https://github.com/apache/spark/commit/89cbfb7c98325852c5c97d321d83fe91154b129e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19152: [SPARK-21915][ML][PySpark] Model 1 and Model 2 Pa...

2017-09-06 Thread marktab
GitHub user marktab opened a pull request:

https://github.com/apache/spark/pull/19152

[SPARK-21915][ML][PySpark] Model 1 and Model 2 ParamMaps Missing

@dongjoon-hyun @HyukjinKwon

Error in PySpark example code:
/examples/src/main/python/ml/estimator_transformer_param_example.py

The original Scala code says
println("Model 2 was fit using parameters: " + 
model2.parent.extractParamMap)

The parent is lr

There is no method for accessing parent as is done in Scala.

This code has been tested in Python, and returns values consistent with 
Scala

## What changes were proposed in this pull request?

Proposing to call the lr variable instead of model1 or model2

## How was this patch tested?

This patch was tested with Spark 2.1.0 comparing the Scala and PySpark 
results. Pyspark returns nothing at present for those two print lines.

The output for model2 in PySpark should be

{Param(parent='LogisticRegression_4187be538f744d5a9090', name='tol', 
doc='the convergence tolerance for iterative algorithms (>= 0).'): 1e-06,
Param(parent='LogisticRegression_4187be538f744d5a9090', 
name='elasticNetParam', doc='the ElasticNet mixing parameter, in range [0, 1]. 
For alpha = 0, the penalty is an L2 penalty. For alpha = 1, it is an L1 
penalty.'): 0.0,
Param(parent='LogisticRegression_4187be538f744d5a9090', 
name='predictionCol', doc='prediction column name.'): 'prediction',
Param(parent='LogisticRegression_4187be538f744d5a9090', name='featuresCol', 
doc='features column name.'): 'features',
Param(parent='LogisticRegression_4187be538f744d5a9090', name='labelCol', 
doc='label column name.'): 'label',
Param(parent='LogisticRegression_4187be538f744d5a9090', 
name='probabilityCol', doc='Column name for predicted class conditional 
probabilities. Note: Not all models output well-calibrated probability 
estimates! These probabilities should be treated as confidences, not precise 
probabilities.'): 'myProbability',
Param(parent='LogisticRegression_4187be538f744d5a9090', 
name='rawPredictionCol', doc='raw prediction (a.k.a. confidence) column 
name.'): 'rawPrediction',
Param(parent='LogisticRegression_4187be538f744d5a9090', name='family', 
doc='The name of family which is a description of the label distribution to be 
used in the model. Supported options: auto, binomial, multinomial'): 'auto',
Param(parent='LogisticRegression_4187be538f744d5a9090', 
name='fitIntercept', doc='whether to fit an intercept term.'): True,
Param(parent='LogisticRegression_4187be538f744d5a9090', name='threshold', 
doc='Threshold in binary classification prediction, in range [0, 1]. If 
threshold and thresholds are both set, they must match.e.g. if threshold is p, 
then thresholds must be equal to [1-p, p].'): 0.55,
Param(parent='LogisticRegression_4187be538f744d5a9090', 
name='aggregationDepth', doc='suggested depth for treeAggregate (>= 2).'): 2,
Param(parent='LogisticRegression_4187be538f744d5a9090', name='maxIter', 
doc='max number of iterations (>= 0).'): 30,
Param(parent='LogisticRegression_4187be538f744d5a9090', name='regParam', 
doc='regularization parameter (>= 0).'): 0.1,
Param(parent='LogisticRegression_4187be538f744d5a9090', 
name='standardization', doc='whether to standardize the training features 
before fitting the model.'): True}

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/marktab/spark branch-2.2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19152.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19152


commit a2ccb8a83d13d39c95f0ac1cac1c74dca064
Author: MarkTab marktab.net 
Date:   2017-09-07T02:20:59Z

Model 1 and Model 2 ParamMaps Missing

@dongjoon-hyun @HyukjinKwon

Error in PySpark example code:

[https://github.com/apache/spark/blob/master/examples/src/main/python/ml/estimator_transformer_param_example.py]

The original Scala code says
println("Model 2 was fit using parameters: " + 
model2.parent.extractParamMap)

The parent is lr

There is no method for accessing parent as is done in Scala.

This code has been tested in Python, and returns values consistent with 
Scala




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19144: [UI][Streaming]Modify the title, 'Records' instead of 'I...

2017-09-06 Thread guoxiaolongzte
Github user guoxiaolongzte commented on the issue:

https://github.com/apache/spark/pull/19144
  
@srowen Can this PR pass through?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19145: [spark-21933][yarn] Spark Streaming request more executo...

2017-09-06 Thread klion26
Github user klion26 commented on the issue:

https://github.com/apache/spark/pull/19145
  
@HyukjinKwon i am sorry for that, have changed the title form


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts

2017-09-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19150
  
Thank you for review and approval!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid call reserveUnrollMemoryForThis...

2017-09-06 Thread ConeyLiu
Github user ConeyLiu commented on the issue:

https://github.com/apache/spark/pull/19135
  
@jiangxb1987 Ok, I can test it later. The following picture is when I run 
kmeans and put the source data into the offheap memory, and you can see the CPU 
time occupied by `reserveUnrollMemoryForThisTask` is very high.

![pic](https://user-images.githubusercontent.com/12733256/30142120-a3a3dd42-9344-11e7-9ae3-1c36bedf8939.png)



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18975
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81482/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18975
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18975
  
**[Test build #81482 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81482/testReport)**
 for PR 18975 at commit 
[`6c24b1b`](https://github.com/apache/spark/commit/6c24b1be90fdf0e65c80ae24f81c75d34f7e1542).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19141: [SPARK-21384] [YARN] Spark + YARN fails with Loca...

2017-09-06 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/19141#discussion_r137427105
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala 
---
@@ -565,7 +565,6 @@ private[spark] class Client(
   distribute(jarsArchive.toURI.getPath,
 resType = LocalResourceType.ARCHIVE,
 destName = Some(LOCALIZED_LIB_DIR))
-  jarsArchive.delete()
--- End diff --

Agree with Marcelo, this is a valid concern, we should not avoid such 
regression here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Simple Python Vectorized UDF...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18659
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81480/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Simple Python Vectorized UDF...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18659
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18975
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81481/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18975
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Simple Python Vectorized UDF...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18659
  
**[Test build #81480 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81480/testReport)**
 for PR 18659 at commit 
[`4f6c950`](https://github.com/apache/spark/commit/4f6c95092066ee31a670ca827fbb892ac66df870).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18975
  
**[Test build #81481 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81481/testReport)**
 for PR 18975 at commit 
[`28fcb39`](https://github.com/apache/spark/commit/28fcb39028d93ec6ecea9eecf289c0e88b6c9ae6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Simple Python Vectorized UDF...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18659
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81479/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Simple Python Vectorized UDF...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18659
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Simple Python Vectorized UDF...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18659
  
**[Test build #81479 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81479/testReport)**
 for PR 18659 at commit 
[`fdea603`](https://github.com/apache/spark/commit/fdea603ae0ac6a8c27ec8161920f8c77549784e8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19141: [SPARK-21384] [YARN] Spark + YARN fails with Loca...

2017-09-06 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/19141#discussion_r137425482
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala 
---
@@ -565,7 +565,6 @@ private[spark] class Client(
   distribute(jarsArchive.toURI.getPath,
 resType = LocalResourceType.ARCHIVE,
 destName = Some(LOCALIZED_LIB_DIR))
-  jarsArchive.delete()
--- End diff --

You're undoing the fix for SPARK-20741. If this is causing a problem and 
you want to fix it, you need to make it so that you don't do this only when the 
specific scenario that's causing the problem happens.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19145: add logic to test whether the complete container has bee...

2017-09-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19145
  
Could you fix the title to be a form, `[SPARK-][COMPONENT] Title`, as 
described in http://spark.apache.org/contributing.html?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17096
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17096
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81488/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17096
  
**[Test build #81488 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81488/testReport)**
 for PR 17096 at commit 
[`830b4fe`](https://github.com/apache/spark/commit/830b4fe1f71befb97debd9286306b3f872eb1c09).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19149: [SPARK-21652][SQL][FOLLOW-UP] Fix rule conflict between ...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19149
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19149: [SPARK-21652][SQL][FOLLOW-UP] Fix rule conflict between ...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19149
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81483/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19149: [SPARK-21652][SQL][FOLLOW-UP] Fix rule conflict between ...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19149
  
**[Test build #81483 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81483/testReport)**
 for PR 19149 at commit 
[`e5501e1`](https://github.com/apache/spark/commit/e5501e1f46317a82b915d952f4ee192e5eb8e61d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19141: [SPARK-21384] [YARN] Spark + YARN fails with LocalFileSy...

2017-09-06 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/19141
  
OK to test.

(I may not have the permission to trigger Jenkins test 😞 )


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18981: Fixed pandoc dependency issue in python/setup.py

2017-09-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18981


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19141: [SPARK-21384] [YARN] Spark + YARN fails with LocalFileSy...

2017-09-06 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/19141
  
I see, thanks for the explanation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18981: Fixed pandoc dependency issue in python/setup.py

2017-09-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18981
  
Merged to master and branch-2.2


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-06 Thread jinxing64
Github user jinxing64 commented on the issue:

https://github.com/apache/spark/pull/19086
  
It's not ok to follow Spark current behavior?(It will be different from 
Hive)
I make this pr because we are migrating from Hive to Spark and lots of our 
users are using this function.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19151
  
**[Test build #81489 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81489/testReport)**
 for PR 19151 at commit 
[`f05f281`](https://github.com/apache/spark/commit/f05f281eb5fda2b68e7e5f7a1a61a87a7a4bc467).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...

2017-09-06 Thread BryanCutler
Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/18982
  
Hmmm, I can repeat the error with Python3, I'll look into it tomorrow


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...

2017-09-06 Thread BryanCutler
Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/18982
  
No problem @holdenk, I updated using `transform()` on the test.  See if it 
looks ok to you now (pending Jenkins). Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18982
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81487/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18982
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18982
  
**[Test build #81487 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81487/testReport)**
 for PR 18982 at commit 
[`482c025`](https://github.com/apache/spark/commit/482c02507e38909e934a9f2b7ea06612eaea5ce0).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17096
  
**[Test build #81488 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81488/testReport)**
 for PR 17096 at commit 
[`830b4fe`](https://github.com/apache/spark/commit/830b4fe1f71befb97debd9286306b3f872eb1c09).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...

2017-09-06 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/17096
  
Jenkins retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19151
  
**[Test build #81486 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81486/testReport)**
 for PR 19151 at commit 
[`4fc4d05`](https://github.com/apache/spark/commit/4fc4d05fd8dfa5397f790051196893d2b6fb2ca5).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18982
  
**[Test build #81487 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81487/testReport)**
 for PR 18982 at commit 
[`482c025`](https://github.com/apache/spark/commit/482c02507e38909e934a9f2b7ea06612eaea5ce0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...

2017-09-06 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/19151
  
cc @gatorsmile for review. Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSub...

2017-09-06 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/19151

[SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery should not produce 
unresolved query plans

## What changes were proposed in this pull request?

This is a follow-up of #19050 to deal with `ExistenceJoin` case.

## How was this patch tested?

Added test.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 SPARK-21835-followup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19151.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19151


commit 4fc4d05fd8dfa5397f790051196893d2b6fb2ca5
Author: Liang-Chi Hsieh 
Date:   2017-09-07T00:04:07Z

Deal with ExistenceJoin case.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...

2017-09-06 Thread BryanCutler
Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/18982
  
Jenkins retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19150
  
**[Test build #81485 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81485/testReport)**
 for PR 19150 at commit 
[`ab339b3`](https://github.com/apache/spark/commit/ab339b31b311035ebb75e8f079000d306cab16b8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18982
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81484/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18982
  
**[Test build #81484 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81484/testReport)**
 for PR 18982 at commit 
[`482c025`](https://github.com/apache/spark/commit/482c02507e38909e934a9f2b7ea06612eaea5ce0).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18982
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19150: [SPARK-21939][TEST] Use TimeLimits instead of Tim...

2017-09-06 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request:

https://github.com/apache/spark/pull/19150

[SPARK-21939][TEST] Use TimeLimits instead of Timeouts

## What changes were proposed in this pull request?

Since ScalaTest 3.0.0, `org.scalatest.concurrent.TimeLimits` is deprecated.
This PR replaces the deprecated one with 
`org.scalatest.concurrent.TimeLimits`.

```scala
-import org.scalatest.concurrent.Timeouts._
+import org.scalatest.concurrent.TimeLimits._
```

## How was this patch tested?

Pass the existing test suites.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dongjoon-hyun/spark SPARK-21939

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19150.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19150


commit ab339b31b311035ebb75e8f079000d306cab16b8
Author: Dongjoon Hyun 
Date:   2017-09-06T23:22:11Z

[SPARK-21939][TEST] Use TimeLimits instead of Timeouts




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18982
  
**[Test build #81484 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81484/testReport)**
 for PR 18982 at commit 
[`482c025`](https://github.com/apache/spark/commit/482c02507e38909e934a9f2b7ea06612eaea5ce0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19140: [SPARK-21890] Credentials not being passed to add the to...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19140
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19140: [SPARK-21890] Credentials not being passed to add the to...

2017-09-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19140
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81477/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19140: [SPARK-21890] Credentials not being passed to add the to...

2017-09-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19140
  
**[Test build #81477 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81477/testReport)**
 for PR 19140 at commit 
[`98f0ff2`](https://github.com/apache/spark/commit/98f0ff2a655c398e5b502ce2b340dfac88b385e9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19136: [DO NOT MERGE][SPARK-15689][SQL] data source v2

2017-09-06 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/19136
  
Thanks for pinging me. I left comments on the older PR, since other 
discussion was already there. If you'd prefer comments here, just let me know.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >