[GitHub] spark issue #19124: [SPARK-21912][SQL] ORC/Parquet table should not create i...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19124 Thank you for your reviewing and helping this PR, @tejasapatil , @viirya , and @HyukjinKwon , too! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r137449077 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLParserSuite.scala --- @@ -524,6 +525,50 @@ class DDLParserSuite extends PlanTest with SharedSQLContext { assert(e.message.contains("you can only specify one of them.")) } + test("insert overwrite directory") { +val v1 = "INSERT OVERWRITE DIRECTORY '/tmp/file' USING parquet SELECT 1 as a" +parser.parsePlan(v1) match { + case InsertIntoDir(_, storage, provider, query, overwrite) => +assert(storage.locationUri != None && storage.locationUri.get.toString == "/tmp/file") --- End diff -- Nit: ```Scala assert(storage.locationUri.isDefined && storage.locationUri.get.toString == "/tmp/file") ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19124: [SPARK-21912][SQL] ORC/Parquet table should not create i...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19124 @gatorsmile . Thank you for your help! This PR is almost made by you. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r137448976 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLParserSuite.scala --- @@ -32,7 +32,8 @@ import org.apache.spark.sql.catalyst.dsl.plans.DslLogicalPlan import org.apache.spark.sql.catalyst.expressions.JsonTuple import org.apache.spark.sql.catalyst.parser.ParseException import org.apache.spark.sql.catalyst.plans.PlanTest -import org.apache.spark.sql.catalyst.plans.logical.{Generate, LogicalPlan, Project, ScriptTransformation} +import org.apache.spark.sql.catalyst.plans.logical.{Generate, InsertIntoDir, LogicalPlan, +Project, ScriptTransformation} --- End diff -- We do not have a limit of characters. If it is too long, our style is ```Scala import org.apache.spark.sql.catalyst.plans.logical.{Generate, InsertIntoDir, LogicalPlan} import org.apache.spark.sql.catalyst.plans.logical.{Project, ScriptTransformation} ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19124: [SPARK-21912][SQL] ORC/Parquet table should not c...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19124 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19124: [SPARK-21912][SQL] ORC/Parquet table should not create i...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19124 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19124: [SPARK-21912][SQL] ORC/Parquet table should not create i...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19124 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17451: [SPARK-19866][ML][PySpark] Add local version of Word2Vec...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17451 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81493/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17451: [SPARK-19866][ML][PySpark] Add local version of Word2Vec...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17451 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17451: [SPARK-19866][ML][PySpark] Add local version of Word2Vec...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17451 **[Test build #81493 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81493/testReport)** for PR 17451 at commit [`b4d928d`](https://github.com/apache/spark/commit/b4d928d41b9d1d97c512d1f6c5381db4589cd793). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSub...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19151 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19086 If we follow what our current way, rename becomes a special case. All the other commands are following different resolution ways. Just curious which company are your from? I am trying to see the impact of this PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19151 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19151 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19149: [SPARK-21652][SQL][FOLLOW-UP] Fix rule conflict between ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19149 **[Test build #81495 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81495/testReport)** for PR 19149 at commit [`1a22533`](https://github.com/apache/spark/commit/1a22533e21fd98e815ad425e6e46228b97e55386). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19150 `BlockGeneratorSuite` and `StreamTest` is fixed and tested. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19136: [DO NOT MERGE][SPARK-15689][SQL] data source v2
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19136 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81490/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19136: [DO NOT MERGE][SPARK-15689][SQL] data source v2
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19136 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19136: [DO NOT MERGE][SPARK-15689][SQL] data source v2
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19136 **[Test build #81490 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81490/testReport)** for PR 19136 at commit [`89cbfb7`](https://github.com/apache/spark/commit/89cbfb7c98325852c5c97d321d83fe91154b129e). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public class DataSourceV2Options ` * `public abstract class DataSourceV2Reader ` * `class RowToUnsafeRowReadTask implements ReadTask ` * `class RowToUnsafeDataReader implements DataReader ` * `class DataSourceRDDPartition(val index: Int, val readTask: ReadTask[UnsafeRow])` * `class DataSourceRDD(` * `case class DataSourceV2Relation(` * `case class DataSourceV2ScanExec(` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19150 **[Test build #81494 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81494/testReport)** for PR 19150 at commit [`678d1b2`](https://github.com/apache/spark/commit/678d1b214f2d8cec72b47782177e406cbab9f5ee). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17451: [SPARK-19866][ML][PySpark] Add local version of Word2Vec...
Github user keypointt commented on the issue: https://github.com/apache/spark/pull/17451 ``` >>> from pyspark.ml.feature import Word2Vec >>> sent = ("a b " * 100 + "a c " * 10).split(" ") >>> doc = spark.createDataFrame([(sent,), (sent,)], ["sentence"]) >>> word2Vec = Word2Vec(vectorSize=5, seed=42, inputCol="sentence", outputCol="model") >>> model = word2Vec.fit(doc) ``` above is the setup, and I created the `vec` below. It's fitting in `model.findSynonyms` nicely ``` >>> from pyspark.ml.linalg import Vectors >>> vec = Vectors.dense([0.267, -0.2691, 0.058, -0.0801, 0.1821, 0.4162, 0.0259, -0.2163, 0.1787, 0.0764]) >>> model.findSynonyms(vec, 2) DataFrame[word: string, similarity: double] ``` but `vec` cannot fit in `model.findSynonymsArray` even its type is `` ``` >>> model.findSynonymsArray(vec, 2) word: [0.267,-0.2691,0.058,-0.0801,0.1821,0.4162,0.0259,-0.2163,0.1787,0.0764] Traceback (most recent call last): File "", line 1, in File "/Users/renxin/Documents/workspace/spark/python/pyspark/ml/feature.py", line 2951, in findSynonymsArray tuples = self._java_obj.findSynonymsArray(word, num) File "/Users/renxin/Documents/workspace/spark/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py", line 1160, in __call__ File "/Users/renxin/Documents/workspace/spark/python/pyspark/sql/utils.py", line 63, in deco return f(*a, **kw) File "/Users/renxin/Documents/workspace/spark/python/lib/py4j-0.10.6-src.zip/py4j/protocol.py", line 324, in get_return_value py4j.protocol.Py4JError: An error occurred while calling o65.findSynonymsArray. Trace: py4j.Py4JException: Method findSynonymsArray([class java.util.ArrayList, class java.lang.Integer]) does not exist at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318) at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326) at py4j.Gateway.invoke(Gateway.java:274) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:214) at java.lang.Thread.run(Thread.java:745) >>> type(vec) ``` here `vec` is taken as `java.util.ArrayList` does `self._java_obj.findSynonymsArray(word, num)` behave differently from `self._call_java("findSynonyms", word, num)` for Vector type? thank you Holden ð --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17451: [SPARK-19866][ML][PySpark] Add local version of Word2Vec...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17451 **[Test build #81493 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81493/testReport)** for PR 17451 at commit [`b4d928d`](https://github.com/apache/spark/commit/b4d928d41b9d1d97c512d1f6c5381db4589cd793). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17451: [SPARK-19866][ML][PySpark] Add local version of Word2Vec...
Github user keypointt commented on the issue: https://github.com/apache/spark/pull/17451 `self._java_obj.findSynonymsArray` is totally a much nicer and more elegant solution ð --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19077 **[Test build #81492 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81492/testReport)** for PR 19077 at commit [`0c6647c`](https://github.com/apache/spark/commit/0c6647cca3868a24f07c077bd9e37d436b49f5e8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19150 Oh, I see. Thank you so much. I'll add that. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19150 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81485/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19150 **[Test build #81485 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81485/testReport)** for PR 19150 at commit [`ab339b3`](https://github.com/apache/spark/commit/ab339b31b311035ebb75e8f079000d306cab16b8). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class DriverSuite extends SparkFunSuite with TimeLimits ` * `class AsyncRDDActionsSuite extends SparkFunSuite with BeforeAndAfterAll with TimeLimits ` * `class DAGSchedulerSuite extends SparkFunSuite with LocalSparkContext with TimeLimits ` * `class EventLoopSuite extends SparkFunSuite with TimeLimits ` * `trait StreamTest extends QueryTest with SharedSQLContext with TimeLimits with BeforeAndAfterAll ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19150 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r137442821 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -48,6 +48,13 @@ public long size() { } /** + * Reset the size of the memory block. + */ --- End diff -- Thanksï¼i will add a check. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r137442763 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -47,23 +48,29 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { +long alignedSize = ByteArrayMethods.roundNumberOfBytesToNearestWord(size); --- End diff -- yeahï¼I think it's acceptable --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19150 Looks `stop ensures correct shutdown` in `BlockGeneratorSuite` is dependent on interrupting - http://www.scalatest.org/release_notes/3.0.0: > If you were relying on the default behavior of interrupting a thread on the JVM in ScalaTest 2.2.x, you'll need to define an implicit val referring to a `ThreadSignaler` ScalaTest looks they changed the default for good reasons bug looks we should explicitly set `ThreadSignaler` to keep the previous behaviour more conservatively. ```scala implicit val defaultSignaler: Signaler = ThreadSignaler ``` I just double checked this passes the pending tests. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19150 Thank you, @jerryshao . I'll fix the typo. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r137441814 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -48,6 +48,13 @@ public long size() { } /** + * Reset the size of the memory block. + */ --- End diff -- It is dangerous to reset to a invalid size. We should add a check here or put a WARNING in the method comment. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r137439954 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -47,23 +48,29 @@ private boolean shouldPool(long size) { @Override public MemoryBlock allocate(long size) throws OutOfMemoryError { -if (shouldPool(size)) { +long alignedSize = ByteArrayMethods.roundNumberOfBytesToNearestWord(size); --- End diff -- Maybe minor but some small allocations will be counted for pooling mechanism but they are not before, e.g. `POOLING_THRESHOLD_BYTES` - 1. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19151 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19151 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81489/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19151 **[Test build #81489 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81489/testReport)** for PR 19151 at commit [`f05f281`](https://github.com/apache/spark/commit/f05f281eb5fda2b68e7e5f7a1a61a87a7a4bc467). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid call reserveUnrollMemoryForThis...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19135 is it better to do batch unrolling? i.e., we can check memory usage and request memory for like every 10 records, instead of doing it for every record. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19077: [SPARK-21860][core]Improve memory reuse for heap memory ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19077 **[Test build #81491 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81491/testReport)** for PR 19077 at commit [`729df24`](https://github.com/apache/spark/commit/729df248bf44818202d1ca61b30ab43daf8aea8d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19151 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81486/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19151 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19151 **[Test build #81486 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81486/testReport)** for PR 19151 at commit [`4fc4d05`](https://github.com/apache/spark/commit/4fc4d05fd8dfa5397f790051196893d2b6fb2ca5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19136: [DO NOT MERGE][SPARK-15689][SQL] data source v2
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19136#discussion_r137435478 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/SchemaRequiredDataSourceV2.java --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.sources.v2; + +import org.apache.spark.sql.sources.v2.reader.DataSourceV2Reader; +import org.apache.spark.sql.types.StructType; + +/** + * A variant of `DataSourceV2` which requires users to provide a schema when reading data. A data + * source can inherit both `DataSourceV2` and `SchemaRequiredDataSourceV2` if it supports both schema + * inference and user-specified schemas. --- End diff -- cc @rdblue for the new API of schema reference. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19150 LGTM, there's a typo in PR description, "Timeouts is deprecated." not "TimeLimits". --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19152: [SPARK-21915][ML][PySpark] Model 1 and Model 2 ParamMaps...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19152 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19136: [DO NOT MERGE][SPARK-15689][SQL] data source v2
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19136 **[Test build #81490 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81490/testReport)** for PR 19136 at commit [`89cbfb7`](https://github.com/apache/spark/commit/89cbfb7c98325852c5c97d321d83fe91154b129e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19152: [SPARK-21915][ML][PySpark] Model 1 and Model 2 Pa...
GitHub user marktab opened a pull request: https://github.com/apache/spark/pull/19152 [SPARK-21915][ML][PySpark] Model 1 and Model 2 ParamMaps Missing @dongjoon-hyun @HyukjinKwon Error in PySpark example code: /examples/src/main/python/ml/estimator_transformer_param_example.py The original Scala code says println("Model 2 was fit using parameters: " + model2.parent.extractParamMap) The parent is lr There is no method for accessing parent as is done in Scala. This code has been tested in Python, and returns values consistent with Scala ## What changes were proposed in this pull request? Proposing to call the lr variable instead of model1 or model2 ## How was this patch tested? This patch was tested with Spark 2.1.0 comparing the Scala and PySpark results. Pyspark returns nothing at present for those two print lines. The output for model2 in PySpark should be {Param(parent='LogisticRegression_4187be538f744d5a9090', name='tol', doc='the convergence tolerance for iterative algorithms (>= 0).'): 1e-06, Param(parent='LogisticRegression_4187be538f744d5a9090', name='elasticNetParam', doc='the ElasticNet mixing parameter, in range [0, 1]. For alpha = 0, the penalty is an L2 penalty. For alpha = 1, it is an L1 penalty.'): 0.0, Param(parent='LogisticRegression_4187be538f744d5a9090', name='predictionCol', doc='prediction column name.'): 'prediction', Param(parent='LogisticRegression_4187be538f744d5a9090', name='featuresCol', doc='features column name.'): 'features', Param(parent='LogisticRegression_4187be538f744d5a9090', name='labelCol', doc='label column name.'): 'label', Param(parent='LogisticRegression_4187be538f744d5a9090', name='probabilityCol', doc='Column name for predicted class conditional probabilities. Note: Not all models output well-calibrated probability estimates! These probabilities should be treated as confidences, not precise probabilities.'): 'myProbability', Param(parent='LogisticRegression_4187be538f744d5a9090', name='rawPredictionCol', doc='raw prediction (a.k.a. confidence) column name.'): 'rawPrediction', Param(parent='LogisticRegression_4187be538f744d5a9090', name='family', doc='The name of family which is a description of the label distribution to be used in the model. Supported options: auto, binomial, multinomial'): 'auto', Param(parent='LogisticRegression_4187be538f744d5a9090', name='fitIntercept', doc='whether to fit an intercept term.'): True, Param(parent='LogisticRegression_4187be538f744d5a9090', name='threshold', doc='Threshold in binary classification prediction, in range [0, 1]. If threshold and thresholds are both set, they must match.e.g. if threshold is p, then thresholds must be equal to [1-p, p].'): 0.55, Param(parent='LogisticRegression_4187be538f744d5a9090', name='aggregationDepth', doc='suggested depth for treeAggregate (>= 2).'): 2, Param(parent='LogisticRegression_4187be538f744d5a9090', name='maxIter', doc='max number of iterations (>= 0).'): 30, Param(parent='LogisticRegression_4187be538f744d5a9090', name='regParam', doc='regularization parameter (>= 0).'): 0.1, Param(parent='LogisticRegression_4187be538f744d5a9090', name='standardization', doc='whether to standardize the training features before fitting the model.'): True} Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/marktab/spark branch-2.2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19152.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19152 commit a2ccb8a83d13d39c95f0ac1cac1c74dca064 Author: MarkTab marktab.netDate: 2017-09-07T02:20:59Z Model 1 and Model 2 ParamMaps Missing @dongjoon-hyun @HyukjinKwon Error in PySpark example code: [https://github.com/apache/spark/blob/master/examples/src/main/python/ml/estimator_transformer_param_example.py] The original Scala code says println("Model 2 was fit using parameters: " + model2.parent.extractParamMap) The parent is lr There is no method for accessing parent as is done in Scala. This code has been tested in Python, and returns values consistent with Scala --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19144: [UI][Streaming]Modify the title, 'Records' instead of 'I...
Github user guoxiaolongzte commented on the issue: https://github.com/apache/spark/pull/19144 @srowen Can this PR pass through? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19145: [spark-21933][yarn] Spark Streaming request more executo...
Github user klion26 commented on the issue: https://github.com/apache/spark/pull/19145 @HyukjinKwon i am sorry for that, have changed the title form --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19150 Thank you for review and approval! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid call reserveUnrollMemoryForThis...
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19135 @jiangxb1987 Ok, I can test it later. The following picture is when I run kmeans and put the source data into the offheap memory, and you can see the CPU time occupied by `reserveUnrollMemoryForThisTask` is very high. ![pic](https://user-images.githubusercontent.com/12733256/30142120-a3a3dd42-9344-11e7-9ae3-1c36bedf8939.png) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18975 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81482/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18975 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18975 **[Test build #81482 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81482/testReport)** for PR 18975 at commit [`6c24b1b`](https://github.com/apache/spark/commit/6c24b1be90fdf0e65c80ae24f81c75d34f7e1542). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19141: [SPARK-21384] [YARN] Spark + YARN fails with Loca...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19141#discussion_r137427105 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -565,7 +565,6 @@ private[spark] class Client( distribute(jarsArchive.toURI.getPath, resType = LocalResourceType.ARCHIVE, destName = Some(LOCALIZED_LIB_DIR)) - jarsArchive.delete() --- End diff -- Agree with Marcelo, this is a valid concern, we should not avoid such regression here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Simple Python Vectorized UDF...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18659 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81480/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Simple Python Vectorized UDF...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18659 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18975 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81481/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18975 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Simple Python Vectorized UDF...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18659 **[Test build #81480 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81480/testReport)** for PR 18659 at commit [`4f6c950`](https://github.com/apache/spark/commit/4f6c95092066ee31a670ca827fbb892ac66df870). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18975 **[Test build #81481 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81481/testReport)** for PR 18975 at commit [`28fcb39`](https://github.com/apache/spark/commit/28fcb39028d93ec6ecea9eecf289c0e88b6c9ae6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Simple Python Vectorized UDF...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18659 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81479/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Simple Python Vectorized UDF...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18659 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Simple Python Vectorized UDF...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18659 **[Test build #81479 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81479/testReport)** for PR 18659 at commit [`fdea603`](https://github.com/apache/spark/commit/fdea603ae0ac6a8c27ec8161920f8c77549784e8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19141: [SPARK-21384] [YARN] Spark + YARN fails with Loca...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/19141#discussion_r137425482 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -565,7 +565,6 @@ private[spark] class Client( distribute(jarsArchive.toURI.getPath, resType = LocalResourceType.ARCHIVE, destName = Some(LOCALIZED_LIB_DIR)) - jarsArchive.delete() --- End diff -- You're undoing the fix for SPARK-20741. If this is causing a problem and you want to fix it, you need to make it so that you don't do this only when the specific scenario that's causing the problem happens. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19145: add logic to test whether the complete container has bee...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19145 Could you fix the title to be a form, `[SPARK-][COMPONENT] Title`, as described in http://spark.apache.org/contributing.html? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17096 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17096 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81488/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17096 **[Test build #81488 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81488/testReport)** for PR 17096 at commit [`830b4fe`](https://github.com/apache/spark/commit/830b4fe1f71befb97debd9286306b3f872eb1c09). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19149: [SPARK-21652][SQL][FOLLOW-UP] Fix rule conflict between ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19149 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19149: [SPARK-21652][SQL][FOLLOW-UP] Fix rule conflict between ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19149 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81483/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19149: [SPARK-21652][SQL][FOLLOW-UP] Fix rule conflict between ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19149 **[Test build #81483 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81483/testReport)** for PR 19149 at commit [`e5501e1`](https://github.com/apache/spark/commit/e5501e1f46317a82b915d952f4ee192e5eb8e61d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19141: [SPARK-21384] [YARN] Spark + YARN fails with LocalFileSy...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19141 OK to test. (I may not have the permission to trigger Jenkins test ð ) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18981: Fixed pandoc dependency issue in python/setup.py
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18981 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19141: [SPARK-21384] [YARN] Spark + YARN fails with LocalFileSy...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19141 I see, thanks for the explanation. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18981: Fixed pandoc dependency issue in python/setup.py
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18981 Merged to master and branch-2.2 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 It's not ok to follow Spark current behavior?(It will be different from Hive) I make this pr because we are migrating from Hive to Spark and lots of our users are using this function. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19151 **[Test build #81489 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81489/testReport)** for PR 19151 at commit [`f05f281`](https://github.com/apache/spark/commit/f05f281eb5fda2b68e7e5f7a1a61a87a7a4bc467). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18982 Hmmm, I can repeat the error with Python3, I'll look into it tomorrow --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18982 No problem @holdenk, I updated using `transform()` on the test. See if it looks ok to you now (pending Jenkins). Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18982 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81487/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18982 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18982 **[Test build #81487 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81487/testReport)** for PR 18982 at commit [`482c025`](https://github.com/apache/spark/commit/482c02507e38909e934a9f2b7ea06612eaea5ce0). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17096 **[Test build #81488 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81488/testReport)** for PR 17096 at commit [`830b4fe`](https://github.com/apache/spark/commit/830b4fe1f71befb97debd9286306b3f872eb1c09). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/17096 Jenkins retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19151 **[Test build #81486 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81486/testReport)** for PR 19151 at commit [`4fc4d05`](https://github.com/apache/spark/commit/4fc4d05fd8dfa5397f790051196893d2b6fb2ca5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18982 **[Test build #81487 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81487/testReport)** for PR 18982 at commit [`482c025`](https://github.com/apache/spark/commit/482c02507e38909e934a9f2b7ea06612eaea5ce0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery s...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19151 cc @gatorsmile for review. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19151: [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSub...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/19151 [SPARK-21835][SQL][FOLLOW-UP] RewritePredicateSubquery should not produce unresolved query plans ## What changes were proposed in this pull request? This is a follow-up of #19050 to deal with `ExistenceJoin` case. ## How was this patch tested? Added test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 SPARK-21835-followup Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19151.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19151 commit 4fc4d05fd8dfa5397f790051196893d2b6fb2ca5 Author: Liang-Chi HsiehDate: 2017-09-07T00:04:07Z Deal with ExistenceJoin case. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18982 Jenkins retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19150 **[Test build #81485 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81485/testReport)** for PR 19150 at commit [`ab339b3`](https://github.com/apache/spark/commit/ab339b31b311035ebb75e8f079000d306cab16b8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18982 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81484/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18982 **[Test build #81484 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81484/testReport)** for PR 18982 at commit [`482c025`](https://github.com/apache/spark/commit/482c02507e38909e934a9f2b7ea06612eaea5ce0). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18982 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19150: [SPARK-21939][TEST] Use TimeLimits instead of Tim...
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/19150 [SPARK-21939][TEST] Use TimeLimits instead of Timeouts ## What changes were proposed in this pull request? Since ScalaTest 3.0.0, `org.scalatest.concurrent.TimeLimits` is deprecated. This PR replaces the deprecated one with `org.scalatest.concurrent.TimeLimits`. ```scala -import org.scalatest.concurrent.Timeouts._ +import org.scalatest.concurrent.TimeLimits._ ``` ## How was this patch tested? Pass the existing test suites. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/spark SPARK-21939 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19150.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19150 commit ab339b31b311035ebb75e8f079000d306cab16b8 Author: Dongjoon HyunDate: 2017-09-06T23:22:11Z [SPARK-21939][TEST] Use TimeLimits instead of Timeouts --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18982 **[Test build #81484 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81484/testReport)** for PR 18982 at commit [`482c025`](https://github.com/apache/spark/commit/482c02507e38909e934a9f2b7ea06612eaea5ce0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19140: [SPARK-21890] Credentials not being passed to add the to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19140 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19140: [SPARK-21890] Credentials not being passed to add the to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19140 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81477/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19140: [SPARK-21890] Credentials not being passed to add the to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19140 **[Test build #81477 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81477/testReport)** for PR 19140 at commit [`98f0ff2`](https://github.com/apache/spark/commit/98f0ff2a655c398e5b502ce2b340dfac88b385e9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19136: [DO NOT MERGE][SPARK-15689][SQL] data source v2
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/19136 Thanks for pinging me. I left comments on the older PR, since other discussion was already there. If you'd prefer comments here, just let me know. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org