[GitHub] spark issue #21496: docs: fix typo
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21496 **[Test build #91688 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91688/testReport)** for PR 21496 at commit [`fea9616`](https://github.com/apache/spark/commit/fea9616fb35e3fcf886073767da040aef3a408e0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21496: docs: fix typo
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21496 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user jainaks commented on the issue: https://github.com/apache/spark/pull/21320 @mallman It does work fine with "name.First". --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21518: [SPARK-24502][SQL] flaky test: UnsafeRowSerialize...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21518 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21535 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21535 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3925/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21535 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/35/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21535 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21518: [SPARK-24502][SQL] flaky test: UnsafeRowSerializerSuite
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21518 thanks, merging to master/2.3! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21535 **[Test build #91687 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91687/testReport)** for PR 21535 at commit [`b8c7238`](https://github.com/apache/spark/commit/b8c7238aec9d6d79b8528eb3f47c0de7a48d23e8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21496: docs: fix typo
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21496 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91683/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21535: [SPARK-23596][SQL][WIP] Test interpreted path on Dataset...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21535 cc @cloud-fan @hvanhovell @kiszk @mgaido91 @maropu --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21496: docs: fix typo
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21496 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21535: [SPARK-23596][SQL][WIP] Test interpreted path on ...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/21535 [SPARK-23596][SQL][WIP] Test interpreted path on Dataset and DataFrame test suites ## What changes were proposed in this pull request? We have completed a significant subset of the object related Expressions to provide an interpreted fallback. This PR is going to modify the Dataset tests to also test the interpreted code paths. One concern right now is that by testing the interpreted code paths too, we will double current test time or more. Otherwise, we can only choose to test the interpreted code paths for just few test suites such as DatasetSuite, DataFrameSuite. This is in WIP status now for discussing the approach and also the test scope of interpreted code paths. ## How was this patch tested? Existing tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 SPARK-23596 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21535.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21535 commit b8c7238aec9d6d79b8528eb3f47c0de7a48d23e8 Author: Liang-Chi Hsieh Date: 2018-06-12T05:00:06Z Test interpreted path on Dataset and DataFrame test suites. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21496: docs: fix typo
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21496 **[Test build #91683 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91683/testReport)** for PR 21496 at commit [`fea9616`](https://github.com/apache/spark/commit/fea9616fb35e3fcf886073767da040aef3a408e0). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21357 **[Test build #91686 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91686/testReport)** for PR 21357 at commit [`8ad2a3f`](https://github.com/apache/spark/commit/8ad2a3f8112662a865ee1dbaf7c5269197c3ee4f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21357 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21534: [SPARK-24526][build] Spaces in the build dir causes fail...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21534 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21534: [SPARK-24526][build] Spaces in the build dir causes fail...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21534 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21534: [SPARK-24526][build] Spaces in the build dir causes fail...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21534 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21534: [SPARK-24526][build] Spaces in the build dir caus...
GitHub user trystanleftwich opened a pull request: https://github.com/apache/spark/pull/21534 [SPARK-24526][build] Spaces in the build dir causes failures in the build/mvn script ## What changes were proposed in this pull request? Fix the call to ${MVN_BIN} to be wrapped in quotes so it will handle having spaces in the path. ## How was this patch tested? Ran the following to confirm using the build/mvn tool with a space in the build dir now works without error ``` mkdir /tmp/test\ spaces cd /tmp/test\ spaces git clone https://github.com/apache/spark.git cd spark # Remove all mvn references in PATH so the script will download mvn to the local dir ./build/mvn -DskipTests clean package ``` Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/trystanleftwich/spark SPARK-24526 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21534.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21534 commit bb12f3e2ad74f9d4c89e1c7adab4d306fa87b101 Author: trystanleftwich Date: 2018-06-12T04:44:33Z [SPARK-24526][build] Spaces in the build dir causes failures in the build/mvn script --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21469: [SPARK-24441][SS] Expose total estimated size of ...
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21469#discussion_r194613720 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala --- @@ -112,14 +122,19 @@ trait StateStoreWriter extends StatefulOperator { self: SparkPlan => val storeMetrics = store.metrics longMetric("numTotalStateRows") += storeMetrics.numKeys longMetric("stateMemory") += storeMetrics.memoryUsedBytes -storeMetrics.customMetrics.foreach { case (metric, value) => - longMetric(metric.name) += value +storeMetrics.customMetrics.foreach { + case (metric: StateStoreCustomAverageMetric, value) => +longMetric(metric.name).set(value * 1.0d) --- End diff -- We would be better to think about the actual benefit of exposing the value, rather than how to expose the value to somewhere. If we define it as count and do aggregation as summation, the aggregated value will be `(partition count * versions)` which might be hard for end users to find the meaning from the value. I'm afraid that exposing this to StreamingQuery as average is not trivial, especially SQLMetric is defined as `AccumulatorV2[Long, Long]` so only single Long value can be passed. Under the restriction, we couldn't define `merge` operation for `average metric`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF should assi...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21427 > But we marked this as experimental ... That's also special for this case, we marked it as experimental in 2.3.1. Not a lot of behavior changes are similar to this one. To highlight: 1. it's not marked as experimental in the first release. 2. it missed 2.3.1, so the old behavior will be there for some time, until the next release(2.3.2 or 2.4.0) 3. it turns runnable code into failure, and the old behavior is kind of self-consistent(by-position match). it's not like turning failures into runnable or fix a correctness bug. To summary: 1. I agree the new behavior makes more sense, we should have done that in the first place. 2. This is a special case like I mentioned above. We should be a little more conservative here. 3. Adding a config is not hard. Maybe @ueshin can build the framework first for passing configs to the python worker? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91684/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #91684 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91684/testReport)** for PR 21320 at commit [`7f67ec0`](https://github.com/apache/spark/commit/7f67ec0a82dd09dd867d5882dda0965fcab28974). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21482: [SPARK-24393][SQL] SQL builtin: isinf
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21482#discussion_r194611249 --- Diff: python/pyspark/sql/functions.py --- @@ -468,6 +468,18 @@ def input_file_name(): return Column(sc._jvm.functions.input_file_name()) +@since(2.4) +def isinf(col): --- End diff -- Shall we expose this to column.py too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21482: [SPARK-24393][SQL] SQL builtin: isinf
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21482#discussion_r194610745 --- Diff: R/pkg/NAMESPACE --- @@ -281,6 +281,8 @@ exportMethods("%<=>%", "initcap", "input_file_name", "instr", + "isInf", + "isinf", --- End diff -- Ah, I got it now. I believe we should match it to one side though. I roughly remember we keep functions this_naming_style in functions[.py|.R|.scala], e.g.([SPARK-10621](https://issues.apache.org/jira/browse/SPARK-10621)). Shall we stick to `isinf` then? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21319 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3924/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21319 **[Test build #91685 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91685/testReport)** for PR 21319 at commit [`91fdedc`](https://github.com/apache/spark/commit/91fdedc4d91a7abde5f6b64dbfcf354b67d89a48). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21319 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21319 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21319 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/34/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21319 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/21501#discussion_r194606928 --- Diff: python/pyspark/ml/feature.py --- @@ -2582,25 +2582,31 @@ class StopWordsRemover(JavaTransformer, HasInputCol, HasOutputCol, JavaMLReadabl typeConverter=TypeConverters.toListString) caseSensitive = Param(Params._dummy(), "caseSensitive", "whether to do a case sensitive " + "comparison over the stop words", typeConverter=TypeConverters.toBoolean) +locale = Param(Params._dummy(), "locale", "locale of the input. ignored when case sensitive " + + "is true", typeConverter=TypeConverters.toString) @keyword_only -def __init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=False): +def __init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=False, + locale=None): """ -__init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=false) +__init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=false, +locale=None) --- End diff -- Please add \ to the end of L2592 and use the right indentation here. Unfortunately, we need this to make the doc correctly displayed. See https://github.com/apache/spark/blob/master/python/pyspark/ml/feature.py#L3112. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/21501#discussion_r194607278 --- Diff: python/pyspark/ml/feature.py --- @@ -2582,25 +2582,31 @@ class StopWordsRemover(JavaTransformer, HasInputCol, HasOutputCol, JavaMLReadabl typeConverter=TypeConverters.toListString) caseSensitive = Param(Params._dummy(), "caseSensitive", "whether to do a case sensitive " + "comparison over the stop words", typeConverter=TypeConverters.toBoolean) +locale = Param(Params._dummy(), "locale", "locale of the input. ignored when case sensitive " + + "is true", typeConverter=TypeConverters.toString) @keyword_only -def __init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=False): +def __init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=False, + locale=None): """ -__init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=false) +__init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=false, +locale=None) """ super(StopWordsRemover, self).__init__() self._java_obj = self._new_java_obj("org.apache.spark.ml.feature.StopWordsRemover", self.uid) self._setDefault(stopWords=StopWordsRemover.loadDefaultStopWords("english"), - caseSensitive=False) + caseSensitive=False, locale=StopWordsRemover.defaultLocale()) --- End diff -- You already have the `_java_obj`, call `_java_object.getLocale()` would give you the default locale. And then Python users only need `stopWordsRemover.getLocale()` to get the default value. In the param doc, we should make it clear that the default would be the JVM default locale. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/21501#discussion_r194606981 --- Diff: python/pyspark/ml/feature.py --- @@ -2582,25 +2582,31 @@ class StopWordsRemover(JavaTransformer, HasInputCol, HasOutputCol, JavaMLReadabl typeConverter=TypeConverters.toListString) caseSensitive = Param(Params._dummy(), "caseSensitive", "whether to do a case sensitive " + "comparison over the stop words", typeConverter=TypeConverters.toBoolean) +locale = Param(Params._dummy(), "locale", "locale of the input. ignored when case sensitive " + + "is true", typeConverter=TypeConverters.toString) @keyword_only -def __init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=False): +def __init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=False, + locale=None): """ -__init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=false) +__init__(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=false, +locale=None) """ super(StopWordsRemover, self).__init__() self._java_obj = self._new_java_obj("org.apache.spark.ml.feature.StopWordsRemover", self.uid) self._setDefault(stopWords=StopWordsRemover.loadDefaultStopWords("english"), - caseSensitive=False) + caseSensitive=False, locale=StopWordsRemover.defaultLocale()) kwargs = self._input_kwargs self.setParams(**kwargs) @keyword_only @since("1.6.0") -def setParams(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=False): +def setParams(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=False, + locale=None): """ -setParams(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=false) +setParams(self, inputCol=None, outputCol=None, stopWords=None, caseSensitive=false, +locale=None) --- End diff -- ditto --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21501: [SPARK-15064][ML] Locale support in StopWordsRemo...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/21501#discussion_r194606418 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala --- @@ -84,7 +86,28 @@ class StopWordsRemover @Since("1.5.0") (@Since("1.5.0") override val uid: String @Since("1.5.0") def getCaseSensitive: Boolean = $(caseSensitive) - setDefault(stopWords -> StopWordsRemover.loadDefaultStopWords("english"), caseSensitive -> false) + /** + * Locale of the input for case insensitive matching. Ignored when [[caseSensitive]] + * is true. + * Default: Locale.getDefault.toString + * @see `StopWordsRemover.getDefaultLocale()` --- End diff -- I feel it is unnecessary to expose it as a public API. This is the same as `Locale.getDefault.toString` or `stopWordsRemover.getLocale` when nothing is set. See my comments on the Python API. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21357 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21357 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91681/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21357 **[Test build #91681 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91681/testReport)** for PR 21357 at commit [`8ad2a3f`](https://github.com/apache/spark/commit/8ad2a3f8112662a865ee1dbaf7c5269197c3ee4f). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21486: [SPARK-24387][Core] Heartbeat-timeout executor is...
Github user Ngone51 commented on a diff in the pull request: https://github.com/apache/spark/pull/21486#discussion_r194606075 --- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala --- @@ -197,14 +197,14 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, clock: Clock) if (now - lastSeenMs > executorTimeoutMs) { logWarning(s"Removing executor $executorId with no recent heartbeats: " + s"${now - lastSeenMs} ms exceeds timeout $executorTimeoutMs ms") -scheduler.executorLost(executorId, SlaveLost("Executor heartbeat " + - s"timed out after ${now - lastSeenMs} ms")) // Asynchronously kill the executor to avoid blocking the current thread killExecutorThread.submit(new Runnable { override def run(): Unit = Utils.tryLogNonFatalError { // Note: we want to get an executor back after expiring this one, // so do not simply call `sc.killExecutor` here (SPARK-8119) sc.killAndReplaceExecutor(executorId) --- End diff -- To be more specific, `killAndReplaceExecutor#killExecutors` will block until we get response from cluster manager or overtime after 120s (by default RPC timeout config). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21045: [SPARK-23931][SQL] Adds arrays_zip function to sparksql
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21045 LGTM. I'm fine with this function name `arrays_zip` but wondering if others all agree on it too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21370: [SPARK-24215][PySpark] Implement _repr_html_ for datafra...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21370 @xuanyuanking Just for your reference, for this PR, the PR description can be improved to something like > This PR is to add eager execution into the __repr__ and _repr_html_ of the DataFrame APIs in PySpark. When eager evaluation is enabled, _repr_html_ returns a rich HTML version of the top-K rows of the DataFrame output. If `_repr_html_` is not called by REPL, `_repr_` will return the plain text of the top-K rows. > This PR adds three new external SQL confs for controlling the behavior of eager evaluation: > - spark.sql.repl.eagerEval.enabled: Enables eager evaluation or not. When true, the top K rows of Dataset will be displayed if and only if the REPL supports the eager evaluation. Currently, the eager evaluation is only supported in PySpark. For the notebooks like Jupyter, the HTML table (generated by _repr_html_) will be returned. For plain Python REPL, the returned outputs are formatted like dataframe.show(). > - spark.sql.repl.eagerEval.maxNumRows: The max number of rows that are returned by eager evaluation. This only takes effect when spark.sql.repl.eagerEval.enabled is set to true. > - spark.sql.repl.eagerEval.truncate: The max number of characters of each row that is returned by eager evaluation. This only takes effect when spark.sql.repl.eagerEval.enabled is set to true. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21096: [SPARK-24011][CORE][WIP] cache rdd's immediate pa...
Github user Ngone51 closed the pull request at: https://github.com/apache/spark/pull/21096 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20996: [SPARK-23884][CORE] hasLaunchedTask should be tru...
Github user Ngone51 closed the pull request at: https://github.com/apache/spark/pull/20996 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21370: [SPARK-24215][PySpark] Implement _repr_html_ for datafra...
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21370 ``` Test coverage is the most critical when we refactor the existing code and add new features. Hopefully, when you submit new PRs in the future, could you also improve this part? ``` Of cause, I'll do this in a follow up PR and answer all question from Xiao this night. Thanks for all your comments. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21319 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21319 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91680/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21319 **[Test build #91680 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91680/testReport)** for PR 21319 at commit [`91fdedc`](https://github.com/apache/spark/commit/91fdedc4d91a7abde5f6b64dbfcf354b67d89a48). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #91684 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91684/testReport)** for PR 21320 at commit [`7f67ec0`](https://github.com/apache/spark/commit/7f67ec0a82dd09dd867d5882dda0965fcab28974). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/33/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3923/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91679/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #91679 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91679/testReport)** for PR 21320 at commit [`89febc8`](https://github.com/apache/spark/commit/89febc8e978d606e32911088e9589462805b8697). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21533 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3922/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21533 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21533 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/32/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21533 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21496: docs: fix typo
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21496 **[Test build #91683 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91683/testReport)** for PR 21496 at commit [`fea9616`](https://github.com/apache/spark/commit/fea9616fb35e3fcf886073767da040aef3a408e0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21533 **[Test build #91682 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91682/testReport)** for PR 21533 at commit [`f922fd8`](https://github.com/apache/spark/commit/f922fd8c995164cada4a8b72e92c369a827def16). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21533: [SPARK-24195][Core] Bug fix for local:/ path in SparkCon...
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21533 cc @felixcheung. Please take a look about this when you have time. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21533: [SPARK-24195][Core] Bug fix for local:/ path in S...
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/21533 [SPARK-24195][Core] Bug fix for local:/ path in SparkContext.addFile ## What changes were proposed in this pull request? In the chagnes in [SPARK-6300](https://issues.apache.org/jira/browse/SPARK-6300), essentially it change schemePath to ``` new File(path).getCanonicalFile.toURI.toString ``` . This has problem when path is local:, as `java.io.File` doesn't handle it. eg. new File("local:///home/user/demo/logger.config").getCanonicalFile.toURI.toString res1: String = file:/user/anotheruser/local:/home/user/demo/logger.config ## How was this patch tested? Add test in `SparkContextSuite`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/xuanyuanking/spark SPARK-24195 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21533.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21533 commit f922fd8c995164cada4a8b72e92c369a827def16 Author: Yuanjian Li Date: 2018-06-12T01:51:44Z bug fix for local:/ path in sc.addFile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21496: docs: fix typo
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21496 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21496: docs: fix typo
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21496 **[Test build #4199 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4199/testReport)** for PR 21496 at commit [`fea9616`](https://github.com/apache/spark/commit/fea9616fb35e3fcf886073767da040aef3a408e0). * This patch **fails Spark unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21521: [SPARK-23732][docs] Fix source links in generated...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21521 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21258: [SPARK-23933][SQL] Add map_from_arrays function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21258 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91677/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21258: [SPARK-23933][SQL] Add map_from_arrays function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21258 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21357 **[Test build #91681 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91681/testReport)** for PR 21357 at commit [`8ad2a3f`](https://github.com/apache/spark/commit/8ad2a3f8112662a865ee1dbaf7c5269197c3ee4f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21258: [SPARK-23933][SQL] Add map_from_arrays function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21258 **[Test build #91677 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91677/testReport)** for PR 21258 at commit [`38d0868`](https://github.com/apache/spark/commit/38d086877385324ae872652e9dbeb484a0915557). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21521: [SPARK-23732][docs] Fix source links in generated scalad...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21521 Merged to master and branch-2.3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21357 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21469: [SPARK-24441][SS] Expose total estimated size of ...
Github user arunmahadevan commented on a diff in the pull request: https://github.com/apache/spark/pull/21469#discussion_r194592510 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala --- @@ -112,14 +122,19 @@ trait StateStoreWriter extends StatefulOperator { self: SparkPlan => val storeMetrics = store.metrics longMetric("numTotalStateRows") += storeMetrics.numKeys longMetric("stateMemory") += storeMetrics.memoryUsedBytes -storeMetrics.customMetrics.foreach { case (metric, value) => - longMetric(metric.name) += value +storeMetrics.customMetrics.foreach { + case (metric: StateStoreCustomAverageMetric, value) => +longMetric(metric.name).set(value * 1.0d) --- End diff -- Not sure if SQLAppstatusListener comes into play for reporting query progress. (e.g. StreamingQueryWrapper.lastProgress) https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala#L193 Based on my understanding, the SQLMetric is an Accumulator so the merged values of the accumulators across all the tasks is returned. The merge operation in SQLMetric just adds the value so it makes sense only for count or size values. We would be able to display the (min, med, max) values for now in the UI and not in the "query status". I was thinking if we make it a count metric, it may work (similar to number of state rows). I am fine with either way. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21532: [SPARK-24524][SQL]Improve aggregateMetrics: reduce memor...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21532 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91676/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21532: [SPARK-24524][SQL]Improve aggregateMetrics: reduce memor...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21532 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21532: [SPARK-24524][SQL]Improve aggregateMetrics: reduce memor...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21532 **[Test build #91676 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91676/testReport)** for PR 21532 at commit [`f58b944`](https://github.com/apache/spark/commit/f58b94411d6564d66338f97b9e753cd3267dd0cf). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21527: [SPARK-24519] MapStatus has 2000 hardcoded
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21527 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21527: [SPARK-24519] MapStatus has 2000 hardcoded
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21527 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91675/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21527: [SPARK-24519] MapStatus has 2000 hardcoded
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21527 **[Test build #91675 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91675/testReport)** for PR 21527 at commit [`4c8acfa`](https://github.com/apache/spark/commit/4c8acfa5899ccbdafeb630f38ce44b23332b80f2). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21319 **[Test build #91680 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91680/testReport)** for PR 21319 at commit [`91fdedc`](https://github.com/apache/spark/commit/91fdedc4d91a7abde5f6b64dbfcf354b67d89a48). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21319 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21319 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/31/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21319 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21319: [SPARK-24267][SQL] explicitly keep DataSourceReader in D...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21319 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3921/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF should assi...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21427 > we can't just change the behavior. We think the old behavior doesn't make sense and users should change their code, but users may not think in this way. I think this basically mean we will have every configuration for each behaviour change whether it's a bug or not. If we failed to explain why users could think it makes sense in a way, how about elaborating it rather then thinking hypothetically there might be. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF should assi...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21427 Okay, but I get it can be smooth to go ahead. I am okay. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19364: [SPARK-22144][SQL] ExchangeCoordinator combine th...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19364 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21427: [SPARK-24324][PYTHON] Pandas Grouped Map UDF should assi...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21427 But we marked this as experimental. If we treat old API and new experimental API in the same way, I wonder why we have them. One thing I am less clear is, what kind of scenario we are worried of. I reread the discussion here and I still don't know which case we are worried of breaking. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21523: [SPARK-24506][UI] Add UI filters also to thriftserver ta...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/21523 @mgaido91 Please fix the PR title and description to reflect the new changes you made. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19364: [SPARK-22144][SQL] ExchangeCoordinator combine the parti...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19364 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21517: Testing k9s change - please ignore (13)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21517 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/29/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21357 Kindly ping again to @tdas And cc. to @jose-torres @jerryshao @HyukjinKwon @arunmahadevan for reviewing. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21517: Testing k9s change - please ignore (13)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21517 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21222: [SPARK-24161][SS] Enable debug package feature on struct...
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21222 Kindly ping again to @tdas --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/30/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21469 Kindly ping again to @tdas --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org