[spark] branch master updated: [SPARK-27489][WEBUI] UI updates to show executor resource information
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 4c8f114 [SPARK-27489][WEBUI] UI updates to show executor resource information 4c8f114 is described below commit 4c8f114783241e780c0ad227bfb3191657637df7 Author: Thomas Graves AuthorDate: Wed Sep 4 09:45:44 2019 +0800 [SPARK-27489][WEBUI] UI updates to show executor resource information ### What changes were proposed in this pull request? We are adding other resource type support to the executors and Spark. We should show the resource information for each executor on the UI Executors page. This also adds a toggle button to show the resources column. It is off by default. ![executorui1](https://user-images.githubusercontent.com/4563792/63891432-c815b580-c9aa-11e9-9f41-62975649efbc.png) ![Screenshot from 2019-08-28 14-56-26](https://user-images.githubusercontent.com/4563792/63891516-fd220800-c9aa-11e9-9fe4-89fcdca37306.png) ### Why are the changes needed? to show user what resources the executors have. Like Gpus, fpgas, etc ### Does this PR introduce any user-facing change? Yes introduces UI and rest api changes to show the resources ### How was this patch tested? Unit tests and manual UI tests on yarn and standalone modes. Closes #25613 from tgravescs/SPARK-27489-gpu-ui-latest. Authored-by: Thomas Graves Signed-off-by: Gengliang Wang --- .../spark/ui/static/executorspage-template.html| 1 + .../org/apache/spark/ui/static/executorspage.js| 26 - .../deploy/history/HistoryAppStatusStore.scala | 2 +- .../apache/spark/deploy/master/WorkerInfo.scala| 7 +- .../apache/spark/resource/ResourceAllocator.scala | 28 ++--- .../spark/resource/ResourceInformation.scala | 10 ++ .../spark/scheduler/ExecutorResourceInfo.scala | 8 +- .../spark/scheduler/cluster/ExecutorData.scala | 18 +-- .../spark/scheduler/cluster/ExecutorInfo.scala | 19 ++- .../apache/spark/status/AppStatusListener.scala| 1 + .../scala/org/apache/spark/status/LiveEntity.scala | 5 +- .../scala/org/apache/spark/status/api/v1/api.scala | 4 +- .../scala/org/apache/spark/util/JsonProtocol.scala | 25 +++- .../application_list_json_expectation.json | 15 +++ .../completed_app_list_json_expectation.json | 15 +++ .../executor_list_json_expectation.json| 3 +- ...ist_with_executor_metrics_json_expectation.json | 12 +- .../executor_memory_usage_expectation.json | 15 ++- .../executor_node_blacklisting_expectation.json| 15 ++- ...de_blacklisting_unblacklisting_expectation.json | 15 ++- .../executor_resource_information_expectation.json | 130 + .../limit_app_list_json_expectation.json | 30 ++--- .../minDate_app_list_json_expectation.json | 20 +++- .../minEndDate_app_list_json_expectation.json | 15 +++ .../spark-events/application_1555004656427_0144| 9 ++ .../scala/org/apache/spark/SparkContextSuite.scala | 14 ++- .../spark/deploy/history/HistoryServerSuite.scala | 1 + .../org/apache/spark/util/JsonProtocolSuite.scala | 12 +- dev/.rat-excludes | 1 + 29 files changed, 394 insertions(+), 82 deletions(-) diff --git a/core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html b/core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html index b236857..31ef045 100644 --- a/core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html +++ b/core/src/main/resources/org/apache/spark/ui/static/executorspage-template.html @@ -92,6 +92,7 @@ limitations under the License. Off Heap Storage Memory Disk Used Cores + Resources Active Tasks Failed Tasks Complete Tasks diff --git a/core/src/main/resources/org/apache/spark/ui/static/executorspage.js b/core/src/main/resources/org/apache/spark/ui/static/executorspage.js index 17cb68c..11d7c77 100644 --- a/core/src/main/resources/org/apache/spark/ui/static/executorspage.js +++ b/core/src/main/resources/org/apache/spark/ui/static/executorspage.js @@ -39,6 +39,19 @@ function formatStatus(status, type, row) { return "Dead" } +function formatResourceCells(resources) { +var result = "" +var count = 0 +$.each(resources, function (name, resInfo) { +if (count > 0) { +result += ", " +} +result += name + ': [' + resInfo.addresses.join(", ") + ']' +count += 1 +}); +return result +} + jQuery.extend(jQuery.fn.dataTableExt.oSort, { "title-numeric-pre": function (a) { var x = a.match(/ti
[spark-website] branch asf-site updated: Add Gengliang Wang to committer list
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git The following commit(s) were added to refs/heads/asf-site by this push: new 047e311 Add Gengliang Wang to committer list 047e311 is described below commit 047e311fdd9195b9da3b34cdb25b506f61df7c18 Author: Gengliang Wang AuthorDate: Mon Sep 2 14:43:23 2019 +0800 Add Gengliang Wang to committer list Author: Gengliang Wang Author: Gengliang Wang Closes #218 from gengliangwang/new-committer. --- committers.md| 1 + site/committers.html | 4 2 files changed, 5 insertions(+) diff --git a/committers.md b/committers.md index fffb042..bc4205e 100644 --- a/committers.md +++ b/committers.md @@ -71,6 +71,7 @@ navigation: |Takuya Ueshin|Databricks| |Marcelo Vanzin|Cloudera| |Shivaram Venkataraman|University of Wisconsin, Madison| +|Gengliang Wang|Databricks| |Yuming Wang|eBay| |Zhenhua Wang|Huawei| |Patrick Wendell|Databricks| diff --git a/site/committers.html b/site/committers.html index c897a37..025b505 100644 --- a/site/committers.html +++ b/site/committers.html @@ -455,6 +455,10 @@ University of Wisconsin, Madison + Gengliang Wang + Databricks + + Yuming Wang eBay - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (beae14d -> 99ea324)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from beae14d [SPARK-30104][SQL] Fix catalog resolution for 'global_temp' add 99ea324 [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas No new revisions were added by this update. Summary of changes: docs/sql-data-sources-avro.md | 9 + .../apache/spark/sql/avro/AvroDataToCatalyst.scala | 9 - .../org/apache/spark/sql/avro/AvroOptions.scala| 8 .../org/apache/spark/sql/avro/functions.scala | 7 ++-- .../apache/spark/sql/avro/AvroFunctionsSuite.scala | 45 +- python/pyspark/sql/avro/functions.py | 7 ++-- 6 files changed, 76 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (beae14d -> 99ea324)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from beae14d [SPARK-30104][SQL] Fix catalog resolution for 'global_temp' add 99ea324 [SPARK-27506][SQL] Allow deserialization of Avro data using compatible schemas No new revisions were added by this update. Summary of changes: docs/sql-data-sources-avro.md | 9 + .../apache/spark/sql/avro/AvroDataToCatalyst.scala | 9 - .../org/apache/spark/sql/avro/AvroOptions.scala| 8 .../org/apache/spark/sql/avro/functions.scala | 7 ++-- .../apache/spark/sql/avro/AvroFunctionsSuite.scala | 45 +- python/pyspark/sql/avro/functions.py | 7 ++-- 6 files changed, 76 insertions(+), 9 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (39c0696 -> cada5be)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 39c0696 [MINOR] Fix google style guide address add cada5be [SPARK-30230][SQL] Like ESCAPE syntax can not use '_' and '%' No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala | 4 .../org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala | 4 2 files changed, 8 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (982f72f -> 5114389)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 982f72f [SPARK-30238][SQL] hive partition pruning can only support string and integral types add 5114389 [SPARK-30107][SQL] Expose nested schema pruning to all V2 sources No new revisions were added by this update. Summary of changes: .../sql/execution/datasources/SchemaPruning.scala | 36 -- .../execution/datasources/v2/FileScanBuilder.scala | 8 +++- .../execution/datasources/v2/PushDownUtils.scala | 36 ++ .../datasources/v2/V2ScanRelationPushDown.scala| 31 ++- .../datasources/v2/orc/OrcScanBuilder.scala| 2 + .../v2/parquet/ParquetScanBuilder.scala| 2 + .../spark/sql/FileBasedDataSourceSuite.scala | 17 + .../execution/datasources/SchemaPruningSuite.scala | 44 ++ .../sql/execution/datasources/csv/CSVSuite.scala | 18 + 9 files changed, 140 insertions(+), 54 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (982f72f -> 5114389)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 982f72f [SPARK-30238][SQL] hive partition pruning can only support string and integral types add 5114389 [SPARK-30107][SQL] Expose nested schema pruning to all V2 sources No new revisions were added by this update. Summary of changes: .../sql/execution/datasources/SchemaPruning.scala | 36 -- .../execution/datasources/v2/FileScanBuilder.scala | 8 +++- .../execution/datasources/v2/PushDownUtils.scala | 36 ++ .../datasources/v2/V2ScanRelationPushDown.scala| 31 ++- .../datasources/v2/orc/OrcScanBuilder.scala| 2 + .../v2/parquet/ParquetScanBuilder.scala| 2 + .../spark/sql/FileBasedDataSourceSuite.scala | 17 + .../execution/datasources/SchemaPruningSuite.scala | 44 ++ .../sql/execution/datasources/csv/CSVSuite.scala | 18 + 9 files changed, 140 insertions(+), 54 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b86d4bb -> 187f3c1)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b86d4bb [SPARK-30001][SQL] ResolveRelations should handle both V1 and V2 tables add 187f3c1 [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax No new revisions were added by this update. Summary of changes: docs/sql-keywords.md | 1 + .../apache/spark/sql/catalyst/parser/SqlBase.g4| 5 +- .../apache/spark/sql/catalyst/dsl/package.scala| 3 +- .../catalyst/expressions/regexpExpressions.scala | 33 ++--- .../spark/sql/catalyst/optimizer/expressions.scala | 5 +- .../spark/sql/catalyst/parser/AstBuilder.scala | 9 ++- .../spark/sql/catalyst/util/StringUtils.scala | 10 +-- .../expressions/RegexpExpressionsSuite.scala | 78 ++ .../catalyst/parser/ExpressionParserSuite.scala| 12 .../parser/TableIdentifierParserSuite.scala| 2 + .../spark/sql/catalyst/util/StringUtilsSuite.scala | 35 -- .../sql/dynamicpruning/PartitionPruning.scala | 2 +- 12 files changed, 167 insertions(+), 28 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (dba673f -> 7435146)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from dba673f [SPARK-29489][ML][PYSPARK] ml.evaluation support log-loss add 7435146 [SPARK-29482][SQL] ANALYZE TABLE should look up catalog/table like v2 commands No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/parser/SqlBase.g4| 2 +- .../spark/sql/catalyst/parser/AstBuilder.scala | 48 ++ .../sql/catalyst/plans/logical/statements.scala| 19 ++ .../spark/sql/catalyst/parser/DDLParserSuite.scala | 76 ++ .../catalyst/analysis/ResolveSessionCatalog.scala | 20 +- .../spark/sql/execution/SparkSqlParser.scala | 49 -- .../spark/sql/connector/DataSourceV2SQLSuite.scala | 17 + .../spark/sql/execution/SparkSqlParserSuite.scala | 62 -- 8 files changed, 180 insertions(+), 113 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9e77d48 -> 1296bbb)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9e77d48 [SPARK-21492][SQL][FOLLOW UP] Reimplement UnsafeExternalRowSorter in database style iterator add 1296bbb [SPARK-29504][WEBUI] Toggle full job description on click No new revisions were added by this update. Summary of changes: core/src/main/resources/org/apache/spark/ui/static/webui.js | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (bdf0c60 -> 5b628f8)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from bdf0c60 [SPARK-28752][BUILD][FOLLOWUP] Fix to install `rouge` instead of `rogue` add 5b628f8 Revert "[SPARK-26081][SPARK-2]" No new revisions were added by this update. Summary of changes: .../datasources/csv/CsvOutputWriter.scala | 22 ++--- .../datasources/json/JsonOutputWriter.scala| 18 ++-- .../datasources/text/TextOutputWriter.scala| 16 +--- .../streaming/ManifestFileCommitProtocol.scala | 23 ++--- .../sql/execution/datasources/csv/CSVSuite.scala | 9 -- .../sql/execution/datasources/json/JsonSuite.scala | 9 -- .../sql/execution/datasources/text/TextSuite.scala | 9 -- .../spark/sql/streaming/FileStreamSinkSuite.scala | 98 +++--- 8 files changed, 73 insertions(+), 131 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9351e3e -> b182ed8)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9351e3e Revert "[SPARK-29991][INFRA] Support `test-hive1.2` in PR Builder" add b182ed8 [SPARK-29724][SPARK-29726][WEBUI][SQL] Support JDBC/ODBC tab for HistoryServer WebUI No new revisions were added by this update. Summary of changes: .../spark/deploy/history/FsHistoryProvider.scala | 5 +- .../spark/status/AppHistoryServerPlugin.scala | 5 + .../sql/execution/ui/SQLHistoryServerPlugin.scala | 3 + .../org.apache.spark.status.AppHistoryServerPlugin | 1 + .../sql/hive/thriftserver/HiveThriftServer2.scala | 203 ++--- .../SparkExecuteStatementOperation.scala | 19 +- .../thriftserver/SparkGetCatalogsOperation.scala | 10 +- .../thriftserver/SparkGetColumnsOperation.scala| 10 +- .../thriftserver/SparkGetFunctionsOperation.scala | 10 +- .../thriftserver/SparkGetSchemasOperation.scala| 10 +- .../thriftserver/SparkGetTableTypesOperation.scala | 10 +- .../thriftserver/SparkGetTablesOperation.scala | 10 +- .../thriftserver/SparkGetTypeInfoOperation.scala | 10 +- .../hive/thriftserver/SparkSQLSessionManager.scala | 4 +- .../ui/HiveThriftServer2AppStatusStore.scala | 132 + .../ui/HiveThriftServer2EventManager.scala | 113 .../ui/HiveThriftServer2HistoryServerPlugin.scala} | 16 +- .../ui/HiveThriftServer2Listener.scala | 315 + .../hive/thriftserver/ui/ThriftServerPage.scala| 30 +- .../thriftserver/ui/ThriftServerSessionPage.scala | 13 +- .../sql/hive/thriftserver/ui/ThriftServerTab.scala | 15 +- .../ui/HiveThriftServer2ListenerSuite.scala| 164 +++ .../thriftserver/ui/ThriftServerPageSuite.scala| 68 +++-- 23 files changed, 883 insertions(+), 293 deletions(-) create mode 100644 sql/hive-thriftserver/src/main/resources/META-INF/services/org.apache.spark.status.AppHistoryServerPlugin create mode 100644 sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/HiveThriftServer2AppStatusStore.scala create mode 100644 sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/HiveThriftServer2EventManager.scala copy sql/{core/src/main/scala/org/apache/spark/sql/execution/ui/SQLHistoryServerPlugin.scala => hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/HiveThriftServer2HistoryServerPlugin.scala} (75%) create mode 100644 sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/HiveThriftServer2Listener.scala create mode 100644 sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/ui/HiveThriftServer2ListenerSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9351e3e -> b182ed8)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9351e3e Revert "[SPARK-29991][INFRA] Support `test-hive1.2` in PR Builder" add b182ed8 [SPARK-29724][SPARK-29726][WEBUI][SQL] Support JDBC/ODBC tab for HistoryServer WebUI No new revisions were added by this update. Summary of changes: .../spark/deploy/history/FsHistoryProvider.scala | 5 +- .../spark/status/AppHistoryServerPlugin.scala | 5 + .../sql/execution/ui/SQLHistoryServerPlugin.scala | 3 + .../org.apache.spark.status.AppHistoryServerPlugin | 1 + .../sql/hive/thriftserver/HiveThriftServer2.scala | 203 ++--- .../SparkExecuteStatementOperation.scala | 19 +- .../thriftserver/SparkGetCatalogsOperation.scala | 10 +- .../thriftserver/SparkGetColumnsOperation.scala| 10 +- .../thriftserver/SparkGetFunctionsOperation.scala | 10 +- .../thriftserver/SparkGetSchemasOperation.scala| 10 +- .../thriftserver/SparkGetTableTypesOperation.scala | 10 +- .../thriftserver/SparkGetTablesOperation.scala | 10 +- .../thriftserver/SparkGetTypeInfoOperation.scala | 10 +- .../hive/thriftserver/SparkSQLSessionManager.scala | 4 +- .../ui/HiveThriftServer2AppStatusStore.scala | 132 + .../ui/HiveThriftServer2EventManager.scala | 113 .../ui/HiveThriftServer2HistoryServerPlugin.scala} | 16 +- .../ui/HiveThriftServer2Listener.scala | 315 + .../hive/thriftserver/ui/ThriftServerPage.scala| 30 +- .../thriftserver/ui/ThriftServerSessionPage.scala | 13 +- .../sql/hive/thriftserver/ui/ThriftServerTab.scala | 15 +- .../ui/HiveThriftServer2ListenerSuite.scala| 164 +++ .../thriftserver/ui/ThriftServerPageSuite.scala| 68 +++-- 23 files changed, 883 insertions(+), 293 deletions(-) create mode 100644 sql/hive-thriftserver/src/main/resources/META-INF/services/org.apache.spark.status.AppHistoryServerPlugin create mode 100644 sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/HiveThriftServer2AppStatusStore.scala create mode 100644 sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/HiveThriftServer2EventManager.scala copy sql/{core/src/main/scala/org/apache/spark/sql/execution/ui/SQLHistoryServerPlugin.scala => hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/HiveThriftServer2HistoryServerPlugin.scala} (75%) create mode 100644 sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/HiveThriftServer2Listener.scala create mode 100644 sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/ui/HiveThriftServer2ListenerSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (d8b0914 -> a36a723)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d8b0914 [SPARK-28957][SQL] Copy any "spark.hive.foo=bar" spark properties into hadoop conf as "hive.foo=bar" add a36a723 [SPARK-29215][SQL] current namespace should be tracked in SessionCatalog if the current catalog is session catalog No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 4 +- .../sql/connector/catalog/CatalogManager.scala | 60 - .../connector/catalog/CatalogManagerSuite.scala| 76 ++ .../org/apache/spark/sql/DataFrameWriterV2.scala | 6 +- 4 files changed, 110 insertions(+), 36 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (82d0aa3 -> 3c4044e)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 82d0aa3 [SPARK-30762] Add dtype=float32 support to vector_to_array UDF add 3c4044e [SPARK-30703][SQL][DOCS] Add a document for the ANSI mode No new revisions were added by this update. Summary of changes: docs/_data/menu-sql.yaml | 11 +- ...{sql-keywords.md => sql-ref-ansi-compliance.md} | 125 - docs/sql-ref-arithmetic-ops.md | 22 3 files changed, 132 insertions(+), 26 deletions(-) rename docs/{sql-keywords.md => sql-ref-ansi-compliance.md} (82%) delete mode 100644 docs/sql-ref-arithmetic-ops.md - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (82d0aa3 -> 3c4044e)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 82d0aa3 [SPARK-30762] Add dtype=float32 support to vector_to_array UDF add 3c4044e [SPARK-30703][SQL][DOCS] Add a document for the ANSI mode No new revisions were added by this update. Summary of changes: docs/_data/menu-sql.yaml | 11 +- ...{sql-keywords.md => sql-ref-ansi-compliance.md} | 125 - docs/sql-ref-arithmetic-ops.md | 22 3 files changed, 132 insertions(+), 26 deletions(-) rename docs/{sql-keywords.md => sql-ref-ansi-compliance.md} (82%) delete mode 100644 docs/sql-ref-arithmetic-ops.md - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-30703][SQL][DOCS] Add a document for the ANSI mode
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 82258aa [SPARK-30703][SQL][DOCS] Add a document for the ANSI mode 82258aa is described below commit 82258aa4794d15bfe9cfd2b4bced790ef1d35e45 Author: Takeshi Yamamuro AuthorDate: Thu Feb 13 10:53:55 2020 -0800 [SPARK-30703][SQL][DOCS] Add a document for the ANSI mode ### What changes were proposed in this pull request? This pr intends to add a document for the ANSI mode; https://user-images.githubusercontent.com/692303/74386041-5934f780-4e38-11ea-8162-26e524e11c65.png;> https://user-images.githubusercontent.com/692303/74386040-589c6100-4e38-11ea-8a64-899788eaf55f.png;> https://user-images.githubusercontent.com/692303/74386039-5803ca80-4e38-11ea-949f-049208d2203d.png;> https://user-images.githubusercontent.com/692303/74386036-563a0700-4e38-11ea-9ec3-87a8f6771cf0.png;> ### Why are the changes needed? For better document coverage and usability. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? N/A Closes #27489 from maropu/SPARK-30703. Authored-by: Takeshi Yamamuro Signed-off-by: Gengliang Wang (cherry picked from commit 3c4044ea77fe3b1268b52744cd4f1ae61f17a9a8) Signed-off-by: Gengliang Wang --- docs/_data/menu-sql.yaml | 11 +- ...{sql-keywords.md => sql-ref-ansi-compliance.md} | 125 - docs/sql-ref-arithmetic-ops.md | 22 3 files changed, 132 insertions(+), 26 deletions(-) diff --git a/docs/_data/menu-sql.yaml b/docs/_data/menu-sql.yaml index 241ec39..1e343f6 100644 --- a/docs/_data/menu-sql.yaml +++ b/docs/_data/menu-sql.yaml @@ -80,6 +80,15 @@ url: sql-ref-null-semantics.html - text: NaN Semantics url: sql-ref-nan-semantics.html +- text: ANSI Compliance + url: sql-ref-ansi-compliance.html + subitems: +- text: Arithmetic Operations + url: sql-ref-ansi-compliance.html#arithmetic-operations +- text: Type Conversion + url: sql-ref-ansi-compliance.html#type-conversion +- text: SQL Keywords + url: sql-ref-ansi-compliance.html#sql-keywords - text: SQL Syntax url: sql-ref-syntax.html subitems: @@ -214,5 +223,3 @@ url: sql-ref-syntax-aux-resource-mgmt-list-file.html - text: LIST JAR url: sql-ref-syntax-aux-resource-mgmt-list-jar.html -- text: Arithmetic operations - url: sql-ref-arithmetic-ops.html diff --git a/docs/sql-keywords.md b/docs/sql-ref-ansi-compliance.md similarity index 82% rename from docs/sql-keywords.md rename to docs/sql-ref-ansi-compliance.md index 9e4a3c5..d023835 100644 --- a/docs/sql-keywords.md +++ b/docs/sql-ref-ansi-compliance.md @@ -1,7 +1,7 @@ --- layout: global -title: Spark SQL Keywords -displayTitle: Spark SQL Keywords +title: ANSI Compliance +displayTitle: ANSI Compliance license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with @@ -19,6 +19,127 @@ license: | limitations under the License. --- +Spark SQL has two options to comply with the SQL standard: `spark.sql.ansi.enabled` and `spark.sql.storeAssignmentPolicy` (See a table below for details). +When `spark.sql.ansi.enabled` is set to `true`, Spark SQL follows the standard in basic behaviours (e.g., arithmetic operations, type conversion, and SQL parsing). +Moreover, Spark SQL has an independent option to control implicit casting behaviours when inserting rows in a table. +The casting behaviours are defined as store assignment rules in the standard. +When `spark.sql.storeAssignmentPolicy` is set to `ANSI`, Spark SQL complies with the ANSI store assignment rules. + + +Property NameDefaultMeaning + + spark.sql.ansi.enabled + false + +When true, Spark tries to conform to the ANSI SQL specification: +1. Spark will throw a runtime exception if an overflow occurs in any operation on integral/decimal field. +2. Spark will forbid using the reserved keywords of ANSI SQL as identifiers in the SQL parser. + + + + spark.sql.storeAssignmentPolicy + ANSI + +When inserting a value into a column with different data type, Spark will perform type coercion. +Currently, we support 3 policies for the type coercion rules: ANSI, legacy and strict. With ANSI policy, +Spark performs the type coercion as per ANSI SQL. In practice, the behavior is mostly the same as PostgreSQL. +It disallows certain unreasonable type conversions such as converting string to int or double to boolean. +With legacy policy, Spark allows the type c
[spark] branch master updated: [SPARK-31014][CORE] InMemoryStore: remove key from parentToChildrenMap when removing key from CountingRemoveIfForEach
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 78721fd [SPARK-31014][CORE] InMemoryStore: remove key from parentToChildrenMap when removing key from CountingRemoveIfForEach 78721fd is described below commit 78721fd8c5cc2b09807c7a3196244cac60bcfbeb Author: Jungtaek Lim (HeartSaVioR) AuthorDate: Wed Mar 4 10:05:26 2020 -0800 [SPARK-31014][CORE] InMemoryStore: remove key from parentToChildrenMap when removing key from CountingRemoveIfForEach ### What changes were proposed in this pull request? This patch addresses missed spot on SPARK-30964 (#27716) - SPARK-30964 added secondary index which defines the relationship between parent - children and able to operate all children for given parent faster. While SPARK-30964 handled the addition and deletion of secondary index in InstanceList properly, it missed to add code to handle deletion of secondary index in CountingRemoveIfForEach, resulting to the leak of indices. This patch adds the deletion of secondary index in CountingRemoveIfForEach. ### Why are the changes needed? Described above. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? N/A, as relevant field and class are marked as private, and it cannot be checked in higher level. I'm not sure we want to adjust scope to add a test. Closes #27765 from HeartSaVioR/SPARK-31014. Authored-by: Jungtaek Lim (HeartSaVioR) Signed-off-by: Gengliang Wang --- .../apache/spark/util/kvstore/InMemoryStore.java | 30 +++--- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java b/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java index f1bebb4..e929c6c 100644 --- a/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java +++ b/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java @@ -177,7 +177,7 @@ public class InMemoryStore implements KVStore { * iterators. https://bugs.openjdk.java.net/browse/JDK-8078645 */ private static class CountingRemoveIfForEach implements BiConsumer, T> { - private final ConcurrentMap, T> data; + private final InstanceList instanceList; private final Predicate filter; /** @@ -189,17 +189,15 @@ public class InMemoryStore implements KVStore { */ private int count = 0; - CountingRemoveIfForEach( - ConcurrentMap, T> data, - Predicate filter) { -this.data = data; + CountingRemoveIfForEach(InstanceList instanceList, Predicate filter) { +this.instanceList = instanceList; this.filter = filter; } @Override public void accept(Comparable key, T value) { if (filter.test(value)) { - if (data.remove(key, value)) { + if (instanceList.delete(key, value)) { count++; } } @@ -253,7 +251,7 @@ public class InMemoryStore implements KVStore { return count; } else { Predicate filter = getPredicate(ti.getAccessor(index), indexValues); -CountingRemoveIfForEach callback = new CountingRemoveIfForEach<>(data, filter); +CountingRemoveIfForEach callback = new CountingRemoveIfForEach<>(this, filter); // Go through all the values in `data` and delete objects that meets the predicate `filter`. // This can be slow when there is a large number of entries in `data`. @@ -278,7 +276,22 @@ public class InMemoryStore implements KVStore { public boolean delete(Object key) { boolean entryExists = data.remove(asKey(key)) != null; - if (entryExists && hasNaturalParentIndex) { + if (entryExists) { +deleteParentIndex(key); + } + return entryExists; +} + +public boolean delete(Object key, T value) { + boolean entryExists = data.remove(asKey(key), value); + if (entryExists) { +deleteParentIndex(key); + } + return entryExists; +} + +private void deleteParentIndex(Object key) { + if (hasNaturalParentIndex) { for (NaturalKeys v : parentToChildrenMap.values()) { if (v.remove(asKey(key))) { // `v` can be empty after removing the natural key and we can remove it from @@ -289,7 +302,6 @@ public class InMemoryStore implements KVStore { } } } - return entryExists; } public int size() { - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3995728 -> 314442a)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3995728 [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655 add 314442a [SQL][MINOR][TESTS] Remove GivenWhenThen trait from HiveComparisonTest No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/hive/execution/HiveComparisonTest.scala | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-30906][SQL][TESTS][FOLLOW-UP] Set the configuration against TestHive explicitly in HiveSerDeSuite
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 95df63c [SPARK-30906][SQL][TESTS][FOLLOW-UP] Set the configuration against TestHive explicitly in HiveSerDeSuite 95df63c is described below commit 95df63c45fd9c8159ddee00014023eda10df93dc Author: HyukjinKwon AuthorDate: Wed Feb 26 18:01:26 2020 -0800 [SPARK-30906][SQL][TESTS][FOLLOW-UP] Set the configuration against TestHive explicitly in HiveSerDeSuite ### What changes were proposed in this pull request? After https://github.com/apache/spark/pull/27659 (see https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7-hive-2.3/253/), the tests below fail consistently, specifically in one job https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7-hive-2.3/ in Jenkins ``` org.apache.spark.sql.hive.execution.HiveSerDeSuite.Test the default fileformat for Hive-serde tables ``` The profile is same as PR builder but seems it fails specifically in this machine. Several configurations used in `HiveSerDeSuite` are not being set presumably due to the inconsistency between `SQLConf.get` and the active Spark session described in the https://github.com/apache/spark/pull/27387, and as a side effect of the cloned session at https://github.com/apache/spark/pull/27659. This PR proposes to explicitly set the configuration against `TestHive` by using `withExistingConf` at `withSQLConf` ### Why are the changes needed? To make `spark-master-test-sbt-hadoop-2.7-hive-2.3` job pass. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Cannot reproduce in my local. Presumably it cannot be reproduced in the PR builder. We should see if the tests pass at `spark-master-test-sbt-hadoop-2.7-hive-2.3` job after this PR is merged. Closes #27705 from HyukjinKwon/SPARK-30906. Authored-by: HyukjinKwon Signed-off-by: Gengliang Wang (cherry picked from commit cd3ef2249fc6bd60e4ba62ee046dcf4115864206) Signed-off-by: Gengliang Wang --- .../scala/org/apache/spark/sql/hive/execution/HiveSerDeSuite.scala| 4 1 file changed, 4 insertions(+) diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveSerDeSuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveSerDeSuite.scala index 9a1190a..d2d3502 100644 --- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveSerDeSuite.scala +++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveSerDeSuite.scala @@ -82,6 +82,10 @@ class HiveSerDeSuite extends HiveComparisonTest with PlanTest with BeforeAndAfte }.head } + // Make sure we set the config values to TestHive.conf. + override def withSQLConf(pairs: (String, String)*)(f: => Unit): Unit = +SQLConf.withExistingConf(TestHive.conf)(super.withSQLConf(pairs: _*)(f)) + test("Test the default fileformat for Hive-serde tables") { withSQLConf("hive.default.fileformat" -> "orc", SQLConf.LEGACY_CREATE_HIVE_TABLE_BY_DEFAULT_ENABLED.key -> "true") { - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (825d3dc -> cd3ef22)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 825d3dc [SPARK-30841][SQL][DOC] Add version information to the configuration of SQL add cd3ef22 [SPARK-30906][SQL][TESTS][FOLLOW-UP] Set the configuration against TestHive explicitly in HiveSerDeSuite No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/hive/execution/HiveSerDeSuite.scala| 4 1 file changed, 4 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3995728 -> 314442a)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3995728 [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655 add 314442a [SQL][MINOR][TESTS] Remove GivenWhenThen trait from HiveComparisonTest No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/hive/execution/HiveComparisonTest.scala | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (80a8947 -> 515eb9d)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 80a8947 [SPARK-31052][TEST][CORE] Fix flaky test "DAGSchedulerSuite.shuffle fetch failed on speculative task, but original task succeed" add 515eb9d [SPARK-31013][CORE][WEBUI] InMemoryStore: improve removeAllByIndexValues over natural key index No new revisions were added by this update. Summary of changes: .../org/apache/spark/util/kvstore/InMemoryStore.java | 16 +++- .../apache/spark/util/kvstore/InMemoryStoreSuite.java | 18 -- 2 files changed, 23 insertions(+), 11 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SQL][DOCS][MINOR] Fix typos and wrong phrases in docs
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new ed7924a [SQL][DOCS][MINOR] Fix typos and wrong phrases in docs ed7924a is described below commit ed7924a86e2c4dd6b38075de68038e1997b2b30e Author: Takeshi Yamamuro AuthorDate: Thu Mar 5 16:54:59 2020 -0800 [SQL][DOCS][MINOR] Fix typos and wrong phrases in docs ### What changes were proposed in this pull request? This PR intends to fix typos and phrases in the `/docs` directory. To find them, I run the Intellij typo checker. ### Why are the changes needed? For better documents. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? N/A Closes #27819 from maropu/TypoFix-20200306. Authored-by: Takeshi Yamamuro Signed-off-by: Gengliang Wang (cherry picked from commit ffec7a1964984a7f911ac0da125134c098f273ba) Signed-off-by: Gengliang Wang --- docs/sql-pyspark-pandas-with-arrow.md| 2 +- docs/sql-ref-ansi-compliance.md | 2 +- docs/sql-ref-null-semantics.md | 2 +- docs/sql-ref-syntax-aux-show-functions.md| 2 +- docs/sql-ref-syntax-ddl-alter-table.md | 2 +- docs/sql-ref-syntax-ddl-create-table-like.md | 2 +- docs/sql-ref-syntax-qry-select-limit.md | 2 +- 7 files changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/sql-pyspark-pandas-with-arrow.md b/docs/sql-pyspark-pandas-with-arrow.md index 63ba0ba..e8abb9f 100644 --- a/docs/sql-pyspark-pandas-with-arrow.md +++ b/docs/sql-pyspark-pandas-with-arrow.md @@ -91,7 +91,7 @@ specify the type hints of `pandas.Series` and `pandas.DataFrame` as below: -In the following sections, it describes the cominations of the supported type hints. For simplicity, +In the following sections, it describes the combinations of the supported type hints. For simplicity, `pandas.DataFrame` variant is omitted. ### Series to Series diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md index 267184a..27e60b4 100644 --- a/docs/sql-ref-ansi-compliance.md +++ b/docs/sql-ref-ansi-compliance.md @@ -60,7 +60,7 @@ The following subsections present behaviour changes in arithmetic operations, ty ### Arithmetic Operations In Spark SQL, arithmetic operations performed on numeric types (with the exception of decimal) are not checked for overflows by default. -This means that in case an operation causes overflows, the result is the same that the same operation returns in a Java/Scala program (e.g., if the sum of 2 integers is higher than the maximum value representable, the result is a negative number). +This means that in case an operation causes overflows, the result is the same with the corresponding operation in a Java/Scala program (e.g., if the sum of 2 integers is higher than the maximum value representable, the result is a negative number). On the other hand, Spark SQL returns null for decimal overflows. When `spark.sql.ansi.enabled` is set to `true` and an overflow occurs in numeric and interval arithmetic operations, it throws an arithmetic exception at runtime. diff --git a/docs/sql-ref-null-semantics.md b/docs/sql-ref-null-semantics.md index 3cbc15c..37b4081 100644 --- a/docs/sql-ref-null-semantics.md +++ b/docs/sql-ref-null-semantics.md @@ -605,7 +605,7 @@ SELECT name, age FROM unknown_age; In Spark, EXISTS and NOT EXISTS expressions are allowed inside a WHERE clause. These are boolean expressions which return either `TRUE` or `FALSE`. In other words, EXISTS is a membership condition and returns `TRUE` -when the subquery it refers to returns one or more rows. Similary, NOT EXISTS +when the subquery it refers to returns one or more rows. Similarly, NOT EXISTS is a non-membership condition and returns TRUE when no rows or zero rows are returned from the subquery. diff --git a/docs/sql-ref-syntax-aux-show-functions.md b/docs/sql-ref-syntax-aux-show-functions.md index 701d427..d6f9df9 100644 --- a/docs/sql-ref-syntax-aux-show-functions.md +++ b/docs/sql-ref-syntax-aux-show-functions.md @@ -22,7 +22,7 @@ license: | ### Description Returns the list of functions after applying an optional regex pattern. Given number of functions supported by Spark is quite large, this statement -in conjuction with [describe function](sql-ref-syntax-aux-describe-function.html) +in conjunction with [describe function](sql-ref-syntax-aux-describe-function.html) may be used to quickly find the function and understand its usage. The `LIKE` clause is optional and supported only for compatibility with other systems. diff --git a/docs/sql-ref-syntax-ddl-alter-table.md b/docs/sql-ref-syntax-ddl-alter-table.md index a921478..373fa8d 100644 --- a/docs/sql-ref-syntax-ddl-alter-table.md +++ b
[spark] branch master updated: [SQL][DOCS][MINOR] Fix typos and wrong phrases in docs
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new ffec7a1 [SQL][DOCS][MINOR] Fix typos and wrong phrases in docs ffec7a1 is described below commit ffec7a1964984a7f911ac0da125134c098f273ba Author: Takeshi Yamamuro AuthorDate: Thu Mar 5 16:54:59 2020 -0800 [SQL][DOCS][MINOR] Fix typos and wrong phrases in docs ### What changes were proposed in this pull request? This PR intends to fix typos and phrases in the `/docs` directory. To find them, I run the Intellij typo checker. ### Why are the changes needed? For better documents. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? N/A Closes #27819 from maropu/TypoFix-20200306. Authored-by: Takeshi Yamamuro Signed-off-by: Gengliang Wang --- docs/sql-pyspark-pandas-with-arrow.md| 2 +- docs/sql-ref-ansi-compliance.md | 2 +- docs/sql-ref-null-semantics.md | 2 +- docs/sql-ref-syntax-aux-show-functions.md| 2 +- docs/sql-ref-syntax-ddl-alter-table.md | 2 +- docs/sql-ref-syntax-ddl-create-table-like.md | 2 +- docs/sql-ref-syntax-qry-select-limit.md | 2 +- 7 files changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/sql-pyspark-pandas-with-arrow.md b/docs/sql-pyspark-pandas-with-arrow.md index 63ba0ba..e8abb9f 100644 --- a/docs/sql-pyspark-pandas-with-arrow.md +++ b/docs/sql-pyspark-pandas-with-arrow.md @@ -91,7 +91,7 @@ specify the type hints of `pandas.Series` and `pandas.DataFrame` as below: -In the following sections, it describes the cominations of the supported type hints. For simplicity, +In the following sections, it describes the combinations of the supported type hints. For simplicity, `pandas.DataFrame` variant is omitted. ### Series to Series diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md index 267184a..27e60b4 100644 --- a/docs/sql-ref-ansi-compliance.md +++ b/docs/sql-ref-ansi-compliance.md @@ -60,7 +60,7 @@ The following subsections present behaviour changes in arithmetic operations, ty ### Arithmetic Operations In Spark SQL, arithmetic operations performed on numeric types (with the exception of decimal) are not checked for overflows by default. -This means that in case an operation causes overflows, the result is the same that the same operation returns in a Java/Scala program (e.g., if the sum of 2 integers is higher than the maximum value representable, the result is a negative number). +This means that in case an operation causes overflows, the result is the same with the corresponding operation in a Java/Scala program (e.g., if the sum of 2 integers is higher than the maximum value representable, the result is a negative number). On the other hand, Spark SQL returns null for decimal overflows. When `spark.sql.ansi.enabled` is set to `true` and an overflow occurs in numeric and interval arithmetic operations, it throws an arithmetic exception at runtime. diff --git a/docs/sql-ref-null-semantics.md b/docs/sql-ref-null-semantics.md index 3cbc15c..37b4081 100644 --- a/docs/sql-ref-null-semantics.md +++ b/docs/sql-ref-null-semantics.md @@ -605,7 +605,7 @@ SELECT name, age FROM unknown_age; In Spark, EXISTS and NOT EXISTS expressions are allowed inside a WHERE clause. These are boolean expressions which return either `TRUE` or `FALSE`. In other words, EXISTS is a membership condition and returns `TRUE` -when the subquery it refers to returns one or more rows. Similary, NOT EXISTS +when the subquery it refers to returns one or more rows. Similarly, NOT EXISTS is a non-membership condition and returns TRUE when no rows or zero rows are returned from the subquery. diff --git a/docs/sql-ref-syntax-aux-show-functions.md b/docs/sql-ref-syntax-aux-show-functions.md index 701d427..d6f9df9 100644 --- a/docs/sql-ref-syntax-aux-show-functions.md +++ b/docs/sql-ref-syntax-aux-show-functions.md @@ -22,7 +22,7 @@ license: | ### Description Returns the list of functions after applying an optional regex pattern. Given number of functions supported by Spark is quite large, this statement -in conjuction with [describe function](sql-ref-syntax-aux-describe-function.html) +in conjunction with [describe function](sql-ref-syntax-aux-describe-function.html) may be used to quickly find the function and understand its usage. The `LIKE` clause is optional and supported only for compatibility with other systems. diff --git a/docs/sql-ref-syntax-ddl-alter-table.md b/docs/sql-ref-syntax-ddl-alter-table.md index a921478..373fa8d 100644 --- a/docs/sql-ref-syntax-ddl-alter-table.md +++ b/docs/sql-ref-syntax-ddl-alter-table.md @@ -260,7 +260,7 @@ ALTER TABLE dbx.tab1 PARTITION (a='1', b='2') SET LOCATION
[spark] branch master updated (a372f76 -> 621e37e)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a372f76 [SPARK-30898][SQL] The behavior of MakeDecimal should not depend on SQLConf.get add 621e37e [SPARK-28880][SQL] Support ANSI nested bracketed comments No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/parser/SqlBase.g4| 22 +++- .../sql/catalyst/parser/PlanParserSuite.scala | 98 + .../test/resources/sql-tests/inputs/comments.sql | 90 +++ .../sql-tests/inputs/postgreSQL/comments.sql | 5 +- .../resources/sql-tests/results/comments.sql.out | 121 + .../sql-tests/results/postgreSQL/comments.sql.out | 47 +--- 6 files changed, 331 insertions(+), 52 deletions(-) create mode 100644 sql/core/src/test/resources/sql-tests/inputs/comments.sql create mode 100644 sql/core/src/test/resources/sql-tests/results/comments.sql.out - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-30684 ][WEBUI][FollowUp] A new approach for SPARK-30684
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 00c761d [SPARK-30684 ][WEBUI][FollowUp] A new approach for SPARK-30684 00c761d is described below commit 00c761d15e2628c8c5e1a9dad5ae122f292eae28 Author: Gengliang Wang AuthorDate: Sun Feb 9 14:18:51 2020 -0800 [SPARK-30684 ][WEBUI][FollowUp] A new approach for SPARK-30684 ### What changes were proposed in this pull request? Simplify the changes for adding metrics description for WholeStageCodegen in https://github.com/apache/spark/pull/27405 ### Why are the changes needed? In https://github.com/apache/spark/pull/27405, the UI changes can be made without using the function `adjustPositionOfOperationName` to adjust the position of operation name and mark as an operation-name class. I suggest we make simpler changes so that it would be easier for future development. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Manual test with the queries provided in https://github.com/apache/spark/pull/27405 ``` sc.parallelize(1 to 10).toDF.sort("value").filter("value > 1").selectExpr("value * 2").show sc.parallelize(1 to 10).toDF.sort("value").filter("value > 1").selectExpr("value * 2").write.format("json").mode("overwrite").save("/tmp/test_output") sc.parallelize(1 to 10).toDF.write.format("json").mode("append").save("/tmp/test_output") ``` ![image](https://user-images.githubusercontent.com/1097932/74073629-e3f09f00-49bf-11ea-90dc-1edb5ca29e5e.png) Closes #27490 from gengliangwang/wholeCodegenUI. Authored-by: Gengliang Wang Signed-off-by: Gengliang Wang (cherry picked from commit b877aac14657832d1b896ea57e06b0d0fd15ee01) Signed-off-by: Gengliang Wang --- .../sql/execution/ui/static/spark-sql-viz.css | 8 ++ .../spark/sql/execution/ui/static/spark-sql-viz.js | 31 +- .../spark/sql/execution/ui/SparkPlanGraph.scala| 18 ++--- 3 files changed, 11 insertions(+), 46 deletions(-) diff --git a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css index eff0142..2018838 100644 --- a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css +++ b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css @@ -18,6 +18,7 @@ #plan-viz-graph .label { font-weight: normal; text-shadow: none; + color: black; } #plan-viz-graph svg g.cluster rect { @@ -32,13 +33,8 @@ stroke-width: 1px; } -/* This declaration is needed to define the width of rectangles */ -#plan-viz-graph svg text :first-child { - font-weight: bold; -} - /* Highlight the SparkPlan node name */ -#plan-viz-graph svg text .operator-name { +#plan-viz-graph svg text :first-child { font-weight: bold; } diff --git a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js index e6ce641..c834914 100644 --- a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js +++ b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js @@ -34,7 +34,6 @@ function renderPlanViz() { preprocessGraphLayout(g); var renderer = new dagreD3.render(); renderer(graph, g); - adjustPositionOfOperationName(); // Round corners on rectangles svg @@ -82,7 +81,7 @@ function setupTooltipForSparkPlanNode(nodeId) { * and sizes of graph elements, e.g. padding, font style, shape. */ function preprocessGraphLayout(g) { - g.graph().ranksep = "90"; + g.graph().ranksep = "70"; var nodes = g.nodes(); for (var i = 0; i < nodes.length; i++) { var node = g.node(nodes[i]); @@ -129,34 +128,6 @@ function resizeSvg(svg) { .attr("height", height); } - -/* Helper function to adjust the position of operation name and mark as a operation-name class. */ -function adjustPositionOfOperationName() { - $("#plan-viz-graph svg text") - .each(function() { -var tspans = $(this).find("tspan"); - -if (tspans[0].textContent.trim() !== "") { - var isOperationNameOnly = - $(tspans).filter(function(i, n) { -return i !== 0 && n.textContent.trim() !== ""; - }).length === 0; - - if (isOperationNameOnly) { -// If the only te
[spark] branch master updated (339c0f9 -> b877aac)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 339c0f9 [SPARK-30510][SQL][DOCS] Publicly document Spark SQL configuration options add b877aac [SPARK-30684 ][WEBUI][FollowUp] A new approach for SPARK-30684 No new revisions were added by this update. Summary of changes: .../sql/execution/ui/static/spark-sql-viz.css | 8 ++ .../spark/sql/execution/ui/static/spark-sql-viz.js | 31 +- .../spark/sql/execution/ui/SparkPlanGraph.scala| 18 ++--- 3 files changed, 11 insertions(+), 46 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (339c0f9 -> b877aac)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 339c0f9 [SPARK-30510][SQL][DOCS] Publicly document Spark SQL configuration options add b877aac [SPARK-30684 ][WEBUI][FollowUp] A new approach for SPARK-30684 No new revisions were added by this update. Summary of changes: .../sql/execution/ui/static/spark-sql-viz.css | 8 ++ .../spark/sql/execution/ui/static/spark-sql-viz.js | 31 +- .../spark/sql/execution/ui/SparkPlanGraph.scala| 18 ++--- 3 files changed, 11 insertions(+), 46 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-30881][SQL][DOCS] Revise the doc of spark.sql.sources.parallelPartitionDiscovery.threshold
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new b38b237 [SPARK-30881][SQL][DOCS] Revise the doc of spark.sql.sources.parallelPartitionDiscovery.threshold b38b237 is described below commit b38b237ac22c08ce6aadb041b2bf1b78f5f5db75 Author: Gengliang Wang AuthorDate: Thu Feb 20 00:59:22 2020 -0800 [SPARK-30881][SQL][DOCS] Revise the doc of spark.sql.sources.parallelPartitionDiscovery.threshold ### What changes were proposed in this pull request? Revise the doc of SQL configuration `spark.sql.sources.parallelPartitionDiscovery.threshold`. ### Why are the changes needed? The doc of configuration "spark.sql.sources.parallelPartitionDiscovery.threshold" is not accurate on the part "This applies to Parquet, ORC, CSV, JSON and LibSVM data sources". We should revise it as effective on all the file-based data sources. ### Does this PR introduce any user-facing change? No ### How was this patch tested? None. It's just doc. Closes #27639 from gengliangwang/reviseParallelPartitionDiscovery. Authored-by: Gengliang Wang Signed-off-by: Gengliang Wang (cherry picked from commit 92d5d40c8efffd90eb04308b6dad77d3d1e1be14) Signed-off-by: Gengliang Wang --- .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala index 612fa86..9cbaaee 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala @@ -878,8 +878,8 @@ object SQLConf { buildConf("spark.sql.sources.parallelPartitionDiscovery.threshold") .doc("The maximum number of paths allowed for listing files at driver side. If the number " + "of detected paths exceeds this value during partition discovery, it tries to list the " + -"files with another Spark distributed job. This applies to Parquet, ORC, CSV, JSON and " + -"LibSVM data sources.") +"files with another Spark distributed job. This configuration is effective only when " + +"using file-based sources such as Parquet, JSON and ORC.") .intConf .checkValue(parallel => parallel >= 0, "The maximum number of paths allowed for listing " + "files at driver side must not be negative") - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (7c4ad63 -> 92d5d40)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 7c4ad63 [SPARK-29148][CORE][FOLLOW-UP] Don't dynamic allocation warning when it's disabled add 92d5d40 [SPARK-30881][SQL][DOCS] Revise the doc of spark.sql.sources.parallelPartitionDiscovery.threshold No new revisions were added by this update. Summary of changes: .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (7c4ad63 -> 92d5d40)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 7c4ad63 [SPARK-29148][CORE][FOLLOW-UP] Don't dynamic allocation warning when it's disabled add 92d5d40 [SPARK-30881][SQL][DOCS] Revise the doc of spark.sql.sources.parallelPartitionDiscovery.threshold No new revisions were added by this update. Summary of changes: .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (94284c8 -> 2d59ca4)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 94284c8 [SPARK-30587][SQL][TESTS] Add test suites for CSV and JSON v1 add 2d59ca4 [SPARK-30475][SQL] File source V2: Push data filters for file listing No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/v2/avro/AvroScan.scala| 8 +++-- .../org/apache/spark/sql/avro/AvroSuite.scala | 29 + .../datasources/PruneFileSourcePartitions.scala| 22 +++-- .../sql/execution/datasources/v2/FileScan.scala| 16 +++--- .../sql/execution/datasources/v2/csv/CSVScan.scala | 8 +++-- .../execution/datasources/v2/json/JsonScan.scala | 8 +++-- .../sql/execution/datasources/v2/orc/OrcScan.scala | 8 +++-- .../datasources/v2/parquet/ParquetScan.scala | 8 +++-- .../execution/datasources/v2/text/TextScan.scala | 8 +++-- .../spark/sql/FileBasedDataSourceSuite.scala | 36 ++ 10 files changed, 120 insertions(+), 31 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (f8d5957 -> c0e9f9f)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f8d5957 [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider add c0e9f9f [SPARK-30459][SQL] Fix ignoreMissingFiles/ignoreCorruptFiles in data source v2 No new revisions were added by this update. Summary of changes: .../datasources/v2/FilePartitionReader.scala | 9 ++ .../spark/sql/FileBasedDataSourceSuite.scala | 37 ++ 2 files changed, 27 insertions(+), 19 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (f8d5957 -> c0e9f9f)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f8d5957 [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider add c0e9f9f [SPARK-30459][SQL] Fix ignoreMissingFiles/ignoreCorruptFiles in data source v2 No new revisions were added by this update. Summary of changes: .../datasources/v2/FilePartitionReader.scala | 9 ++ .../spark/sql/FileBasedDataSourceSuite.scala | 37 ++ 2 files changed, 27 insertions(+), 19 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (f8d5957 -> c0e9f9f)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f8d5957 [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider add c0e9f9f [SPARK-30459][SQL] Fix ignoreMissingFiles/ignoreCorruptFiles in data source v2 No new revisions were added by this update. Summary of changes: .../datasources/v2/FilePartitionReader.scala | 9 ++ .../spark/sql/FileBasedDataSourceSuite.scala | 37 ++ 2 files changed, 27 insertions(+), 19 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5137346 -> 4f9d3dc)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5137346 [SPARK-30412][SQL][TESTS] Eliminate warnings in Java tests regarding to deprecated Spark SQL API add 4f9d3dc [SPARK-30384][WEBUI] Needs to improve the Column name and Add tooltips for the Fair Scheduler Pool Table No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/ui/jobs/PoolTable.scala | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (d32ed25 -> e645125)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d32ed25 [SPARK-30144][ML][PYSPARK] Make MultilayerPerceptronClassificationModel extend MultilayerPerceptronParams add e645125 [SPARK-30267][SQL] Avro arrays can be of any List No new revisions were added by this update. Summary of changes: .../apache/spark/sql/avro/AvroDeserializer.scala | 5 +- .../sql/avro/AvroCatalystDataConversionSuite.scala | 67 ++ 2 files changed, 69 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1160457 -> 0d589f4)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1160457 [SPARK-30429][SQL] Optimize catalogString and usage in ValidateExternalType.errMsg to avoid OOM add 0d589f4 [SPARK-30267][SQL][FOLLOWUP] Use while loop in Avro Array Deserializer No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1160457 -> 0d589f4)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1160457 [SPARK-30429][SQL] Optimize catalogString and usage in ValidateExternalType.errMsg to avoid OOM add 0d589f4 [SPARK-30267][SQL][FOLLOWUP] Use while loop in Avro Array Deserializer No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1c8526d -> 0f46325)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1c8526d [SPARK-28093][FOLLOW-UP] Remove migration guide of TRIM changes add 0f46325 [SPARK-31128][WEBUI] Fix Uncaught TypeError in streaming statistics page No new revisions were added by this update. Summary of changes: .../org/apache/spark/ui/static/streaming-page.js | 39 -- 1 file changed, 21 insertions(+), 18 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31128][WEBUI] Fix Uncaught TypeError in streaming statistics page
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 61ede3a [SPARK-31128][WEBUI] Fix Uncaught TypeError in streaming statistics page 61ede3a is described below commit 61ede3ac7cb799fadc2549fa61b5375bc7872e7f Author: Gengliang Wang AuthorDate: Thu Mar 12 20:01:17 2020 -0700 [SPARK-31128][WEBUI] Fix Uncaught TypeError in streaming statistics page ### What changes were proposed in this pull request? There is a minor issue in https://github.com/apache/spark/pull/26201 In the streaming statistics page, there is such error ``` streaming-page.js:211 Uncaught TypeError: Cannot read property 'top' of undefined at SVGCircleElement. (streaming-page.js:211) at SVGCircleElement.__onclick (d3.min.js:1) ``` in the console after clicking the timeline graph. ![image](https://user-images.githubusercontent.com/1097932/76479745-14b26280-63ca-11ea-9079-0065321795f9.png) This PR is to fix it. ### Why are the changes needed? Fix the error of javascript execution. ### Does this PR introduce any user-facing change? No, the error shows up in the console. ### How was this patch tested? Manual test. Closes #27883 from gengliangwang/fixSelector. Authored-by: Gengliang Wang Signed-off-by: Gengliang Wang (cherry picked from commit 0f463258c214c67360f2b46a4e2292d992b3bf23) Signed-off-by: Gengliang Wang --- .../org/apache/spark/ui/static/streaming-page.js | 39 -- 1 file changed, 21 insertions(+), 18 deletions(-) diff --git a/core/src/main/resources/org/apache/spark/ui/static/streaming-page.js b/core/src/main/resources/org/apache/spark/ui/static/streaming-page.js index ed3e65c3..528b9df 100644 --- a/core/src/main/resources/org/apache/spark/ui/static/streaming-page.js +++ b/core/src/main/resources/org/apache/spark/ui/static/streaming-page.js @@ -190,29 +190,32 @@ function drawTimeline(id, data, minX, maxX, minY, maxY, unitY, batchInterval) { .attr("r", function(d) { return isFailedBatch(d.x) ? "2" : "3";}); }) .on("click", function(d) { -if (lastTimeout != null) { -window.clearTimeout(lastTimeout); -} -if (lastClickedBatch != null) { -clearBatchRow(lastClickedBatch); -lastClickedBatch = null; -} -lastClickedBatch = d.x; -highlightBatchRow(lastClickedBatch); -lastTimeout = window.setTimeout(function () { -lastTimeout = null; +var batchSelector = $("#batch-" + d.x); +// If there is a corresponding batch row, scroll down to it and highlight it. +if (batchSelector.length > 0) { +if (lastTimeout != null) { +window.clearTimeout(lastTimeout); +} if (lastClickedBatch != null) { clearBatchRow(lastClickedBatch); lastClickedBatch = null; } -}, 3000); // Clean up after 3 seconds - -var batchSelector = $("#batch-" + d.x); -var topOffset = batchSelector.offset().top - 15; -if (topOffset < 0) { -topOffset = 0; +lastClickedBatch = d.x; +highlightBatchRow(lastClickedBatch); +lastTimeout = window.setTimeout(function () { +lastTimeout = null; +if (lastClickedBatch != null) { +clearBatchRow(lastClickedBatch); +lastClickedBatch = null; +} +}, 3000); // Clean up after 3 seconds + +var topOffset = batchSelector.offset().top - 15; +if (topOffset < 0) { +topOffset = 0; +} +$('html,body').animate({scrollTop: topOffset}, 200); } -$('html,body').animate({scrollTop: topOffset}, 200); }); } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (1c65b1d -> 775e958)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 1c65b1d [SPARK-31422][CORE][FOLLOWUP] Fix a test case compilation error add 775e958 [SPARK-31420][WEBUI][2.4] Infinite timeline redraw in job details page No new revisions were added by this update. Summary of changes: .../org/apache/spark/ui/static/timeline-view.css | 12 - .../org/apache/spark/ui/static/timeline-view.js| 29 --- .../spark/ui/static/vis-timeline-graph2d.min.css | 2 + .../ui/static/vis-timeline-graph2d.min.css.map | 1 + .../spark/ui/static/vis-timeline-graph2d.min.js| 60 ++ .../ui/static/vis-timeline-graph2d.min.js.map | 1 + .../org/apache/spark/ui/static/vis.min.css | 1 - .../org/apache/spark/ui/static/vis.min.js | 45 .../main/scala/org/apache/spark/ui/UIUtils.scala | 5 +- .../org/apache/spark/ui/UISeleniumSuite.scala | 2 +- dev/.rat-excludes | 6 ++- licenses-binary/LICENSE-vis-timeline.txt | 23 + licenses-binary/LICENSE-vis.txt| 22 licenses/LICENSE-vis-timeline.txt | 23 + licenses/LICENSE-vis.txt | 22 15 files changed, 140 insertions(+), 114 deletions(-) create mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis-timeline-graph2d.min.css create mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis-timeline-graph2d.min.css.map create mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis-timeline-graph2d.min.js create mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis-timeline-graph2d.min.js.map delete mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis.min.css delete mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis.min.js create mode 100644 licenses-binary/LICENSE-vis-timeline.txt delete mode 100644 licenses-binary/LICENSE-vis.txt create mode 100644 licenses/LICENSE-vis-timeline.txt delete mode 100644 licenses/LICENSE-vis.txt - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (e415a42 -> 2a27366)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from e415a42 [SPARK-31439][SQL] Fix perf regression of fromJavaDate add 2a27366 [SPARK-31420][WEBUI][3.0] Infinite timeline redraw in job details page No new revisions were added by this update. Summary of changes: .../org/apache/spark/ui/static/timeline-view.css | 12 - .../org/apache/spark/ui/static/timeline-view.js| 29 --- .../spark/ui/static/vis-timeline-graph2d.min.css | 2 + .../ui/static/vis-timeline-graph2d.min.css.map | 1 + .../spark/ui/static/vis-timeline-graph2d.min.js| 60 ++ .../ui/static/vis-timeline-graph2d.min.js.map | 1 + .../org/apache/spark/ui/static/vis.min.css | 1 - .../org/apache/spark/ui/static/vis.min.js | 45 .../main/scala/org/apache/spark/ui/UIUtils.scala | 5 +- .../org/apache/spark/ui/UISeleniumSuite.scala | 2 +- dev/.rat-excludes | 6 ++- licenses-binary/LICENSE-vis-timeline.txt | 23 + licenses-binary/LICENSE-vis.txt| 22 licenses/LICENSE-vis-timeline.txt | 23 + licenses/LICENSE-vis.txt | 22 15 files changed, 140 insertions(+), 114 deletions(-) create mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis-timeline-graph2d.min.css create mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis-timeline-graph2d.min.css.map create mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis-timeline-graph2d.min.js create mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis-timeline-graph2d.min.js.map delete mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis.min.css delete mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis.min.js create mode 100644 licenses-binary/LICENSE-vis-timeline.txt delete mode 100644 licenses-binary/LICENSE-vis.txt create mode 100644 licenses/LICENSE-vis-timeline.txt delete mode 100644 licenses/LICENSE-vis.txt - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (e415a42 -> 2a27366)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from e415a42 [SPARK-31439][SQL] Fix perf regression of fromJavaDate add 2a27366 [SPARK-31420][WEBUI][3.0] Infinite timeline redraw in job details page No new revisions were added by this update. Summary of changes: .../org/apache/spark/ui/static/timeline-view.css | 12 - .../org/apache/spark/ui/static/timeline-view.js| 29 --- .../spark/ui/static/vis-timeline-graph2d.min.css | 2 + .../ui/static/vis-timeline-graph2d.min.css.map | 1 + .../spark/ui/static/vis-timeline-graph2d.min.js| 60 ++ .../ui/static/vis-timeline-graph2d.min.js.map | 1 + .../org/apache/spark/ui/static/vis.min.css | 1 - .../org/apache/spark/ui/static/vis.min.js | 45 .../main/scala/org/apache/spark/ui/UIUtils.scala | 5 +- .../org/apache/spark/ui/UISeleniumSuite.scala | 2 +- dev/.rat-excludes | 6 ++- licenses-binary/LICENSE-vis-timeline.txt | 23 + licenses-binary/LICENSE-vis.txt| 22 licenses/LICENSE-vis-timeline.txt | 23 + licenses/LICENSE-vis.txt | 22 15 files changed, 140 insertions(+), 114 deletions(-) create mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis-timeline-graph2d.min.css create mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis-timeline-graph2d.min.css.map create mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis-timeline-graph2d.min.js create mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis-timeline-graph2d.min.js.map delete mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis.min.css delete mode 100644 core/src/main/resources/org/apache/spark/ui/static/vis.min.js create mode 100644 licenses-binary/LICENSE-vis-timeline.txt delete mode 100644 licenses-binary/LICENSE-vis.txt create mode 100644 licenses/LICENSE-vis-timeline.txt delete mode 100644 licenses/LICENSE-vis.txt - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-31411][UI] Show submitted time and duration in job details page
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 28e1a4f [SPARK-31411][UI] Show submitted time and duration in job details page 28e1a4f is described below commit 28e1a4fa93f32bf4b94e25d06c5419f978c6eced Author: Gengliang Wang AuthorDate: Mon Apr 13 17:12:26 2020 -0700 [SPARK-31411][UI] Show submitted time and duration in job details page ### What changes were proposed in this pull request? Show submitted time and duration of a job in its details page ### Why are the changes needed? When we check job details from the SQL execution page, it will be more convenient if we can get the submission time and duration from the job page, instead of finding the info from job list page. ### Does this PR introduce any user-facing change? Yes. After changes, the job details page shows the submitted time and duration. ### How was this patch tested? Manual check ![image](https://user-images.githubusercontent.com/1097932/78974997-0a1de280-7ac8-11ea-8072-ce7a001b1b0c.png) Closes #28179 from gengliangwang/addSubmittedTimeAndDuration. Authored-by: Gengliang Wang Signed-off-by: Gengliang Wang --- .../org/apache/spark/ui/jobs/AllJobsPage.scala | 11 ++- .../org/apache/spark/ui/jobs/JobDataUtil.scala | 38 ++ .../scala/org/apache/spark/ui/jobs/JobPage.scala | 8 + 3 files changed, 49 insertions(+), 8 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala b/core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala index 3fa4ecc..68c8d82 100644 --- a/core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala +++ b/core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala @@ -442,15 +442,10 @@ private[ui] class JobDataSource( } private def jobRow(jobData: v1.JobData): JobTableRowData = { -val duration: Option[Long] = { - jobData.submissionTime.map { start => -val end = jobData.completionTime.map(_.getTime()).getOrElse(System.currentTimeMillis()) -end - start.getTime() - } -} -val formattedDuration = duration.map(d => UIUtils.formatDuration(d)).getOrElse("Unknown") +val duration: Option[Long] = JobDataUtil.getDuration(jobData) +val formattedDuration = JobDataUtil.getFormattedDuration(jobData) val submissionTime = jobData.submissionTime -val formattedSubmissionTime = submissionTime.map(UIUtils.formatDate).getOrElse("Unknown") +val formattedSubmissionTime = JobDataUtil.getFormattedSubmissionTime(jobData) val (lastStageName, lastStageDescription) = lastStageNameAndDescription(store, jobData) val jobDescription = diff --git a/core/src/main/scala/org/apache/spark/ui/jobs/JobDataUtil.scala b/core/src/main/scala/org/apache/spark/ui/jobs/JobDataUtil.scala new file mode 100644 index 000..f357fec --- /dev/null +++ b/core/src/main/scala/org/apache/spark/ui/jobs/JobDataUtil.scala @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.ui.jobs + +import org.apache.spark.status.api.v1.JobData +import org.apache.spark.ui.UIUtils + +private[ui] object JobDataUtil { + def getDuration(jobData: JobData): Option[Long] = { +jobData.submissionTime.map { start => +val end = jobData.completionTime.map(_.getTime()).getOrElse(System.currentTimeMillis()) +end - start.getTime() +} + } + + def getFormattedDuration(jobData: JobData): String = { +val duration = getDuration(jobData) +duration.map(d => UIUtils.formatDuration(d)).getOrElse("Unknown") + } + + def getFormattedSubmissionTime(jobData: JobData): String = { +jobData.submissionTime.map(UIUtils.formatDate).getOrElse("Unknown") + } +} diff --git a/core/src/main/scala/org/apache/spark/ui/jobs/JobPage.scala b/core/src/main/scala/org/apache/spark/ui/jobs/JobPage.scala index 348baab..eacc6ce 100644 --- a/core/src/main/sc
[spark] branch master updated (bbb3cd9 -> 28e1a4f)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from bbb3cd9 [SPARK-31391][SQL][TEST] Add AdaptiveTestUtils to ease the test of AQE add 28e1a4f [SPARK-31411][UI] Show submitted time and duration in job details page No new revisions were added by this update. Summary of changes: .../org/apache/spark/ui/jobs/AllJobsPage.scala | 11 ++- .../org/apache/spark/ui/jobs/JobDataUtil.scala | 38 -- .../scala/org/apache/spark/ui/jobs/JobPage.scala | 8 + 3 files changed, 25 insertions(+), 32 deletions(-) copy common/kvstore/src/test/java/org/apache/spark/util/kvstore/LevelDBIteratorSuite.java => core/src/main/scala/org/apache/spark/ui/jobs/JobDataUtil.scala (54%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-31420][WEBUI] Infinite timeline redraw in job details page
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 04f04e0 [SPARK-31420][WEBUI] Infinite timeline redraw in job details page 04f04e0 is described below commit 04f04e0ea74fd26a7bd39e833752ce277997f4dc Author: Kousuke Saruta AuthorDate: Mon Apr 13 23:23:00 2020 -0700 [SPARK-31420][WEBUI] Infinite timeline redraw in job details page ### What changes were proposed in this pull request? Upgrade vis.js to fix an infinite re-drawing issue. As reported here, old releases of vis.js have that issue. Fortunately, the latest version seems to resolve the issue. With the latest release of vis.js, there are some performance issues with the original `timeline-view.js` and `timeline-view.css` so I also changed them. ### Why are the changes needed? For better UX. ### Does this PR introduce any user-facing change? No. Appearance and functionalities are not changed. ### How was this patch tested? I confirmed infinite redrawing doesn't happen with a JobPage which I had reproduced the issue. With the original version of vis.js, I reproduced the issue with the following conditions. * Use history server and load core/src/test/resources/spark-events. * Visit the JobPage for job2 in application_1553914137147_0018. * Zoom out to 80% on Safari / Chrome / Firefox. Maybe, it depends on OS and the version of browsers. Closes #28192 from sarutak/upgrade-visjs. Authored-by: Kousuke Saruta Signed-off-by: Gengliang Wang --- .../org/apache/spark/ui/static/timeline-view.css | 12 - .../org/apache/spark/ui/static/timeline-view.js| 29 --- .../spark/ui/static/vis-timeline-graph2d.min.css | 2 + .../ui/static/vis-timeline-graph2d.min.css.map | 1 + .../spark/ui/static/vis-timeline-graph2d.min.js| 60 ++ .../ui/static/vis-timeline-graph2d.min.js.map | 1 + .../org/apache/spark/ui/static/vis.min.css | 1 - .../org/apache/spark/ui/static/vis.min.js | 45 .../main/scala/org/apache/spark/ui/UIUtils.scala | 5 +- .../org/apache/spark/ui/UISeleniumSuite.scala | 2 +- dev/.rat-excludes | 6 ++- licenses-binary/LICENSE-vis-timeline.txt | 23 + licenses-binary/LICENSE-vis.txt| 22 licenses/LICENSE-vis-timeline.txt | 23 + licenses/LICENSE-vis.txt | 22 15 files changed, 140 insertions(+), 114 deletions(-) diff --git a/core/src/main/resources/org/apache/spark/ui/static/timeline-view.css b/core/src/main/resources/org/apache/spark/ui/static/timeline-view.css index 3f31403..c9bf83c 100644 --- a/core/src/main/resources/org/apache/spark/ui/static/timeline-view.css +++ b/core/src/main/resources/org/apache/spark/ui/static/timeline-view.css @@ -238,18 +238,6 @@ tr.corresponding-item-hover > td, tr.corresponding-item-hover > th { background-color: #D6FFE4 !important; } -#application-timeline.collapsed { - display: none; -} - -#job-timeline.collapsed { - display: none; -} - -#task-assignment-timeline.collapsed { - display: none; -} - .control-panel { margin-bottom: 5px; } diff --git a/core/src/main/resources/org/apache/spark/ui/static/timeline-view.js b/core/src/main/resources/org/apache/spark/ui/static/timeline-view.js index b2cd616..a63ee86 100644 --- a/core/src/main/resources/org/apache/spark/ui/static/timeline-view.js +++ b/core/src/main/resources/org/apache/spark/ui/static/timeline-view.js @@ -26,7 +26,7 @@ function drawApplicationTimeline(groupArray, eventObjArray, startTime, offset) { editable: false, align: 'left', showCurrentTime: false, -min: startTime, +start: startTime, zoomable: false, moment: function (date) { return vis.moment(date).utcOffset(offset); @@ -50,7 +50,7 @@ function drawApplicationTimeline(groupArray, eventObjArray, startTime, offset) { }; $(this).click(function() { -var jobPagePath = $(getSelectorForJobEntry(this)).find("a.name-link").attr("href") +var jobPagePath = $(getSelectorForJobEntry(this)).find("a.name-link").attr("href"); window.location.href = jobPagePath }); @@ -75,6 +75,9 @@ function drawApplicationTimeline(groupArray, eventObjArray, startTime, offset) { $("#application-timeline").toggleClass('collapsed'); +var visibilityState = status ? "" : "none"; +$("#application-timeline").css("display", visibilityState); + // Switch the class of the arrow from open to closed. $(this).find('.expand
[spark] branch branch-3.0 updated: [SPARK-31004][WEBUI][SS] Show message for empty Streaming Queries instead of empty timelines and histograms
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new b1976ac [SPARK-31004][WEBUI][SS] Show message for empty Streaming Queries instead of empty timelines and histograms b1976ac is described below commit b1976ac43ce8cdda58916c300fd5b01677fd5973 Author: Kousuke Saruta AuthorDate: Fri Mar 13 12:58:49 2020 -0700 [SPARK-31004][WEBUI][SS] Show message for empty Streaming Queries instead of empty timelines and histograms ### What changes were proposed in this pull request? `StreamingQueryStatisticsPage` shows a message "No visualization information available because there is no batches" instead of showing empty timelines and histograms for empty streaming queries. [Before this change applied] ![before-fix-for-empty-streaming-query](https://user-images.githubusercontent.com/4736016/75642391-b32e1d80-5c7e-11ea-9c07-e2f0f1b5b4f9.png) [After this change applied] ![after-fix-for-empty-streaming-query2](https://user-images.githubusercontent.com/4736016/75694583-1904be80-5cec-11ea-9b13-dc7078775188.png) ### Why are the changes needed? Empty charts are ugly and a little bit confusing. It's better to clearly say "No visualization information available". Also, this change fixes a JS error shown in the capture above. This error occurs because `drawTimeline` in `streaming-page.js` is called even though `formattedDate` will be `undefined` for empty streaming queries. ### Does this PR introduce any user-facing change? Yes. screen captures are shown above. ### How was this patch tested? Manually tested by creating an empty streaming query like as follows. ``` val df = spark.readStream.format("socket").options(Map("host"->"", "port"->"...")).load df.writeStream.format("console").start ``` This streaming query will fail because of `non-existing hostname` and has no batches. Closes #27755 from sarutak/fix-for-empty-batches. Authored-by: Kousuke Saruta Signed-off-by: Gengliang Wang (cherry picked from commit 680981587d3219aea7caadb56df6519c8b3acb7f) Signed-off-by: Gengliang Wang --- .../sql/streaming/ui/StreamingQueryStatisticsPage.scala| 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala b/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala index 65052d9..227e5e5 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala @@ -229,14 +229,15 @@ private[ui] class StreamingQueryStatisticsPage(parent: StreamingQueryTab) 0L, "ms") -val table = -// scalastyle:off +val table = if (query.lastProgress != null) { + // scalastyle:off Timelines -Histograms +Histograms + @@ -285,7 +286,12 @@ private[ui] class StreamingQueryStatisticsPage(parent: StreamingQueryTab) -// scalastyle:on +} else { + +No visualization information available. + + // scalastyle:on +} generateTimeToValues(operationDurationData) ++ generateFormattedTimeTipStrings(batchToTimestamps) ++ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (0f46325 -> 6809815)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0f46325 [SPARK-31128][WEBUI] Fix Uncaught TypeError in streaming statistics page add 6809815 [SPARK-31004][WEBUI][SS] Show message for empty Streaming Queries instead of empty timelines and histograms No new revisions were added by this update. Summary of changes: .../sql/streaming/ui/StreamingQueryStatisticsPage.scala| 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (0f46325 -> 6809815)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0f46325 [SPARK-31128][WEBUI] Fix Uncaught TypeError in streaming statistics page add 6809815 [SPARK-31004][WEBUI][SS] Show message for empty Streaming Queries instead of empty timelines and histograms No new revisions were added by this update. Summary of changes: .../sql/streaming/ui/StreamingQueryStatisticsPage.scala| 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6809815 -> 2a4fed0)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6809815 [SPARK-31004][WEBUI][SS] Show message for empty Streaming Queries instead of empty timelines and histograms add 2a4fed0 [SPARK-30654][WEBUI] Bootstrap4 WebUI upgrade No new revisions were added by this update. Summary of changes: .../apache/spark/ui/static/bootstrap.bundle.min.js | 7 + .../spark/ui/static/bootstrap.bundle.min.js.map| 1 + .../org/apache/spark/ui/static/bootstrap.min.css | 879 + .../apache/spark/ui/static/bootstrap.min.css.map | 1 + .../spark/ui/static/dataTables.bootstrap.css | 319 .../spark/ui/static/dataTables.bootstrap.min.js| 8 - .../static/dataTables.bootstrap4.1.10.20.min.css | 1 + .../ui/static/dataTables.bootstrap4.1.10.20.min.js | 11 + .../spark/ui/static/executorspage-template.html| 26 +- .../org/apache/spark/ui/static/executorspage.js| 8 +- .../ui/static/jquery.dataTables.1.10.18.min.js | 166 ...8.min.css => jquery.dataTables.1.10.20.min.css} | 2 +- .../ui/static/jquery.dataTables.1.10.20.min.js | 180 + .../org/apache/spark/ui/static/stagepage.js| 14 +- .../spark/ui/static/stagespage-template.html | 50 +- .../org/apache/spark/ui/static/streaming-page.js | 2 +- .../resources/org/apache/spark/ui/static/table.js | 12 +- .../apache/spark/ui/static/webui-dataTables.css| 47 +- .../resources/org/apache/spark/ui/static/webui.css | 126 ++- .../apache/spark/deploy/history/HistoryPage.scala | 4 +- .../spark/deploy/history/HistoryServer.scala | 2 +- .../spark/deploy/master/ui/ApplicationPage.scala | 12 +- .../apache/spark/deploy/master/ui/MasterPage.scala | 26 +- .../apache/spark/deploy/worker/ui/LogPage.scala| 4 +- .../apache/spark/deploy/worker/ui/WorkerPage.scala | 12 +- .../scala/org/apache/spark/ui/PagedTable.scala | 35 +- .../main/scala/org/apache/spark/ui/UIUtils.scala | 74 +- .../spark/ui/exec/ExecutorThreadDumpPage.scala | 73 +- .../org/apache/spark/ui/exec/ExecutorsTab.scala| 14 +- .../org/apache/spark/ui/jobs/AllJobsPage.scala | 9 +- .../org/apache/spark/ui/jobs/AllStagesPage.scala | 2 +- .../scala/org/apache/spark/ui/jobs/JobPage.scala | 2 +- .../scala/org/apache/spark/ui/jobs/PoolTable.scala | 28 +- .../scala/org/apache/spark/ui/jobs/StagePage.scala | 13 +- .../org/apache/spark/ui/jobs/StageTable.scala | 2 +- .../org/apache/spark/ui/storage/RDDPage.scala | 12 +- .../scala/org/apache/spark/ui/UIUtilsSuite.scala | 7 +- dev/.rat-excludes | 12 +- licenses-binary/LICENSE-bootstrap.txt | 29 +- licenses-binary/LICENSE-datatables.txt | 4 +- licenses/LICENSE-bootstrap.txt | 29 +- licenses/LICENSE-datatables.txt| 2 +- .../apache/spark/deploy/mesos/ui/DriverPage.scala | 4 +- .../spark/deploy/mesos/ui/MesosClusterPage.scala | 4 +- .../sql/execution/ui/static/spark-sql-viz.css | 1 + .../spark/sql/execution/ui/static/spark-sql-viz.js | 9 +- .../spark/sql/execution/ui/AllExecutionsPage.scala | 4 +- .../spark/sql/execution/ui/ExecutionPage.scala | 2 +- .../hive/thriftserver/ui/ThriftServerPage.scala| 6 +- .../thriftserver/ui/ThriftServerSessionPage.scala | 2 +- .../spark/streaming/ui/AllBatchesTable.scala | 2 +- .../org/apache/spark/streaming/ui/BatchPage.scala | 4 +- .../apache/spark/streaming/ui/StreamingPage.scala | 8 +- 53 files changed, 647 insertions(+), 1666 deletions(-) create mode 100644 core/src/main/resources/org/apache/spark/ui/static/bootstrap.bundle.min.js create mode 100644 core/src/main/resources/org/apache/spark/ui/static/bootstrap.bundle.min.js.map mode change 100755 => 100644 core/src/main/resources/org/apache/spark/ui/static/bootstrap.min.css create mode 100644 core/src/main/resources/org/apache/spark/ui/static/bootstrap.min.css.map delete mode 100644 core/src/main/resources/org/apache/spark/ui/static/dataTables.bootstrap.css delete mode 100644 core/src/main/resources/org/apache/spark/ui/static/dataTables.bootstrap.min.js create mode 100644 core/src/main/resources/org/apache/spark/ui/static/dataTables.bootstrap4.1.10.20.min.css create mode 100644 core/src/main/resources/org/apache/spark/ui/static/dataTables.bootstrap4.1.10.20.min.js delete mode 100644 core/src/main/resources/org/apache/spark/ui/static/jquery.dataTables.1.10.18.min.js rename core/src/main/resources/org/apache/spark/ui/static/{jquery.dataTables.1.10.18.min.css => jquery.dataTables.1.10.20.min.css} (99%) create mode 100644 core/src/main/resources/org/apache/spark/ui/static/jquery.dataTables.1.10.20.min.js - To unsubscrib
[spark] branch master updated (0f46325 -> 6809815)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0f46325 [SPARK-31128][WEBUI] Fix Uncaught TypeError in streaming statistics page add 6809815 [SPARK-31004][WEBUI][SS] Show message for empty Streaming Queries instead of empty timelines and histograms No new revisions were added by this update. Summary of changes: .../sql/streaming/ui/StreamingQueryStatisticsPage.scala| 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (0f46325 -> 6809815)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0f46325 [SPARK-31128][WEBUI] Fix Uncaught TypeError in streaming statistics page add 6809815 [SPARK-31004][WEBUI][SS] Show message for empty Streaming Queries instead of empty timelines and histograms No new revisions were added by this update. Summary of changes: .../sql/streaming/ui/StreamingQueryStatisticsPage.scala| 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31004][WEBUI][SS] Show message for empty Streaming Queries instead of empty timelines and histograms
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new b1976ac [SPARK-31004][WEBUI][SS] Show message for empty Streaming Queries instead of empty timelines and histograms b1976ac is described below commit b1976ac43ce8cdda58916c300fd5b01677fd5973 Author: Kousuke Saruta AuthorDate: Fri Mar 13 12:58:49 2020 -0700 [SPARK-31004][WEBUI][SS] Show message for empty Streaming Queries instead of empty timelines and histograms ### What changes were proposed in this pull request? `StreamingQueryStatisticsPage` shows a message "No visualization information available because there is no batches" instead of showing empty timelines and histograms for empty streaming queries. [Before this change applied] ![before-fix-for-empty-streaming-query](https://user-images.githubusercontent.com/4736016/75642391-b32e1d80-5c7e-11ea-9c07-e2f0f1b5b4f9.png) [After this change applied] ![after-fix-for-empty-streaming-query2](https://user-images.githubusercontent.com/4736016/75694583-1904be80-5cec-11ea-9b13-dc7078775188.png) ### Why are the changes needed? Empty charts are ugly and a little bit confusing. It's better to clearly say "No visualization information available". Also, this change fixes a JS error shown in the capture above. This error occurs because `drawTimeline` in `streaming-page.js` is called even though `formattedDate` will be `undefined` for empty streaming queries. ### Does this PR introduce any user-facing change? Yes. screen captures are shown above. ### How was this patch tested? Manually tested by creating an empty streaming query like as follows. ``` val df = spark.readStream.format("socket").options(Map("host"->"", "port"->"...")).load df.writeStream.format("console").start ``` This streaming query will fail because of `non-existing hostname` and has no batches. Closes #27755 from sarutak/fix-for-empty-batches. Authored-by: Kousuke Saruta Signed-off-by: Gengliang Wang (cherry picked from commit 680981587d3219aea7caadb56df6519c8b3acb7f) Signed-off-by: Gengliang Wang --- .../sql/streaming/ui/StreamingQueryStatisticsPage.scala| 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala b/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala index 65052d9..227e5e5 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala @@ -229,14 +229,15 @@ private[ui] class StreamingQueryStatisticsPage(parent: StreamingQueryTab) 0L, "ms") -val table = -// scalastyle:off +val table = if (query.lastProgress != null) { + // scalastyle:off Timelines -Histograms +Histograms + @@ -285,7 +286,12 @@ private[ui] class StreamingQueryStatisticsPage(parent: StreamingQueryTab) -// scalastyle:on +} else { + +No visualization information available. + + // scalastyle:on +} generateTimeToValues(operationDurationData) ++ generateFormattedTimeTipStrings(batchToTimestamps) ++ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31004][WEBUI][SS] Show message for empty Streaming Queries instead of empty timelines and histograms
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new b1976ac [SPARK-31004][WEBUI][SS] Show message for empty Streaming Queries instead of empty timelines and histograms b1976ac is described below commit b1976ac43ce8cdda58916c300fd5b01677fd5973 Author: Kousuke Saruta AuthorDate: Fri Mar 13 12:58:49 2020 -0700 [SPARK-31004][WEBUI][SS] Show message for empty Streaming Queries instead of empty timelines and histograms ### What changes were proposed in this pull request? `StreamingQueryStatisticsPage` shows a message "No visualization information available because there is no batches" instead of showing empty timelines and histograms for empty streaming queries. [Before this change applied] ![before-fix-for-empty-streaming-query](https://user-images.githubusercontent.com/4736016/75642391-b32e1d80-5c7e-11ea-9c07-e2f0f1b5b4f9.png) [After this change applied] ![after-fix-for-empty-streaming-query2](https://user-images.githubusercontent.com/4736016/75694583-1904be80-5cec-11ea-9b13-dc7078775188.png) ### Why are the changes needed? Empty charts are ugly and a little bit confusing. It's better to clearly say "No visualization information available". Also, this change fixes a JS error shown in the capture above. This error occurs because `drawTimeline` in `streaming-page.js` is called even though `formattedDate` will be `undefined` for empty streaming queries. ### Does this PR introduce any user-facing change? Yes. screen captures are shown above. ### How was this patch tested? Manually tested by creating an empty streaming query like as follows. ``` val df = spark.readStream.format("socket").options(Map("host"->"", "port"->"...")).load df.writeStream.format("console").start ``` This streaming query will fail because of `non-existing hostname` and has no batches. Closes #27755 from sarutak/fix-for-empty-batches. Authored-by: Kousuke Saruta Signed-off-by: Gengliang Wang (cherry picked from commit 680981587d3219aea7caadb56df6519c8b3acb7f) Signed-off-by: Gengliang Wang --- .../sql/streaming/ui/StreamingQueryStatisticsPage.scala| 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala b/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala index 65052d9..227e5e5 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatisticsPage.scala @@ -229,14 +229,15 @@ private[ui] class StreamingQueryStatisticsPage(parent: StreamingQueryTab) 0L, "ms") -val table = -// scalastyle:off +val table = if (query.lastProgress != null) { + // scalastyle:off Timelines -Histograms +Histograms + @@ -285,7 +286,12 @@ private[ui] class StreamingQueryStatisticsPage(parent: StreamingQueryTab) -// scalastyle:on +} else { + +No visualization information available. + + // scalastyle:on +} generateTimeToValues(operationDurationData) ++ generateFormattedTimeTipStrings(batchToTimestamps) ++ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (53221cd -> 49dd32d)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 53221cd [SPARK-31244][K8S][TEST] Use Minio instead of Ceph in K8S DepsTestsSuite add 49dd32d [SPARK-31161][WEBUI][3.0] Refactor the on-click timeline action in streagming-page.js No new revisions were added by this update. Summary of changes: .../org/apache/spark/ui/static/streaming-page.js | 70 +- .../apache/spark/streaming/ui/StreamingPage.scala | 9 ++- 2 files changed, 50 insertions(+), 29 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (2c39502 -> d98df76)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 2c39502 [SPARK-31253][SQL][FOLLOWUP] Add metrics to AQE shuffle reader add d98df76 [SPARK-31325][SQL][WEB UI] Control a plan explain mode in the events of SQL listeners via SQLConf No new revisions were added by this update. Summary of changes: docs/sql-migration-guide.md| 3 ++ .../org/apache/spark/sql/internal/SQLConf.scala| 13 +++ .../apache/spark/sql/execution/SQLExecution.scala | 5 ++- .../execution/adaptive/AdaptiveSparkPlanExec.scala | 3 +- .../adaptive/AdaptiveQueryExecSuite.scala | 42 +- .../execution/ui/SQLAppStatusListenerSuite.scala | 37 +++ 6 files changed, 100 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (2c39502 -> d98df76)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 2c39502 [SPARK-31253][SQL][FOLLOWUP] Add metrics to AQE shuffle reader add d98df76 [SPARK-31325][SQL][WEB UI] Control a plan explain mode in the events of SQL listeners via SQLConf No new revisions were added by this update. Summary of changes: docs/sql-migration-guide.md| 3 ++ .../org/apache/spark/sql/internal/SQLConf.scala| 13 +++ .../apache/spark/sql/execution/SQLExecution.scala | 5 ++- .../execution/adaptive/AdaptiveSparkPlanExec.scala | 3 +- .../adaptive/AdaptiveQueryExecSuite.scala | 42 +- .../execution/ui/SQLAppStatusListenerSuite.scala | 37 +++ 6 files changed, 100 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31584][WEBUI] Fix NullPointerException when parsing event log with InMemoryStore
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 0058212 [SPARK-31584][WEBUI] Fix NullPointerException when parsing event log with InMemoryStore 0058212 is described below commit 0058212dc60129357ad90e2c04bcb22094a9dd84 Author: Baohe Zhang AuthorDate: Tue Apr 28 17:27:13 2020 -0700 [SPARK-31584][WEBUI] Fix NullPointerException when parsing event log with InMemoryStore ### What changes were proposed in this pull request? https://github.com/apache/spark/pull/27716 introduced parent index for InMemoryStore. When the method "deleteParentIndex(Object key)" in InMemoryStore.java is called and the key is not contained in "NaturalKeys v", Â A java.lang.NullPointerException will be thrown. This patch fixed the issue by updating the if condition. ### Why are the changes needed? Fixed a minor bug. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Added a unit test for deleteParentIndex. Closes #28378 from baohe-zhang/SPARK-31584. Authored-by: Baohe Zhang Signed-off-by: Gengliang Wang (cherry picked from commit 3808014a2fa763866ad55776ccb397de4d13afcc) Signed-off-by: Gengliang Wang --- .../apache/spark/util/kvstore/InMemoryStore.java | 2 +- .../org/apache/spark/util/kvstore/CustomType2.java | 50 ++ .../spark/util/kvstore/InMemoryStoreSuite.java | 42 ++ 3 files changed, 93 insertions(+), 1 deletion(-) diff --git a/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java b/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java index e929c6c..42e090b 100644 --- a/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java +++ b/common/kvstore/src/main/java/org/apache/spark/util/kvstore/InMemoryStore.java @@ -293,7 +293,7 @@ public class InMemoryStore implements KVStore { private void deleteParentIndex(Object key) { if (hasNaturalParentIndex) { for (NaturalKeys v : parentToChildrenMap.values()) { - if (v.remove(asKey(key))) { + if (v.remove(asKey(key)) != null) { // `v` can be empty after removing the natural key and we can remove it from // `parentToChildrenMap`. However, `parentToChildrenMap` is a ConcurrentMap and such // checking and deleting can be slow. diff --git a/common/kvstore/src/test/java/org/apache/spark/util/kvstore/CustomType2.java b/common/kvstore/src/test/java/org/apache/spark/util/kvstore/CustomType2.java new file mode 100644 index 000..3bb66bb --- /dev/null +++ b/common/kvstore/src/test/java/org/apache/spark/util/kvstore/CustomType2.java @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.util.kvstore; + +public class CustomType2 { + + @KVIndex(parent = "parentId") + public String key; + + @KVIndex("id") + public String id; + + @KVIndex("parentId") + public String parentId; + + @Override + public boolean equals(Object o) { +if (o instanceof CustomType2) { + CustomType2 other = (CustomType2) o; + return id.equals(other.id) && parentId.equals(other.parentId); +} +return false; + } + + @Override + public int hashCode() { +return id.hashCode() ^ parentId.hashCode(); + } + + @Override + public String toString() { +return "CustomType2[key=" + key + ",id=" + id + ",parentId=" + parentId; + } + +} diff --git a/common/kvstore/src/test/java/org/apache/spark/util/kvstore/InMemoryStoreSuite.java b/common/kvstore/src/test/java/org/apache/spark/util/kvstore/InMemoryStoreSuite.java index da52676..35656fb 100644 --- a/common/kvstore/src/test/java/org/apache/spark/util/kvstore/InMemoryStoreSuite.java +++ b/common/kvstore/src/test/java/org/apache/spark/util/kvstore/InMemoryStoreSuite.java @@ -210,4 +210,46 @@ public cl
[spark] branch master updated (d34cb59 -> 3808014)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d34cb59 [SPARK-31556][SQL][DOCS] Document LIKE clause in SQL Reference add 3808014 [SPARK-31584][WEBUI] Fix NullPointerException when parsing event log with InMemoryStore No new revisions were added by this update. Summary of changes: .../apache/spark/util/kvstore/InMemoryStore.java | 2 +- .../kvstore/{CustomType1.java => CustomType2.java} | 24 + .../spark/util/kvstore/InMemoryStoreSuite.java | 42 ++ 3 files changed, 52 insertions(+), 16 deletions(-) copy common/kvstore/src/test/java/org/apache/spark/util/kvstore/{CustomType1.java => CustomType2.java} (70%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31558][UI] Code clean up in spark-sql-viz.js
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 27a9c1d [SPARK-31558][UI] Code clean up in spark-sql-viz.js 27a9c1d is described below commit 27a9c1da6f9ced760c33ff624c2eadc5f6a85dd9 Author: Gengliang Wang AuthorDate: Sat Apr 25 13:43:52 2020 -0700 [SPARK-31558][UI] Code clean up in spark-sql-viz.js ### What changes were proposed in this pull request? 1. Remove console.log(), which seems unnecessary in the releases. 2. Replace the double equals to triple equals 3. Reuse jquery selector. ### Why are the changes needed? For better code quality. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Existing tests + manual test. Closes #28333 from gengliangwang/removeLog. Authored-by: Gengliang Wang Signed-off-by: Gengliang Wang (cherry picked from commit f59ebdef5b9c03dab99835527554ba8bebe38548) Signed-off-by: Gengliang Wang --- .../org/apache/spark/sql/execution/ui/static/spark-sql-viz.js | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js index bded921..724cec1 100644 --- a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js +++ b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js @@ -61,7 +61,7 @@ function planVizContainer() { return d3.select("#plan-viz-graph"); } * node, it will display the details of this SparkPlan node in the right. */ function setupTooltipForSparkPlanNode(nodeId) { - var nodeTooltip = d3.select("#plan-meta-data-" + nodeId).text() + var nodeTooltip = d3.select("#plan-meta-data-" + nodeId).text(); d3.select("svg g .node_" + nodeId) .on('mouseover', function(d) { var domNode = d3.select(this).node(); @@ -127,10 +127,8 @@ function preprocessGraphLayout(g) { */ function resizeSvg(svg) { var allClusters = svg.selectAll("g rect")[0]; - console.log(allClusters); var startX = -PlanVizConstants.svgMarginX + toFloat(d3.min(allClusters, function(e) { - console.log(e); return getAbsolutePosition(d3.select(e)).x; })); var startY = -PlanVizConstants.svgMarginY + @@ -183,7 +181,7 @@ function getAbsolutePosition(d3selection) { // Climb upwards to find how our parents are translated obj = d3.select(obj.node().parentNode); // Stop when we've reached the graph container itself -if (obj.node() == planVizContainer().node()) { +if (obj.node() === planVizContainer().node()) { break; } } @@ -215,8 +213,8 @@ function postprocessForAdditionalMetrics() { checkboxNode.click(function() { onClickAdditionalMetricsCheckbox($(this)); }); - var isChecked = window.localStorage.getItem("stageId-and-taskId-checked") == "true"; - $("#stageId-and-taskId-checkbox").prop("checked", isChecked); + var isChecked = window.localStorage.getItem("stageId-and-taskId-checked") === "true"; + checkboxNode.prop("checked", isChecked); onClickAdditionalMetricsCheckbox(checkboxNode); } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-31558][UI] Code clean up in spark-sql-viz.js
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new f59ebde [SPARK-31558][UI] Code clean up in spark-sql-viz.js f59ebde is described below commit f59ebdef5b9c03dab99835527554ba8bebe38548 Author: Gengliang Wang AuthorDate: Sat Apr 25 13:43:52 2020 -0700 [SPARK-31558][UI] Code clean up in spark-sql-viz.js ### What changes were proposed in this pull request? 1. Remove console.log(), which seems unnecessary in the releases. 2. Replace the double equals to triple equals 3. Reuse jquery selector. ### Why are the changes needed? For better code quality. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Existing tests + manual test. Closes #28333 from gengliangwang/removeLog. Authored-by: Gengliang Wang Signed-off-by: Gengliang Wang --- .../org/apache/spark/sql/execution/ui/static/spark-sql-viz.js | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js index bb393d9..301183f 100644 --- a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js +++ b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js @@ -61,7 +61,7 @@ function planVizContainer() { return d3.select("#plan-viz-graph"); } * node, it will display the details of this SparkPlan node in the right. */ function setupTooltipForSparkPlanNode(nodeId) { - var nodeTooltip = d3.select("#plan-meta-data-" + nodeId).text() + var nodeTooltip = d3.select("#plan-meta-data-" + nodeId).text(); d3.select("svg g .node_" + nodeId) .each(function(d) { var domNode = d3.select(this).node(); @@ -122,10 +122,8 @@ function preprocessGraphLayout(g) { */ function resizeSvg(svg) { var allClusters = svg.selectAll("g rect")[0]; - console.log(allClusters); var startX = -PlanVizConstants.svgMarginX + toFloat(d3.min(allClusters, function(e) { - console.log(e); return getAbsolutePosition(d3.select(e)).x; })); var startY = -PlanVizConstants.svgMarginY + @@ -178,7 +176,7 @@ function getAbsolutePosition(d3selection) { // Climb upwards to find how our parents are translated obj = d3.select(obj.node().parentNode); // Stop when we've reached the graph container itself -if (obj.node() == planVizContainer().node()) { +if (obj.node() === planVizContainer().node()) { break; } } @@ -210,8 +208,8 @@ function postprocessForAdditionalMetrics() { checkboxNode.click(function() { onClickAdditionalMetricsCheckbox($(this)); }); - var isChecked = window.localStorage.getItem("stageId-and-taskId-checked") == "true"; - $("#stageId-and-taskId-checkbox").prop("checked", isChecked); + var isChecked = window.localStorage.getItem("stageId-and-taskId-checked") === "true"; + checkboxNode.prop("checked", isChecked); onClickAdditionalMetricsCheckbox(checkboxNode); } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31440][SQL] Improve SQL Rest API
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new d668d67 [SPARK-31440][SQL] Improve SQL Rest API d668d67 is described below commit d668d679e93dc4f84bdce15f9f6bdb19552496f1 Author: Eren Avsarogullari AuthorDate: Mon May 18 23:21:32 2020 -0700 [SPARK-31440][SQL] Improve SQL Rest API ### What changes were proposed in this pull request? SQL Rest API exposes query execution metrics as Public API. This PR aims to apply following improvements on SQL Rest API by aligning Spark-UI. **Proposed Improvements:** 1- Support Physical Operations and group metrics per physical operation by aligning Spark UI. 2- Support `wholeStageCodegenId` for Physical Operations 3- `nodeId` can be useful for grouping metrics and sorting physical operations (according to execution order) to differentiate same operators (if used multiple times during the same query execution) and their metrics. 4- Filter `empty` metrics by aligning with Spark UI - SQL Tab. Currently, Spark UI does not show empty metrics. 5- Remove line breakers(`\n`) from `metricValue`. 6- `planDescription` can be `optional` Http parameter to avoid network cost where there is specially complex jobs creating big-plans. 7- `metrics` attribute needs to be exposed at the bottom order as `nodes`. Specially, this can be useful for the user where `nodes` array size is high. 8- `edges` attribute is being exposed to show relationship between `nodes`. 9- Reverse order on `metricDetails` aims to match with Spark UI by supporting Physical Operators' execution order. ### Why are the changes needed? Proposed improvements provides more useful (e.g: physical operations and metrics correlation, grouping) and clear (e.g: filtering blank metrics, removing line breakers) result for the end-user. ### Does this PR introduce any user-facing change? Yes. Please find both current and improved versions of the results as attached for following SQL Rest Endpoint: ``` curl -X GET http://localhost:4040/api/v1/applications/$appId/sql/$executionId?details=true ``` **Current version:** https://issues.apache.org/jira/secure/attachment/12999821/current_version.json **Improved version:** https://issues.apache.org/jira/secure/attachment/13000621/improved_version.json ### Backward Compatibility SQL Rest API will be started to expose with `Spark 3.0` and `3.0.0-preview2` (released on 12/23/19) does not cover this API so if PR can catch 3.0 release, this will not have any backward compatibility issue. ### How was this patch tested? 1. New Unit tests are added. 2. Also, patch has been tested manually through both **Spark Core** and **History Server** Rest APIs. Closes #28208 from erenavsarogullari/SPARK-31440. Authored-by: Eren Avsarogullari Signed-off-by: Gengliang Wang (cherry picked from commit ab4cf49a1ca5a2578ecd0441ad6db8e88593e648) Signed-off-by: Gengliang Wang --- .../spark/sql/execution/ui/SparkPlanGraph.scala| 6 +- .../spark/status/api/v1/sql/SqlResource.scala | 90 +++-- .../org/apache/spark/status/api/v1/sql/api.scala | 15 +- .../spark/status/api/v1/sql/SqlResourceSuite.scala | 205 + 4 files changed, 293 insertions(+), 23 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala index 274a5a4..a798fe0 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala @@ -153,7 +153,7 @@ object SparkPlanGraph { * @param name the name of this SparkPlan node * @param metrics metrics that this SparkPlan node will track */ -private[ui] class SparkPlanGraphNode( +class SparkPlanGraphNode( val id: Long, val name: String, val desc: String, @@ -193,7 +193,7 @@ private[ui] class SparkPlanGraphNode( /** * Represent a tree of SparkPlan for WholeStageCodegen. */ -private[ui] class SparkPlanGraphCluster( +class SparkPlanGraphCluster( id: Long, name: String, desc: String, @@ -229,7 +229,7 @@ private[ui] class SparkPlanGraphCluster( * Represent an edge in the SparkPlan tree. `fromId` is the child node id, and `toId` is the parent * node id. */ -private[ui] case class SparkPlanGraphEdge(fromId: Long, toId: Long) { +case class SparkPlanGraphEdge(fromId: Long, toId: Long) { def makeDotEdge: String = s""" $fromId->$toId;\n""" } diff --git a/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala b/sql/core/src/main/sca
[spark] branch branch-3.0 updated: [SPARK-31440][SQL] Improve SQL Rest API
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new d668d67 [SPARK-31440][SQL] Improve SQL Rest API d668d67 is described below commit d668d679e93dc4f84bdce15f9f6bdb19552496f1 Author: Eren Avsarogullari AuthorDate: Mon May 18 23:21:32 2020 -0700 [SPARK-31440][SQL] Improve SQL Rest API ### What changes were proposed in this pull request? SQL Rest API exposes query execution metrics as Public API. This PR aims to apply following improvements on SQL Rest API by aligning Spark-UI. **Proposed Improvements:** 1- Support Physical Operations and group metrics per physical operation by aligning Spark UI. 2- Support `wholeStageCodegenId` for Physical Operations 3- `nodeId` can be useful for grouping metrics and sorting physical operations (according to execution order) to differentiate same operators (if used multiple times during the same query execution) and their metrics. 4- Filter `empty` metrics by aligning with Spark UI - SQL Tab. Currently, Spark UI does not show empty metrics. 5- Remove line breakers(`\n`) from `metricValue`. 6- `planDescription` can be `optional` Http parameter to avoid network cost where there is specially complex jobs creating big-plans. 7- `metrics` attribute needs to be exposed at the bottom order as `nodes`. Specially, this can be useful for the user where `nodes` array size is high. 8- `edges` attribute is being exposed to show relationship between `nodes`. 9- Reverse order on `metricDetails` aims to match with Spark UI by supporting Physical Operators' execution order. ### Why are the changes needed? Proposed improvements provides more useful (e.g: physical operations and metrics correlation, grouping) and clear (e.g: filtering blank metrics, removing line breakers) result for the end-user. ### Does this PR introduce any user-facing change? Yes. Please find both current and improved versions of the results as attached for following SQL Rest Endpoint: ``` curl -X GET http://localhost:4040/api/v1/applications/$appId/sql/$executionId?details=true ``` **Current version:** https://issues.apache.org/jira/secure/attachment/12999821/current_version.json **Improved version:** https://issues.apache.org/jira/secure/attachment/13000621/improved_version.json ### Backward Compatibility SQL Rest API will be started to expose with `Spark 3.0` and `3.0.0-preview2` (released on 12/23/19) does not cover this API so if PR can catch 3.0 release, this will not have any backward compatibility issue. ### How was this patch tested? 1. New Unit tests are added. 2. Also, patch has been tested manually through both **Spark Core** and **History Server** Rest APIs. Closes #28208 from erenavsarogullari/SPARK-31440. Authored-by: Eren Avsarogullari Signed-off-by: Gengliang Wang (cherry picked from commit ab4cf49a1ca5a2578ecd0441ad6db8e88593e648) Signed-off-by: Gengliang Wang --- .../spark/sql/execution/ui/SparkPlanGraph.scala| 6 +- .../spark/status/api/v1/sql/SqlResource.scala | 90 +++-- .../org/apache/spark/status/api/v1/sql/api.scala | 15 +- .../spark/status/api/v1/sql/SqlResourceSuite.scala | 205 + 4 files changed, 293 insertions(+), 23 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala index 274a5a4..a798fe0 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala @@ -153,7 +153,7 @@ object SparkPlanGraph { * @param name the name of this SparkPlan node * @param metrics metrics that this SparkPlan node will track */ -private[ui] class SparkPlanGraphNode( +class SparkPlanGraphNode( val id: Long, val name: String, val desc: String, @@ -193,7 +193,7 @@ private[ui] class SparkPlanGraphNode( /** * Represent a tree of SparkPlan for WholeStageCodegen. */ -private[ui] class SparkPlanGraphCluster( +class SparkPlanGraphCluster( id: Long, name: String, desc: String, @@ -229,7 +229,7 @@ private[ui] class SparkPlanGraphCluster( * Represent an edge in the SparkPlan tree. `fromId` is the child node id, and `toId` is the parent * node id. */ -private[ui] case class SparkPlanGraphEdge(fromId: Long, toId: Long) { +case class SparkPlanGraphEdge(fromId: Long, toId: Long) { def makeDotEdge: String = s""" $fromId->$toId;\n""" } diff --git a/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala b/sql/core/src/main/sca
[spark] branch master updated: [SPARK-31440][SQL] Improve SQL Rest API
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new ab4cf49 [SPARK-31440][SQL] Improve SQL Rest API ab4cf49 is described below commit ab4cf49a1ca5a2578ecd0441ad6db8e88593e648 Author: Eren Avsarogullari AuthorDate: Mon May 18 23:21:32 2020 -0700 [SPARK-31440][SQL] Improve SQL Rest API ### What changes were proposed in this pull request? SQL Rest API exposes query execution metrics as Public API. This PR aims to apply following improvements on SQL Rest API by aligning Spark-UI. **Proposed Improvements:** 1- Support Physical Operations and group metrics per physical operation by aligning Spark UI. 2- Support `wholeStageCodegenId` for Physical Operations 3- `nodeId` can be useful for grouping metrics and sorting physical operations (according to execution order) to differentiate same operators (if used multiple times during the same query execution) and their metrics. 4- Filter `empty` metrics by aligning with Spark UI - SQL Tab. Currently, Spark UI does not show empty metrics. 5- Remove line breakers(`\n`) from `metricValue`. 6- `planDescription` can be `optional` Http parameter to avoid network cost where there is specially complex jobs creating big-plans. 7- `metrics` attribute needs to be exposed at the bottom order as `nodes`. Specially, this can be useful for the user where `nodes` array size is high. 8- `edges` attribute is being exposed to show relationship between `nodes`. 9- Reverse order on `metricDetails` aims to match with Spark UI by supporting Physical Operators' execution order. ### Why are the changes needed? Proposed improvements provides more useful (e.g: physical operations and metrics correlation, grouping) and clear (e.g: filtering blank metrics, removing line breakers) result for the end-user. ### Does this PR introduce any user-facing change? Yes. Please find both current and improved versions of the results as attached for following SQL Rest Endpoint: ``` curl -X GET http://localhost:4040/api/v1/applications/$appId/sql/$executionId?details=true ``` **Current version:** https://issues.apache.org/jira/secure/attachment/12999821/current_version.json **Improved version:** https://issues.apache.org/jira/secure/attachment/13000621/improved_version.json ### Backward Compatibility SQL Rest API will be started to expose with `Spark 3.0` and `3.0.0-preview2` (released on 12/23/19) does not cover this API so if PR can catch 3.0 release, this will not have any backward compatibility issue. ### How was this patch tested? 1. New Unit tests are added. 2. Also, patch has been tested manually through both **Spark Core** and **History Server** Rest APIs. Closes #28208 from erenavsarogullari/SPARK-31440. Authored-by: Eren Avsarogullari Signed-off-by: Gengliang Wang --- .../spark/sql/execution/ui/SparkPlanGraph.scala| 6 +- .../spark/status/api/v1/sql/SqlResource.scala | 90 +++-- .../org/apache/spark/status/api/v1/sql/api.scala | 15 +- .../spark/status/api/v1/sql/SqlResourceSuite.scala | 205 + 4 files changed, 293 insertions(+), 23 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala index 274a5a4..a798fe0 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala @@ -153,7 +153,7 @@ object SparkPlanGraph { * @param name the name of this SparkPlan node * @param metrics metrics that this SparkPlan node will track */ -private[ui] class SparkPlanGraphNode( +class SparkPlanGraphNode( val id: Long, val name: String, val desc: String, @@ -193,7 +193,7 @@ private[ui] class SparkPlanGraphNode( /** * Represent a tree of SparkPlan for WholeStageCodegen. */ -private[ui] class SparkPlanGraphCluster( +class SparkPlanGraphCluster( id: Long, name: String, desc: String, @@ -229,7 +229,7 @@ private[ui] class SparkPlanGraphCluster( * Represent an edge in the SparkPlan tree. `fromId` is the child node id, and `toId` is the parent * node id. */ -private[ui] case class SparkPlanGraphEdge(fromId: Long, toId: Long) { +case class SparkPlanGraphEdge(fromId: Long, toId: Long) { def makeDotEdge: String = s""" $fromId->$toId;\n""" } diff --git a/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala b/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala index 346e07f..c7599f8 100644 --- a/sql/core/src/main/scala/o
[spark] branch master updated (c560428 -> ab4cf49)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from c560428 [SPARK-20732][CORE] Decommission cache blocks to other executors when an executor is decommissioned add ab4cf49 [SPARK-31440][SQL] Improve SQL Rest API No new revisions were added by this update. Summary of changes: .../spark/sql/execution/ui/SparkPlanGraph.scala| 6 +- .../spark/status/api/v1/sql/SqlResource.scala | 90 +++-- .../org/apache/spark/status/api/v1/sql/api.scala | 15 +- .../spark/status/api/v1/sql/SqlResourceSuite.scala | 205 + 4 files changed, 293 insertions(+), 23 deletions(-) create mode 100644 sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31081][UI][SQL] Make display of stageId/stageAttemptId/taskId of sql metrics toggleable
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new d278960 [SPARK-31081][UI][SQL] Make display of stageId/stageAttemptId/taskId of sql metrics toggleable d278960 is described below commit d2789600233fa5e6c01b0c973f8f31ce5a206438 Author: Kousuke Saruta AuthorDate: Tue Mar 24 13:37:13 2020 -0700 [SPARK-31081][UI][SQL] Make display of stageId/stageAttemptId/taskId of sql metrics toggleable ### What changes were proposed in this pull request? This is another solution for `SPARK-31081` and #27849 . I added a checkbox which can toggle display of stageId/taskid in the SQL's DAG page. Mainly, I implemented the toggleable texts in boxes with HTML label feature provided by `dagre-d3`. The additional metrics are enclosed by `` and control the appearance of the text. But the exception is additional metrics in clusters. We can use HTML label for cluster but layout will be broken so I choosed normal text label for clusters. Due to that, this solution contains a little bit tricky code in`spark-sql-viz.js` to manipulate the metric texts and generate DOMs. ### Why are the changes needed? It makes metrics harder to read after #26843 and user may not interest in extra info(stageId/StageAttemptId/taskId ) when they do not need debug. #27849 control the appearance by a new configuration property but providing a checkbox is more flexible. ### Does this PR introduce any user-facing change? Yes. [Additional metrics shown] ![with-checked](https://user-images.githubusercontent.com/4736016/77244214-0f6cd780-6c56-11ea-9275-a30758dd5339.png) [Additional metrics hidden] ![without-chedked](https://user-images.githubusercontent.com/4736016/77244219-14ca2200-6c56-11ea-9874-33a466085fce.png) ### How was this patch tested? Tested manually with a simple DataFrame operation. * The appearance of additional metrics in the boxes are controlled by the newly added checkbox. * No error found with JS-debugger. * Checked/not-checked state is preserved after reloading. Closes #27927 from sarutak/SPARK-31081. Authored-by: Kousuke Saruta Signed-off-by: Gengliang Wang (cherry picked from commit 999c9ed10c2362d89afd3bbb48e35f3c7ac3cf89) Signed-off-by: Gengliang Wang --- .../sql/execution/ui/static/spark-sql-viz.css | 2 +- .../spark/sql/execution/ui/static/spark-sql-viz.js | 74 +- .../spark/sql/execution/ui/ExecutionPage.scala | 4 ++ .../spark/sql/execution/ui/SparkPlanGraph.scala| 10 +-- 4 files changed, 83 insertions(+), 7 deletions(-) diff --git a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css index 2018838..ffc9acd 100644 --- a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css +++ b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css @@ -34,7 +34,7 @@ } /* Highlight the SparkPlan node name */ -#plan-viz-graph svg text :first-child { +#plan-viz-graph svg text :first-child:not(.stageId-and-taskId-metrics) { font-weight: bold; } diff --git a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js index c834914..b23ae9a 100644 --- a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js +++ b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js @@ -47,6 +47,7 @@ function renderPlanViz() { } resizeSvg(svg); + postprocessForAdditionalMetrics(); } /* * @@ -75,6 +76,10 @@ function setupTooltipForSparkPlanNode(nodeId) { }) } +// labelSeparator should be a non-graphical character in order not to affect the width of boxes. +var labelSeparator = "\x01"; +var stageAndTaskMetricsPattern = "^(.*)(\\(stage.*attempt.*task[^)]*\\))(.*)$"; + /* * Helper function to pre-process the graph layout. * This step is necessary for certain styles that affect the positioning @@ -84,8 +89,29 @@ function preprocessGraphLayout(g) { g.graph().ranksep = "70"; var nodes = g.nodes(); for (var i = 0; i < nodes.length; i++) { - var node = g.node(nodes[i]); - node.padding = "5"; +var node = g.node(nodes[i]); +node.padding = "5"; + +var firstSearator; +var secondSeparator; +var splitter; +if (node.isCluster) { + firstSearator = secondSeparator = labelSeparator; + splitter = "\\n"; +} else { + f
[spark] branch master updated: [SPARK-31081][UI][SQL] Make display of stageId/stageAttemptId/taskId of sql metrics toggleable
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 999c9ed1 [SPARK-31081][UI][SQL] Make display of stageId/stageAttemptId/taskId of sql metrics toggleable 999c9ed1 is described below commit 999c9ed10c2362d89afd3bbb48e35f3c7ac3cf89 Author: Kousuke Saruta AuthorDate: Tue Mar 24 13:37:13 2020 -0700 [SPARK-31081][UI][SQL] Make display of stageId/stageAttemptId/taskId of sql metrics toggleable ### What changes were proposed in this pull request? This is another solution for `SPARK-31081` and #27849 . I added a checkbox which can toggle display of stageId/taskid in the SQL's DAG page. Mainly, I implemented the toggleable texts in boxes with HTML label feature provided by `dagre-d3`. The additional metrics are enclosed by `` and control the appearance of the text. But the exception is additional metrics in clusters. We can use HTML label for cluster but layout will be broken so I choosed normal text label for clusters. Due to that, this solution contains a little bit tricky code in`spark-sql-viz.js` to manipulate the metric texts and generate DOMs. ### Why are the changes needed? It makes metrics harder to read after #26843 and user may not interest in extra info(stageId/StageAttemptId/taskId ) when they do not need debug. #27849 control the appearance by a new configuration property but providing a checkbox is more flexible. ### Does this PR introduce any user-facing change? Yes. [Additional metrics shown] ![with-checked](https://user-images.githubusercontent.com/4736016/77244214-0f6cd780-6c56-11ea-9275-a30758dd5339.png) [Additional metrics hidden] ![without-chedked](https://user-images.githubusercontent.com/4736016/77244219-14ca2200-6c56-11ea-9874-33a466085fce.png) ### How was this patch tested? Tested manually with a simple DataFrame operation. * The appearance of additional metrics in the boxes are controlled by the newly added checkbox. * No error found with JS-debugger. * Checked/not-checked state is preserved after reloading. Closes #27927 from sarutak/SPARK-31081. Authored-by: Kousuke Saruta Signed-off-by: Gengliang Wang --- .../sql/execution/ui/static/spark-sql-viz.css | 2 +- .../spark/sql/execution/ui/static/spark-sql-viz.js | 74 +- .../spark/sql/execution/ui/ExecutionPage.scala | 4 ++ .../spark/sql/execution/ui/SparkPlanGraph.scala| 10 +-- 4 files changed, 83 insertions(+), 7 deletions(-) diff --git a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css index 94a6bd8..6ba79dc 100644 --- a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css +++ b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css @@ -35,7 +35,7 @@ } /* Highlight the SparkPlan node name */ -#plan-viz-graph svg text :first-child { +#plan-viz-graph svg text :first-child:not(.stageId-and-taskId-metrics) { font-weight: bold; } diff --git a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js index 6244b2c..0fb7dab 100644 --- a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js +++ b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js @@ -47,6 +47,7 @@ function renderPlanViz() { } resizeSvg(svg); + postprocessForAdditionalMetrics(); } /* * @@ -70,6 +71,10 @@ function setupTooltipForSparkPlanNode(nodeId) { }) } +// labelSeparator should be a non-graphical character in order not to affect the width of boxes. +var labelSeparator = "\x01"; +var stageAndTaskMetricsPattern = "^(.*)(\\(stage.*attempt.*task[^)]*\\))(.*)$"; + /* * Helper function to pre-process the graph layout. * This step is necessary for certain styles that affect the positioning @@ -79,8 +84,29 @@ function preprocessGraphLayout(g) { g.graph().ranksep = "70"; var nodes = g.nodes(); for (var i = 0; i < nodes.length; i++) { - var node = g.node(nodes[i]); - node.padding = "5"; +var node = g.node(nodes[i]); +node.padding = "5"; + +var firstSearator; +var secondSeparator; +var splitter; +if (node.isCluster) { + firstSearator = secondSeparator = labelSeparator; + splitter = "\\n"; +} else { + firstSearator = ""; + secondSeparator = ""; + splitter = ""; +} +
[spark] branch master updated (92877c4 -> db5e5fc)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 92877c4 [SPARK-31765][WEBUI] Upgrade HtmlUnit >= 2.37.0 add db5e5fc Revert "[SPARK-31765][WEBUI] Upgrade HtmlUnit >= 2.37.0" No new revisions were added by this update. Summary of changes: core/pom.xml | 2 +- core/src/main/scala/org/apache/spark/ui/JettyUtils.scala | 7 +-- core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala | 3 +-- pom.xml | 10 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- streaming/pom.xml | 2 +- 7 files changed, 11 insertions(+), 17 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (92877c4 -> db5e5fc)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 92877c4 [SPARK-31765][WEBUI] Upgrade HtmlUnit >= 2.37.0 add db5e5fc Revert "[SPARK-31765][WEBUI] Upgrade HtmlUnit >= 2.37.0" No new revisions were added by this update. Summary of changes: core/pom.xml | 2 +- core/src/main/scala/org/apache/spark/ui/JettyUtils.scala | 7 +-- core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala | 3 +-- pom.xml | 10 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- streaming/pom.xml | 2 +- 7 files changed, 11 insertions(+), 17 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (92877c4 -> db5e5fc)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 92877c4 [SPARK-31765][WEBUI] Upgrade HtmlUnit >= 2.37.0 add db5e5fc Revert "[SPARK-31765][WEBUI] Upgrade HtmlUnit >= 2.37.0" No new revisions were added by this update. Summary of changes: core/pom.xml | 2 +- core/src/main/scala/org/apache/spark/ui/JettyUtils.scala | 7 +-- core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala | 3 +-- pom.xml | 10 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- streaming/pom.xml | 2 +- 7 files changed, 11 insertions(+), 17 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (64ffc66 -> 9fdc2a08)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 64ffc66 [SPARK-31786][K8S][BUILD] Upgrade kubernetes-client to 4.9.2 add 9fdc2a08 [SPARK-31793][SQL] Reduce the memory usage in file scan location metadata No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/util/Utils.scala | 18 +++ .../scala/org/apache/spark/util/UtilsSuite.scala | 8 +++ .../spark/sql/execution/DataSourceScanExec.scala | 7 -- .../sql/execution/datasources/v2/FileScan.scala| 6 +++-- .../DataSourceScanExecRedactionSuite.scala | 26 ++ 5 files changed, 61 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (64ffc66 -> 9fdc2a08)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 64ffc66 [SPARK-31786][K8S][BUILD] Upgrade kubernetes-client to 4.9.2 add 9fdc2a08 [SPARK-31793][SQL] Reduce the memory usage in file scan location metadata No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/util/Utils.scala | 18 +++ .../scala/org/apache/spark/util/UtilsSuite.scala | 8 +++ .../spark/sql/execution/DataSourceScanExec.scala | 7 -- .../sql/execution/datasources/v2/FileScan.scala| 6 +++-- .../DataSourceScanExecRedactionSuite.scala | 26 ++ 5 files changed, 61 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-31765][WEBUI] Upgrade HtmlUnit >= 2.37.0
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 92877c4 [SPARK-31765][WEBUI] Upgrade HtmlUnit >= 2.37.0 92877c4 is described below commit 92877c4ef2ad113c156b7d9c359f396187c78fa3 Author: Kousuke Saruta AuthorDate: Thu May 21 11:43:25 2020 -0700 [SPARK-31765][WEBUI] Upgrade HtmlUnit >= 2.37.0 ### What changes were proposed in this pull request? This PR upgrades HtmlUnit. Selenium and Jetty also upgraded because of dependency. ### Why are the changes needed? Recently, a security issue which affects HtmlUnit is reported. https://nvd.nist.gov/vuln/detail/CVE-2020-5529 According to the report, arbitrary code can be run by malicious users. HtmlUnit is used for test so the impact might not be large but it's better to upgrade it just in case. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing testcases. Closes #28585 from sarutak/upgrade-htmlunit. Authored-by: Kousuke Saruta Signed-off-by: Gengliang Wang --- core/pom.xml | 2 +- core/src/main/scala/org/apache/spark/ui/JettyUtils.scala | 7 ++- core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala | 3 ++- pom.xml | 10 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- streaming/pom.xml | 2 +- 7 files changed, 17 insertions(+), 11 deletions(-) diff --git a/core/pom.xml b/core/pom.xml index b0f6888..14b217d 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -334,7 +334,7 @@ org.seleniumhq.selenium - selenium-htmlunit-driver + htmlunit-driver test diff --git a/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala b/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala index 4b4788f..f1962ef 100644 --- a/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala +++ b/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala @@ -23,6 +23,7 @@ import javax.servlet.DispatcherType import javax.servlet.http._ import scala.language.implicitConversions +import scala.util.Try import scala.xml.Node import org.eclipse.jetty.client.HttpClient @@ -500,7 +501,11 @@ private[spark] case class ServerInfo( threadPool match { case pool: QueuedThreadPool => // Workaround for SPARK-30385 to avoid Jetty's acceptor thread shrink. -pool.setIdleTimeout(0) +// As of Jetty 9.4.21, the implementation of +// QueuedThreadPool#setIdleTimeout is changed and IllegalStateException +// will be thrown if we try to set idle timeout after the server has started. +// But this workaround works for Jetty 9.4.28 by ignoring the exception. +Try(pool.setIdleTimeout(0)) case _ => } server.stop() diff --git a/core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala b/core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala index 3ec9385..e96d82a 100644 --- a/core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala +++ b/core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala @@ -24,6 +24,7 @@ import javax.servlet.http.{HttpServletRequest, HttpServletResponse} import scala.io.Source import scala.xml.Node +import com.gargoylesoftware.css.parser.CSSParseException import com.gargoylesoftware.htmlunit.DefaultCssErrorHandler import org.json4s._ import org.json4s.jackson.JsonMethods @@ -33,7 +34,6 @@ import org.scalatest._ import org.scalatest.concurrent.Eventually._ import org.scalatest.time.SpanSugar._ import org.scalatestplus.selenium.WebBrowser -import org.w3c.css.sac.CSSParseException import org.apache.spark._ import org.apache.spark.LocalSparkContext._ @@ -784,6 +784,7 @@ class UISeleniumSuite extends SparkFunSuite with WebBrowser with Matchers with B eventually(timeout(10.seconds), interval(50.milliseconds)) { goToUi(sc, "/jobs") + val jobDesc = driver.findElement(By.cssSelector("div[class='application-timeline-content']")) jobDesc.getAttribute("data-title") should include ("collect at console:25") diff --git a/pom.xml b/pom.xml index fd4cebc..29f7fec 100644 --- a/pom.xml +++ b/pom.xml @@ -139,7 +139,7 @@ com.twitter 1.6.0 -9.4.18.v20190429 +9.4.28.v20200408 3.1.0 0.9.5 2.4.0 @@ -187,8 +187,8 @@ 0.12.0 4.7.1 1.1 -2.52.0 -2.22 +3.141.59 +2.40.0 @@ -591,8 +591,8 @@ org.seleniumhq.selenium -
[spark] branch master updated (f556946 -> f5360e7)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f556946 [SPARK-32800][SQL] Remove ExpressionSet from the 2.13 branch add f5360e7 [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API No new revisions were added by this update. Summary of changes: .../status/api/v1/sql/ApiSqlRootResource.scala | 9 +- .../spark/status/api/v1/sql/SqlResourceSuite.scala | 5 +- .../v1/sql/SqlResourceWithActualMetricsSuite.scala | 127 + 3 files changed, 136 insertions(+), 5 deletions(-) create mode 100644 sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (f556946 -> f5360e7)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f556946 [SPARK-32800][SQL] Remove ExpressionSet from the 2.13 branch add f5360e7 [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API No new revisions were added by this update. Summary of changes: .../status/api/v1/sql/ApiSqlRootResource.scala | 9 +- .../spark/status/api/v1/sql/SqlResourceSuite.scala | 5 +- .../v1/sql/SqlResourceWithActualMetricsSuite.scala | 127 + 3 files changed, 136 insertions(+), 5 deletions(-) create mode 100644 sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (f556946 -> f5360e7)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f556946 [SPARK-32800][SQL] Remove ExpressionSet from the 2.13 branch add f5360e7 [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API No new revisions were added by this update. Summary of changes: .../status/api/v1/sql/ApiSqlRootResource.scala | 9 +- .../spark/status/api/v1/sql/SqlResourceSuite.scala | 5 +- .../v1/sql/SqlResourceWithActualMetricsSuite.scala | 127 + 3 files changed, 136 insertions(+), 5 deletions(-) create mode 100644 sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (f556946 -> f5360e7)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f556946 [SPARK-32800][SQL] Remove ExpressionSet from the 2.13 branch add f5360e7 [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API No new revisions were added by this update. Summary of changes: .../status/api/v1/sql/ApiSqlRootResource.scala | 9 +- .../spark/status/api/v1/sql/SqlResourceSuite.scala | 5 +- .../v1/sql/SqlResourceWithActualMetricsSuite.scala | 127 + 3 files changed, 136 insertions(+), 5 deletions(-) create mode 100644 sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (f556946 -> f5360e7)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f556946 [SPARK-32800][SQL] Remove ExpressionSet from the 2.13 branch add f5360e7 [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API No new revisions were added by this update. Summary of changes: .../status/api/v1/sql/ApiSqlRootResource.scala | 9 +- .../spark/status/api/v1/sql/SqlResourceSuite.scala | 5 +- .../v1/sql/SqlResourceWithActualMetricsSuite.scala | 127 + 3 files changed, 136 insertions(+), 5 deletions(-) create mode 100644 sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (4072665 -> d33052b)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 4072665 [SPARK-32865][DOC] python section in quickstart page doesn't display SPARK_VERSION correctly add d33052b [SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2 No new revisions were added by this update. Summary of changes: .../datasources/v2/DataSourceV2ScanExec.scala | 12 +++ .../spark/sql/sources/v2/DataSourceV2Suite.scala | 23 ++ 2 files changed, 35 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (4072665 -> d33052b)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 4072665 [SPARK-32865][DOC] python section in quickstart page doesn't display SPARK_VERSION correctly add d33052b [SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2 No new revisions were added by this update. Summary of changes: .../datasources/v2/DataSourceV2ScanExec.scala | 12 +++ .../spark/sql/sources/v2/DataSourceV2Suite.scala | 23 ++ 2 files changed, 35 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (4072665 -> d33052b)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 4072665 [SPARK-32865][DOC] python section in quickstart page doesn't display SPARK_VERSION correctly add d33052b [SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2 No new revisions were added by this update. Summary of changes: .../datasources/v2/DataSourceV2ScanExec.scala | 12 +++ .../spark/sql/sources/v2/DataSourceV2Suite.scala | 23 ++ 2 files changed, 35 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (4072665 -> d33052b)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 4072665 [SPARK-32865][DOC] python section in quickstart page doesn't display SPARK_VERSION correctly add d33052b [SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2 No new revisions were added by this update. Summary of changes: .../datasources/v2/DataSourceV2ScanExec.scala | 12 +++ .../spark/sql/sources/v2/DataSourceV2Suite.scala | 23 ++ 2 files changed, 35 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new d33052b [SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2 d33052b is described below commit d33052b8456818a71724638248cfb91f1aacf1d4 Author: mingjial AuthorDate: Mon Sep 14 15:51:13 2020 +0800 [SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2 ### What changes were proposed in this pull request? Override doCanonicalize function of class DataSourceV2ScanExec ### Why are the changes needed? Query plan of DataSourceV2 fails to reuse any exchange. This change can make DataSourceV2's plan more optimized and reuse exchange as DataSourceV1 and parquet do. Direct reason: equals function of DataSourceV2ScanExec returns 'false' as comparing the same V2 scans(same table, outputs and pushedfilters) Actual cause : With query plan's default doCanonicalize function, pushedfilters of DataSourceV2ScanExec are not canonicalized correctly. Essentially expressionId of predicate columns are not normalized. [Spark 32708](https://issues.apache.org/jira/browse/SPARK-32708#) was not caught by my [tests](https://github.com/apache/spark/blob/5b1b9b39eb612cbf9ec67efd4e364adafcff66c4/sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala#L392) previously added for [SPARK-32609] because the above issue happens only when the same filtered column are of different expression id (eg : join table t1 with t1) ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? unit test added Closes #29564 from mingjialiu/branch-2.4. Authored-by: mingjial Signed-off-by: Gengliang Wang --- .../datasources/v2/DataSourceV2ScanExec.scala | 12 +++ .../spark/sql/sources/v2/DataSourceV2Suite.scala | 23 ++ 2 files changed, 35 insertions(+) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala index 9b70eec..2fe563a 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala @@ -22,6 +22,7 @@ import scala.collection.JavaConverters._ import org.apache.spark.rdd.RDD import org.apache.spark.sql.catalyst.InternalRow import org.apache.spark.sql.catalyst.expressions._ +import org.apache.spark.sql.catalyst.plans.QueryPlan import org.apache.spark.sql.catalyst.plans.physical import org.apache.spark.sql.catalyst.plans.physical.SinglePartition import org.apache.spark.sql.execution.{ColumnarBatchScan, LeafExecNode, WholeStageCodegenExec} @@ -52,6 +53,17 @@ case class DataSourceV2ScanExec( case _ => false } + override def doCanonicalize(): DataSourceV2ScanExec = { +DataSourceV2ScanExec( + output.map(QueryPlan.normalizeExprId(_, output)), + source, + options, + QueryPlan.normalizePredicates( +pushedFilters, +AttributeSeq(pushedFilters.flatMap(_.references).distinct)), + reader) + } + override def hashCode(): Int = { Seq(output, source, options).hashCode() } diff --git a/sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala b/sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala index ef0a8bd..92e0ac9 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala @@ -393,6 +393,29 @@ class DataSourceV2Suite extends QueryTest with SharedSQLContext { } } } + + test("SPARK-32708: same columns with different ExprIds should be equal after canonicalization ") { +def getV2ScanExec(query: DataFrame): DataSourceV2ScanExec = { + query.queryExecution.executedPlan.collect { +case d: DataSourceV2ScanExec => d + }.head +} + +val df1 = spark.read.format(classOf[AdvancedDataSourceV2].getName).load() +val q1 = df1.select('i).filter('i > 6) +val df2 = spark.read.format(classOf[AdvancedDataSourceV2].getName).load() +val q2 = df2.select('i).filter('i > 6) +val scan1 = getV2ScanExec(q1) +val scan2 = getV2ScanExec(q2) +assert(scan1.sameResult(scan2)) +assert(scan1.doCanonicalize().equals(scan2.doCanonicalize())) + +val q3 = df2.select('i).filter('i > 5) +val scan3 = getV2ScanExec(q3) +assert(!scan1.sameResult(scan3)) +assert(!scan1.doCanonicalize().equals(scan3.doCanonicali
[spark] branch master updated (de0dc52 -> 794b48c)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from de0dc52 [SPARK-32813][SQL] Get default config of ParquetSource vectorized reader if no active SparkSession add 794b48c [SPARK-32204][SPARK-32182][DOCS][FOLLOW-UP] Use IPython instead of ipython to check if installed in dev/lint-python No new revisions were added by this update. Summary of changes: dev/lint-python | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (de0dc52 -> 794b48c)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from de0dc52 [SPARK-32813][SQL] Get default config of ParquetSource vectorized reader if no active SparkSession add 794b48c [SPARK-32204][SPARK-32182][DOCS][FOLLOW-UP] Use IPython instead of ipython to check if installed in dev/lint-python No new revisions were added by this update. Summary of changes: dev/lint-python | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (de0dc52 -> 794b48c)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from de0dc52 [SPARK-32813][SQL] Get default config of ParquetSource vectorized reader if no active SparkSession add 794b48c [SPARK-32204][SPARK-32182][DOCS][FOLLOW-UP] Use IPython instead of ipython to check if installed in dev/lint-python No new revisions were added by this update. Summary of changes: dev/lint-python | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (de0dc52 -> 794b48c)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from de0dc52 [SPARK-32813][SQL] Get default config of ParquetSource vectorized reader if no active SparkSession add 794b48c [SPARK-32204][SPARK-32182][DOCS][FOLLOW-UP] Use IPython instead of ipython to check if installed in dev/lint-python No new revisions were added by this update. Summary of changes: dev/lint-python | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (de0dc52 -> 794b48c)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from de0dc52 [SPARK-32813][SQL] Get default config of ParquetSource vectorized reader if no active SparkSession add 794b48c [SPARK-32204][SPARK-32182][DOCS][FOLLOW-UP] Use IPython instead of ipython to check if installed in dev/lint-python No new revisions were added by this update. Summary of changes: dev/lint-python | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3ae1520 -> e029e89)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3ae1520 [SPARK-33131][SQL] Fix grouping sets with having clause can not resolve qualified col name add e029e89 [SPARK-33145][WEBUI] Fix when `Succeeded Jobs` has many child url elements,they will extend over the edge of the page No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/execution/ui/static/spark-sql-viz.css | 5 + .../main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3ae1520 -> e029e89)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3ae1520 [SPARK-33131][SQL] Fix grouping sets with having clause can not resolve qualified col name add e029e89 [SPARK-33145][WEBUI] Fix when `Succeeded Jobs` has many child url elements,they will extend over the edge of the page No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/execution/ui/static/spark-sql-viz.css | 5 + .../main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3ae1520 -> e029e89)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3ae1520 [SPARK-33131][SQL] Fix grouping sets with having clause can not resolve qualified col name add e029e89 [SPARK-33145][WEBUI] Fix when `Succeeded Jobs` has many child url elements,they will extend over the edge of the page No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/execution/ui/static/spark-sql-viz.css | 5 + .../main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3ae1520 -> e029e89)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3ae1520 [SPARK-33131][SQL] Fix grouping sets with having clause can not resolve qualified col name add e029e89 [SPARK-33145][WEBUI] Fix when `Succeeded Jobs` has many child url elements,they will extend over the edge of the page No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/execution/ui/static/spark-sql-viz.css | 5 + .../main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-33145][WEBUI] Fix when `Succeeded Jobs` has many child url elements, they will extend over the edge of the page
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new e029e89 [SPARK-33145][WEBUI] Fix when `Succeeded Jobs` has many child url elements,they will extend over the edge of the page e029e89 is described below commit e029e891abeb37f383e4d5237edf693c8ad53bed Author: neko AuthorDate: Fri Oct 16 23:13:22 2020 +0800 [SPARK-33145][WEBUI] Fix when `Succeeded Jobs` has many child url elements,they will extend over the edge of the page ### What changes were proposed in this pull request? In Execution web page, when `Succeeded Job`(or Failed Jobs) has many child url elements,they will extend over the edge of the page. ### Why are the changes needed? To make the page more friendly. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Munual test result shows as below: ![fixed](https://user-images.githubusercontent.com/52202080/95977319-50734600-0e4b-11eb-93c0-b8deb565bcd8.png) Closes #30035 from akiyamaneko/sql_execution_job_overflow. Authored-by: neko Signed-off-by: Gengliang Wang --- .../org/apache/spark/sql/execution/ui/static/spark-sql-viz.css | 5 + .../main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css index 9a32b79..dbdbf9f 100644 --- a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css +++ b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.css @@ -52,3 +52,8 @@ .tooltip-inner { word-wrap:break-word; } + +/* Breaks the long job url list when showing Details for Query in SQL */ +.job-url { + word-wrap: break-word; +} diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala index 76bc7fa..b15c70a 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala @@ -45,7 +45,7 @@ class ExecutionPage(parent: SQLTab) extends WebUIPage("execution") with Logging if (jobStatus == status) Some(jobId) else None } if (jobs.nonEmpty) { - + {label} {jobs.toSeq.sorted.map { jobId => {jobId.toString} - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (21e0dd04 -> d0dfe49)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 21e0dd04 [SPARK-32119][FOLLOWUP][DOC] Update monitoring doc following the improvement in SPARK-32119 add d0dfe49 [MINOR][INFRA] Rename master.yml to build_and_test.yml No new revisions were added by this update. Summary of changes: .github/workflows/{master.yml => build_and_test.yml} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename .github/workflows/{master.yml => build_and_test.yml} (100%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (21e0dd04 -> d0dfe49)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 21e0dd04 [SPARK-32119][FOLLOWUP][DOC] Update monitoring doc following the improvement in SPARK-32119 add d0dfe49 [MINOR][INFRA] Rename master.yml to build_and_test.yml No new revisions were added by this update. Summary of changes: .github/workflows/{master.yml => build_and_test.yml} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename .github/workflows/{master.yml => build_and_test.yml} (100%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (21e0dd04 -> d0dfe49)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 21e0dd04 [SPARK-32119][FOLLOWUP][DOC] Update monitoring doc following the improvement in SPARK-32119 add d0dfe49 [MINOR][INFRA] Rename master.yml to build_and_test.yml No new revisions were added by this update. Summary of changes: .github/workflows/{master.yml => build_and_test.yml} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename .github/workflows/{master.yml => build_and_test.yml} (100%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (21e0dd04 -> d0dfe49)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 21e0dd04 [SPARK-32119][FOLLOWUP][DOC] Update monitoring doc following the improvement in SPARK-32119 add d0dfe49 [MINOR][INFRA] Rename master.yml to build_and_test.yml No new revisions were added by this update. Summary of changes: .github/workflows/{master.yml => build_and_test.yml} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename .github/workflows/{master.yml => build_and_test.yml} (100%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (21e0dd04 -> d0dfe49)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 21e0dd04 [SPARK-32119][FOLLOWUP][DOC] Update monitoring doc following the improvement in SPARK-32119 add d0dfe49 [MINOR][INFRA] Rename master.yml to build_and_test.yml No new revisions were added by this update. Summary of changes: .github/workflows/{master.yml => build_and_test.yml} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename .github/workflows/{master.yml => build_and_test.yml} (100%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org