[spark] branch master updated (5d5866b -> b80309b)

2020-05-10 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5d5866b  [SPARK-31672][SQL] Fix loading of timestamps before 
1582-10-15 from dictionary encoded Parquet columns
 add b80309b  [SPARK-31674][CORE][DOCS] Make Prometheus metric endpoints 
experimental

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/metrics/sink/PrometheusServlet.scala  | 3 +++
 .../scala/org/apache/spark/status/api/v1/PrometheusResource.scala | 3 +++
 docs/monitoring.md| 4 ++--
 3 files changed, 8 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5d5866b -> b80309b)

2020-05-10 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5d5866b  [SPARK-31672][SQL] Fix loading of timestamps before 
1582-10-15 from dictionary encoded Parquet columns
 add b80309b  [SPARK-31674][CORE][DOCS] Make Prometheus metric endpoints 
experimental

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/metrics/sink/PrometheusServlet.scala  | 3 +++
 .../scala/org/apache/spark/status/api/v1/PrometheusResource.scala | 3 +++
 docs/monitoring.md| 4 ++--
 3 files changed, 8 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-31674][CORE][DOCS] Make Prometheus metric endpoints experimental

2020-05-10 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new e2bf140  [SPARK-31674][CORE][DOCS] Make Prometheus metric endpoints 
experimental
e2bf140 is described below

commit e2bf140c68ef38216167e0872b964c3964ca0d9f
Author: Dongjoon Hyun 
AuthorDate: Sun May 10 22:32:26 2020 -0700

[SPARK-31674][CORE][DOCS] Make Prometheus metric endpoints experimental

### What changes were proposed in this pull request?

This PR aims to new Prometheus-format metric endpoints experimental in 
Apache Spark 3.0.0.

### Why are the changes needed?

Although the new metrics are disabled by default, we had better make it 
experimental explicitly in Apache Spark 3.0.0 since the output format is still 
not fixed. We can finalize it in Apache Spark 3.1.0.

### Does this PR introduce _any_ user-facing change?

Only doc-change is visible to the users.

### How was this patch tested?

Manually check the code since this is a documentation and class annotation 
change.

Closes #28495 from dongjoon-hyun/SPARK-31674.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit b80309bdb4d26556bd3da6a61cac464cdbdd1fe1)
Signed-off-by: Dongjoon Hyun 
---
 .../main/scala/org/apache/spark/metrics/sink/PrometheusServlet.scala  | 3 +++
 .../scala/org/apache/spark/status/api/v1/PrometheusResource.scala | 3 +++
 docs/monitoring.md| 4 ++--
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/metrics/sink/PrometheusServlet.scala 
b/core/src/main/scala/org/apache/spark/metrics/sink/PrometheusServlet.scala
index 7c33bce..011c7bc 100644
--- a/core/src/main/scala/org/apache/spark/metrics/sink/PrometheusServlet.scala
+++ b/core/src/main/scala/org/apache/spark/metrics/sink/PrometheusServlet.scala
@@ -24,15 +24,18 @@ import com.codahale.metrics.MetricRegistry
 import org.eclipse.jetty.servlet.ServletContextHandler
 
 import org.apache.spark.{SecurityManager, SparkConf}
+import org.apache.spark.annotation.Experimental
 import org.apache.spark.ui.JettyUtils._
 
 /**
+ * :: Experimental ::
  * This exposes the metrics of the given registry with Prometheus format.
  *
  * The output is consistent with /metrics/json result in terms of item ordering
  * and with the previous result of Spark JMX Sink + Prometheus JMX Converter 
combination
  * in terms of key string format.
  */
+@Experimental
 private[spark] class PrometheusServlet(
 val property: Properties,
 val registry: MetricRegistry,
diff --git 
a/core/src/main/scala/org/apache/spark/status/api/v1/PrometheusResource.scala 
b/core/src/main/scala/org/apache/spark/status/api/v1/PrometheusResource.scala
index f9fb78e..2a5f151 100644
--- 
a/core/src/main/scala/org/apache/spark/status/api/v1/PrometheusResource.scala
+++ 
b/core/src/main/scala/org/apache/spark/status/api/v1/PrometheusResource.scala
@@ -23,15 +23,18 @@ import org.eclipse.jetty.servlet.{ServletContextHandler, 
ServletHolder}
 import org.glassfish.jersey.server.ServerProperties
 import org.glassfish.jersey.servlet.ServletContainer
 
+import org.apache.spark.annotation.Experimental
 import org.apache.spark.ui.SparkUI
 
 /**
+ * :: Experimental ::
  * This aims to expose Executor metrics like REST API which is documented in
  *
  *https://spark.apache.org/docs/3.0.0/monitoring.html#executor-metrics
  *
  * Note that this is based on ExecutorSummary which is different from 
ExecutorSource.
  */
+@Experimental
 @Path("/executors")
 private[v1] class PrometheusResource extends ApiRequestContext {
   @GET
diff --git a/docs/monitoring.md b/docs/monitoring.md
index 7e41c9d..4da0f8e 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -715,7 +715,7 @@ A list of the available metrics, with a short description:
 Executor-level metrics are sent from each executor to the driver as part of 
the Heartbeat to describe the performance metrics of Executor itself like JVM 
heap memory, GC information.
 Executor metric values and their measured memory peak values per executor are 
exposed via the REST API in JSON format and in Prometheus format.
 The JSON end point is exposed at: `/applications/[app-id]/executors`, and the 
Prometheus endpoint at: `/metrics/executors/prometheus`.
-The Prometheus endpoint is conditional to a configuration parameter: 
`spark.ui.prometheus.enabled=true` (the default is `false`).
+The Prometheus endpoint is experimental and conditional to a configuration 
parameter: `spark.ui.prometheus.enabled=true` (the default is `false`).
 In addition, aggregated per-stage peak values of the executor memory metrics 
are written to the event log if
 `spark.eventLog.logStageExecutorMetrics` is true.  
 Executor 

[spark] branch branch-3.0 updated: [SPARK-31672][SQL] Fix loading of timestamps before 1582-10-15 from dictionary encoded Parquet columns

2020-05-10 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 5c6a4fc  [SPARK-31672][SQL] Fix loading of timestamps before 
1582-10-15 from dictionary encoded Parquet columns
5c6a4fc is described below

commit 5c6a4fc8a71fcca9110c8c18ebd44d935514fcc1
Author: Max Gekk 
AuthorDate: Mon May 11 04:58:08 2020 +

[SPARK-31672][SQL] Fix loading of timestamps before 1582-10-15 from 
dictionary encoded Parquet columns

Modified the `decodeDictionaryIds()` method of `VectorizedColumnReader` to 
handle especially `TimestampType` when the passed parameter `rebaseDateTime` is 
true. In that case, decoded milliseconds/microseconds are rebased from the 
hybrid calendar to Proleptic Gregorian calendar using 
`RebaseDateTime`.`rebaseJulianToGregorianMicros()`.

This fixes the bug of loading timestamps before the cutover day from 
dictionary encoded column in parquet files. The code below forces dictionary 
encoding:
```scala
spark.conf.set("spark.sql.legacy.parquet.rebaseDateTimeInWrite.enabled", 
true)
scala> spark.conf.set("spark.sql.parquet.outputTimestampType", 
"TIMESTAMP_MICROS")
scala>
Seq.tabulate(8)(_ => "1001-01-01 01:02:03.123").toDF("tsS")
  .select($"tsS".cast("timestamp").as("ts")).repartition(1)
  .write
  .option("parquet.enable.dictionary", true)
  .parquet(path)
```
Load the dates back:
```scala
scala> spark.read.parquet(path).show(false)
+---+
|ts |
+---+
|1001-01-07 00:32:20.123|
...
|1001-01-07 00:32:20.123|
+---+
```
Expected values **must be 1001-01-01 01:02:03.123** but not 1001-01-07 
00:32:20.123.

Yes. After the changes:
```scala
scala> spark.read.parquet(path).show(false)
+---+
|ts |
+---+
|1001-01-01 01:02:03.123|
...
|1001-01-01 01:02:03.123|
+---+
```

Modified the test `SPARK-31159: rebasing timestamps in write` in 
`ParquetIOSuite` to checked reading dictionary encoded dates.

Closes #28489 from MaxGekk/fix-ts-rebase-parquet-dict-enc.

Authored-by: Max Gekk 
Signed-off-by: Wenchen Fan 
(cherry picked from commit 5d5866be12259c40972f7404f64d830cab87401f)
Signed-off-by: Wenchen Fan 
---
 .../parquet/VectorizedColumnReader.java| 31 +--
 .../datasources/parquet/ParquetIOSuite.scala   | 65 --
 2 files changed, 64 insertions(+), 32 deletions(-)

diff --git 
a/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java
 
b/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java
index 03056f5..11ce11d 100644
--- 
a/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java
+++ 
b/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java
@@ -159,7 +159,11 @@ public class VectorizedColumnReader {
 isSupported = originalType != OriginalType.DATE || !rebaseDateTime;
 break;
   case INT64:
-isSupported = originalType != OriginalType.TIMESTAMP_MILLIS;
+if (originalType == OriginalType.TIMESTAMP_MICROS) {
+  isSupported = !rebaseDateTime;
+} else {
+  isSupported = originalType != OriginalType.TIMESTAMP_MILLIS;
+}
 break;
   case FLOAT:
   case DOUBLE:
@@ -313,17 +317,36 @@ public class VectorizedColumnReader {
   case INT64:
 if (column.dataType() == DataTypes.LongType ||
 DecimalType.is64BitDecimalType(column.dataType()) ||
-originalType == OriginalType.TIMESTAMP_MICROS) {
+(originalType == OriginalType.TIMESTAMP_MICROS && 
!rebaseDateTime)) {
   for (int i = rowId; i < rowId + num; ++i) {
 if (!column.isNullAt(i)) {
   column.putLong(i, 
dictionary.decodeToLong(dictionaryIds.getDictId(i)));
 }
   }
 } else if (originalType == OriginalType.TIMESTAMP_MILLIS) {
+  if (rebaseDateTime) {
+for (int i = rowId; i < rowId + num; ++i) {
+  if (!column.isNullAt(i)) {
+long julianMillis = 
dictionary.decodeToLong(dictionaryIds.getDictId(i));
+long julianMicros = DateTimeUtils.fromMillis(julianMillis);
+long gregorianMicros = 
RebaseDateTime.rebaseJulianToGregorianMicros(julianMicros);
+column.putLong(i, gregorianMicros);
+  }
+}
+  } else {
+for (int i = rowId; i < rowId + num; ++i) {
+  if (!column.isNullAt(i)) {
+

[spark] branch master updated (9f768fa -> 5d5866b)

2020-05-10 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9f768fa  [SPARK-31669][SQL][TESTS] Fix RowEncoderSuite failures on 
non-existing dates/timestamps
 add 5d5866b  [SPARK-31672][SQL] Fix loading of timestamps before 
1582-10-15 from dictionary encoded Parquet columns

No new revisions were added by this update.

Summary of changes:
 .../parquet/VectorizedColumnReader.java| 31 +--
 .../datasources/parquet/ParquetIOSuite.scala   | 65 --
 2 files changed, 64 insertions(+), 32 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (9f768fa -> 5d5866b)

2020-05-10 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9f768fa  [SPARK-31669][SQL][TESTS] Fix RowEncoderSuite failures on 
non-existing dates/timestamps
 add 5d5866b  [SPARK-31672][SQL] Fix loading of timestamps before 
1582-10-15 from dictionary encoded Parquet columns

No new revisions were added by this update.

Summary of changes:
 .../parquet/VectorizedColumnReader.java| 31 +--
 .../datasources/parquet/ParquetIOSuite.scala   | 65 --
 2 files changed, 64 insertions(+), 32 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (a75dc80 -> 9f768fa)

2020-05-10 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from a75dc80  [SPARK-31636][SQL][DOCS] Remove HTML syntax in SQL reference
 add 9f768fa  [SPARK-31669][SQL][TESTS] Fix RowEncoderSuite failures on 
non-existing dates/timestamps

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/RandomDataGenerator.scala | 23 +++---
 1 file changed, 20 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.0 updated: [SPARK-31669][SQL][TESTS] Fix RowEncoderSuite failures on non-existing dates/timestamps

2020-05-10 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 6f7c719  [SPARK-31669][SQL][TESTS] Fix RowEncoderSuite failures on 
non-existing dates/timestamps
6f7c719 is described below

commit 6f7c71947073f147bc35da196139d5ceb6fbdf45
Author: Max Gekk 
AuthorDate: Sun May 10 14:22:12 2020 -0500

[SPARK-31669][SQL][TESTS] Fix RowEncoderSuite failures on non-existing 
dates/timestamps

### What changes were proposed in this pull request?
Shift non-existing dates in Proleptic Gregorian calendar by 1 day. The 
reason for that is `RowEncoderSuite` generates random dates/timestamps in the 
hybrid calendar, and some dates/timestamps don't exist in Proleptic Gregorian 
calendar like 1000-02-29 because 1000 is not leap year in Proleptic Gregorian 
calendar.

### Why are the changes needed?
This makes RowEncoderSuite much stable.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By running RowEncoderSuite and set non-existing date manually:
```scala
val date = new java.sql.Date(1000 - 1900, 1, 29)
Try { date.toLocalDate; date }.getOrElse(new Date(date.getTime + 
MILLIS_PER_DAY))
```

Closes #28486 from MaxGekk/fix-RowEncoderSuite.

Authored-by: Max Gekk 
Signed-off-by: Sean Owen 
(cherry picked from commit 9f768fa9916dec3cc695e3f28ec77148d81d335f)
Signed-off-by: Sean Owen 
---
 .../org/apache/spark/sql/RandomDataGenerator.scala | 23 +++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
index a7c20c3..5a4d23d 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
@@ -18,9 +18,10 @@
 package org.apache.spark.sql
 
 import java.math.MathContext
+import java.sql.{Date, Timestamp}
 
 import scala.collection.mutable
-import scala.util.Random
+import scala.util.{Random, Try}
 
 import org.apache.spark.sql.catalyst.CatalystTypeConverters
 import org.apache.spark.sql.catalyst.util.DateTimeConstants.MILLIS_PER_DAY
@@ -172,7 +173,15 @@ object RandomDataGenerator {
   // January 1, 1970, 00:00:00 GMT for "-12-31 
23:59:59.99".
   milliseconds = rand.nextLong() % 25340232959L
 }
-DateTimeUtils.toJavaDate((milliseconds / MILLIS_PER_DAY).toInt)
+val date = DateTimeUtils.toJavaDate((milliseconds / 
MILLIS_PER_DAY).toInt)
+// The generated `date` is based on the hybrid calendar Julian + 
Gregorian since
+// 1582-10-15 but it should be valid in Proleptic Gregorian 
calendar too which is used
+// by Spark SQL since version 3.0 (see SPARK-26651). We try to 
convert `date` to
+// a local date in Proleptic Gregorian calendar to satisfy this 
requirement.
+// Some years are leap years in Julian calendar but not in 
Proleptic Gregorian calendar.
+// As the consequence of that, 29 February of such years might not 
exist in Proleptic
+// Gregorian calendar. When this happens, we shift the date by one 
day.
+Try { date.toLocalDate; date }.getOrElse(new Date(date.getTime + 
MILLIS_PER_DAY))
   }
 Some(generator)
   case TimestampType =>
@@ -188,7 +197,15 @@ object RandomDataGenerator {
   milliseconds = rand.nextLong() % 25340232959L
 }
 // DateTimeUtils.toJavaTimestamp takes microsecond.
-DateTimeUtils.toJavaTimestamp(milliseconds * 1000)
+val ts = DateTimeUtils.toJavaTimestamp(milliseconds * 1000)
+// The generated `ts` is based on the hybrid calendar Julian + 
Gregorian since
+// 1582-10-15 but it should be valid in Proleptic Gregorian 
calendar too which is used
+// by Spark SQL since version 3.0 (see SPARK-26651). We try to 
convert `ts` to
+// a local timestamp in Proleptic Gregorian calendar to satisfy 
this requirement.
+// Some years are leap years in Julian calendar but not in 
Proleptic Gregorian calendar.
+// As the consequence of that, 29 February of such years might not 
exist in Proleptic
+// Gregorian calendar. When this happens, we shift the timestamp 
`ts` by one day.
+Try { ts.toLocalDateTime; ts }.getOrElse(new Timestamp(ts.getTime 
+ MILLIS_PER_DAY))
   }
 Some(generator)
   case CalendarIntervalType => Some(() => {


-
To unsubscribe, e-mail: 

[spark] branch master updated (ce63bef -> a75dc80)

2020-05-10 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ce63bef  [SPARK-31662][SQL] Fix loading of dates before 1582-10-15 
from dictionary encoded Parquet columns
 add a75dc80  [SPARK-31636][SQL][DOCS] Remove HTML syntax in SQL reference

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml   |  20 +-
 docs/sql-ref-ansi-compliance.md|  18 +-
 docs/sql-ref-datatypes.md  |   4 +-
 docs/sql-ref-functions-builtin.md  |   2 +-
 docs/sql-ref-functions-udf-aggregate.md| 101 
 docs/sql-ref-functions-udf-hive.md |  12 +-
 docs/sql-ref-functions-udf-scalar.md   |  28 +-
 docs/sql-ref-identifier.md |  37 ++-
 docs/sql-ref-literals.md   | 282 +
 docs/sql-ref-null-semantics.md |  44 ++--
 docs/sql-ref-syntax-aux-analyze-table.md   |  64 ++---
 docs/sql-ref-syntax-aux-cache-cache-table.md   |  98 +++
 docs/sql-ref-syntax-aux-cache-clear-cache.md   |  16 +-
 docs/sql-ref-syntax-aux-cache-refresh.md   |  24 +-
 docs/sql-ref-syntax-aux-cache-uncache-table.md |  31 +--
 docs/sql-ref-syntax-aux-conf-mgmt-reset.md |  10 +-
 docs/sql-ref-syntax-aux-conf-mgmt-set.md   |  31 +--
 docs/sql-ref-syntax-aux-describe-database.md   |  21 +-
 docs/sql-ref-syntax-aux-describe-function.md   |  30 +--
 docs/sql-ref-syntax-aux-describe-query.md  |  44 ++--
 docs/sql-ref-syntax-aux-describe-table.md  |  62 ++---
 docs/sql-ref-syntax-aux-refresh-table.md   |  31 +--
 docs/sql-ref-syntax-aux-resource-mgmt-add-file.md  |  21 +-
 docs/sql-ref-syntax-aux-resource-mgmt-add-jar.md   |  21 +-
 docs/sql-ref-syntax-aux-resource-mgmt-list-file.md |  14 +-
 docs/sql-ref-syntax-aux-resource-mgmt-list-jar.md  |  14 +-
 docs/sql-ref-syntax-aux-show-columns.md|   2 +-
 docs/sql-ref-syntax-aux-show-create-table.md   |  27 +-
 docs/sql-ref-syntax-aux-show-databases.md  |  32 +--
 docs/sql-ref-syntax-aux-show-functions.md  |  60 ++---
 docs/sql-ref-syntax-aux-show-partitions.md |  47 ++--
 docs/sql-ref-syntax-aux-show-table.md  |  60 ++---
 docs/sql-ref-syntax-aux-show-tables.md |  41 ++-
 docs/sql-ref-syntax-aux-show-tblproperties.md  |  51 ++--
 docs/sql-ref-syntax-aux-show-views.md  |  45 ++--
 docs/sql-ref-syntax-aux-show.md|   4 +-
 docs/sql-ref-syntax-ddl-alter-database.md  |  17 +-
 docs/sql-ref-syntax-ddl-alter-table.md | 256 ---
 docs/sql-ref-syntax-ddl-alter-view.md  | 124 -
 docs/sql-ref-syntax-ddl-create-database.md |  39 +--
 docs/sql-ref-syntax-ddl-create-function.md |  85 +++
 docs/sql-ref-syntax-ddl-create-table-datasource.md | 100 
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md |  99 
 docs/sql-ref-syntax-ddl-create-table-like.md   |  73 +++---
 docs/sql-ref-syntax-ddl-create-table.md|  10 +-
 docs/sql-ref-syntax-ddl-create-view.md |  82 +++---
 docs/sql-ref-syntax-ddl-drop-database.md   |  42 ++-
 docs/sql-ref-syntax-ddl-drop-function.md   |  55 ++--
 docs/sql-ref-syntax-ddl-drop-table.md  |  45 ++--
 docs/sql-ref-syntax-ddl-drop-view.md   |  49 ++--
 docs/sql-ref-syntax-ddl-repair-table.md|  25 +-
 docs/sql-ref-syntax-ddl-truncate-table.md  |  43 ++--
 docs/sql-ref-syntax-dml-insert-into.md |  90 +++
 ...f-syntax-dml-insert-overwrite-directory-hive.md |  75 +++---
 ...ql-ref-syntax-dml-insert-overwrite-directory.md |  74 +++---
 docs/sql-ref-syntax-dml-insert-overwrite-table.md  |  87 +++
 docs/sql-ref-syntax-dml-insert.md  |   8 +-
 docs/sql-ref-syntax-dml-load.md|  67 ++---
 docs/sql-ref-syntax-dml.md |   4 +-
 docs/sql-ref-syntax-qry-explain.md |  58 ++---
 docs/sql-ref-syntax-qry-sampling.md|  20 +-
 docs/sql-ref-syntax-qry-select-clusterby.md|  33 ++-
 docs/sql-ref-syntax-qry-select-cte.md  |  35 ++-
 docs/sql-ref-syntax-qry-select-distribute-by.md|  33 ++-
 docs/sql-ref-syntax-qry-select-groupby.md  | 261 ++-
 docs/sql-ref-syntax-qry-select-having.md   |  54 ++--
 docs/sql-ref-syntax-qry-select-hints.md|  56 ++--
 docs/sql-ref-syntax-qry-select-inline-table.md |  35 +--
 docs/sql-ref-syntax-qry-select-join.md | 185 ++
 docs/sql-ref-syntax-qry-select-like.md |  51 ++--
 docs/sql-ref-syntax-qry-select-limit.md|  41 ++-
 docs/sql-ref-syntax-qry-select-orderby.md