This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new d1aa7d6c4427 [SPARK-54713][SQL] Add vector similarity/distance 
function expressions support
d1aa7d6c4427 is described below

commit d1aa7d6c4427be8444450af1c1894704575dccb2
Author: zhidongqu-db <[email protected]>
AuthorDate: Wed Jan 7 16:03:47 2026 +0800

    [SPARK-54713][SQL] Add vector similarity/distance function expressions 
support
    
    ### What changes were proposed in this pull request?
    
    This PR adds support for vector distance/similarity functions to Spark SQL, 
enabling efficient computation of common distance/similarity metrics on 
embedding vectors.
    
    Similarity Functions
    - vector_cosine_similarity(vector1, vector2) - Returns the cosine 
similarity between two vectors (range: -1.0 to 1.0)
    - vector_inner_product(vector1, vector2) - Returns the inner product (dot 
product) between two vectors
    
    Distance Functions
    - vector_l2_distance(vector1, vector2) - Returns the Euclidean (L2) 
distance between two vectors
    
    Key implementation details:
    - Type Safety: All functions accept only ARRAY<FLOAT> inputs and return 
FLOAT. No implicit type casting is performed - passing ARRAY<DOUBLE> or 
ARRAY<INT> results in a DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE error.
    - Dimension Validation: Vectors must have matching dimensions; mismatches 
throw VECTOR_DIMENSION_MISMATCH error with clear messaging.
    - NULL Handling: NULL array inputs return NULL. Arrays containing NULL 
elements also return NULL.
    - Edge Cases: Empty vectors return NULL for cosine similarity (undefined), 
0.0 for inner product and L2 distance. Zero-magnitude vectors return NULL for 
cosine similarity.
    - Code Generation: Optimized codegen with manual loop unrolling (8 elements 
at a time) to enable potential JIT SIMD vectorization. This can be evolved to 
use Java Vector API in the future.
    
    This PR only includes SQL language support; DataFrame API need to be added 
in a separate PR.
    
    ### Why are the changes needed?
    
    Vector distance/similarity functions are fundamental operations for:
    - Similarity search: Finding similar items in recommendation systems
    - Clustering: K-means and other distance-based clustering algorithms
    - Categorization: Categorizing text/image to labels based on embeddings
    
    These functions are commonly available in other OLAP/OLTP systems 
(BigQuery, PostgreSQL pgvector, DuckDB...) and are essential for modern ML/AI 
workloads in Spark SQL.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes, this PR introduces 3 new SQL functions:
    
    ```
    -- Cosine similarity (1.0 = identical, 0 = orthogonal, -1.0 = opposite)
      SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F),
      array(4.0F, 5.0F, 6.0F));
      -- Returns: 0.97463185
    
    -- Inner product (dot product)
      SELECT vector_inner_product(array(1.0F, 2.0F, 3.0F), array(4.0F, 5.0F, 
6.0F));
      -- Returns: 32.0
    
    -- L2 (Euclidean) distance
      SELECT vector_l2_distance(array(1.0F, 2.0F, 3.0F), array(4.0F, 5.0F, 
6.0F));
      -- Returns: 5.196152
    ```
    
    ### How was this patch tested?
    
    SQL Golden File Tests: Added vector-distance.sql with test coverage:
    - Basic functionality tests for all three functions
    - Mathematical correctness validation (identical vectors, orthogonal 
vectors, known values)
    - Empty vector handling
    - Zero-magnitude vector handling
    - NULL array input handling
    - NULL element within array handling
    - Dimension mismatch error cases
    - Type mismatch error cases (ARRAY, ARRAY)
    - Large vector tests (16 elements to exercise unrolled loop path)
    
    Expression Schema Tests: Updated sql-expression-schema.md via 
ExpressionsSchemaSuite.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Yes, code assistance with claude opus 4.5 in combination with manual 
editing by the author.
    
    Closes #53481 from zhidongqu-db/distance-functions.
    
    Authored-by: zhidongqu-db <[email protected]>
    Signed-off-by: Wenchen Fan <[email protected]>
---
 .../src/main/resources/error/error-conditions.json |   6 +
 .../scala/org/apache/spark/sql/functions.scala     |   1 +
 .../sql/catalyst/expressions/ExpressionInfo.java   |   3 +-
 .../expressions/VectorFunctionImplUtils.java       | 247 +++++++
 .../sql/catalyst/analysis/FunctionRegistry.scala   |   5 +
 .../catalyst/expressions/vectorExpressions.scala   | 198 ++++++
 .../spark/sql/errors/QueryExecutionErrors.scala    |  12 +
 .../sql-functions/sql-expression-schema.md         |   3 +
 .../analyzer-results/vector-distance.sql.out       | 656 ++++++++++++++++++
 .../resources/sql-tests/inputs/vector-distance.sql | 132 ++++
 .../sql-tests/results/vector-distance.sql.out      | 746 +++++++++++++++++++++
 .../sql/expressions/ExpressionInfoSuite.scala      |   2 +-
 12 files changed, 2009 insertions(+), 2 deletions(-)

diff --git a/common/utils/src/main/resources/error/error-conditions.json 
b/common/utils/src/main/resources/error/error-conditions.json
index b101782e28c2..18ef49f01bef 100644
--- a/common/utils/src/main/resources/error/error-conditions.json
+++ b/common/utils/src/main/resources/error/error-conditions.json
@@ -7483,6 +7483,12 @@
     ],
     "sqlState" : "22023"
   },
+  "VECTOR_DIMENSION_MISMATCH" : {
+    "message" : [
+      "Vectors passed to <functionName> must have the same dimension, but got 
<leftDim> and <rightDim>."
+    ],
+    "sqlState" : "22000"
+  },
   "VIEW_ALREADY_EXISTS" : {
     "message" : [
       "Cannot create view <relationName> because it already exists.",
diff --git a/sql/api/src/main/scala/org/apache/spark/sql/functions.scala 
b/sql/api/src/main/scala/org/apache/spark/sql/functions.scala
index 78675364d841..c4eaca3ac7a6 100644
--- a/sql/api/src/main/scala/org/apache/spark/sql/functions.scala
+++ b/sql/api/src/main/scala/org/apache/spark/sql/functions.scala
@@ -75,6 +75,7 @@ import org.apache.spark.util.SparkClassUtils
  * @groupname csv_funcs CSV functions
  * @groupname json_funcs JSON functions
  * @groupname variant_funcs VARIANT functions
+ * @groupname vector_funcs Vector functions
  * @groupname xml_funcs XML functions
  * @groupname url_funcs URL functions
  * @groupname partition_transforms Partition transform functions
diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionInfo.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionInfo.java
index dd56c650c073..592230cc5b10 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionInfo.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/ExpressionInfo.java
@@ -48,7 +48,8 @@ public class ExpressionInfo {
             "collection_funcs", "predicate_funcs", "conditional_funcs", 
"conversion_funcs",
             "csv_funcs", "datetime_funcs", "generator_funcs", "hash_funcs", 
"json_funcs",
             "lambda_funcs", "map_funcs", "math_funcs", "misc_funcs", 
"string_funcs", "struct_funcs",
-            "window_funcs", "xml_funcs", "table_funcs", "url_funcs", 
"variant_funcs", "st_funcs"));
+            "window_funcs", "xml_funcs", "table_funcs", "url_funcs", 
"variant_funcs",
+            "vector_funcs", "st_funcs"));
 
     private static final Set<String> validSources =
             new HashSet<>(Arrays.asList("built-in", "hive", "python_udf", 
"scala_udf", "sql_udf",
diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/VectorFunctionImplUtils.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/VectorFunctionImplUtils.java
new file mode 100644
index 000000000000..59811298360f
--- /dev/null
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/VectorFunctionImplUtils.java
@@ -0,0 +1,247 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions;
+
+import org.apache.spark.sql.catalyst.util.ArrayData;
+import org.apache.spark.sql.errors.QueryExecutionErrors;
+import org.apache.spark.unsafe.types.UTF8String;
+
+/**
+ * A utility class for vector similarity/distance function implementations.
+ */
+public class VectorFunctionImplUtils {
+
+  /**
+   * Computes the cosine similarity between two float vectors.
+   * Returns NULL if either vector contains NULL elements, has zero magnitude, 
or is empty.
+   * Throws an exception if vectors have different dimensions.
+   * Uses manual loop unrolling (8 elements at a time) for speculative SIMD 
optimization.
+   */
+  public static Float vectorCosineSimilarity(ArrayData left, ArrayData right, 
UTF8String funcName) {
+    int leftLen = left.numElements();
+    int rightLen = right.numElements();
+
+    if (leftLen != rightLen) {
+      throw QueryExecutionErrors.vectorDimensionMismatchError(
+          funcName.toString(), leftLen, rightLen);
+    }
+
+    if (leftLen == 0) {
+      return null;
+    }
+
+    double dotProduct = 0.0;
+    double norm1Sq = 0.0;
+    double norm2Sq = 0.0;
+
+    int i = 0;
+    int simdLimit = (leftLen / 8) * 8;
+
+    // Manual unroll loop - process 8 floats at a time for speculative SIMD 
optimization
+    while (i < simdLimit) {
+      // Check for nulls in batch
+      if (left.isNullAt(i) || left.isNullAt(i + 1) ||
+          left.isNullAt(i + 2) || left.isNullAt(i + 3) ||
+          left.isNullAt(i + 4) || left.isNullAt(i + 5) ||
+          left.isNullAt(i + 6) || left.isNullAt(i + 7) ||
+          right.isNullAt(i) || right.isNullAt(i + 1) ||
+          right.isNullAt(i + 2) || right.isNullAt(i + 3) ||
+          right.isNullAt(i + 4) || right.isNullAt(i + 5) ||
+          right.isNullAt(i + 6) || right.isNullAt(i + 7)) {
+        return null;
+      }
+
+      float a0 = left.getFloat(i), a1 = left.getFloat(i + 1);
+      float a2 = left.getFloat(i + 2), a3 = left.getFloat(i + 3);
+      float a4 = left.getFloat(i + 4), a5 = left.getFloat(i + 5);
+      float a6 = left.getFloat(i + 6), a7 = left.getFloat(i + 7);
+
+      float b0 = right.getFloat(i), b1 = right.getFloat(i + 1);
+      float b2 = right.getFloat(i + 2), b3 = right.getFloat(i + 3);
+      float b4 = right.getFloat(i + 4), b5 = right.getFloat(i + 5);
+      float b6 = right.getFloat(i + 6), b7 = right.getFloat(i + 7);
+
+      dotProduct += (double) (a0 * b0 + a1 * b1 + a2 * b2 + a3 * b3 +
+                              a4 * b4 + a5 * b5 + a6 * b6 + a7 * b7);
+      norm1Sq += (double) (a0 * a0 + a1 * a1 + a2 * a2 + a3 * a3 +
+                           a4 * a4 + a5 * a5 + a6 * a6 + a7 * a7);
+      norm2Sq += (double) (b0 * b0 + b1 * b1 + b2 * b2 + b3 * b3 +
+                           b4 * b4 + b5 * b5 + b6 * b6 + b7 * b7);
+      i += 8;
+    }
+
+    // Handle remaining elements
+    while (i < leftLen) {
+      if (left.isNullAt(i) || right.isNullAt(i)) {
+        return null;
+      }
+      float a = left.getFloat(i);
+      float b = right.getFloat(i);
+      dotProduct += (double) (a * b);
+      norm1Sq += (double) (a * a);
+      norm2Sq += (double) (b * b);
+      i++;
+    }
+
+    double normProduct = Math.sqrt(norm1Sq * norm2Sq);
+    if (normProduct == 0.0) {
+      return null;
+    }
+    return (float) (dotProduct / normProduct);
+  }
+
+  /**
+   * Computes the inner product (dot product) between two float vectors.
+   * Returns NULL if either vector contains NULL elements.
+   * Returns 0.0 for empty vectors.
+   * Throws an exception if vectors have different dimensions.
+   * Uses manual loop unrolling (8 elements at a time) for speculative SIMD 
optimization.
+   */
+  public static Float vectorInnerProduct(ArrayData left, ArrayData right, 
UTF8String funcName) {
+    int leftLen = left.numElements();
+    int rightLen = right.numElements();
+
+    if (leftLen != rightLen) {
+      throw QueryExecutionErrors.vectorDimensionMismatchError(
+          funcName.toString(), leftLen, rightLen);
+    }
+
+    if (leftLen == 0) {
+      return 0.0f;
+    }
+
+    double dotProduct = 0.0;
+
+    int i = 0;
+    int simdLimit = (leftLen / 8) * 8;
+
+    // Manual unroll loop - process 8 floats at a time for speculative SIMD 
optimization
+    while (i < simdLimit) {
+      // Check for nulls in batch
+      if (left.isNullAt(i) || left.isNullAt(i + 1) ||
+          left.isNullAt(i + 2) || left.isNullAt(i + 3) ||
+          left.isNullAt(i + 4) || left.isNullAt(i + 5) ||
+          left.isNullAt(i + 6) || left.isNullAt(i + 7) ||
+          right.isNullAt(i) || right.isNullAt(i + 1) ||
+          right.isNullAt(i + 2) || right.isNullAt(i + 3) ||
+          right.isNullAt(i + 4) || right.isNullAt(i + 5) ||
+          right.isNullAt(i + 6) || right.isNullAt(i + 7)) {
+        return null;
+      }
+
+      float a0 = left.getFloat(i), a1 = left.getFloat(i + 1);
+      float a2 = left.getFloat(i + 2), a3 = left.getFloat(i + 3);
+      float a4 = left.getFloat(i + 4), a5 = left.getFloat(i + 5);
+      float a6 = left.getFloat(i + 6), a7 = left.getFloat(i + 7);
+
+      float b0 = right.getFloat(i), b1 = right.getFloat(i + 1);
+      float b2 = right.getFloat(i + 2), b3 = right.getFloat(i + 3);
+      float b4 = right.getFloat(i + 4), b5 = right.getFloat(i + 5);
+      float b6 = right.getFloat(i + 6), b7 = right.getFloat(i + 7);
+
+      dotProduct += (double) (a0 * b0 + a1 * b1 + a2 * b2 + a3 * b3 +
+                              a4 * b4 + a5 * b5 + a6 * b6 + a7 * b7);
+      i += 8;
+    }
+
+    // Handle remaining elements
+    while (i < leftLen) {
+      if (left.isNullAt(i) || right.isNullAt(i)) {
+        return null;
+      }
+      float a = left.getFloat(i);
+      float b = right.getFloat(i);
+      dotProduct += (double) (a * b);
+      i++;
+    }
+
+    return (float) dotProduct;
+  }
+
+  /**
+   * Computes the Euclidean (L2) distance between two float vectors.
+   * Returns NULL if either vector contains NULL elements.
+   * Returns 0.0 for empty vectors.
+   * Throws an exception if vectors have different dimensions.
+   * Uses manual loop unrolling (8 elements at a time) for speculative SIMD 
optimization.
+   */
+  public static Float vectorL2Distance(ArrayData left, ArrayData right, 
UTF8String funcName) {
+    int leftLen = left.numElements();
+    int rightLen = right.numElements();
+
+    if (leftLen != rightLen) {
+      throw QueryExecutionErrors.vectorDimensionMismatchError(
+          funcName.toString(), leftLen, rightLen);
+    }
+
+    if (leftLen == 0) {
+      return 0.0f;
+    }
+
+    double sumSq = 0.0;
+
+    int i = 0;
+    int simdLimit = (leftLen / 8) * 8;
+
+    // Manual unroll loop - process 8 floats at a time for speculative SIMD 
optimization
+    while (i < simdLimit) {
+      // Check for nulls in batch
+      if (left.isNullAt(i) || left.isNullAt(i + 1) ||
+          left.isNullAt(i + 2) || left.isNullAt(i + 3) ||
+          left.isNullAt(i + 4) || left.isNullAt(i + 5) ||
+          left.isNullAt(i + 6) || left.isNullAt(i + 7) ||
+          right.isNullAt(i) || right.isNullAt(i + 1) ||
+          right.isNullAt(i + 2) || right.isNullAt(i + 3) ||
+          right.isNullAt(i + 4) || right.isNullAt(i + 5) ||
+          right.isNullAt(i + 6) || right.isNullAt(i + 7)) {
+        return null;
+      }
+
+      float a0 = left.getFloat(i), a1 = left.getFloat(i + 1);
+      float a2 = left.getFloat(i + 2), a3 = left.getFloat(i + 3);
+      float a4 = left.getFloat(i + 4), a5 = left.getFloat(i + 5);
+      float a6 = left.getFloat(i + 6), a7 = left.getFloat(i + 7);
+
+      float b0 = right.getFloat(i), b1 = right.getFloat(i + 1);
+      float b2 = right.getFloat(i + 2), b3 = right.getFloat(i + 3);
+      float b4 = right.getFloat(i + 4), b5 = right.getFloat(i + 5);
+      float b6 = right.getFloat(i + 6), b7 = right.getFloat(i + 7);
+
+      float d0 = a0 - b0, d1 = a1 - b1, d2 = a2 - b2, d3 = a3 - b3;
+      float d4 = a4 - b4, d5 = a5 - b5, d6 = a6 - b6, d7 = a7 - b7;
+
+      sumSq += (double) (d0 * d0 + d1 * d1 + d2 * d2 + d3 * d3 +
+                         d4 * d4 + d5 * d5 + d6 * d6 + d7 * d7);
+      i += 8;
+    }
+
+    // Handle remaining elements
+    while (i < leftLen) {
+      if (left.isNullAt(i) || right.isNullAt(i)) {
+        return null;
+      }
+      float a = left.getFloat(i);
+      float b = right.getFloat(i);
+      float diff = a - b;
+      sumSq += (double) (diff * diff);
+      i++;
+    }
+
+    return (float) Math.sqrt(sumSq);
+  }
+}
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
index cc8ac07126cd..4faac1a8b6df 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
@@ -553,6 +553,11 @@ object FunctionRegistry {
     expression[KllSketchGetRankFloat]("kll_sketch_get_rank_float"),
     expression[KllSketchGetRankDouble]("kll_sketch_get_rank_double"),
 
+    // vector functions
+    expression[VectorCosineSimilarity]("vector_cosine_similarity"),
+    expression[VectorInnerProduct]("vector_inner_product"),
+    expression[VectorL2Distance]("vector_l2_distance"),
+
     // string functions
     expression[Ascii]("ascii"),
     expression[Chr]("char", true),
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/vectorExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/vectorExpressions.scala
new file mode 100644
index 000000000000..0b658eb7d5ea
--- /dev/null
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/vectorExpressions.scala
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult
+import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.DataTypeMismatch
+import org.apache.spark.sql.catalyst.expressions.objects.StaticInvoke
+import org.apache.spark.sql.errors.QueryErrorsBase
+import org.apache.spark.sql.types.{ArrayType, FloatType, StringType}
+
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = """
+    _FUNC_(array1, array2) - Returns the cosine similarity between two float 
vectors.
+    The vectors must have the same dimension.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1.0F, 2.0F, 3.0F), array(4.0F, 5.0F, 6.0F));
+       0.97463185
+  """,
+  since = "4.2.0",
+  group = "vector_funcs"
+)
+// scalastyle:on line.size.limit
+case class VectorCosineSimilarity(left: Expression, right: Expression)
+    extends RuntimeReplaceable with QueryErrorsBase {
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+    (left.dataType, right.dataType) match {
+      case (ArrayType(FloatType, _), ArrayType(FloatType, _)) =>
+        TypeCheckResult.TypeCheckSuccess
+      case (ArrayType(FloatType, _), _) =>
+        DataTypeMismatch(
+          errorSubClass = "UNEXPECTED_INPUT_TYPE",
+          messageParameters = Map(
+            "paramIndex" -> ordinalNumber(1),
+            "requiredType" -> toSQLType(ArrayType(FloatType)),
+            "inputSql" -> toSQLExpr(right),
+            "inputType" -> toSQLType(right.dataType)))
+      case _ =>
+        DataTypeMismatch(
+          errorSubClass = "UNEXPECTED_INPUT_TYPE",
+          messageParameters = Map(
+            "paramIndex" -> ordinalNumber(0),
+            "requiredType" -> toSQLType(ArrayType(FloatType)),
+            "inputSql" -> toSQLExpr(left),
+            "inputType" -> toSQLType(left.dataType)))
+    }
+  }
+
+  override lazy val replacement: Expression = StaticInvoke(
+    classOf[VectorFunctionImplUtils],
+    FloatType,
+    "vectorCosineSimilarity",
+    Seq(left, right, Literal(prettyName)),
+    Seq(ArrayType(FloatType), ArrayType(FloatType), StringType))
+
+  override def prettyName: String = "vector_cosine_similarity"
+
+  override def children: Seq[Expression] = Seq(left, right)
+
+  override protected def withNewChildrenInternal(
+      newChildren: IndexedSeq[Expression]): VectorCosineSimilarity = {
+    copy(left = newChildren(0), right = newChildren(1))
+  }
+}
+
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = """
+    _FUNC_(array1, array2) - Returns the inner product (dot product) between 
two float vectors.
+    The vectors must have the same dimension.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1.0F, 2.0F, 3.0F), array(4.0F, 5.0F, 6.0F));
+       32.0
+  """,
+  since = "4.2.0",
+  group = "vector_funcs"
+)
+// scalastyle:on line.size.limit
+case class VectorInnerProduct(left: Expression, right: Expression)
+    extends RuntimeReplaceable with QueryErrorsBase {
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+    (left.dataType, right.dataType) match {
+      case (ArrayType(FloatType, _), ArrayType(FloatType, _)) =>
+        TypeCheckResult.TypeCheckSuccess
+      case (ArrayType(FloatType, _), _) =>
+        DataTypeMismatch(
+          errorSubClass = "UNEXPECTED_INPUT_TYPE",
+          messageParameters = Map(
+            "paramIndex" -> ordinalNumber(1),
+            "requiredType" -> toSQLType(ArrayType(FloatType)),
+            "inputSql" -> toSQLExpr(right),
+            "inputType" -> toSQLType(right.dataType)))
+      case _ =>
+        DataTypeMismatch(
+          errorSubClass = "UNEXPECTED_INPUT_TYPE",
+          messageParameters = Map(
+            "paramIndex" -> ordinalNumber(0),
+            "requiredType" -> toSQLType(ArrayType(FloatType)),
+            "inputSql" -> toSQLExpr(left),
+            "inputType" -> toSQLType(left.dataType)))
+    }
+  }
+
+  override lazy val replacement: Expression = StaticInvoke(
+    classOf[VectorFunctionImplUtils],
+    FloatType,
+    "vectorInnerProduct",
+    Seq(left, right, Literal(prettyName)),
+    Seq(ArrayType(FloatType), ArrayType(FloatType), StringType))
+
+  override def prettyName: String = "vector_inner_product"
+
+  override def children: Seq[Expression] = Seq(left, right)
+
+  override protected def withNewChildrenInternal(
+      newChildren: IndexedSeq[Expression]): VectorInnerProduct = {
+    copy(left = newChildren(0), right = newChildren(1))
+  }
+}
+
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = """
+    _FUNC_(array1, array2) - Returns the Euclidean (L2) distance between two 
float vectors.
+    The vectors must have the same dimension.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_(array(1.0F, 2.0F, 3.0F), array(4.0F, 5.0F, 6.0F));
+       5.196152
+  """,
+  since = "4.2.0",
+  group = "vector_funcs"
+)
+// scalastyle:on line.size.limit
+case class VectorL2Distance(left: Expression, right: Expression)
+    extends RuntimeReplaceable with QueryErrorsBase {
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+    (left.dataType, right.dataType) match {
+      case (ArrayType(FloatType, _), ArrayType(FloatType, _)) =>
+        TypeCheckResult.TypeCheckSuccess
+      case (ArrayType(FloatType, _), _) =>
+        DataTypeMismatch(
+          errorSubClass = "UNEXPECTED_INPUT_TYPE",
+          messageParameters = Map(
+            "paramIndex" -> ordinalNumber(1),
+            "requiredType" -> toSQLType(ArrayType(FloatType)),
+            "inputSql" -> toSQLExpr(right),
+            "inputType" -> toSQLType(right.dataType)))
+      case _ =>
+        DataTypeMismatch(
+          errorSubClass = "UNEXPECTED_INPUT_TYPE",
+          messageParameters = Map(
+            "paramIndex" -> ordinalNumber(0),
+            "requiredType" -> toSQLType(ArrayType(FloatType)),
+            "inputSql" -> toSQLExpr(left),
+            "inputType" -> toSQLType(left.dataType)))
+    }
+  }
+
+  override lazy val replacement: Expression = StaticInvoke(
+    classOf[VectorFunctionImplUtils],
+    FloatType,
+    "vectorL2Distance",
+    Seq(left, right, Literal(prettyName)),
+    Seq(ArrayType(FloatType), ArrayType(FloatType), StringType))
+
+  override def prettyName: String = "vector_l2_distance"
+
+  override def children: Seq[Expression] = Seq(left, right)
+
+  override protected def withNewChildrenInternal(
+      newChildren: IndexedSeq[Expression]): VectorL2Distance = {
+    copy(left = newChildren(0), right = newChildren(1))
+  }
+}
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
index 13b948391622..0bde732ea5d3 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
@@ -3222,4 +3222,16 @@ private[sql] object QueryExecutionErrors extends 
QueryErrorsBase with ExecutionE
         "functionName" -> toSQLId(function),
         "k" -> toSQLValue(k, IntegerType)))
   }
+
+  def vectorDimensionMismatchError(
+    function: String,
+    leftDim: Int,
+    rightDim: Int): RuntimeException = {
+    new SparkRuntimeException(
+      errorClass = "VECTOR_DIMENSION_MISMATCH",
+      messageParameters = Map(
+        "functionName" -> toSQLId(function),
+        "leftDim" -> leftDim.toString,
+        "rightDim" -> rightDim.toString))
+  }
 }
diff --git a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md 
b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
index cd4538162b8d..6ec3db92f4f2 100644
--- a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
+++ b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
@@ -421,6 +421,9 @@
 | org.apache.spark.sql.catalyst.expressions.UrlEncode | url_encode | SELECT 
url_encode('https://spark.apache.org') | 
struct<url_encode(https://spark.apache.org):string> |
 | org.apache.spark.sql.catalyst.expressions.Uuid | uuid | SELECT uuid() | 
struct<uuid():string> |
 | org.apache.spark.sql.catalyst.expressions.ValidateUTF8 | validate_utf8 | 
SELECT validate_utf8('Spark') | struct<validate_utf8(Spark):string> |
+| org.apache.spark.sql.catalyst.expressions.VectorCosineSimilarity | 
vector_cosine_similarity | SELECT vector_cosine_similarity(array(1.0F, 2.0F, 
3.0F), array(4.0F, 5.0F, 6.0F)) | struct<vector_cosine_similarity(array(1.0, 
2.0, 3.0), array(4.0, 5.0, 6.0)):float> |
+| org.apache.spark.sql.catalyst.expressions.VectorInnerProduct | 
vector_inner_product | SELECT vector_inner_product(array(1.0F, 2.0F, 3.0F), 
array(4.0F, 5.0F, 6.0F)) | struct<vector_inner_product(array(1.0, 2.0, 3.0), 
array(4.0, 5.0, 6.0)):float> |
+| org.apache.spark.sql.catalyst.expressions.VectorL2Distance | 
vector_l2_distance | SELECT vector_l2_distance(array(1.0F, 2.0F, 3.0F), 
array(4.0F, 5.0F, 6.0F)) | struct<vector_l2_distance(array(1.0, 2.0, 3.0), 
array(4.0, 5.0, 6.0)):float> |
 | org.apache.spark.sql.catalyst.expressions.WeekDay | weekday | SELECT 
weekday('2009-07-30') | struct<weekday(2009-07-30):int> |
 | org.apache.spark.sql.catalyst.expressions.WeekOfYear | weekofyear | SELECT 
weekofyear('2008-02-20') | struct<weekofyear(2008-02-20):int> |
 | org.apache.spark.sql.catalyst.expressions.WidthBucket | width_bucket | 
SELECT width_bucket(5.3, 0.2, 10.6, 5) | struct<width_bucket(5.3, 0.2, 10.6, 
5):bigint> |
diff --git 
a/sql/core/src/test/resources/sql-tests/analyzer-results/vector-distance.sql.out
 
b/sql/core/src/test/resources/sql-tests/analyzer-results/vector-distance.sql.out
new file mode 100644
index 000000000000..42239b329730
--- /dev/null
+++ 
b/sql/core/src/test/resources/sql-tests/analyzer-results/vector-distance.sql.out
@@ -0,0 +1,656 @@
+-- Automatically generated by SQLQueryTestSuite
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), array(4.0F, 5.0F, 
6.0F))
+-- !query analysis
+Project [vector_cosine_similarity(array(1.0, 2.0, 3.0), array(4.0, 5.0, 6.0)) 
AS vector_cosine_similarity(array(1.0, 2.0, 3.0), array(4.0, 5.0, 6.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 0.0F, 0.0F), array(1.0F, 0.0F, 
0.0F))
+-- !query analysis
+Project [vector_cosine_similarity(array(1.0, 0.0, 0.0), array(1.0, 0.0, 0.0)) 
AS vector_cosine_similarity(array(1.0, 0.0, 0.0), array(1.0, 0.0, 0.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 0.0F), array(0.0F, 1.0F))
+-- !query analysis
+Project [vector_cosine_similarity(array(1.0, 0.0), array(0.0, 1.0)) AS 
vector_cosine_similarity(array(1.0, 0.0), array(0.0, 1.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 0.0F), array(-1.0F, 0.0F))
+-- !query analysis
+Project [vector_cosine_similarity(array(1.0, 0.0), array(-1.0, 0.0)) AS 
vector_cosine_similarity(array(1.0, 0.0), array(-1.0, 0.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_inner_product(array(1.0F, 2.0F, 3.0F), array(4.0F, 5.0F, 6.0F))
+-- !query analysis
+Project [vector_inner_product(array(1.0, 2.0, 3.0), array(4.0, 5.0, 6.0)) AS 
vector_inner_product(array(1.0, 2.0, 3.0), array(4.0, 5.0, 6.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_inner_product(array(1.0F, 0.0F), array(0.0F, 1.0F))
+-- !query analysis
+Project [vector_inner_product(array(1.0, 0.0), array(0.0, 1.0)) AS 
vector_inner_product(array(1.0, 0.0), array(0.0, 1.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_inner_product(array(3.0F, 4.0F), array(3.0F, 4.0F))
+-- !query analysis
+Project [vector_inner_product(array(3.0, 4.0), array(3.0, 4.0)) AS 
vector_inner_product(array(3.0, 4.0), array(3.0, 4.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_l2_distance(array(1.0F, 2.0F, 3.0F), array(4.0F, 5.0F, 6.0F))
+-- !query analysis
+Project [vector_l2_distance(array(1.0, 2.0, 3.0), array(4.0, 5.0, 6.0)) AS 
vector_l2_distance(array(1.0, 2.0, 3.0), array(4.0, 5.0, 6.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_l2_distance(array(1.0F, 2.0F), array(1.0F, 2.0F))
+-- !query analysis
+Project [vector_l2_distance(array(1.0, 2.0), array(1.0, 2.0)) AS 
vector_l2_distance(array(1.0, 2.0), array(1.0, 2.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_l2_distance(array(0.0F, 0.0F), array(3.0F, 4.0F))
+-- !query analysis
+Project [vector_l2_distance(array(0.0, 0.0), array(3.0, 4.0)) AS 
vector_l2_distance(array(0.0, 0.0), array(3.0, 4.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_cosine_similarity(array(), array())
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array()\"",
+    "inputType" : "\"ARRAY<VOID>\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(array(), array())\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 49,
+    "fragment" : "vector_cosine_similarity(array(), array())"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product(CAST(array() AS ARRAY<FLOAT>), CAST(array() AS 
ARRAY<FLOAT>))
+-- !query analysis
+Project [vector_inner_product(cast(array() as array<float>), cast(array() as 
array<float>)) AS vector_inner_product(array(), array())#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_l2_distance(CAST(array() AS ARRAY<FLOAT>), CAST(array() AS 
ARRAY<FLOAT>))
+-- !query analysis
+Project [vector_l2_distance(cast(array() as array<float>), cast(array() as 
array<float>)) AS vector_l2_distance(array(), array())#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_cosine_similarity(array(0.0F, 0.0F, 0.0F), array(1.0F, 2.0F, 
3.0F))
+-- !query analysis
+Project [vector_cosine_similarity(array(0.0, 0.0, 0.0), array(1.0, 2.0, 3.0)) 
AS vector_cosine_similarity(array(0.0, 0.0, 0.0), array(1.0, 2.0, 3.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_cosine_similarity(NULL, array(1.0F, 2.0F, 3.0F))
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"NULL\"",
+    "inputType" : "\"VOID\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(NULL, array(1.0, 2.0, 3.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 62,
+    "fragment" : "vector_cosine_similarity(NULL, array(1.0F, 2.0F, 3.0F))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), NULL)
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"NULL\"",
+    "inputType" : "\"VOID\"",
+    "paramIndex" : "second",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(array(1.0, 2.0, 3.0), NULL)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 62,
+    "fragment" : "vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), NULL)"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product(NULL, array(1.0F, 2.0F, 3.0F))
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"NULL\"",
+    "inputType" : "\"VOID\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_inner_product(NULL, array(1.0, 2.0, 3.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 58,
+    "fragment" : "vector_inner_product(NULL, array(1.0F, 2.0F, 3.0F))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product(array(1.0F, 2.0F, 3.0F), NULL)
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"NULL\"",
+    "inputType" : "\"VOID\"",
+    "paramIndex" : "second",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_inner_product(array(1.0, 2.0, 3.0), NULL)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 58,
+    "fragment" : "vector_inner_product(array(1.0F, 2.0F, 3.0F), NULL)"
+  } ]
+}
+
+
+-- !query
+SELECT vector_l2_distance(NULL, array(1.0F, 2.0F, 3.0F))
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"NULL\"",
+    "inputType" : "\"VOID\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_l2_distance(NULL, array(1.0, 2.0, 3.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 56,
+    "fragment" : "vector_l2_distance(NULL, array(1.0F, 2.0F, 3.0F))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_l2_distance(array(1.0F, 2.0F, 3.0F), NULL)
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"NULL\"",
+    "inputType" : "\"VOID\"",
+    "paramIndex" : "second",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_l2_distance(array(1.0, 2.0, 3.0), NULL)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 56,
+    "fragment" : "vector_l2_distance(array(1.0F, 2.0F, 3.0F), NULL)"
+  } ]
+}
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, CAST(NULL AS FLOAT), 3.0F), 
array(1.0F, 2.0F, 3.0F))
+-- !query analysis
+Project [vector_cosine_similarity(array(1.0, cast(null as float), 3.0), 
array(1.0, 2.0, 3.0)) AS vector_cosine_similarity(array(1.0, CAST(NULL AS 
FLOAT), 3.0), array(1.0, 2.0, 3.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_inner_product(array(1.0F, CAST(NULL AS FLOAT), 3.0F), 
array(1.0F, 2.0F, 3.0F))
+-- !query analysis
+Project [vector_inner_product(array(1.0, cast(null as float), 3.0), array(1.0, 
2.0, 3.0)) AS vector_inner_product(array(1.0, CAST(NULL AS FLOAT), 3.0), 
array(1.0, 2.0, 3.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_l2_distance(array(1.0F, CAST(NULL AS FLOAT), 3.0F), array(1.0F, 
2.0F, 3.0F))
+-- !query analysis
+Project [vector_l2_distance(array(1.0, cast(null as float), 3.0), array(1.0, 
2.0, 3.0)) AS vector_l2_distance(array(1.0, CAST(NULL AS FLOAT), 3.0), 
array(1.0, 2.0, 3.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), array(1.0F, 2.0F))
+-- !query analysis
+Project [vector_cosine_similarity(array(1.0, 2.0, 3.0), array(1.0, 2.0)) AS 
vector_cosine_similarity(array(1.0, 2.0, 3.0), array(1.0, 2.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_inner_product(array(1.0F, 2.0F, 3.0F), array(1.0F, 2.0F))
+-- !query analysis
+Project [vector_inner_product(array(1.0, 2.0, 3.0), array(1.0, 2.0)) AS 
vector_inner_product(array(1.0, 2.0, 3.0), array(1.0, 2.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_l2_distance(array(1.0F, 2.0F, 3.0F), array(1.0F, 2.0F))
+-- !query analysis
+Project [vector_l2_distance(array(1.0, 2.0, 3.0), array(1.0, 2.0)) AS 
vector_l2_distance(array(1.0, 2.0, 3.0), array(1.0, 2.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0D, 2.0D, 3.0D), array(4.0D, 5.0D, 
6.0D))
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array(1.0, 2.0, 3.0)\"",
+    "inputType" : "\"ARRAY<DOUBLE>\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(array(1.0, 2.0, 3.0), array(4.0, 
5.0, 6.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 81,
+    "fragment" : "vector_cosine_similarity(array(1.0D, 2.0D, 3.0D), 
array(4.0D, 5.0D, 6.0D))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1, 2, 3), array(4, 5, 6))
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array(1, 2, 3)\"",
+    "inputType" : "\"ARRAY<INT>\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(array(1, 2, 3), array(4, 5, 6))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 63,
+    "fragment" : "vector_cosine_similarity(array(1, 2, 3), array(4, 5, 6))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product(array(1.0D, 2.0D, 3.0D), array(4.0D, 5.0D, 6.0D))
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array(1.0, 2.0, 3.0)\"",
+    "inputType" : "\"ARRAY<DOUBLE>\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_inner_product(array(1.0, 2.0, 3.0), array(4.0, 5.0, 
6.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 77,
+    "fragment" : "vector_inner_product(array(1.0D, 2.0D, 3.0D), array(4.0D, 
5.0D, 6.0D))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product(array(1, 2, 3), array(4, 5, 6))
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array(1, 2, 3)\"",
+    "inputType" : "\"ARRAY<INT>\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_inner_product(array(1, 2, 3), array(4, 5, 6))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 59,
+    "fragment" : "vector_inner_product(array(1, 2, 3), array(4, 5, 6))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_l2_distance(array(1.0D, 2.0D, 3.0D), array(4.0D, 5.0D, 6.0D))
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array(1.0, 2.0, 3.0)\"",
+    "inputType" : "\"ARRAY<DOUBLE>\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_l2_distance(array(1.0, 2.0, 3.0), array(4.0, 5.0, 
6.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 75,
+    "fragment" : "vector_l2_distance(array(1.0D, 2.0D, 3.0D), array(4.0D, 
5.0D, 6.0D))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_l2_distance(array(1, 2, 3), array(4, 5, 6))
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array(1, 2, 3)\"",
+    "inputType" : "\"ARRAY<INT>\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_l2_distance(array(1, 2, 3), array(4, 5, 6))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 57,
+    "fragment" : "vector_l2_distance(array(1, 2, 3), array(4, 5, 6))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), array(1.0D, 2.0D, 
3.0D))
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array(1.0, 2.0, 3.0)\"",
+    "inputType" : "\"ARRAY<DOUBLE>\"",
+    "paramIndex" : "second",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(array(1.0, 2.0, 3.0), array(1.0, 
2.0, 3.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 81,
+    "fragment" : "vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), 
array(1.0D, 2.0D, 3.0D))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_cosine_similarity('not an array', array(1.0F, 2.0F, 3.0F))
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"not an array\"",
+    "inputType" : "\"STRING\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(not an array, array(1.0, 2.0, 
3.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 72,
+    "fragment" : "vector_cosine_similarity('not an array', array(1.0F, 2.0F, 
3.0F))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), 'not an array')
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"not an array\"",
+    "inputType" : "\"STRING\"",
+    "paramIndex" : "second",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(array(1.0, 2.0, 3.0), not an 
array)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 72,
+    "fragment" : "vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), 'not an 
array')"
+  } ]
+}
+
+
+-- !query
+SELECT vector_cosine_similarity(123, 456)
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"123\"",
+    "inputType" : "\"INT\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(123, 456)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 41,
+    "fragment" : "vector_cosine_similarity(123, 456)"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product('not an array', array(1.0F, 2.0F, 3.0F))
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"not an array\"",
+    "inputType" : "\"STRING\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_inner_product(not an array, array(1.0, 2.0, 3.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 68,
+    "fragment" : "vector_inner_product('not an array', array(1.0F, 2.0F, 
3.0F))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product(array(1.0F, 2.0F, 3.0F), 'not an array')
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"not an array\"",
+    "inputType" : "\"STRING\"",
+    "paramIndex" : "second",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_inner_product(array(1.0, 2.0, 3.0), not an array)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 68,
+    "fragment" : "vector_inner_product(array(1.0F, 2.0F, 3.0F), 'not an 
array')"
+  } ]
+}
+
+
+-- !query
+SELECT vector_l2_distance('not an array', array(1.0F, 2.0F, 3.0F))
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"not an array\"",
+    "inputType" : "\"STRING\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_l2_distance(not an array, array(1.0, 2.0, 3.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 66,
+    "fragment" : "vector_l2_distance('not an array', array(1.0F, 2.0F, 3.0F))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_l2_distance(array(1.0F, 2.0F, 3.0F), 'not an array')
+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"not an array\"",
+    "inputType" : "\"STRING\"",
+    "paramIndex" : "second",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_l2_distance(array(1.0, 2.0, 3.0), not an array)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 66,
+    "fragment" : "vector_l2_distance(array(1.0F, 2.0F, 3.0F), 'not an array')"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product(
+    array(1.0F, 2.0F, 3.0F, 4.0F, 5.0F, 6.0F, 7.0F, 8.0F, 9.0F, 10.0F, 11.0F, 
12.0F, 13.0F, 14.0F, 15.0F, 16.0F),
+    array(16.0F, 15.0F, 14.0F, 13.0F, 12.0F, 11.0F, 10.0F, 9.0F, 8.0F, 7.0F, 
6.0F, 5.0F, 4.0F, 3.0F, 2.0F, 1.0F)
+)
+-- !query analysis
+Project [vector_inner_product(array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 
9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0), array(16.0, 15.0, 14.0, 13.0, 
12.0, 11.0, 10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0)) AS 
vector_inner_product(array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 
11.0, 12.0, 13.0, 14.0, 15.0, 16.0), array(16.0, 15.0, 14.0, 13.0, 12.0, 11.0, 
10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0))#x]
++- OneRowRelation
+
+
+-- !query
+SELECT vector_l2_distance(
+    array(1.0F, 2.0F, 3.0F, 4.0F, 5.0F, 6.0F, 7.0F, 8.0F, 9.0F, 10.0F, 11.0F, 
12.0F, 13.0F, 14.0F, 15.0F, 16.0F),
+    array(16.0F, 15.0F, 14.0F, 13.0F, 12.0F, 11.0F, 10.0F, 9.0F, 8.0F, 7.0F, 
6.0F, 5.0F, 4.0F, 3.0F, 2.0F, 1.0F)
+)
+-- !query analysis
+Project [vector_l2_distance(array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 
10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0), array(16.0, 15.0, 14.0, 13.0, 12.0, 
11.0, 10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0)) AS 
vector_l2_distance(array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 
11.0, 12.0, 13.0, 14.0, 15.0, 16.0), array(16.0, 15.0, 14.0, 13.0, 12.0, 11.0, 
10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0))#x]
++- OneRowRelation
diff --git a/sql/core/src/test/resources/sql-tests/inputs/vector-distance.sql 
b/sql/core/src/test/resources/sql-tests/inputs/vector-distance.sql
new file mode 100644
index 000000000000..24035963260b
--- /dev/null
+++ b/sql/core/src/test/resources/sql-tests/inputs/vector-distance.sql
@@ -0,0 +1,132 @@
+-- Tests for vector distance functions: vector_cosine_similarity, 
vector_inner_product, vector_l2_distance
+
+-- Basic functionality tests
+
+-- vector_cosine_similarity: basic test
+SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), array(4.0F, 5.0F, 
6.0F));
+
+-- vector_cosine_similarity: identical vectors (similarity = 1.0)
+SELECT vector_cosine_similarity(array(1.0F, 0.0F, 0.0F), array(1.0F, 0.0F, 
0.0F));
+
+-- vector_cosine_similarity: orthogonal vectors (similarity = 0.0)
+SELECT vector_cosine_similarity(array(1.0F, 0.0F), array(0.0F, 1.0F));
+
+-- vector_cosine_similarity: opposite vectors (similarity = -1.0)
+SELECT vector_cosine_similarity(array(1.0F, 0.0F), array(-1.0F, 0.0F));
+
+-- vector_inner_product: basic test (1*4 + 2*5 + 3*6 = 32)
+SELECT vector_inner_product(array(1.0F, 2.0F, 3.0F), array(4.0F, 5.0F, 6.0F));
+
+-- vector_inner_product: orthogonal vectors (product = 0)
+SELECT vector_inner_product(array(1.0F, 0.0F), array(0.0F, 1.0F));
+
+-- vector_inner_product: self product (squared L2 norm: 3^2 + 4^2 = 25)
+SELECT vector_inner_product(array(3.0F, 4.0F), array(3.0F, 4.0F));
+
+-- vector_l2_distance: basic test (sqrt((4-1)^2 + (5-2)^2 + (6-3)^2) = 
sqrt(27))
+SELECT vector_l2_distance(array(1.0F, 2.0F, 3.0F), array(4.0F, 5.0F, 6.0F));
+
+-- vector_l2_distance: identical vectors (distance = 0)
+SELECT vector_l2_distance(array(1.0F, 2.0F), array(1.0F, 2.0F));
+
+-- vector_l2_distance: 3-4-5 triangle (distance = 5)
+SELECT vector_l2_distance(array(0.0F, 0.0F), array(3.0F, 4.0F));
+
+-- Edge cases
+
+-- Empty vectors: cosine similarity returns NULL
+SELECT vector_cosine_similarity(array(), array());
+
+-- Empty vectors: inner product returns 0.0
+SELECT vector_inner_product(CAST(array() AS ARRAY<FLOAT>), CAST(array() AS 
ARRAY<FLOAT>));
+
+-- Empty vectors: L2 distance returns 0.0
+SELECT vector_l2_distance(CAST(array() AS ARRAY<FLOAT>), CAST(array() AS 
ARRAY<FLOAT>));
+
+-- Zero magnitude vector: cosine similarity returns NULL
+SELECT vector_cosine_similarity(array(0.0F, 0.0F, 0.0F), array(1.0F, 2.0F, 
3.0F));
+
+-- NULL array input: cosine similarity returns NULL
+SELECT vector_cosine_similarity(NULL, array(1.0F, 2.0F, 3.0F));
+SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), NULL);
+
+-- NULL array input: inner product returns NULL
+SELECT vector_inner_product(NULL, array(1.0F, 2.0F, 3.0F));
+SELECT vector_inner_product(array(1.0F, 2.0F, 3.0F), NULL);
+
+-- NULL array input: L2 distance returns NULL
+SELECT vector_l2_distance(NULL, array(1.0F, 2.0F, 3.0F));
+SELECT vector_l2_distance(array(1.0F, 2.0F, 3.0F), NULL);
+
+-- Array containing NULL element: returns NULL
+SELECT vector_cosine_similarity(array(1.0F, CAST(NULL AS FLOAT), 3.0F), 
array(1.0F, 2.0F, 3.0F));
+SELECT vector_inner_product(array(1.0F, CAST(NULL AS FLOAT), 3.0F), 
array(1.0F, 2.0F, 3.0F));
+SELECT vector_l2_distance(array(1.0F, CAST(NULL AS FLOAT), 3.0F), array(1.0F, 
2.0F, 3.0F));
+
+-- Dimension mismatch errors
+
+-- vector_cosine_similarity: dimension mismatch
+SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), array(1.0F, 2.0F));
+
+-- vector_inner_product: dimension mismatch
+SELECT vector_inner_product(array(1.0F, 2.0F, 3.0F), array(1.0F, 2.0F));
+
+-- vector_l2_distance: dimension mismatch
+SELECT vector_l2_distance(array(1.0F, 2.0F, 3.0F), array(1.0F, 2.0F));
+
+-- Type mismatch errors (only ARRAY<FLOAT> is accepted)
+
+-- vector_cosine_similarity: ARRAY<DOUBLE> not accepted
+SELECT vector_cosine_similarity(array(1.0D, 2.0D, 3.0D), array(4.0D, 5.0D, 
6.0D));
+
+-- vector_cosine_similarity: ARRAY<INT> not accepted
+SELECT vector_cosine_similarity(array(1, 2, 3), array(4, 5, 6));
+
+-- vector_inner_product: ARRAY<DOUBLE> not accepted
+SELECT vector_inner_product(array(1.0D, 2.0D, 3.0D), array(4.0D, 5.0D, 6.0D));
+
+-- vector_inner_product: ARRAY<INT> not accepted
+SELECT vector_inner_product(array(1, 2, 3), array(4, 5, 6));
+
+-- vector_l2_distance: ARRAY<DOUBLE> not accepted
+SELECT vector_l2_distance(array(1.0D, 2.0D, 3.0D), array(4.0D, 5.0D, 6.0D));
+
+-- vector_l2_distance: ARRAY<INT> not accepted
+SELECT vector_l2_distance(array(1, 2, 3), array(4, 5, 6));
+
+-- Mixed type errors (first arg correct, second wrong)
+SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), array(1.0D, 2.0D, 
3.0D));
+
+-- Non-array argument errors (argument is not an array at all)
+
+-- vector_cosine_similarity: first argument is not an array
+SELECT vector_cosine_similarity('not an array', array(1.0F, 2.0F, 3.0F));
+
+-- vector_cosine_similarity: second argument is not an array
+SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), 'not an array');
+
+-- vector_cosine_similarity: both arguments are not arrays
+SELECT vector_cosine_similarity(123, 456);
+
+-- vector_inner_product: first argument is not an array
+SELECT vector_inner_product('not an array', array(1.0F, 2.0F, 3.0F));
+
+-- vector_inner_product: second argument is not an array
+SELECT vector_inner_product(array(1.0F, 2.0F, 3.0F), 'not an array');
+
+-- vector_l2_distance: first argument is not an array
+SELECT vector_l2_distance('not an array', array(1.0F, 2.0F, 3.0F));
+
+-- vector_l2_distance: second argument is not an array
+SELECT vector_l2_distance(array(1.0F, 2.0F, 3.0F), 'not an array');
+
+-- Large vectors (test SIMD unroll path with 16 elements)
+SELECT vector_inner_product(
+    array(1.0F, 2.0F, 3.0F, 4.0F, 5.0F, 6.0F, 7.0F, 8.0F, 9.0F, 10.0F, 11.0F, 
12.0F, 13.0F, 14.0F, 15.0F, 16.0F),
+    array(16.0F, 15.0F, 14.0F, 13.0F, 12.0F, 11.0F, 10.0F, 9.0F, 8.0F, 7.0F, 
6.0F, 5.0F, 4.0F, 3.0F, 2.0F, 1.0F)
+);
+
+SELECT vector_l2_distance(
+    array(1.0F, 2.0F, 3.0F, 4.0F, 5.0F, 6.0F, 7.0F, 8.0F, 9.0F, 10.0F, 11.0F, 
12.0F, 13.0F, 14.0F, 15.0F, 16.0F),
+    array(16.0F, 15.0F, 14.0F, 13.0F, 12.0F, 11.0F, 10.0F, 9.0F, 8.0F, 7.0F, 
6.0F, 5.0F, 4.0F, 3.0F, 2.0F, 1.0F)
+);
diff --git 
a/sql/core/src/test/resources/sql-tests/results/vector-distance.sql.out 
b/sql/core/src/test/resources/sql-tests/results/vector-distance.sql.out
new file mode 100644
index 000000000000..14f503dde7f0
--- /dev/null
+++ b/sql/core/src/test/resources/sql-tests/results/vector-distance.sql.out
@@ -0,0 +1,746 @@
+-- Automatically generated by SQLQueryTestSuite
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), array(4.0F, 5.0F, 
6.0F))
+-- !query schema
+struct<vector_cosine_similarity(array(1.0, 2.0, 3.0), array(4.0, 5.0, 
6.0)):float>
+-- !query output
+0.97463185
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 0.0F, 0.0F), array(1.0F, 0.0F, 
0.0F))
+-- !query schema
+struct<vector_cosine_similarity(array(1.0, 0.0, 0.0), array(1.0, 0.0, 
0.0)):float>
+-- !query output
+1.0
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 0.0F), array(0.0F, 1.0F))
+-- !query schema
+struct<vector_cosine_similarity(array(1.0, 0.0), array(0.0, 1.0)):float>
+-- !query output
+0.0
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 0.0F), array(-1.0F, 0.0F))
+-- !query schema
+struct<vector_cosine_similarity(array(1.0, 0.0), array(-1.0, 0.0)):float>
+-- !query output
+-1.0
+
+
+-- !query
+SELECT vector_inner_product(array(1.0F, 2.0F, 3.0F), array(4.0F, 5.0F, 6.0F))
+-- !query schema
+struct<vector_inner_product(array(1.0, 2.0, 3.0), array(4.0, 5.0, 6.0)):float>
+-- !query output
+32.0
+
+
+-- !query
+SELECT vector_inner_product(array(1.0F, 0.0F), array(0.0F, 1.0F))
+-- !query schema
+struct<vector_inner_product(array(1.0, 0.0), array(0.0, 1.0)):float>
+-- !query output
+0.0
+
+
+-- !query
+SELECT vector_inner_product(array(3.0F, 4.0F), array(3.0F, 4.0F))
+-- !query schema
+struct<vector_inner_product(array(3.0, 4.0), array(3.0, 4.0)):float>
+-- !query output
+25.0
+
+
+-- !query
+SELECT vector_l2_distance(array(1.0F, 2.0F, 3.0F), array(4.0F, 5.0F, 6.0F))
+-- !query schema
+struct<vector_l2_distance(array(1.0, 2.0, 3.0), array(4.0, 5.0, 6.0)):float>
+-- !query output
+5.196152
+
+
+-- !query
+SELECT vector_l2_distance(array(1.0F, 2.0F), array(1.0F, 2.0F))
+-- !query schema
+struct<vector_l2_distance(array(1.0, 2.0), array(1.0, 2.0)):float>
+-- !query output
+0.0
+
+
+-- !query
+SELECT vector_l2_distance(array(0.0F, 0.0F), array(3.0F, 4.0F))
+-- !query schema
+struct<vector_l2_distance(array(0.0, 0.0), array(3.0, 4.0)):float>
+-- !query output
+5.0
+
+
+-- !query
+SELECT vector_cosine_similarity(array(), array())
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array()\"",
+    "inputType" : "\"ARRAY<VOID>\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(array(), array())\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 49,
+    "fragment" : "vector_cosine_similarity(array(), array())"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product(CAST(array() AS ARRAY<FLOAT>), CAST(array() AS 
ARRAY<FLOAT>))
+-- !query schema
+struct<vector_inner_product(array(), array()):float>
+-- !query output
+0.0
+
+
+-- !query
+SELECT vector_l2_distance(CAST(array() AS ARRAY<FLOAT>), CAST(array() AS 
ARRAY<FLOAT>))
+-- !query schema
+struct<vector_l2_distance(array(), array()):float>
+-- !query output
+0.0
+
+
+-- !query
+SELECT vector_cosine_similarity(array(0.0F, 0.0F, 0.0F), array(1.0F, 2.0F, 
3.0F))
+-- !query schema
+struct<vector_cosine_similarity(array(0.0, 0.0, 0.0), array(1.0, 2.0, 
3.0)):float>
+-- !query output
+NULL
+
+
+-- !query
+SELECT vector_cosine_similarity(NULL, array(1.0F, 2.0F, 3.0F))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"NULL\"",
+    "inputType" : "\"VOID\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(NULL, array(1.0, 2.0, 3.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 62,
+    "fragment" : "vector_cosine_similarity(NULL, array(1.0F, 2.0F, 3.0F))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), NULL)
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"NULL\"",
+    "inputType" : "\"VOID\"",
+    "paramIndex" : "second",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(array(1.0, 2.0, 3.0), NULL)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 62,
+    "fragment" : "vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), NULL)"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product(NULL, array(1.0F, 2.0F, 3.0F))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"NULL\"",
+    "inputType" : "\"VOID\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_inner_product(NULL, array(1.0, 2.0, 3.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 58,
+    "fragment" : "vector_inner_product(NULL, array(1.0F, 2.0F, 3.0F))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product(array(1.0F, 2.0F, 3.0F), NULL)
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"NULL\"",
+    "inputType" : "\"VOID\"",
+    "paramIndex" : "second",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_inner_product(array(1.0, 2.0, 3.0), NULL)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 58,
+    "fragment" : "vector_inner_product(array(1.0F, 2.0F, 3.0F), NULL)"
+  } ]
+}
+
+
+-- !query
+SELECT vector_l2_distance(NULL, array(1.0F, 2.0F, 3.0F))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"NULL\"",
+    "inputType" : "\"VOID\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_l2_distance(NULL, array(1.0, 2.0, 3.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 56,
+    "fragment" : "vector_l2_distance(NULL, array(1.0F, 2.0F, 3.0F))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_l2_distance(array(1.0F, 2.0F, 3.0F), NULL)
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"NULL\"",
+    "inputType" : "\"VOID\"",
+    "paramIndex" : "second",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_l2_distance(array(1.0, 2.0, 3.0), NULL)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 56,
+    "fragment" : "vector_l2_distance(array(1.0F, 2.0F, 3.0F), NULL)"
+  } ]
+}
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, CAST(NULL AS FLOAT), 3.0F), 
array(1.0F, 2.0F, 3.0F))
+-- !query schema
+struct<vector_cosine_similarity(array(1.0, CAST(NULL AS FLOAT), 3.0), 
array(1.0, 2.0, 3.0)):float>
+-- !query output
+NULL
+
+
+-- !query
+SELECT vector_inner_product(array(1.0F, CAST(NULL AS FLOAT), 3.0F), 
array(1.0F, 2.0F, 3.0F))
+-- !query schema
+struct<vector_inner_product(array(1.0, CAST(NULL AS FLOAT), 3.0), array(1.0, 
2.0, 3.0)):float>
+-- !query output
+NULL
+
+
+-- !query
+SELECT vector_l2_distance(array(1.0F, CAST(NULL AS FLOAT), 3.0F), array(1.0F, 
2.0F, 3.0F))
+-- !query schema
+struct<vector_l2_distance(array(1.0, CAST(NULL AS FLOAT), 3.0), array(1.0, 
2.0, 3.0)):float>
+-- !query output
+NULL
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), array(1.0F, 2.0F))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.SparkRuntimeException
+{
+  "errorClass" : "VECTOR_DIMENSION_MISMATCH",
+  "sqlState" : "22000",
+  "messageParameters" : {
+    "functionName" : "`vector_cosine_similarity`",
+    "leftDim" : "3",
+    "rightDim" : "2"
+  }
+}
+
+
+-- !query
+SELECT vector_inner_product(array(1.0F, 2.0F, 3.0F), array(1.0F, 2.0F))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.SparkRuntimeException
+{
+  "errorClass" : "VECTOR_DIMENSION_MISMATCH",
+  "sqlState" : "22000",
+  "messageParameters" : {
+    "functionName" : "`vector_inner_product`",
+    "leftDim" : "3",
+    "rightDim" : "2"
+  }
+}
+
+
+-- !query
+SELECT vector_l2_distance(array(1.0F, 2.0F, 3.0F), array(1.0F, 2.0F))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.SparkRuntimeException
+{
+  "errorClass" : "VECTOR_DIMENSION_MISMATCH",
+  "sqlState" : "22000",
+  "messageParameters" : {
+    "functionName" : "`vector_l2_distance`",
+    "leftDim" : "3",
+    "rightDim" : "2"
+  }
+}
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0D, 2.0D, 3.0D), array(4.0D, 5.0D, 
6.0D))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array(1.0, 2.0, 3.0)\"",
+    "inputType" : "\"ARRAY<DOUBLE>\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(array(1.0, 2.0, 3.0), array(4.0, 
5.0, 6.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 81,
+    "fragment" : "vector_cosine_similarity(array(1.0D, 2.0D, 3.0D), 
array(4.0D, 5.0D, 6.0D))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1, 2, 3), array(4, 5, 6))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array(1, 2, 3)\"",
+    "inputType" : "\"ARRAY<INT>\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(array(1, 2, 3), array(4, 5, 6))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 63,
+    "fragment" : "vector_cosine_similarity(array(1, 2, 3), array(4, 5, 6))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product(array(1.0D, 2.0D, 3.0D), array(4.0D, 5.0D, 6.0D))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array(1.0, 2.0, 3.0)\"",
+    "inputType" : "\"ARRAY<DOUBLE>\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_inner_product(array(1.0, 2.0, 3.0), array(4.0, 5.0, 
6.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 77,
+    "fragment" : "vector_inner_product(array(1.0D, 2.0D, 3.0D), array(4.0D, 
5.0D, 6.0D))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product(array(1, 2, 3), array(4, 5, 6))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array(1, 2, 3)\"",
+    "inputType" : "\"ARRAY<INT>\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_inner_product(array(1, 2, 3), array(4, 5, 6))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 59,
+    "fragment" : "vector_inner_product(array(1, 2, 3), array(4, 5, 6))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_l2_distance(array(1.0D, 2.0D, 3.0D), array(4.0D, 5.0D, 6.0D))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array(1.0, 2.0, 3.0)\"",
+    "inputType" : "\"ARRAY<DOUBLE>\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_l2_distance(array(1.0, 2.0, 3.0), array(4.0, 5.0, 
6.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 75,
+    "fragment" : "vector_l2_distance(array(1.0D, 2.0D, 3.0D), array(4.0D, 
5.0D, 6.0D))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_l2_distance(array(1, 2, 3), array(4, 5, 6))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array(1, 2, 3)\"",
+    "inputType" : "\"ARRAY<INT>\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_l2_distance(array(1, 2, 3), array(4, 5, 6))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 57,
+    "fragment" : "vector_l2_distance(array(1, 2, 3), array(4, 5, 6))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), array(1.0D, 2.0D, 
3.0D))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"array(1.0, 2.0, 3.0)\"",
+    "inputType" : "\"ARRAY<DOUBLE>\"",
+    "paramIndex" : "second",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(array(1.0, 2.0, 3.0), array(1.0, 
2.0, 3.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 81,
+    "fragment" : "vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), 
array(1.0D, 2.0D, 3.0D))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_cosine_similarity('not an array', array(1.0F, 2.0F, 3.0F))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"not an array\"",
+    "inputType" : "\"STRING\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(not an array, array(1.0, 2.0, 
3.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 72,
+    "fragment" : "vector_cosine_similarity('not an array', array(1.0F, 2.0F, 
3.0F))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), 'not an array')
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"not an array\"",
+    "inputType" : "\"STRING\"",
+    "paramIndex" : "second",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(array(1.0, 2.0, 3.0), not an 
array)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 72,
+    "fragment" : "vector_cosine_similarity(array(1.0F, 2.0F, 3.0F), 'not an 
array')"
+  } ]
+}
+
+
+-- !query
+SELECT vector_cosine_similarity(123, 456)
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"123\"",
+    "inputType" : "\"INT\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_cosine_similarity(123, 456)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 41,
+    "fragment" : "vector_cosine_similarity(123, 456)"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product('not an array', array(1.0F, 2.0F, 3.0F))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"not an array\"",
+    "inputType" : "\"STRING\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_inner_product(not an array, array(1.0, 2.0, 3.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 68,
+    "fragment" : "vector_inner_product('not an array', array(1.0F, 2.0F, 
3.0F))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product(array(1.0F, 2.0F, 3.0F), 'not an array')
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"not an array\"",
+    "inputType" : "\"STRING\"",
+    "paramIndex" : "second",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_inner_product(array(1.0, 2.0, 3.0), not an array)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 68,
+    "fragment" : "vector_inner_product(array(1.0F, 2.0F, 3.0F), 'not an 
array')"
+  } ]
+}
+
+
+-- !query
+SELECT vector_l2_distance('not an array', array(1.0F, 2.0F, 3.0F))
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"not an array\"",
+    "inputType" : "\"STRING\"",
+    "paramIndex" : "first",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_l2_distance(not an array, array(1.0, 2.0, 3.0))\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 66,
+    "fragment" : "vector_l2_distance('not an array', array(1.0F, 2.0F, 3.0F))"
+  } ]
+}
+
+
+-- !query
+SELECT vector_l2_distance(array(1.0F, 2.0F, 3.0F), 'not an array')
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE",
+  "sqlState" : "42K09",
+  "messageParameters" : {
+    "inputSql" : "\"not an array\"",
+    "inputType" : "\"STRING\"",
+    "paramIndex" : "second",
+    "requiredType" : "\"ARRAY<FLOAT>\"",
+    "sqlExpr" : "\"vector_l2_distance(array(1.0, 2.0, 3.0), not an array)\""
+  },
+  "queryContext" : [ {
+    "objectType" : "",
+    "objectName" : "",
+    "startIndex" : 8,
+    "stopIndex" : 66,
+    "fragment" : "vector_l2_distance(array(1.0F, 2.0F, 3.0F), 'not an array')"
+  } ]
+}
+
+
+-- !query
+SELECT vector_inner_product(
+    array(1.0F, 2.0F, 3.0F, 4.0F, 5.0F, 6.0F, 7.0F, 8.0F, 9.0F, 10.0F, 11.0F, 
12.0F, 13.0F, 14.0F, 15.0F, 16.0F),
+    array(16.0F, 15.0F, 14.0F, 13.0F, 12.0F, 11.0F, 10.0F, 9.0F, 8.0F, 7.0F, 
6.0F, 5.0F, 4.0F, 3.0F, 2.0F, 1.0F)
+)
+-- !query schema
+struct<vector_inner_product(array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 
10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0), array(16.0, 15.0, 14.0, 13.0, 12.0, 
11.0, 10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0)):float>
+-- !query output
+816.0
+
+
+-- !query
+SELECT vector_l2_distance(
+    array(1.0F, 2.0F, 3.0F, 4.0F, 5.0F, 6.0F, 7.0F, 8.0F, 9.0F, 10.0F, 11.0F, 
12.0F, 13.0F, 14.0F, 15.0F, 16.0F),
+    array(16.0F, 15.0F, 14.0F, 13.0F, 12.0F, 11.0F, 10.0F, 9.0F, 8.0F, 7.0F, 
6.0F, 5.0F, 4.0F, 3.0F, 2.0F, 1.0F)
+)
+-- !query schema
+struct<vector_l2_distance(array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 
10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0), array(16.0, 15.0, 14.0, 13.0, 12.0, 
11.0, 10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0)):float>
+-- !query output
+36.878178
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/expressions/ExpressionInfoSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/expressions/ExpressionInfoSuite.scala
index 33f128409f7e..2b1d4e7e7df4 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/expressions/ExpressionInfoSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/expressions/ExpressionInfoSuite.scala
@@ -60,7 +60,7 @@ class ExpressionInfoSuite extends SparkFunSuite with 
SharedSparkSession {
       "predicate_funcs", "conditional_funcs", "conversion_funcs", "csv_funcs", 
"datetime_funcs",
       "generator_funcs", "hash_funcs", "json_funcs", "lambda_funcs", 
"map_funcs", "math_funcs",
       "misc_funcs", "string_funcs", "struct_funcs", "window_funcs", 
"xml_funcs", "table_funcs",
-      "url_funcs", "variant_funcs", "st_funcs").sorted
+      "url_funcs", "variant_funcs", "vector_funcs", "st_funcs").sorted
     val invalidGroupName = "invalid_group_funcs"
     checkError(
       exception = intercept[SparkIllegalArgumentException] {


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to