This is an automated email from the ASF dual-hosted git repository. paulk pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/groovy-website.git
The following commit(s) were added to refs/heads/asf-site by this push: new fe422ae Oracle 23ai blog post (add PCA plot) fe422ae is described below commit fe422ae9d356baa10a9a9c9e7a91f3f0f9fa3b0c Author: Paul King <pa...@asert.com.au> AuthorDate: Mon Jul 1 17:13:48 2024 +1000 Oracle 23ai blog post (add PCA plot) --- site/src/site/blog/groovy-oracle23ai.adoc | 25 +++++++++++++++++++++--- site/src/site/blog/img/iris_closest10points.png | Bin 0 -> 110968 bytes 2 files changed, 22 insertions(+), 3 deletions(-) diff --git a/site/src/site/blog/groovy-oracle23ai.adoc b/site/src/site/blog/groovy-oracle23ai.adoc index 473a6ec..6e0d421 100644 --- a/site/src/site/blog/groovy-oracle23ai.adoc +++ b/site/src/site/blog/groovy-oracle23ai.adoc @@ -168,7 +168,7 @@ Iris-setosa Iris-setosa 100 Iris-virginica Iris-virginica 100 Iris-versicolor Iris-versicolor 100 Iris-versicolor Iris-versicolor 100 -Iris-versicolor Iris-versicolor 70 +*Iris-versicolor Iris-versicolor 70* Iris-virginica Iris-virginica 100 Iris-virginica Iris-virginica 100 Iris-setosa Iris-setosa 100 @@ -187,8 +187,27 @@ Iris-virginica Iris-virginica 100 Iris-virginica Iris-virginica 100 ---- -Only one result was incorrect. Since we randomly shuffled the data, -we might get a different number of incorrect results for other runs. +Only one result was incorrect (first *bold* line above). +Since we randomly shuffled the data, we might get a +different number of incorrect results for other runs. + +We can visualize how the distance query works +by plotting the closest 10 points in a 3D plot. +We'll do this for the points returned for the 70% +confidence case (second *bold* line above): + +image:img/iris_closest10points.png[closest 10 points] + +This is a Principal Component Analysis (PCA) plot +which projects our 4 dimensions (Petal width and length, +Sepal width and length) down onto 3 dimensions. + +The large red dot is the projection for our test query characteristics. +The small dots are the unselected points in our dataset. +The medium dots are the dots returned by our `vector_distance` +query. +7 Versicolor points (blue) were returned and 3 Virginica points (orange) were returned. +We know the result was Versicolor for that data point. == More Information diff --git a/site/src/site/blog/img/iris_closest10points.png b/site/src/site/blog/img/iris_closest10points.png new file mode 100644 index 0000000..e522ecc Binary files /dev/null and b/site/src/site/blog/img/iris_closest10points.png differ