Re: [PR] HDDS-14303. Updating spark integration doc [ozone-site]

via GitHub Fri, 30 Jan 2026 17:13:01 -0800


SaketaChalamchala commented on code in PR #243:
URL: https://github.com/apache/ozone-site/pull/243#discussion_r2748627906



##########
docs/04-user-guide/03-integrations/06-spark.md:
##########
@@ -1,3 +1,166 @@
-# Spark
+---
+sidebar_label: Spark
+---
 
-**TODO:** File a subtask under 
[HDDS-9858](https://issues.apache.org/jira/browse/HDDS-9858) and complete this 
page or section.
+# Using Apache Spark with Ozone
+
+Apache Spark is a widely used unified analytics engine for large-scale data 
processing. Ozone can serve as a scalable storage layer for Spark applications, 
allowing you to read and write data directly from/to Ozone clusters using 
familiar Spark APIs.
+
+## Overview
+
+Spark interacts with Ozone primarily through the OzoneFileSystem (ofs) 
connector, which allows access using the `ofs://` URI scheme. You can also use 
the older `o3fs://` scheme, though `ofs://` is generally recommended, 
especially in CDP environments.

Review Comment:
   I think we want to move away from `o3fs` protocol. We should also mention 
that Spark can interact with Ozone using `s3a` protocol as well making porting 
cloud native Spark applications to Ozone easier.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HDDS-14303. Updating spark integration doc [ozone-site]

Reply via email to