Jefffrey commented on code in PR #19750:
URL: https://github.com/apache/datafusion/pull/19750#discussion_r2726379801


##########
ci/scripts/check_examples_docs.sh:
##########
@@ -17,48 +17,57 @@
 # specific language governing permissions and limitations
 # under the License.
 
-set -euo pipefail
-
-EXAMPLES_DIR="datafusion-examples/examples"
-README="datafusion-examples/README.md"
-
-# ffi examples are skipped because they were not part of the recent example
-# consolidation work and do not follow the new grouping and execution pattern.
-# They are not documented in the README using the new structure, so including
-# them here would cause false CI failures.
-SKIP_LIST=("ffi")
-
-missing=0
+# Generates documentation for DataFusion examples using the Rust-based
+# documentation generator and verifies that the committed README.md
+# is up to date.
+#
+# The README is generated from documentation comments in:
+#   datafusion-examples/examples/<group>/main.rs
+#
+# This script is intended to be run in CI to ensure that example
+# documentation stays in sync with the code.
+#
+# To update the README locally, run this script and replace README.md
+# with the generated output.
 
-skip() {
-    local value="$1"
-    for item in "${SKIP_LIST[@]}"; do
-        if [[ "$item" == "$value" ]]; then
-            return 0
-        fi
-    done
-    return 1
-}
+set -euo pipefail
 
-# collect folder names
-folders=$(find "$EXAMPLES_DIR" -mindepth 1 -maxdepth 1 -type d -exec basename 
{} \;)
+ROOT_DIR="$(git rev-parse --show-toplevel)"
+EXAMPLES_DIR="$ROOT_DIR/datafusion-examples"
+README="$EXAMPLES_DIR/README.md"
+README_NEW="$EXAMPLES_DIR/README-NEW.md"
 
-# collect group names from README headers
-groups=$(grep "^### Group:" "$README" | sed -E 's/^### Group: `([^`]+)`.*/\1/')
+echo "▶ Generating examples README (Rust generator)…"
+cargo run --quiet \
+  --manifest-path "$EXAMPLES_DIR/Cargo.toml" \
+  --bin examples-docs \
+  > "$README_NEW"
 
-for folder in $folders; do
-    if skip "$folder"; then
-        echo "Skipped group: $folder"
-        continue
-    fi
+echo "▶ Formatting generated README with Prettier…"
+npx [email protected] \

Review Comment:
   Something to look at in a followup issue is unifying our prettier versions 
somehow 🤔 
   
   
https://github.com/apache/datafusion/blob/8023947fadd1d3e5fa1fc4a84fc01647f6b507b9/dev/update_config_docs.sh#L241-L242
   
   
https://github.com/apache/datafusion/blob/4d63f8c9277705a6062625bc099151d0e4995692/dev/update_function_docs.sh#L116-L117
   
   
https://github.com/apache/datafusion/blob/4d63f8c9277705a6062625bc099151d0e4995692/ci/scripts/doc_prettier_check.sh#L23-L27



##########
datafusion-examples/README.md:
##########
@@ -73,15 +73,15 @@ cargo run --example dataframe -- dataframe
 
 | Subcommand            | File Path                                            
                                                 | Description                  
                 |
 | --------------------- | 
-----------------------------------------------------------------------------------------------------
 | --------------------------------------------- |
-| csv_sql_streaming     | 
[`custom_data_source/csv_sql_streaming.rs`](examples/custom_data_source/csv_sql_streaming.rs)
         | Run a streaming SQL query against CSV data    |
 | csv_json_opener       | 
[`custom_data_source/csv_json_opener.rs`](examples/custom_data_source/csv_json_opener.rs)
             | Use low-level FileOpener APIs for CSV/JSON    |
+| csv_sql_streaming     | 
[`custom_data_source/csv_sql_streaming.rs`](examples/custom_data_source/csv_sql_streaming.rs)
         | Run a streaming SQL query against CSV data    |
 | custom_datasource     | 
[`custom_data_source/custom_datasource.rs`](examples/custom_data_source/custom_datasource.rs)
         | Query a custom TableProvider                  |
 | custom_file_casts     | 
[`custom_data_source/custom_file_casts.rs`](examples/custom_data_source/custom_file_casts.rs)
         | Implement custom casting rules                |
 | custom_file_format    | 
[`custom_data_source/custom_file_format.rs`](examples/custom_data_source/custom_file_format.rs)
       | Write to a custom file format                 |
 | default_column_values | 
[`custom_data_source/default_column_values.rs`](examples/custom_data_source/default_column_values.rs)
 | Custom default values using metadata          |
 | file_stream_provider  | 
[`custom_data_source/file_stream_provider.rs`](examples/custom_data_source/file_stream_provider.rs)
   | Read/write via FileStreamProvider for streams |
 
-## Data IO Examples
+## Data Io Examples

Review Comment:
   Would be nice if we could fix these capitalization cases



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to