kosiew opened a new issue, #20007:
URL: https://github.com/apache/datafusion/issues/20007

   ### Problem
   
   ClickBench setup knowledge is currently scattered across multiple locations:
   1. `HITS_VIEW_DDL` constant in `benchmarks/src/clickbench.rs` with inline 
comments
   2. View creation SQL in `datafusion/sqllogictest/test_files/clickbench.slt`
   3. Brief mention in `benchmarks/README.md` (without critical setup details)
   
   This makes it difficult for users to understand:
   - Why the EventDate column needs special handling
   - When and why to use the `binary_as_string` option
   - How to set up ClickBench correctly for DataFusion
   
   ### Background
   
   Related to #19881. The fix introduces a view that transforms EventDate from 
UInt16 (days since epoch) to proper DATE type. However, the knowledge needed to 
run ClickBench effectively is duplicated across files.
   
   "I worry that we are spreading the knowledge needed to run DataFusion on 
ClickBench effectively all over the place. For example, this view definition is 
now copied twice."
   - [PR 
comment](https://github.com/apache/datafusion/pull/19881#discussion_r2725546501)
   
   ### Proposed Solution
   
   Add comprehensive documentation to the existing ClickBench section in 
`benchmarks/README.md` that serves as the single source of truth. This 
documentation should cover:
   
   1. **EventDate UInt16 → DATE transformation** - Why it's needed and how it 
works
   2. **binary_as_string option** - When and why it's required
   3. **Complete setup example** - Copy-pasteable SQL showing the full setup
   4. **Clarifications** - Differences between full dataset and test subsets
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to