janhoy opened a new issue, #10:
URL: https://github.com/apache/solr-orbit-workloads/issues/10

   Port the OSB `http_logs` workload. ~31 GB of web server logs; widely used 
benchmark across the search engine community. Time-series heavy, large-scale 
log workload complementary to `eventdata`.
   
   > ⚠️ **Prerequisite:** Source data carries an HP copyright with a 
non-commercial restriction. **Get legal clearance from ASF Infra/Legal before 
any work begins.**
   
   ## Tasks
   - Obtain legal clearance
   - Convert OSB workload using `solr-orbit convert-workload`
   - Define operations: time-range queries, status faceting, bulk indexing
   - Add 1k sample corpus for test-mode
   - Check whether any operations belong in `common_operations/` rather than 
this workload
   
   **Depends on:** apache/solr-orbit-workloads#3 (ASF dataset hosting must be 
resolved before corpus files can be finalised)
   
   ## References
   - OSB workload: 
https://github.com/opensearch-project/opensearch-benchmark-workloads/tree/main/http_logs
   - Creating workloads: 
https://github.com/apache/solr-orbit/blob/main/CREATE_WORKLOAD_GUIDE.md


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to