I found the demo setup in the "docker" directory not beginner friendly. It took some effort to digest what's there and it's hard to play with. Proposing some scenario-based quickstart setup
- Scenario 1: DeltaStreamer write - sample raw dataset, local FS - run deltastreamer with local Spark or Flink write to COW or MOR - Scenario 2: meta sync - sample hoodie table (COW or MOR), local FS - run hive sync with local Hive server - Scenario 3: SQL read - sample hoodie table (COW or MOR), local FS - run local Trino/Presto queries - More scenarios: incremental read, clustering, etc In all scenarios, users can choose between a release version and the local version of Hudi. Not meant to replace the current "docker" demo. It can be under a "quickstart" dir and aims to be more focused quick sandbox. A typical dev flow is 1. changed some code 2. run mvn install -DskipTests 3. play with affected scenarios to verify the change Any thoughts or comments? Thank you.