Re: [DISCUSS] scenario-based quickstart demo
+1 agree we don't have recipes for each feature as such. would benefit users who are interested in a particular feature. On Tue, Jul 6, 2021 at 2:17 AM Vinoth Chandar wrote: > Hi Raymond, > > Are you suggesting a fix to the dev workflow or general site/quickstart > docs? > > Agree, that the current doc is all-at-once and at least better docs on > incrementally testing parts could be useful. > It takes a while to learn what to skip and what not to. > > Thanks > Vinoth > > On Sat, Jul 3, 2021 at 2:11 PM Raymond Xu > wrote: > > > I found the demo setup in the "docker" directory not beginner friendly. > It > > took some effort to digest what's there and it's hard to play with. > > Proposing some scenario-based quickstart setup > > > > - Scenario 1: DeltaStreamer write > > - sample raw dataset, local FS > > - run deltastreamer with local Spark or Flink write to COW or MOR > > - Scenario 2: meta sync > > - sample hoodie table (COW or MOR), local FS > > - run hive sync with local Hive server > > - Scenario 3: SQL read > > - sample hoodie table (COW or MOR), local FS > > - run local Trino/Presto queries > > - More scenarios: incremental read, clustering, etc > > > > In all scenarios, users can choose between a release version and the > local > > version of Hudi. > > > > Not meant to replace the current "docker" demo. It can be under a > > "quickstart" dir and aims to be more focused quick sandbox. A typical dev > > flow is > > 1. changed some code > > 2. run mvn install -DskipTests > > 3. play with affected scenarios to verify the change > > > > Any thoughts or comments? Thank you. > > > -- Regards, -Sivabalan
Re: [DISCUSS] scenario-based quickstart demo
Hi Raymond, Are you suggesting a fix to the dev workflow or general site/quickstart docs? Agree, that the current doc is all-at-once and at least better docs on incrementally testing parts could be useful. It takes a while to learn what to skip and what not to. Thanks Vinoth On Sat, Jul 3, 2021 at 2:11 PM Raymond Xu wrote: > I found the demo setup in the "docker" directory not beginner friendly. It > took some effort to digest what's there and it's hard to play with. > Proposing some scenario-based quickstart setup > > - Scenario 1: DeltaStreamer write > - sample raw dataset, local FS > - run deltastreamer with local Spark or Flink write to COW or MOR > - Scenario 2: meta sync > - sample hoodie table (COW or MOR), local FS > - run hive sync with local Hive server > - Scenario 3: SQL read > - sample hoodie table (COW or MOR), local FS > - run local Trino/Presto queries > - More scenarios: incremental read, clustering, etc > > In all scenarios, users can choose between a release version and the local > version of Hudi. > > Not meant to replace the current "docker" demo. It can be under a > "quickstart" dir and aims to be more focused quick sandbox. A typical dev > flow is > 1. changed some code > 2. run mvn install -DskipTests > 3. play with affected scenarios to verify the change > > Any thoughts or comments? Thank you. >
[DISCUSS] scenario-based quickstart demo
I found the demo setup in the "docker" directory not beginner friendly. It took some effort to digest what's there and it's hard to play with. Proposing some scenario-based quickstart setup - Scenario 1: DeltaStreamer write - sample raw dataset, local FS - run deltastreamer with local Spark or Flink write to COW or MOR - Scenario 2: meta sync - sample hoodie table (COW or MOR), local FS - run hive sync with local Hive server - Scenario 3: SQL read - sample hoodie table (COW or MOR), local FS - run local Trino/Presto queries - More scenarios: incremental read, clustering, etc In all scenarios, users can choose between a release version and the local version of Hudi. Not meant to replace the current "docker" demo. It can be under a "quickstart" dir and aims to be more focused quick sandbox. A typical dev flow is 1. changed some code 2. run mvn install -DskipTests 3. play with affected scenarios to verify the change Any thoughts or comments? Thank you.