This is an automated email from the ASF dual-hosted git repository. fjy pushed a commit to branch 0.15.0-incubating in repository https://gitbox.apache.org/repos/asf/incubator-druid.git
The following commit(s) were added to refs/heads/0.15.0-incubating by this push: new f695425 Update tutorial to delete data (#7577) (#7732) f695425 is described below commit f695425c5b4c79ed945dbe58e2d32d74e5b47503 Author: Jihoon Son <jihoon...@apache.org> AuthorDate: Wed May 22 10:48:56 2019 -0700 Update tutorial to delete data (#7577) (#7732) * Update tutorial to delete data * update tutorial, remove old ways to drop data * PR comments --- .../content/tutorials/img/tutorial-deletion-02.png | Bin 200459 -> 810422 bytes .../content/tutorials/img/tutorial-deletion-03.png | Bin 0 -> 805673 bytes docs/content/tutorials/tutorial-delete-data.md | 58 +++++++++++---------- .../tutorial/deletion-disable-segments.json | 7 +++ 4 files changed, 37 insertions(+), 28 deletions(-) diff --git a/docs/content/tutorials/img/tutorial-deletion-02.png b/docs/content/tutorials/img/tutorial-deletion-02.png index fdea20f..9b84f0c 100644 Binary files a/docs/content/tutorials/img/tutorial-deletion-02.png and b/docs/content/tutorials/img/tutorial-deletion-02.png differ diff --git a/docs/content/tutorials/img/tutorial-deletion-03.png b/docs/content/tutorials/img/tutorial-deletion-03.png new file mode 100644 index 0000000..e6fb1f3 Binary files /dev/null and b/docs/content/tutorials/img/tutorial-deletion-03.png differ diff --git a/docs/content/tutorials/tutorial-delete-data.md b/docs/content/tutorials/tutorial-delete-data.md index 41812f7..46fbbdc 100644 --- a/docs/content/tutorials/tutorial-delete-data.md +++ b/docs/content/tutorials/tutorial-delete-data.md @@ -29,8 +29,6 @@ This tutorial demonstrates how to delete existing data. For this tutorial, we'll assume you've already downloaded Apache Druid (incubating) as described in the [single-machine quickstart](index.html) and have it running on your local machine. -Completing [Tutorial: Configuring retention](../tutorials/tutorial-retention.html) first is highly recommended, as we will be using retention rules in this tutorial. - ## Load initial data In this tutorial, we will use the Wikipedia edits data, with an indexing spec that creates hourly segments. This spec is located at `quickstart/tutorial/deletion-index.json`, and it creates a datasource called `deletion-tutorial`. @@ -47,30 +45,25 @@ When the load finishes, open [http://localhost:8888/unified-console.html#datasou Permanent deletion of a Druid segment has two steps: -1. The segment must first be marked as "unused". This occurs when a segment is dropped by retention rules, and when a user manually disables a segment through the Coordinator API. This tutorial will cover both cases. +1. The segment must first be marked as "unused". This occurs when a user manually disables a segment through the Coordinator API. 2. After segments have been marked as "unused", a Kill Task will delete any "unused" segments from Druid's metadata store as well as deep storage. -Let's drop some segments now, first with load rules, then manually. - -## Drop some data with load rules - -As with the previous retention tutorial, there are currently 24 segments in the `deletion-tutorial` datasource. - -click the blue pencil icon next to `Cluster default: loadForever` for the `deletion-tutorial` datasource. +Let's drop some segments now, by using the coordinator API to drop data by interval and segmentIds. -A rule configuration window will appear. +## Disable segments by interval -Now click the `+ New rule` button twice. +Let's disable segments in a specified interval. This will mark all segments in the interval as "unused", but not remove them from deep storage. +Let's disable segments in interval `2015-09-12T18:00:00.000Z/2015-09-12T20:00:00.000Z` i.e. between hour 18 and 20. -In the upper rule box, select `Load` and `by interval`, and then enter `2015-09-12T12:00:00.000Z/2015-09-13T00:00:00.000Z` in field next to `by interval`. Replicants can remain at 2 in the `_default_tier`. - -In the lower rule box, select `Drop` and `forever`. +```bash +curl -X 'POST' -H 'Content-Type:application/json' -d '{ "interval" : "2015-09-12T18:00:00.000Z/2015-09-12T20:00:00.000Z" }' http://localhost:8081/druid/coordinator/v1/datasources/deletion-tutorial/markUnused +``` -Now click `Next` and enter `tutorial` for both the user and changelog comment field. +After that command completes, you should see that the segment for hour 18 and 19 have been disabled: -This will cause the first 12 segments of `deletion-tutorial` to be dropped. However, these dropped segments are not removed from deep storage. +![Segments 2](../tutorials/img/tutorial-deletion-02.png "Segments 2") -You can see that all 24 segments are still present in deep storage by listing the contents of `apache-druid-#{DRUIDVERSION}/var/druid/segments/deletion-tutorial`: +Note that the hour 18 and 19 segments are still present in deep storage: ```bash $ ls -l1 var/druid/segments/deletion-tutorial/ @@ -100,9 +93,9 @@ $ ls -l1 var/druid/segments/deletion-tutorial/ 2015-09-12T23:00:00.000Z_2015-09-13T00:00:00.000Z ``` -## Manually disable a segment +## Disable segments by segment IDs -Let's manually disable a segment now. This will mark a segment as "unused", but not remove it from deep storage. +Let's disable some segments by their segmentID. This will again mark the segments as "unused", but not remove them from deep storage. You can see the full segmentID for a segment from UI as explained below. In the [segments view](http://localhost:8888/unified-console.html#segments), click the arrow on the left side of one of the remaining segments to expand the segment entry: @@ -110,17 +103,29 @@ In the [segments view](http://localhost:8888/unified-console.html#segments), cli The top of the info box shows the full segment ID, e.g. `deletion-tutorial_2015-09-12T14:00:00.000Z_2015-09-12T15:00:00.000Z_2019-02-28T01:11:51.606Z` for the segment of hour 14. -Let's disable the hour 14 segment by sending the following DELETE request to the Coordinator, where {SEGMENT-ID} is the full segment ID shown in the info box: +Let's disable the hour 13 and 14 segments by sending a POST request to the Coordinator with this payload + +```json +{ + "segmentIds": + [ + "deletion-tutorial_2015-09-12T13:00:00.000Z_2015-09-12T14:00:00.000Z_2019-05-01T17:38:46.961Z", + "deletion-tutorial_2015-09-12T14:00:00.000Z_2015-09-12T15:00:00.000Z_2019-05-01T17:38:46.961Z" + ] +} +``` + +This payload json has been provided at `quickstart/tutorial/deletion-disable-segments.json`. Submit the POST request to Coordinator like this: ```bash -curl -XDELETE http://localhost:8081/druid/coordinator/v1/datasources/deletion-tutorial/segments/{SEGMENT-ID} +curl -X 'POST' -H 'Content-Type:application/json' -d @quickstart/tutorial/deletion-disable-segments.json http://localhost:8081/druid/coordinator/v1/datasources/deletion-tutorial/markUnused ``` -After that command completes, you should see that the segment for hour 14 has been disabled: +After that command completes, you should see that the segments for hour 13 and 14 have been disabled: -![Segments 2](../tutorials/img/tutorial-deletion-02.png "Segments 2") +![Segments 3](../tutorials/img/tutorial-deletion-03.png "Segments 3") -Note that the hour 14 segment is still in deep storage: +Note that the hour 13 and 14 segments are still in deep storage: ```bash $ ls -l1 var/druid/segments/deletion-tutorial/ @@ -165,12 +170,9 @@ After this task completes, you can see that the disabled segments have now been ```bash $ ls -l1 var/druid/segments/deletion-tutorial/ 2015-09-12T12:00:00.000Z_2015-09-12T13:00:00.000Z -2015-09-12T13:00:00.000Z_2015-09-12T14:00:00.000Z 2015-09-12T15:00:00.000Z_2015-09-12T16:00:00.000Z 2015-09-12T16:00:00.000Z_2015-09-12T17:00:00.000Z 2015-09-12T17:00:00.000Z_2015-09-12T18:00:00.000Z -2015-09-12T18:00:00.000Z_2015-09-12T19:00:00.000Z -2015-09-12T19:00:00.000Z_2015-09-12T20:00:00.000Z 2015-09-12T20:00:00.000Z_2015-09-12T21:00:00.000Z 2015-09-12T21:00:00.000Z_2015-09-12T22:00:00.000Z 2015-09-12T22:00:00.000Z_2015-09-12T23:00:00.000Z diff --git a/examples/quickstart/tutorial/deletion-disable-segments.json b/examples/quickstart/tutorial/deletion-disable-segments.json new file mode 100644 index 0000000..920e071 --- /dev/null +++ b/examples/quickstart/tutorial/deletion-disable-segments.json @@ -0,0 +1,7 @@ +{ + "segmentIds": + [ + "deletion-tutorial_2015-09-12T13:00:00.000Z_2015-09-12T14:00:00.000Z_2019-05-01T17:38:46.961Z", + "deletion-tutorial_2015-09-12T14:00:00.000Z_2015-09-12T15:00:00.000Z_2019-05-01T17:38:46.961Z" + ] +} --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org