writer-jill commented on code in PR #12344: URL: https://github.com/apache/druid/pull/12344#discussion_r890211338
########## docs/design/segments.md: ########## @@ -23,231 +23,198 @@ title: "Segments" --> -Apache Druid stores its index in *segment files*, which are partitioned by -time. In a basic setup, one segment file is created for each time +Apache Druid stores its index in *segment files* partitioned by +time. In a basic setup, Druid creates one segment file for each time interval, where the time interval is configurable in the `segmentGranularity` parameter of the -[`granularitySpec`](../ingestion/ingestion-spec.md#granularityspec). For Druid to -operate well under heavy query load, it is important for the segment +[`granularitySpec`](../ingestion/ingestion-spec.md#granularityspec). + +For Druid to operate well under heavy query load, it is important for the segment file size to be within the recommended range of 300MB-700MB. If your segment files are larger than this range, then consider either changing the granularity of the time interval or partitioning your -data and tweaking the `targetRowsPerSegment` in your `partitionsSpec` -(a good starting point for this parameter is 5 million rows). See the -sharding section below and the 'Partitioning specification' section of +data and adjusting the `targetRowsPerSegment` in your `partitionsSpec`. +A good starting point for this parameter is 5 million rows. + +See the Sharding section below and the 'Partitioning specification' section of the [Batch ingestion](../ingestion/hadoop.md#partitionsspec) documentation -for more information. +for more guidance. -### A segment file's core data structures +## Segment file structure -Here we describe the internal structure of segment files, which is -essentially *columnar*: the data for each column is laid out in -separate data structures. By storing each column separately, Druid can -decrease query latency by scanning only those columns actually needed -for a query. There are three basic column types: the timestamp -column, dimension columns, and metric columns, as illustrated in the -image below: +Segment files are *columnar*: the data for each column is laid out in +separate data structures. By storing each column separately, Druid decreases query latency by scanning only those columns actually needed for a query. There are three basic column types: timestamp, dimensions, and metrics:  -The timestamp and metric columns are simple: behind the scenes each of -these is an array of integer or floating point values compressed with -LZ4. Once a query knows which rows it needs to select, it simply -decompresses these, pulls out the relevant rows, and applies the -desired aggregation operator. As with all columns, if a query doesn’t -require a column, then that column’s data is just skipped over. +Timestamp and metrics type columns are arrays of integer or floating point values compressed with +[LZ4](https://github.com/lz4/lz4-java). Once a query identifies which rows to select, it decompresses them, pulls out the relevant rows, and applies the +desired aggregation operator. If a query doesn’t require a column, Druid skips over that column's data. -Dimensions columns are different because they support filter and +Dimension columns are different because they support filter and group-by operations, so each dimension requires the following three data structures: -1. A dictionary that maps values (which are always treated as strings) to integer IDs, -2. A list of the column’s values, encoded using the dictionary in 1, and -3. For each distinct value in the column, a bitmap that indicates which rows contain that value. - - -Why these three data structures? The dictionary simply maps string -values to integer ids so that the values in \(2\) and \(3\) can be -represented compactly. The bitmaps in \(3\) -- also known as *inverted -indexes* allow for quick filtering operations (specifically, bitmaps -are convenient for quickly applying AND and OR operators). Finally, -the list of values in \(2\) is needed for *group by* and *TopN* -queries. In other words, queries that solely aggregate metrics based -on filters do not need to touch the list of dimension values stored in \(2\). +- Dictionary: Maps values (which are always treated as strings) to integer IDs, allowing compact representation of the list and bitmap values. +- List: The column’s values, encoded using the dictionary. Required for GroupBy and TopN queries. These operators allow queries that solely aggregate metrics based on filters to run without accessing the list of values. +- Bitmap: One bitmap for each distinct value in the column, to indicate which rows contain that value. Bitmaps allow for quick filtering operations because they are convenient for quickly applying AND and OR operators. Also known as inverted indexes. -To get a concrete sense of these data structures, consider the ‘page’ -column from the example data above. The three data structures that -represent this dimension are illustrated in the diagram below. +To get a better sense of these data structures, consider the ‘page’ column from the given example data as represented by the following data structures: ``` -1: Dictionary that encodes column values - { +1: Dictionary + { "Justin Bieber": 0, "Ke$ha": 1 - } + } -2: Column data - [0, +2: List of column data + [0, 0, 1, 1] -3: Bitmaps - one for each unique value of the column - value="Justin Bieber": [1,1,0,0] - value="Ke$ha": [0,0,1,1] +3: Bitmaps + value="Justin Bieber": [1,1,0,0] + value="Ke$ha": [0,0,1,1] ``` -Note that the bitmap is different from the first two data structures: -whereas the first two grow linearly in the size of the data (in the -worst case), the size of the bitmap section is the product of data -size * column cardinality. Compression will help us here though -because we know that for each row in 'column data', there will only be a -single bitmap that has non-zero entry. This means that high cardinality -columns will have extremely sparse, and therefore highly compressible, -bitmaps. Druid exploits this using compression algorithms that are -specially suited for bitmaps, such as roaring bitmap compression. +Note that the bitmap is different from the dictionary and list data structures: the dictionary and list grow linearly with the size of the data, but the size of the bitmap section is the product of data size * column cardinality. Review Comment: @paul-rogers Can you please suggest a correct version of this sentence? We mention cardinality in relation to bitmap growth so I'm not certain of the difference between dictionary, list, and bitmap structures. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
