[ https://issues.apache.org/jira/browse/KAFKA-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811608#comment-15811608 ]
ASF GitHub Bot commented on KAFKA-4353: --------------------------------------- GitHub user rnpridgeon opened a pull request: https://github.com/apache/kafka/pull/2334 subset of KAFKA-4353: Add uuid Alternatively I am open to using Avro's 'Fixed' type with a 16 byte size. However most requests I have seen received wish to see UUID represented as type String so I went with that to start. Also I just realized that Intellij's 'optimized imports' squashed the import list. Follow-up commit soon to follow with expanded import list You can merge this pull request into a Git repository by running: $ git pull https://github.com/rnpridgeon/kafka add_uuid Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/2334.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2334 ---- commit 8413f7d558e6c9b1a4bfe580c6557e4fc337ed97 Author: rnpridgeon <ryan.n.pridg...@gmail.com> Date: 2017-01-09T11:28:24Z Add UUID logical type commit fdbce61dd1eefcaf08276a19d3e424e33d78cbba Author: rnpridgeon <ryan.n.pridg...@gmail.com> Date: 2017-01-09T11:49:19Z Add UUID logical type (fix tests) ---- > Add semantic types to Kafka Connect > ----------------------------------- > > Key: KAFKA-4353 > URL: https://issues.apache.org/jira/browse/KAFKA-4353 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect > Affects Versions: 0.10.0.1 > Reporter: Randall Hauch > Assignee: Ewen Cheslack-Postava > > Kafka Connect's schema system defines several _core types_ that consist of: > * STRUCT > * ARRAY > * MAP > plus these _primitive types_: > * INT8 > * INT16 > * INT32 > * INT64 > * FLOAT32 > * FLOAT64 > * BOOLEAN > * STRING > * BYTES > The {{Schema}} for these core types define several attributes, but they do > not have a name. > Kafka Connect also defines several _logical types_ that are specializations > of the primitive types and _do_ have schema names _and_ are automatically > mapped to/from Java objects: > || Schema Name || Primitive Type || Java value class || Description || > | o.k.c.d.Decimal | {{BYTES}} | {{java.math.BigDecimal}} | An > arbitrary-precision signed decimal number. | > | o.k.c.d.Date | {{INT32}} | {{java.util.Date}} | A date representing a > calendar day with no time of day or timezone. The {{java.util.Date}} value's > hours, minutes, seconds, milliseconds are set to 0. The underlying > representation is an integer representing the number of standardized days > (based on a number of milliseconds with 24 hours/day, 60 minutes/hour, 60 > seconds/minute, 1000 milliseconds/second with n) since Unix epoch. | > | o.k.c.d.Time | {{INT32}} | {{java.util.Date}} | A time representing a > specific point in a day, not tied to any specific date. Only the > {{java.util.Date}} value's hours, minutes, seconds, and milliseconds can be > non-zero. This effectively makes it a point in time during the first day > after the Unix epoch. The underlying representation is an integer > representing the number of milliseconds after midnight. | > | o.k.c.d.Timestamp | {{INT32}} | {{java.util.Date}} | A timestamp > representing an absolute time, without timezone information. The underlying > representation is a long representing the number of milliseconds since Unix > epoch. | > where "o.k.c.d" is short for {{org.kafka.connect.data}}. [~ewencp] has stated > in the past that adding more logical types is challenging and generally > undesirable, since everyone use Kafka Connect values have to deal with all > new logical types. > This proposal adds standard _semantic_ types that are somewhere between the > core types and logical types. Basically, they are just predefined schemas > that have names and are based on other primitive types. However, there is no > mapping to another form other than the primitive. > The purpose of semantic types is to provide hints as to how the values _can_ > be treated. Of course, clients are free to ignore the hints of some or all of > the built-in semantic types, and in these cases would treat the values as the > primitive value with no extra semantics. This behavior makes it much easier > to add new semantic types over time without risking incompatibilities. > Really, any source connector can define custom semantic types, but there is > tremendous value in having a library of standard, well-known semantic types, > including: > || Schema Name || Primitive Type || Description || > | o.k.c.d.Uuid | {{STRING}} | A UUID in string form.| > | o.k.c.d.Json | {{STRING}} | A JSON document, array, or scalar in string > form.| > | o.k.c.d.Xml | {{STRING}} | An XML document in string form.| > | o.k.c.d.BitSet | {{STRING}} | A string of zero or more {{0}} or {{1}} > characters.| > | o.k.c.d.ZonedTime | {{STRING}} | An ISO-8601 formatted representation of a > time (with fractional seconds) with timezone or offset from UTC.| > | o.k.c.d.ZonedTimestamp | {{STRING}} | An ISO-8601 formatted representation > of a timestamp with timezone or offset from UTC.| > | o.k.c.d.EpochDays | {{INT64}} | A date with no time or timezone > information, represented as the number of days since (or before) epoch, or > January 1, 1970, at 00:00:00UTC.| > | o.k.c.d.Year | {{INT32}} | The year number.| > | o.k.c.d.MilliTime | {{INT32}} | Number of milliseconds past midnight.| > | o.k.c.d.MicroTime | {{INT64}} | Number of microseconds past midnight.| > | o.k.c.d.NanoTime | {{INT64}} | Number of nanoseconds past midnight.| > | o.k.c.d.MilliTimestamp | {{INT64}} | Number of milliseconds past epoch.| > | o.k.c.d.MicroTimestamp | {{INT64}} | Number of microseconds past epoch.| -- This message was sent by Atlassian JIRA (v6.3.4#6332)