[ 
https://issues.apache.org/jira/browse/ATLAS-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barbara Eckman updated ATLAS-2915:
----------------------------------
    Description: 
Currently the base types in Atlas do not include AWS Kinesis Stream objects. It 
would be nice to add a typedef for a kinesis stream, inheriting from DataSet.  
Attributes would include:
 * streamType string, eg ""Single Region Stream".
 * awsRegion string: the AWS region in which the kinesis stream endpoint is 
deployed
 * shardCount int:  number of shards (uniquely identified sequence of data 
records) in the stream
 * streamEnvironment enum.  Valid values are "unknown", "production", 
"staging", "QA" and "development"
 * containsPII boolean: does this stream's data contain Personally Identifiable 
Information?
 * aggregationFormat enum. Indicates if/how the records are aggregated within a 
single kinesis record. Valid values are "none" or "kpl".
 * contentType enum: serialization format used by the producer of the stream.  
Valid values are "unknown", "avro", "bson", "csv", "json", "key-value", "kryo", 
"protobuf", "raw" [ie no consistent schema], "sdp" [confluent-style avro with 
envelope that specifies schema id surrounding the payload], "thrift", "tlv", 
"xml", "other".
 * schemaURL string: A URL to the data schema used by the producer, to 
facilitate consumption.
 * avroSchemas: array of avro schema objects (see ATLAS-2694) associated with 
the kinesis stream.

 

  was:
Currently the base types in Atlas do not include AWS Kinesis Stream objects. It 
would be nice to add a typedef for a kinesis stream.  Attributes would include:
 * streamType string, eg ""Single Region Stream".
 * awsRegion string: the AWS region in which the kinesis stream endpoint is 
deployed
 * shardCount int:  number of shards (uniquely identified sequence of data 
records) in the stream
 * streamEnvironment enum.  Valid values are "unknown", "production", 
"staging", "QA" and "development"
 * containsPII boolean: does this stream's data contain Personally Identifiable 
Information?
 * aggregationFormat enum. Indicates if/how the records are aggregated within a 
single kinesis record. Valid values are "none" or "kpl".
 * contentType enum: serialization format used by the producer of the stream.  
Valid values are "unknown", "avro", "bson", "csv", "json", "key-value", "kryo", 
"protobuf", "raw" [ie no consistent schema], "sdp" [confluent-style avro with 
envelope that specifies schema id surrounding the payload], "thrift", "tlv", 
"xml", "other".
 * schemaURL string: A URL to the data schema used by the producer, to 
facilitate consumption.
 * avroSchemas: array of avro schema objects (see ATLAS-2694) associated with 
the kinesis stream.

 


> AWS Kinesis Stream Typedef for Atlas
> ------------------------------------
>
>                 Key: ATLAS-2915
>                 URL: https://issues.apache.org/jira/browse/ATLAS-2915
>             Project: Atlas
>          Issue Type: New Feature
>            Reporter: Barbara Eckman
>            Priority: Major
>
> Currently the base types in Atlas do not include AWS Kinesis Stream objects. 
> It would be nice to add a typedef for a kinesis stream, inheriting from 
> DataSet.  Attributes would include:
>  * streamType string, eg ""Single Region Stream".
>  * awsRegion string: the AWS region in which the kinesis stream endpoint is 
> deployed
>  * shardCount int:  number of shards (uniquely identified sequence of data 
> records) in the stream
>  * streamEnvironment enum.  Valid values are "unknown", "production", 
> "staging", "QA" and "development"
>  * containsPII boolean: does this stream's data contain Personally 
> Identifiable Information?
>  * aggregationFormat enum. Indicates if/how the records are aggregated within 
> a single kinesis record. Valid values are "none" or "kpl".
>  * contentType enum: serialization format used by the producer of the stream. 
>  Valid values are "unknown", "avro", "bson", "csv", "json", "key-value", 
> "kryo", "protobuf", "raw" [ie no consistent schema], "sdp" [confluent-style 
> avro with envelope that specifies schema id surrounding the payload], 
> "thrift", "tlv", "xml", "other".
>  * schemaURL string: A URL to the data schema used by the producer, to 
> facilitate consumption.
>  * avroSchemas: array of avro schema objects (see ATLAS-2694) associated with 
> the kinesis stream.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to