This is an automated email from the ASF dual-hosted git repository. xiangfu pushed a commit to branch k8s-quickstart in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git
commit c39fe6caf88405557d3e360ba82d915849afd7e1 Author: Xiang Fu <[email protected]> AuthorDate: Wed Sep 4 12:54:15 2019 -0700 Adding example for kubernetes deployment on GKE --- kubernetes/README.md | 6 + kubernetes/skaffold/README.md | 92 +++++ kubernetes/skaffold/cleanup.sh | 20 ++ kubernetes/skaffold/gke-storageclass-kafka-pd.yml | 9 + .../gke-storageclass-pinot-controller-pd.yml | 9 + .../skaffold/gke-storageclass-pinot-server-pd.yml | 9 + kubernetes/skaffold/gke-storageclass-zk-pd.yml | 9 + kubernetes/skaffold/kafka.yml | 389 +++++++++++++++++++++ kubernetes/skaffold/pinot-broker.yml | 69 ++++ kubernetes/skaffold/pinot-controller.yml | 87 +++++ kubernetes/skaffold/pinot-example-loader.yml | 24 ++ kubernetes/skaffold/pinot-server.yml | 82 +++++ kubernetes/skaffold/query-pinot-data.sh | 9 + kubernetes/skaffold/setup.sh | 33 ++ kubernetes/skaffold/skaffold.yaml | 17 + kubernetes/skaffold/zookeeper.yml | 61 ++++ 16 files changed, 925 insertions(+) diff --git a/kubernetes/README.md b/kubernetes/README.md new file mode 100644 index 0000000..78c64ff --- /dev/null +++ b/kubernetes/README.md @@ -0,0 +1,6 @@ +# Pinot Quickstart on Kubernetes + + +## [Skaffold](https://skaffold.dev/docs/getting-started) + +Please go to directory `skaffold` for an example to set up a Pinot cluster using `skaffold` tooling. diff --git a/kubernetes/skaffold/README.md b/kubernetes/skaffold/README.md new file mode 100644 index 0000000..019b607 --- /dev/null +++ b/kubernetes/skaffold/README.md @@ -0,0 +1,92 @@ +# Pinot Quickstart on Kubernetes on Google Kubernetes Engine(GKE) + +## Prerequisite + +- kubectl (https://kubernetes.io/docs/tasks/tools/install-kubectl/) +- Google Cloud SDK (https://cloud.google.com/sdk/install) +- Skaffold (https://skaffold.dev/docs/getting-started/#installing-skaffold) +- Enable Google Cloud Account and create a project, e.g. `pinot-demo`. + - `pinot-demo` will be used as example value for `${GCLOUD_PROJECT}` variable in script example. + - `[email protected]` will be used as example value for `${GCLOUD_EMAIL}`. +- Configure kubectl to connect to the Kubernetes cluster. + +## Create a cluster on GKE + +Below script will: +- Create a gCloud cluster `pinot-quickstart` +- Request 1 server of type `n1-standard-2` for zookeeper, kafka, pinot controller, pinot broker. +- Request 1 server of type `n1-standard-8` for Pinot server. + +Please fill both environment variables: `${GCLOUD_PROJECT}` and `${GCLOUD_EMAIL}` with your gcloud project and gcloud account email in below script. +``` +GCLOUD_PROJECT=[your gcloud project name] +GCLOUD_EMAIL=[Your gcloud account email] +./setup.sh +``` + +E.g. +``` +GCLOUD_PROJECT=pinot-demo [email protected] +./setup.sh +``` + +Feel free to modify the script to pick your preferred sku, e.g. `n1-highmem-32` for Pinot server. + + +## How to connect to an existing cluster +Simply run below command to get the credential for the cluster you just created or your existing cluster. +Please modify the Env variables `${GCLOUD_PROJECT}`, `${GCLOUD_ZONE}`, `${GCLOUD_CLUSTER}` accordingly in below script. +``` +GCLOUD_PROJECT=pinot-demo +GCLOUD_ZONE=us-west1-b +GCLOUD_CLUSTER=pinot-quickstart +gcloud container clusters get-credentials ${GCLOUD_CLUSTER} --zone ${GCLOUD_ZONE} --project ${GCLOUD_PROJECT} +``` + +Look for cluster status +``` +kubectl get all -n pinot-quickstart -o wide +``` + +## How to setup a Pinot cluster for demo + +The script requests: + - Create persistent disk for deep storage and mount it. + - Zookeeper + - Kafka + - Pinot Controller + - Pinot Server + - Create Pods for + - Zookeeper + - Kafka + - Pinot Controller + - Pinot Broker + - Pinot Server + - Pinot Example Loader + + +``` +skaffold run -f skaffold.yaml +``` + +## How to load sample data +By default the `skaffold run -f skaffold.yaml` command will create an example table consuming from Kafka stream. + +Please refer to file `pinot-example-loader.yml` for more details. + +## How to query pinot data + +Please use below script to do local port-forwarding and open Pinot query console on your web browser. +``` +./query-pinot-data.sh +``` + +## How to delete a cluster +Below script will delete the pinot perf cluster and delete the pvc disks. + +Note that you need to replace the gcloud project name if you are using another one. +``` +GCLOUD_PROJECT=[your gcloud project name] +./cleanup.sh +``` diff --git a/kubernetes/skaffold/cleanup.sh b/kubernetes/skaffold/cleanup.sh new file mode 100755 index 0000000..7d8ad71 --- /dev/null +++ b/kubernetes/skaffold/cleanup.sh @@ -0,0 +1,20 @@ +#!/usr/bin/env bash +set -e + +if [[ -z "${GCLOUD_PROJECT}" ]] +then + echo "Please set \$GCLOUD_PROJECT variable. E.g. GCLOUD_PROJECT=pinot-demo ./cleanup.sh" + exit 1 +fi + +GCLOUD_ZONE=us-west1-b +GCLOUD_CLUSTER=pinot-quickstart + +gcloud container clusters delete ${GCLOUD_CLUSTER} --zone=${GCLOUD_ZONE} --project=${GCLOUD_PROJECT} -q + +for diskname in `gcloud compute disks list --zones=${GCLOUD_ZONE} --project ${GCLOUD_PROJECT} |grep gke-${GCLOUD_CLUSTER}|awk -F ' ' '{print $1}'`; +do +echo $diskname; +gcloud compute disks delete $diskname --zone=${GCLOUD_ZONE} --project ${GCLOUD_PROJECT} -q +done + diff --git a/kubernetes/skaffold/gke-storageclass-kafka-pd.yml b/kubernetes/skaffold/gke-storageclass-kafka-pd.yml new file mode 100644 index 0000000..c3838fd --- /dev/null +++ b/kubernetes/skaffold/gke-storageclass-kafka-pd.yml @@ -0,0 +1,9 @@ +kind: StorageClass +apiVersion: storage.k8s.io/v1 +metadata: + name: kafka + namespace: pinot-quickstart +provisioner: kubernetes.io/gce-pd +reclaimPolicy: Retain +parameters: + type: pd-ssd diff --git a/kubernetes/skaffold/gke-storageclass-pinot-controller-pd.yml b/kubernetes/skaffold/gke-storageclass-pinot-controller-pd.yml new file mode 100644 index 0000000..b6aa635 --- /dev/null +++ b/kubernetes/skaffold/gke-storageclass-pinot-controller-pd.yml @@ -0,0 +1,9 @@ +kind: StorageClass +apiVersion: storage.k8s.io/v1 +metadata: + name: pinot-controller + namespace: pinot-quickstart +provisioner: kubernetes.io/gce-pd +reclaimPolicy: Retain +parameters: + type: pd-standard diff --git a/kubernetes/skaffold/gke-storageclass-pinot-server-pd.yml b/kubernetes/skaffold/gke-storageclass-pinot-server-pd.yml new file mode 100644 index 0000000..9d8a055 --- /dev/null +++ b/kubernetes/skaffold/gke-storageclass-pinot-server-pd.yml @@ -0,0 +1,9 @@ +kind: StorageClass +apiVersion: storage.k8s.io/v1 +metadata: + name: pinot-server + namespace: pinot-quickstart +provisioner: kubernetes.io/gce-pd +reclaimPolicy: Retain +parameters: + type: pd-ssd diff --git a/kubernetes/skaffold/gke-storageclass-zk-pd.yml b/kubernetes/skaffold/gke-storageclass-zk-pd.yml new file mode 100644 index 0000000..a8bd4f3 --- /dev/null +++ b/kubernetes/skaffold/gke-storageclass-zk-pd.yml @@ -0,0 +1,9 @@ +kind: StorageClass +apiVersion: storage.k8s.io/v1 +metadata: + name: zookeeper + namespace: pinot-quickstart +provisioner: kubernetes.io/gce-pd +reclaimPolicy: Retain +parameters: + type: pd-ssd diff --git a/kubernetes/skaffold/kafka.yml b/kubernetes/skaffold/kafka.yml new file mode 100644 index 0000000..187f815 --- /dev/null +++ b/kubernetes/skaffold/kafka.yml @@ -0,0 +1,389 @@ +kind: ConfigMap +metadata: + name: broker-config + namespace: pinot-quickstart +apiVersion: v1 +data: + init.sh: |- + #!/bin/bash + set -e + set -x + cp /etc/kafka-configmap/log4j.properties /etc/kafka/ + + KAFKA_BROKER_ID=${HOSTNAME##*-} + SEDS=("s/#init#broker.id=#init#/broker.id=$KAFKA_BROKER_ID/") + LABELS="kafka-broker-id=$KAFKA_BROKER_ID" + ANNOTATIONS="" + + hash kubectl 2>/dev/null || { + SEDS+=("s/#init#broker.rack=#init#/#init#broker.rack=# kubectl not found in path/") + } && { + ZONE=$(kubectl get node "$NODE_NAME" -o=go-template='{{index .metadata.labels "failure-domain.beta.kubernetes.io/zone"}}') + if [ $? -ne 0 ]; then + SEDS+=("s/#init#broker.rack=#init#/#init#broker.rack=# zone lookup failed, see -c init-config logs/") + elif [ "x$ZONE" == "x<no value>" ]; then + SEDS+=("s/#init#broker.rack=#init#/#init#broker.rack=# zone label not found for node $NODE_NAME/") + else + SEDS+=("s/#init#broker.rack=#init#/broker.rack=$ZONE/") + LABELS="$LABELS kafka-broker-rack=$ZONE" + fi + + OUTSIDE_HOST=$(kubectl get node "$NODE_NAME" -o jsonpath='{.status.addresses[?(@.type=="InternalIP")].address}') + if [ $? -ne 0 ]; then + echo "Outside (i.e. cluster-external access) host lookup command failed" + else + OUTSIDE_PORT=3240${KAFKA_BROKER_ID} + SEDS+=("s|#init#advertised.listeners=OUTSIDE://#init#|advertised.listeners=OUTSIDE://${OUTSIDE_HOST}:${OUTSIDE_PORT}|") + ANNOTATIONS="$ANNOTATIONS kafka-listener-outside-host=$OUTSIDE_HOST kafka-listener-outside-port=$OUTSIDE_PORT" + fi + + if [ ! -z "$LABELS" ]; then + kubectl -n $POD_NAMESPACE label pod $POD_NAME $LABELS || echo "Failed to label $POD_NAMESPACE.$POD_NAME - RBAC issue?" + fi + if [ ! -z "$ANNOTATIONS" ]; then + kubectl -n $POD_NAMESPACE annotate pod $POD_NAME $ANNOTATIONS || echo "Failed to annotate $POD_NAMESPACE.$POD_NAME - RBAC issue?" + fi + } + printf '%s\n' "${SEDS[@]}" | sed -f - /etc/kafka-configmap/server.properties > /etc/kafka/server.properties.tmp + [ $? -eq 0 ] && mv /etc/kafka/server.properties.tmp /etc/kafka/server.properties + + server.properties: |- + ############################# Log Basics ############################# + + # A comma seperated list of directories under which to store log files + # Overrides log.dir + log.dirs=/var/lib/kafka/data/topics + + # The default number of log partitions per topic. More partitions allow greater + # parallelism for consumption, but this will also result in more files across + # the brokers. + num.partitions=12 + + default.replication.factor=1 + + min.insync.replicas=1 + + auto.create.topics.enable=true + + # The number of threads per data directory to be used for log recovery at startup and flushing at shutdown. + # This value is recommended to be increased for installations with data dirs located in RAID array. + #num.recovery.threads.per.data.dir=1 + + ############################# Server Basics ############################# + + # The id of the broker. This must be set to a unique integer for each broker. + #init#broker.id=#init# + + #init#broker.rack=#init# + + ############################# Socket Server Settings ############################# + + # The address the socket server listens on. It will get the value returned from + # java.net.InetAddress.getCanonicalHostName() if not configured. + # FORMAT: + # listeners = listener_name://host_name:port + # EXAMPLE: + # listeners = PLAINTEXT://your.host.name:9092 + #listeners=PLAINTEXT://:9092 + listeners=OUTSIDE://:9094,PLAINTEXT://:9092 + + # Hostname and port the broker will advertise to producers and consumers. If not set, + # it uses the value for "listeners" if configured. Otherwise, it will use the value + # returned from java.net.InetAddress.getCanonicalHostName(). + #advertised.listeners=PLAINTEXT://your.host.name:9092 + #init#advertised.listeners=OUTSIDE://#init#,PLAINTEXT://:9092 + + # Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details + #listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL + listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL,OUTSIDE:PLAINTEXT + inter.broker.listener.name=PLAINTEXT + + compression.type=gzip + inter.broker.protocol.version=0.10.2.0 + log.message.format.version=0.10.2.0 + + # The number of threads that the server uses for receiving requests from the network and sending responses to the network + #num.network.threads=3 + + # The number of threads that the server uses for processing requests, which may include disk I/O + #num.io.threads=8 + + # The send buffer (SO_SNDBUF) used by the socket server + #socket.send.buffer.bytes=102400 + + # The receive buffer (SO_RCVBUF) used by the socket server + #socket.receive.buffer.bytes=102400 + + # The maximum size of a request that the socket server will accept (protection against OOM) + #socket.request.max.bytes=104857600 + + ############################# Internal Topic Settings ############################# + # The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state" + # For anything other than development testing, a value greater than 1 is recommended for to ensure availability such as 3. + offsets.topic.replication.factor=1 + #transaction.state.log.replication.factor=1 + #transaction.state.log.min.isr=1 + + ############################# Log Flush Policy ############################# + + # Messages are immediately written to the filesystem but by default we only fsync() to sync + # the OS cache lazily. The following configurations control the flush of data to disk. + # There are a few important trade-offs here: + # 1. Durability: Unflushed data may be lost if you are not using replication. + # 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush. + # 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks. + # The settings below allow one to configure the flush policy to flush data after a period of time or + # every N messages (or both). This can be done globally and overridden on a per-topic basis. + + # The number of messages to accept before forcing a flush of data to disk + #log.flush.interval.messages=10000 + + # The maximum amount of time a message can sit in a log before we force a flush + #log.flush.interval.ms=1000 + + ############################# Log Retention Policy ############################# + + # The following configurations control the disposal of log segments. The policy can + # be set to delete segments after a period of time, or after a given size has accumulated. + # A segment will be deleted whenever *either* of these criteria are met. Deletion always happens + # from the end of the log. + + # https://cwiki.apache.org/confluence/display/KAFKA/KIP-186%3A+Increase+offsets+retention+default+to+7+days + offsets.retention.minutes=10080 + + # The minimum age of a log file to be eligible for deletion due to age + log.retention.hours=-1 + + # A size-based retention policy for logs. Segments are pruned from the log unless the remaining + # segments drop below log.retention.bytes. Functions independently of log.retention.hours. + #log.retention.bytes=1073741824 + + # The maximum size of a log segment file. When this size is reached a new log segment will be created. + #log.segment.bytes=1073741824 + + # The interval at which log segments are checked to see if they can be deleted according + # to the retention policies + #log.retention.check.interval.ms=300000 + + ############################# Zookeeper ############################# + + # Zookeeper connection string (see zookeeper docs for details). + # This is a comma separated host:port pairs, each corresponding to a zk + # server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002". + # You can also append an optional chroot string to the urls to specify the + # root directory for all kafka znodes. + zookeeper.connect=zookeeper:2181 + + # Timeout in ms for connecting to zookeeper + #zookeeper.connection.timeout.ms=6000 + + + ############################# Group Coordinator Settings ############################# + + # The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance. + # The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms. + # The default value for this is 3 seconds. + # We override this to 0 here as it makes for a better out-of-the-box experience for development and testing. + # However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup. + #group.initial.rebalance.delay.ms=0 + + log4j.properties: |- + # Unspecified loggers and loggers with additivity=true output to server.log and stdout + # Note that INFO only applies to unspecified loggers, the log level of the child logger is used otherwise + log4j.rootLogger=INFO, stdout + + log4j.appender.stdout=org.apache.log4j.ConsoleAppender + log4j.appender.stdout.layout=org.apache.log4j.PatternLayout + log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n + + log4j.appender.kafkaAppender=org.apache.log4j.DailyRollingFileAppender + log4j.appender.kafkaAppender.DatePattern='.'yyyy-MM-dd-HH + log4j.appender.kafkaAppender.File=${kafka.logs.dir}/server.log + log4j.appender.kafkaAppender.layout=org.apache.log4j.PatternLayout + log4j.appender.kafkaAppender.layout.ConversionPattern=[%d] %p %m (%c)%n + + log4j.appender.stateChangeAppender=org.apache.log4j.DailyRollingFileAppender + log4j.appender.stateChangeAppender.DatePattern='.'yyyy-MM-dd-HH + log4j.appender.stateChangeAppender.File=${kafka.logs.dir}/state-change.log + log4j.appender.stateChangeAppender.layout=org.apache.log4j.PatternLayout + log4j.appender.stateChangeAppender.layout.ConversionPattern=[%d] %p %m (%c)%n + + log4j.appender.requestAppender=org.apache.log4j.DailyRollingFileAppender + log4j.appender.requestAppender.DatePattern='.'yyyy-MM-dd-HH + log4j.appender.requestAppender.File=${kafka.logs.dir}/kafka-request.log + log4j.appender.requestAppender.layout=org.apache.log4j.PatternLayout + log4j.appender.requestAppender.layout.ConversionPattern=[%d] %p %m (%c)%n + + log4j.appender.cleanerAppender=org.apache.log4j.DailyRollingFileAppender + log4j.appender.cleanerAppender.DatePattern='.'yyyy-MM-dd-HH + log4j.appender.cleanerAppender.File=${kafka.logs.dir}/log-cleaner.log + log4j.appender.cleanerAppender.layout=org.apache.log4j.PatternLayout + log4j.appender.cleanerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n + + log4j.appender.controllerAppender=org.apache.log4j.DailyRollingFileAppender + log4j.appender.controllerAppender.DatePattern='.'yyyy-MM-dd-HH + log4j.appender.controllerAppender.File=${kafka.logs.dir}/controller.log + log4j.appender.controllerAppender.layout=org.apache.log4j.PatternLayout + log4j.appender.controllerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n + + log4j.appender.authorizerAppender=org.apache.log4j.DailyRollingFileAppender + log4j.appender.authorizerAppender.DatePattern='.'yyyy-MM-dd-HH + log4j.appender.authorizerAppender.File=${kafka.logs.dir}/kafka-authorizer.log + log4j.appender.authorizerAppender.layout=org.apache.log4j.PatternLayout + log4j.appender.authorizerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n + + # Change the two lines below to adjust ZK client logging + log4j.logger.org.I0Itec.zkclient.ZkClient=INFO + log4j.logger.org.apache.zookeeper=INFO + + # Change the two lines below to adjust the general broker logging level (output to server.log and stdout) + log4j.logger.kafka=INFO + log4j.logger.org.apache.kafka=INFO + + # Change to DEBUG or TRACE to enable request logging + log4j.logger.kafka.request.logger=WARN, requestAppender + log4j.additivity.kafka.request.logger=false + + # Uncomment the lines below and change log4j.logger.kafka.network.RequestChannel$ to TRACE for additional output + # related to the handling of requests + #log4j.logger.kafka.network.Processor=TRACE, requestAppender + #log4j.logger.kafka.server.KafkaApis=TRACE, requestAppender + #log4j.additivity.kafka.server.KafkaApis=false + log4j.logger.kafka.network.RequestChannel$=WARN, requestAppender + log4j.additivity.kafka.network.RequestChannel$=false + + log4j.logger.kafka.controller=TRACE, controllerAppender + log4j.additivity.kafka.controller=false + + log4j.logger.kafka.log.LogCleaner=INFO, cleanerAppender + log4j.additivity.kafka.log.LogCleaner=false + + log4j.logger.state.change.logger=TRACE, stateChangeAppender + log4j.additivity.state.change.logger=false + + # Change to DEBUG to enable audit log for the authorizer + log4j.logger.kafka.authorizer.logger=WARN, authorizerAppender + log4j.additivity.kafka.authorizer.logger=false + +--- +apiVersion: apps/v1 +kind: StatefulSet +metadata: + name: kafka + namespace: pinot-quickstart +spec: + selector: + matchLabels: + app: kafka + serviceName: kafka + replicas: 1 + updateStrategy: + type: RollingUpdate + podManagementPolicy: Parallel + template: + metadata: + labels: + app: kafka + annotations: + spec: + terminationGracePeriodSeconds: 30 + initContainers: + - name: init-config + image: solsson/kafka-initutils@sha256:2cdb90ea514194d541c7b869ac15d2d530ca64889f56e270161fe4e5c3d076ea + env: + - name: NODE_NAME + valueFrom: + fieldRef: + fieldPath: spec.nodeName + - name: POD_NAME + valueFrom: + fieldRef: + fieldPath: metadata.name + - name: POD_NAMESPACE + valueFrom: + fieldRef: + fieldPath: metadata.namespace + command: ['/bin/bash', '/etc/kafka-configmap/init.sh'] + volumeMounts: + - name: configmap + mountPath: /etc/kafka-configmap + - name: config + mountPath: /etc/kafka + - name: extensions + mountPath: /opt/kafka/libs/extensions + containers: + - name: kafka + image: solsson/kafka:2.1.0@sha256:ac3f06d87d45c7be727863f31e79fbfdcb9c610b51ba9cf03c75a95d602f15e1 + env: + - name: CLASSPATH + value: /opt/kafka/libs/extensions/* + - name: KAFKA_LOG4J_OPTS + value: -Dlog4j.configuration=file:/etc/kafka/log4j.properties + - name: JMX_PORT + value: "5555" + ports: + - name: inside + containerPort: 9092 + - name: outside + containerPort: 9094 + - name: jmx + containerPort: 5555 + command: + - ./bin/kafka-server-start.sh + - /etc/kafka/server.properties + lifecycle: + preStop: + exec: + command: ["sh", "-ce", "kill -s TERM 1; while $(kill -0 1 2>/dev/null); do sleep 1; done"] + resources: + requests: + cpu: 400m + memory: 2048Mi + limits: + # This limit was intentionally set low as a reminder that + # the entire Yolean/kubernetes-kafka is meant to be tweaked + # before you run production workloads + memory: 4096Mi + readinessProbe: + tcpSocket: + port: 9092 + timeoutSeconds: 1 + volumeMounts: + - name: config + mountPath: /etc/kafka + - name: data + mountPath: /var/lib/kafka/data + - name: extensions + mountPath: /opt/kafka/libs/extensions + volumes: + - name: configmap + configMap: + name: broker-config + - name: config + emptyDir: {} + - name: extensions + emptyDir: {} + nodeSelector: + cloud.google.com/gke-nodepool: default-pool + volumeClaimTemplates: + - metadata: + name: data + spec: + accessModes: [ "ReadWriteOnce" ] + storageClassName: kafka + resources: + requests: + storage: 5Gi +--- +apiVersion: v1 +kind: Service +metadata: + name: kafka + namespace: pinot-quickstart +spec: + ports: + # [podname].kafka.pinot-quickstart.svc.cluster.local + - port: 9092 + clusterIP: None + selector: + app: kafka diff --git a/kubernetes/skaffold/pinot-broker.yml b/kubernetes/skaffold/pinot-broker.yml new file mode 100644 index 0000000..dfd92d0 --- /dev/null +++ b/kubernetes/skaffold/pinot-broker.yml @@ -0,0 +1,69 @@ +apiVersion: v1 +kind: List +items: + - apiVersion: v1 + kind: ConfigMap + metadata: + name: pinot-broker-config + namespace: pinot-quickstart + data: + pinot-broker.conf: |- + pinot.broker.client.queryPort=8099 + pinot.broker.routing.table.builder.class=random + pinot.set.instance.id.to.hostname=true + + - apiVersion: apps/v1 + kind: StatefulSet + metadata: + name: pinot-broker + namespace: pinot-quickstart + spec: + selector: + matchLabels: + app: pinot-broker + serviceName: pinot-broker + replicas: 3 + updateStrategy: + type: RollingUpdate + podManagementPolicy: Parallel + template: + metadata: + labels: + app: pinot-broker + spec: + terminationGracePeriodSeconds: 30 + containers: + - image: winedepot/pinot:kafka2 + imagePullPolicy: Always + name: pinot-broker + args: [ + "StartBroker", + "-clusterName", "pinot-quickstart", + "-zkAddress", "zookeeper:2181/pinot", + "-configFileName", "/var/pinot/broker/config/pinot-broker.conf" + ] + ports: + - containerPort: 8099 + protocol: TCP + volumeMounts: + - name: config + mountPath: /var/pinot/broker/config + restartPolicy: Always + volumes: + - name: config + configMap: + name: pinot-broker-config + nodeSelector: + cloud.google.com/gke-nodepool: default-pool + - apiVersion: v1 + kind: Service + metadata: + name: pinot-broker + namespace: pinot-quickstart + spec: + ports: + # [podname].pinot-broker.pinot-quickstart.svc.cluster.local + - port: 8099 + clusterIP: None + selector: + app: pinot-broker diff --git a/kubernetes/skaffold/pinot-controller.yml b/kubernetes/skaffold/pinot-controller.yml new file mode 100644 index 0000000..ce9b83b --- /dev/null +++ b/kubernetes/skaffold/pinot-controller.yml @@ -0,0 +1,87 @@ +apiVersion: v1 +kind: List +items: + - apiVersion: v1 + kind: ConfigMap + metadata: + name: pinot-controller-config + namespace: pinot-quickstart + data: + pinot-controller.conf: |- + controller.helix.cluster.name=pinot-quickstart + controller.host=pinot-controller-0.pinot-controller + controller.port=9000 + controller.vip.host=pinot-controller + controller.vip.port=9000 + controller.data.dir=/var/pinot/controller/data + controller.zk.str=zookeeper:2181/pinot + pinot.set.instance.id.to.hostname=true + - apiVersion: apps/v1 + kind: StatefulSet + metadata: + name: pinot-controller + namespace: pinot-quickstart + spec: + selector: + matchLabels: + app: pinot-controller + serviceName: pinot-controller + replicas: 1 + updateStrategy: + type: RollingUpdate + podManagementPolicy: Parallel + template: + metadata: + labels: + app: pinot-controller + spec: + terminationGracePeriodSeconds: 30 + containers: + - image: winedepot/pinot:kafka2 + imagePullPolicy: Always + name: pinot-controller + args: [ + "StartController", + "-configFileName", "/var/pinot/controller/config/pinot-controller.conf" + ] + ports: + - containerPort: 9000 + protocol: TCP + volumeMounts: + - name: config + mountPath: /var/pinot/controller/config + - name: pinot-controller-storage + mountPath: /var/pinot/controller/data + initContainers: + - name: create-zk-root-path + image: solsson/kafka:2.1.0@sha256:ac3f06d87d45c7be727863f31e79fbfdcb9c610b51ba9cf03c75a95d602f15e1 + command: ["bin/zookeeper-shell.sh", "ZooKeeper", "-server", "zookeeper:2181", "create", "/pinot", "\"\""] + restartPolicy: Always + volumes: + - name: config + configMap: + name: pinot-controller-config + nodeSelector: + cloud.google.com/gke-nodepool: default-pool + volumeClaimTemplates: + - metadata: + name: pinot-controller-storage + spec: + accessModes: + - ReadWriteOnce + storageClassName: "pinot-controller" + resources: + requests: + storage: 10G + - apiVersion: v1 + kind: Service + metadata: + name: pinot-controller + namespace: pinot-quickstart + spec: + ports: + # [podname].pinot-controller.pinot-quickstart.svc.cluster.local + - port: 9000 + clusterIP: None + selector: + app: pinot-controller diff --git a/kubernetes/skaffold/pinot-example-loader.yml b/kubernetes/skaffold/pinot-example-loader.yml new file mode 100644 index 0000000..2b402c8 --- /dev/null +++ b/kubernetes/skaffold/pinot-example-loader.yml @@ -0,0 +1,24 @@ +apiVersion: batch/v1 +kind: Job +metadata: + name: pinot-example-loader + namespace: pinot-quickstart +spec: + template: + spec: + containers: + - name: loading-data-to-kafka + image: winedepot/pinot:kafka2 + args: [ "StreamAvroIntoKafka", "-avroFile", "sample_data/airlineStats_data.avro", "-kafkaTopic", "flights-realtime", "-kafkaBrokerList", "kafka:9092", "-zkAddress", "zookeeper:2181" ] + - name: pinot-add-example-schema + image: winedepot/pinot:kafka2 + args: [ "AddSchema", "-schemaFile", "sample_data/airlineStats_schema.json", "-controllerHost", "pinot-controller", "-controllerPort", "9000", "-exec" ] + - name: pinot-add-example-realtime-table + image: winedepot/pinot:kafka2 + args: [ "AddTable", "-filePath", "sample_data/docker/airlineStats_realtime_table_config.json", "-controllerHost", "pinot-controller", "-controllerPort", "9000", "-exec" ] + restartPolicy: OnFailure + nodeSelector: + cloud.google.com/gke-nodepool: default-pool + backoffLimit: 3 + + diff --git a/kubernetes/skaffold/pinot-server.yml b/kubernetes/skaffold/pinot-server.yml new file mode 100644 index 0000000..895498b --- /dev/null +++ b/kubernetes/skaffold/pinot-server.yml @@ -0,0 +1,82 @@ +apiVersion: v1 +kind: List +items: + - apiVersion: v1 + kind: ConfigMap + metadata: + name: pinot-server-config + namespace: pinot-quickstart + data: + pinot-server.conf: |- + pinot.server.netty.port=8098 + pinot.server.adminapi.port=8097 + pinot.server.instance.dataDir=/var/pinot/server/data/index + pinot.server.instance.segmentTarDir=/var/pinot/server/data/segmentTar + pinot.set.instance.id.to.hostname=true + - apiVersion: apps/v1 + kind: StatefulSet + metadata: + name: pinot-server + namespace: pinot-quickstart + spec: + selector: + matchLabels: + app: pinot-server + serviceName: pinot-server + replicas: 1 + updateStrategy: + type: RollingUpdate + podManagementPolicy: Parallel + template: + metadata: + labels: + app: pinot-server + spec: + terminationGracePeriodSeconds: 30 + containers: + - image: winedepot/pinot:kafka2 + imagePullPolicy: Always + name: pinot-server + args: [ + "StartServer", + "-clusterName", "pinot-quickstart", + "-zkAddress", "zookeeper:2181/pinot", + "-configFileName", "/var/pinot/server/config/pinot-server.conf" + ] + ports: + - containerPort: 8098 + protocol: TCP + volumeMounts: + - name: config + mountPath: /var/pinot/server/config + - name: pinot-server-storage + mountPath: /var/pinot/server/data + restartPolicy: Always + volumes: + - name: config + configMap: + name: pinot-server-config + nodeSelector: + cloud.google.com/gke-nodepool: pinot-server-pool + volumeClaimTemplates: + - metadata: + name: pinot-server-storage + spec: + accessModes: + - ReadWriteOnce + storageClassName: "pinot-server" + resources: + requests: + storage: 10G + - apiVersion: v1 + kind: Service + metadata: + name: pinot-server + namespace: pinot-quickstart + spec: + ports: + # [podname].pinot-server.pinot-quickstart.svc.cluster.local + - port: 8098 + clusterIP: None + selector: + app: pinot-server diff --git a/kubernetes/skaffold/query-pinot-data.sh b/kubernetes/skaffold/query-pinot-data.sh new file mode 100755 index 0000000..d7b000c --- /dev/null +++ b/kubernetes/skaffold/query-pinot-data.sh @@ -0,0 +1,9 @@ +#!/usr/bin/env bash +if [[ $(nc -z localhost 9000) != 0 ]]; then + kubectl port-forward service/pinot-controller 9000:9000 -n pinot-quickstart > /dev/null & +fi +sleep 2 +open http://localhost:9000/query +# Just for blocking +tail -f /dev/null +pkill -f "kubectl port-forward service/pinot-controller 9000:9000 -n pinot-quickstart" diff --git a/kubernetes/skaffold/setup.sh b/kubernetes/skaffold/setup.sh new file mode 100755 index 0000000..9a3fec1 --- /dev/null +++ b/kubernetes/skaffold/setup.sh @@ -0,0 +1,33 @@ +#!/usr/bin/env bash +set -e +if [[ -z "${GCLOUD_EMAIL}" ]] || [[ -z "${GCLOUD_PROJECT}" ]] +then + echo "Please set both \$GCLOUD_EMAIL and \$GCLOUD_PROJECT variables. E.g. GCLOUD_PROJECT=pinot-demo [email protected] ./setup.sh" + exit 1 +fi + +GCLOUD_ZONE=us-west1-b +GCLOUD_CLUSTER=pinot-quickstart +GCLOUD_MACHINE_TYPE=n1-standard-2 +GCLOUD_NUM_NODES=1 +gcloud container clusters create ${GCLOUD_CLUSTER} \ + --num-nodes=${GCLOUD_NUM_NODES} \ + --machine-type=${GCLOUD_MACHINE_TYPE} \ + --zone=${GCLOUD_ZONE} \ + --project=${GCLOUD_PROJECT} +gcloud container clusters get-credentials ${GCLOUD_CLUSTER} --zone ${GCLOUD_ZONE} --project ${GCLOUD_PROJECT} + + +GCLOUD_MACHINE_TYPE=n1-standar-8 +gcloud container node-pools create pinot-server-pool \ + --cluster=${GCLOUD_CLUSTER} \ + --machine-type=${GCLOUD_MACHINE_TYPE} \ + --num-nodes=${GCLOUD_NUM_NODES} \ + --zone=${GCLOUD_ZONE} \ + --project=${GCLOUD_PROJECT} + + +kubectl create namespace pinot-quickstart + +kubectl create clusterrolebinding cluster-admin-binding --clusterrole cluster-admin --user ${GCLOUD_EMAIL} +kubectl create clusterrolebinding add-on-cluster-admin --clusterrole=cluster-admin --serviceaccount=pinot-quickstart:default diff --git a/kubernetes/skaffold/skaffold.yaml b/kubernetes/skaffold/skaffold.yaml new file mode 100644 index 0000000..1a6052d --- /dev/null +++ b/kubernetes/skaffold/skaffold.yaml @@ -0,0 +1,17 @@ +apiVersion: skaffold/v1beta7 +kind: Config +build: + artifacts: +deploy: + kubectl: + manifests: + - ./gke-storageclass-zk-pd.yml + - ./gke-storageclass-kafka-pd.yml + - ./gke-storageclass-pinot-controller-pd.yml + - ./gke-storageclass-pinot-server-pd.yml + - ./zookeeper.yml + - ./kafka.yml + - ./pinot-controller.yml + - ./pinot-broker.yml + - ./pinot-server.yml + - ./pinot-example-loader.yml diff --git a/kubernetes/skaffold/zookeeper.yml b/kubernetes/skaffold/zookeeper.yml new file mode 100644 index 0000000..abb010f --- /dev/null +++ b/kubernetes/skaffold/zookeeper.yml @@ -0,0 +1,61 @@ +apiVersion: v1 +kind: Service +metadata: + name: zookeeper + namespace: pinot-quickstart +spec: + ports: + - name: zookeeper + port: 2181 + selector: + app: zookeeper +--- +apiVersion: apps/v1 +kind: StatefulSet +metadata: + name: zookeeper + namespace: pinot-quickstart +spec: + selector: + matchLabels: + app: zookeeper + serviceName: "zookeeper" + template: + metadata: + labels: + app: zookeeper + spec: + terminationGracePeriodSeconds: 10 + containers: + - name: zookeeper + image: solsson/kafka:2.1.0@sha256:ac3f06d87d45c7be727863f31e79fbfdcb9c610b51ba9cf03c75a95d602f15e1 + env: + - name: JMX_PORT + value: "5555" + ports: + - name: zookeeper + containerPort: 2181 + - name: jmx + containerPort: 5555 + volumeMounts: + - name: zookeeper-storage + mountPath: /tmp/zookeeper + command: + - ./bin/zookeeper-server-start.sh + - config/zookeeper.properties + readinessProbe: + tcpSocket: + port: 2181 + timeoutSeconds: 1 + nodeSelector: + cloud.google.com/gke-nodepool: default-pool + volumeClaimTemplates: + - metadata: + name: zookeeper-storage + spec: + accessModes: + - ReadWriteOnce + storageClassName: "zookeeper" + resources: + requests: + storage: 10G --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
