[
https://issues.apache.org/jira/browse/BIGTOP-989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292260#comment-14292260
]
Mark Grover commented on BIGTOP-989:
------------------------------------
Thanks Mani! This is looking great! I have a few more comments, my apologies, I
missed them in the first run.
1. do-component-build checks for presence of GIT_REPO and if it's set, creates
a temporary tarball. I don't think that's how the current Bigtop build workflow
works. We don't ever define GIT_REPO, so I'd suggest getting rid of the bottom
else clause completely. Would you agree?
2. I noticed that zookeeper-server is now a dependency of kafka-server and
zookeeper is dependency of kafka package. I think this is much better than the
previous patch. And, while I totally agree with the dependency of kafka on
zookeeper I don't feel that the kafka-server needs to depend on
zookeeper-server. The reason is that kafka just needs to find a zookeeper
ensemble, for that you don't need to have the ensemble running on the same
nodes as kafka. So, simply having a dependency of kafka on zookeeper (which
enables it to connect to a zookeeper ensemble - whether on the same node or a
different node), should be enough. If people want to run a zookeeper server on
the same node, they will follow the zookeeper documentation to install, setup
and run a zookeeper server on that node and end up install zookeeper-server
package on it anyways. We don't have to install it automatically.
3. I noticed that when we create the kafka user, we do:
{code}
getent passwd kafka > /dev/null || useradd -c "kafka" -s /bin/bash -g kafka -d
/var/lib/kafka kafka 2> /dev/null || :
{code}
The long name here is 'kafka'. It's not a big deal, but historically, our
convention has been to have long name starting with an upper case letter so
that would make it 'Kafka'. See
[spark|https://github.com/apache/bigtop/blob/master/bigtop-packages/src/rpm/spark/SPECS/spark.spec#L130],
[hadoop|https://github.com/apache/bigtop/blob/master/bigtop-packages/src/rpm/hadoop/SPECS/hadoop.spec#L539]
and
[hive|https://github.com/apache/bigtop/blob/master/bigtop-packages/src/rpm/hive/SPECS/hive.spec#L287],
for example.
4. In the .mk file, I noticed:
{code}
KAFKA_BASE_VERSION=0.8.1.1
KAFKA_PKG_VERSION=0.9.1
{code}
Is that intentional?
And, this is a question than a suggestion:
In line 125 of install_kafka.sh, we have:
{code}
for file in kafka-console-consumer.sh kafka-console-producer.sh
kafka-run-class.sh kafka-topics.sh
{code}
That looks fragile in case new binaries get added to kafka's bin directory. Are
there any {{.sh}} files in that directory that we don't want to create shell
wrappers for? Is that why we are spelling them out?
> Add Apache Kafka to Apache Bigtop
> ----------------------------------
>
> Key: BIGTOP-989
> URL: https://issues.apache.org/jira/browse/BIGTOP-989
> Project: Bigtop
> Issue Type: New Feature
> Components: debian
> Affects Versions: 0.6.0
> Reporter: Diederik van Liere
> Labels: features
> Fix For: backlog
>
> Attachments: BIGTOP-989-1.patch, BIGTOP-989-2.patch,
> BIGTOP-989-3.patch, BIGTOP-989-4.patch, BIGTOP-989-5.patch, BIGTOP-989.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)