[ 
https://issues.apache.org/jira/browse/BIGTOP-989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292260#comment-14292260
 ] 

Mark Grover commented on BIGTOP-989:
------------------------------------

Thanks Mani! This is looking great! I have a few more comments, my apologies, I 
missed them in the first run.

1. do-component-build checks for presence of GIT_REPO and if it's set, creates 
a temporary tarball. I don't think that's how the current Bigtop build workflow 
works. We don't ever define GIT_REPO, so I'd suggest getting rid of the bottom 
else clause completely. Would you agree?
2. I noticed that zookeeper-server is now a dependency of kafka-server and 
zookeeper is dependency of kafka package. I think this is much better than the 
previous patch. And, while I totally agree with the dependency of kafka on 
zookeeper I don't feel that the kafka-server needs to depend on 
zookeeper-server. The reason is that kafka just needs to find a zookeeper 
ensemble, for that you don't need to have the ensemble running on the same 
nodes as kafka. So, simply having a dependency of kafka on zookeeper (which 
enables it to connect to a zookeeper ensemble - whether on the same node or a 
different node), should be enough. If people want to run a zookeeper server on 
the same node, they will follow the zookeeper documentation to install, setup 
and run a zookeeper server on that node and end up install zookeeper-server 
package on it anyways. We don't have to install it automatically.
3. I noticed that when we create the kafka user, we do:
{code}
getent passwd kafka > /dev/null || useradd -c "kafka" -s /bin/bash -g kafka -d 
/var/lib/kafka kafka 2> /dev/null || :
{code}
The long name here is 'kafka'. It's not a big deal, but historically, our 
convention has been to have long name starting with an upper case letter so 
that would make it 'Kafka'.  See 
[spark|https://github.com/apache/bigtop/blob/master/bigtop-packages/src/rpm/spark/SPECS/spark.spec#L130],
 
[hadoop|https://github.com/apache/bigtop/blob/master/bigtop-packages/src/rpm/hadoop/SPECS/hadoop.spec#L539]
 and 
[hive|https://github.com/apache/bigtop/blob/master/bigtop-packages/src/rpm/hive/SPECS/hive.spec#L287],
 for example.
4. In the .mk file, I noticed:
{code}
KAFKA_BASE_VERSION=0.8.1.1
KAFKA_PKG_VERSION=0.9.1
{code}
Is that intentional?

And, this is a question than a suggestion:
In line 125 of install_kafka.sh, we have:
{code}
for file in kafka-console-consumer.sh kafka-console-producer.sh 
kafka-run-class.sh kafka-topics.sh
{code}
That looks fragile in case new binaries get added to kafka's bin directory. Are 
there any {{.sh}} files in that directory that we don't want to create shell 
wrappers for? Is that why we are spelling them out?

> Add Apache Kafka  to Apache Bigtop
> ----------------------------------
>
>                 Key: BIGTOP-989
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-989
>             Project: Bigtop
>          Issue Type: New Feature
>          Components: debian
>    Affects Versions: 0.6.0
>            Reporter: Diederik van Liere
>              Labels: features
>             Fix For: backlog
>
>         Attachments: BIGTOP-989-1.patch, BIGTOP-989-2.patch, 
> BIGTOP-989-3.patch, BIGTOP-989-4.patch, BIGTOP-989-5.patch, BIGTOP-989.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to