[jira] [Commented] (ATLAS-515) Ability to initialize Kafka topics with more than 1 replica

2016-06-09 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322330#comment-15322330
 ] 

Hemanth Yamijala commented on ATLAS-515:


The patch still does not take care of kerberized Kafka environment. However, (I 
believe) it paves the way for doing so more easily. For instance, even with 
this patch, one can manually execute this by kinit'ing as the Kafka principal 
and run the scripts and it would be fine.

> Ability to initialize Kafka topics with more than 1 replica
> ---
>
> Key: ATLAS-515
> URL: https://issues.apache.org/jira/browse/ATLAS-515
> Project: Atlas
>  Issue Type: Sub-task
>Reporter: Hemanth Yamijala
>Assignee: Hemanth Yamijala
> Fix For: 0.7-incubating
>
> Attachments: ATLAS-515-1.patch, ATLAS-515-poc.patch, ATLAS-515.patch
>
>
> Atlas uses two Kafka topics - ATLAS_HOOK and ATLAS_ENTITIES for communicating 
> with its upstream and downstream dependencies. If not explicitly created, 
> these topics are setup with a single replica. From a HA perspective, this 
> becomes a problem. This JIRA is to request for a mechanism for creating 
> topics with more than 1 replica. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ATLAS-515) Ability to initialize Kafka topics with more than 1 replica

2016-06-06 Thread ATLAS QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316660#comment-15316660
 ] 

ATLAS QA commented on ATLAS-515:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12808398/ATLAS-515.patch
  against master revision d661964.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

+1 checkstyle.  The patch generated 0 code style errors.

{color:red}-1 findbugs{color}.  The patch appears to introduce 378 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   
org.apache.atlas.repository.typestore.GraphBackedTypeStoreTest

Test results: 
https://builds.apache.org/job/PreCommit-ATLAS-Build/289//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ATLAS-Build/289//artifact/patchprocess/newPatchFindbugsWarningsrepository.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ATLAS-Build/289//artifact/patchprocess/newPatchFindbugsWarningscommon.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ATLAS-Build/289//artifact/patchprocess/newPatchFindbugsWarningsnotification.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ATLAS-Build/289//artifact/patchprocess/newPatchFindbugsWarningswebapp.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ATLAS-Build/289//artifact/patchprocess/newPatchFindbugsWarningsclient.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ATLAS-Build/289//artifact/patchprocess/newPatchFindbugsWarningstypesystem.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ATLAS-Build/289//artifact/patchprocess/newPatchFindbugsWarningsfalcon-bridge.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ATLAS-Build/289//artifact/patchprocess/newPatchFindbugsWarningssqoop-bridge.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ATLAS-Build/289//artifact/patchprocess/newPatchFindbugsWarningshive-bridge.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ATLAS-Build/289//artifact/patchprocess/newPatchFindbugsWarningsstorm-bridge.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ATLAS-Build/289//artifact/patchprocess/newPatchFindbugsWarningshdfs-model.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ATLAS-Build/289//artifact/patchprocess/newPatchFindbugsWarningsauthorization.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ATLAS-Build/289//artifact/patchprocess/newPatchFindbugsWarningscatalog.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ATLAS-Build/289//artifact/patchprocess/newPatchFindbugsWarningstitan.html
Console output: https://builds.apache.org/job/PreCommit-ATLAS-Build/289//console

This message is automatically generated.

> Ability to initialize Kafka topics with more than 1 replica
> ---
>
> Key: ATLAS-515
> URL: https://issues.apache.org/jira/browse/ATLAS-515
> Project: Atlas
>  Issue Type: Sub-task
>Reporter: Hemanth Yamijala
>Assignee: Hemanth Yamijala
> Fix For: 0.7-incubating
>
> Attachments: ATLAS-515-poc.patch, ATLAS-515.patch
>
>
> Atlas uses two Kafka topics - ATLAS_HOOK and ATLAS_ENTITIES for communicating 
> with its upstream and downstream dependencies. If not explicitly created, 
> these topics are setup with a single replica. From a HA perspective, this 
> becomes a problem. This JIRA is to request for a mechanism for creating 
> topics with more than 1 replica. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ATLAS-515) Ability to initialize Kafka topics with more than 1 replica

2016-06-03 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313956#comment-15313956
 ] 

Hemanth Yamijala commented on ATLAS-515:


As part of ATLAS-801, I am investigating options to fix this JIRA. (The intent 
of ATLAS-801 is to minimize the possibility of data loss due to Kafka 
unavailability, thus the relation with this JIRA.)

Broadly, there are two ways to do this:
* As part of Atlas server startup
* As part of Hook startup

If topics are not created up-front, the first message *produced* creates the 
topic, provided {{auto.create.topics.enable}} is set to true. (Default is 
true.) The number of replicas set up will be according to 
{{default.replication.factor}}. (Default is 1.) 

For ATLAS_HOOK, it is the hooks that produce messages. Also, I think it may be 
safer to assume that hooks like Hive will be activated first rather than Atlas, 
because of the relative importance of host components over Atlas. Hence, IMO, 
it is important to try and create topics as part of Hook startup. 

One thing that helps here is that topic creation is atomic. Hence, even if 
multiple hooks try to create topics at the same time, one of them will cleanly 
succeed and others fail. (I confirmed this offline with one of the Kafka 
committers.)

So, the logic can roughly be:
{code}
Hook startup:
if (topicExists("ATLAS_HOOK")):
replicas = ApplicationProperties.get().get("num-replicas") 
try {
createTopic("ATLAS_HOOK", numPartitions=1, replicas)
} catch (Exception e) {
log(e)
}
{code}

Same thing must be done for ATLAS_ENTITIES in the Atlas server as the server is 
producer for this topic.

One issue to be aware of is that, the number of replicas configured in Atlas 
configuration must be less than or equal to number of brokers available. If 
not, this throws an exception and defaults would be used again. Likewise, 
changes to number of replicas will not be managed once the topic is created. 
This would need to be created manually using instructions like 
[these|http://kafka.apache.org/documentation.html#basic_ops_increase_replication_factor],
 which in general appears an involved process, hence I am reluctant to support 
this from within Atlas.

The other option of creating all topics via the Atlas server would work as well 
- provided hooks are not launched first. This has the advantage of running 
under a more controlled environment like the Atlas server. The issue is the 
restriction we are imposing that if the topic is already created, we can't 
modify it.

Thoughts?

> Ability to initialize Kafka topics with more than 1 replica
> ---
>
> Key: ATLAS-515
> URL: https://issues.apache.org/jira/browse/ATLAS-515
> Project: Atlas
>  Issue Type: Sub-task
>Reporter: Hemanth Yamijala
>Assignee: Hemanth Yamijala
> Fix For: 0.7-incubating
>
>
> Atlas uses two Kafka topics - ATLAS_HOOK and ATLAS_ENTITIES for communicating 
> with its upstream and downstream dependencies. If not explicitly created, 
> these topics are setup with a single replica. From a HA perspective, this 
> becomes a problem. This JIRA is to request for a mechanism for creating 
> topics with more than 1 replica. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)