[ 
https://issues.apache.org/jira/browse/ATLAS-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15055902#comment-15055902
 ] 

Suma Shivaprasad edited comment on ATLAS-182 at 12/14/15 12:45 PM:
-------------------------------------------------------------------

Initial review comments. Thanks to [~yhemanth] for the review

1. pom.xml - The dependencies could be removed in storm hook pom since they are 
being added by parent pom already

+        <dependency>
+            <groupId>org.slf4j</groupId>
+            <artifactId>slf4j-api</artifactId>
+        </dependency>
+
+        <dependency>
+            <groupId>org.slf4j</groupId>
+            <artifactId>slf4j-log4j12</artifactId>
+        </dependency>


2. pom.xml -  httpConnector port needs to be changed to 31000 and stop port to 
310001 - Pls refer 
https://github.com/apache/incubator-atlas/blob/master/webapp/pom.xml

3.Whats the use of "endTime" attribute in "Topology" .  Should we removed  
"endTime"  - Didnt see it geting used anywhere?

4.Topology "id" is used to indicate a run id or instance id ? Didnt understand 
why we need to capture lineage between two "DataSet"s across different runs of 
the same Topology?  We could just capture it a Topology level and leave out the 
"instance" id?

5. Is there anything in the Topology conf that is of interest/searchable since 
"conf" ~ (map(string, string), optional) - could be huge for a Storm topology? 

6. "name" attribute could be removed in KAFKA, HBase, HDFS and JMS Data Set 
since its already part of "DataSet"

7. Need to document that Hive Data Model needs to be created before the Storm 
Data Model

8. JMS_TOPIC can be removed since its not getting used in the Hook ?

9. HBASE_TABLE and HDFS_DATA_SET could be renamed to STORM_SINK_HBASE_TABLE and 
STORM_SINK_HDFS_PATH since we are planning to have a generic model for these 
anyways and will conflict with the names and maybe the model also then? This 
will also need a migration story when we have the generic models.

10. We should also add "clusterName" to  kafka topic, hdfs and hbase path



was (Author: suma.shivaprasad):
Initial review comments

1. pom.xml - The dependencies could be removed in storm hook pom since they are 
being added by parent pom already

+        <dependency>
+            <groupId>org.slf4j</groupId>
+            <artifactId>slf4j-api</artifactId>
+        </dependency>
+
+        <dependency>
+            <groupId>org.slf4j</groupId>
+            <artifactId>slf4j-log4j12</artifactId>
+        </dependency>


2. pom.xml -  httpConnector port needs to be changed to 31000 and stop port to 
310001 - Pls refer 
https://github.com/apache/incubator-atlas/blob/master/webapp/pom.xml

3.Whats the use of "endTime" attribute in "Topology" .  Should we removed  
"endTime"  - Didnt see it geting used anywhere?

4.Topology "id" is used to indicate a run id or instance id ? Didnt understand 
why we need to capture lineage between two "DataSet"s across different runs of 
the same Topology?  We could just capture it a Topology level and leave out the 
"instance" id?

5. Is there anything in the Topology conf that is of interest/searchable since 
"conf" ~ (map(string, string), optional) - could be huge for a Storm topology? 

6. "name" attribute could be removed in KAFKA, HBase, HDFS and JMS Data Set 
since its already part of "DataSet"

7. Need to document that Hive Data Model needs to be created before the Storm 
Data Model

8. JMS_TOPIC can be removed since its not getting used in the Hook ?

9. HBASE_TABLE and HDFS_DATA_SET could be renamed to STORM_SINK_HBASE_TABLE and 
STORM_SINK_HDFS_PATH since we are planning to have a generic model for these 
anyways and will conflict with the names and maybe the model also then? This 
will also need a migration story when we have the generic models.

10. We should also add "clusterName" to  kafka topic, hdfs and hbase path


> Add data model for Storm topology elements
> ------------------------------------------
>
>                 Key: ATLAS-182
>                 URL: https://issues.apache.org/jira/browse/ATLAS-182
>             Project: Atlas
>          Issue Type: Sub-task
>    Affects Versions: 0.6-incubating
>            Reporter: Venkatesh Seetharam
>            Assignee: Venkatesh Seetharam
>             Fix For: 0.6-incubating
>
>         Attachments: ATLAS-182-v1.patch, ATLAS-182.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to