[jira] [Created] (ATLAS-1138) UI: Show entity type with the entity name

2016-08-25 Thread Shwetha G S (JIRA)
Shwetha G S created ATLAS-1138:
--

 Summary: UI: Show entity type with the entity name
 Key: ATLAS-1138
 URL: https://issues.apache.org/jira/browse/ATLAS-1138
 Project: Atlas
  Issue Type: Bug
Reporter: Shwetha G S


When atlas has entities from different components like mysql table, hdfs path, 
hie table, the entity names may be same as they represent the same data. In 
lineage graph and entity details page, all these show up with same name and its 
confusing for the user. Displaying entity type with name will make it less 
confusing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ATLAS-1131) Atlas hooks produce excess Kafka logs if Kafka topic is not created already

2016-08-25 Thread Manikumar Reddy (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436927#comment-15436927
 ] 

Manikumar Reddy commented on ATLAS-1131:


Kafka Producer logs these warn messages, when it is not able to fetch topic 
metadata. 
The reasons for this are unavailability of Topic metadata/Leader/Topic not 
existing etc.

Kafka default settings should be fine for production deployments. Also the warn 
messages
should not be a problem on a healthy cluster.  Kafka producer metadata fetch 
logic is complex
and difficult to bring any changes.

During integration/unit tests, If we want to reduce the the log volume, then we 
can decrease
the max.block.ms to 30 and increase retry-backoff-ms to 1000. 


> Atlas hooks produce excess Kafka logs if Kafka topic is not created already
> ---
>
> Key: ATLAS-1131
> URL: https://issues.apache.org/jira/browse/ATLAS-1131
> Project: Atlas
>  Issue Type: Bug
>Reporter: Vimal Sharma
>Assignee: Vimal Sharma
> Attachments: ATLAS-1131.patch
>
>
> Hooks for Atlas publish messages to a Kafka topic named ATLAS_HOOK. If this 
> topic is not present and Atlas does not have permission to create this topic, 
> Kafka produces excessive logs as below (example in /tmp/hive/hive.log).
> 2016-08-22 06:43:47,655 WARN  [kafka-producer-network-thread | producer-1]: 
> clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while 
> fetching metadata with correlation id 1177 : 
> {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION}
> 2016-08-22 06:43:47,756 WARN  [kafka-producer-network-thread | producer-1]: 
> clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while 
> fetching metadata with correlation id 1178 : 
> {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION}
> 2016-08-22 06:43:47,858 WARN  [kafka-producer-network-thread | producer-1]: 
> clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while 
> fetching metadata with correlation id 1179 : 
> {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION}
> 2016-08-22 06:43:47,961 WARN  [kafka-producer-network-thread | producer-1]: 
> clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while 
> fetching metadata with correlation id 1180 : 
> {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION}
> 2016-08-22 06:43:48,062 WARN  [kafka-producer-network-thread | producer-1]: 
> clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while 
> fetching metadata with correlation id 1181 : 
> {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION}
> 2016-08-22 06:43:48,165 WARN  [kafka-producer-network-thread | producer-1]: 
> clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while 
> fetching metadata with correlation id 1182 : 
> {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION}
> 2016-08-22 06:43:48,265 WARN  [kafka-producer-network-thread | producer-1]: 
> clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while 
> fetching metadata with correlation id 1183 : 
> {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION}
> 2016-08-22 06:43:48,366 WARN  [kafka-producer-network-thread | producer-1]: 
> clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while 
> fetching metadata with correlation id 1184 : 
> {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION}
> 2016-08-22 06:43:48,467 WARN  [kafka-producer-network-thread | producer-1]: 
> clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while 
> fetching metadata with correlation id 1185 : 
> {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ATLAS-597) Handle Cluster renaming

2016-08-25 Thread Suma Shivaprasad (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436374#comment-15436374
 ] 

Suma Shivaprasad commented on ATLAS-597:


This change will require all the existing entities in the repository to have 
the clusterId in qualifiedName instead of clusterName and will require a 
migration before upgrade.

> Handle Cluster renaming
> ---
>
> Key: ATLAS-597
> URL: https://issues.apache.org/jira/browse/ATLAS-597
> Project: Atlas
>  Issue Type: Sub-task
>Affects Versions: 0.7-incubating
>Reporter: Suma Shivaprasad
>
> Cluster name could easily be changed by users which is used in qualified 
> names to dedup entities. For eg: hive_table has a qualifiedName 
>  . Hence all entities will be recreated if they 
> are renamed unless a migration on repository is done with new cluster name.
> Hence cluster  should be represented by an id which is constant and tied to 
> clusterName which could change



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (ATLAS-597) Handle Cluster renaming

2016-08-25 Thread Suma Shivaprasad (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436374#comment-15436374
 ] 

Suma Shivaprasad edited comment on ATLAS-597 at 8/25/16 6:55 AM:
-

This change will require all the existing entities in the repository to have 
the clusterId in qualifiedName instead of clusterName and will require a 
migration before ATLAS upgrade.


was (Author: suma.shivaprasad):
This change will require all the existing entities in the repository to have 
the clusterId in qualifiedName instead of clusterName and will require a 
migration before upgrade.

> Handle Cluster renaming
> ---
>
> Key: ATLAS-597
> URL: https://issues.apache.org/jira/browse/ATLAS-597
> Project: Atlas
>  Issue Type: Sub-task
>Affects Versions: 0.7-incubating
>Reporter: Suma Shivaprasad
>
> Cluster name could easily be changed by users which is used in qualified 
> names to dedup entities. For eg: hive_table has a qualifiedName 
>  . Hence all entities will be recreated if they 
> are renamed unless a migration on repository is done with new cluster name.
> Hence cluster  should be represented by an id which is constant and tied to 
> clusterName which could change



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ATLAS-597) Handle Cluster renaming

2016-08-25 Thread Suma Shivaprasad (JIRA)

 [ 
https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated ATLAS-597:
---
Description: 
Cluster name could easily be changed by users which is used in qualified names 
to dedup entities. For eg: hive_table has a qualifiedName 
 . Hence all entities will be recreated if they 
are renamed unless a mass update on repository is done with new cluster name.

Hence cluster  should be represented by an id which is constant and tied to 
clusterName which could change

  was:
Cluster name could easily be changed by users which is used in qualified names 
to dedup entities. For eg: hive_table as  . Hence 
all entities will be recreated if they are renamed unless a mass update on 
repository is done with new cluster name.

Hence cluster  should be represented by an id which is constant and tied to 
clusterName which could change


> Handle Cluster renaming
> ---
>
> Key: ATLAS-597
> URL: https://issues.apache.org/jira/browse/ATLAS-597
> Project: Atlas
>  Issue Type: Sub-task
>Affects Versions: 0.7-incubating
>Reporter: Suma Shivaprasad
>
> Cluster name could easily be changed by users which is used in qualified 
> names to dedup entities. For eg: hive_table has a qualifiedName 
>  . Hence all entities will be recreated if they 
> are renamed unless a mass update on repository is done with new cluster name.
> Hence cluster  should be represented by an id which is constant and tied to 
> clusterName which could change



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ATLAS-1136) %spark.r interpreter is not working in Zeppelin 0.6.1

2016-08-25 Thread Pari Margu (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436367#comment-15436367
 ] 

Pari Margu commented on ATLAS-1136:
---

I am having Spark 1.6.2 cluster with Hadoop YARN, Oozie. I have installed 
Zeppelin 0.6.1(Binary package with all interpreters: 
zeppelin-0.6.1-bin-all.tgz). When I am trying to use SparkR script with 
%spark.r interpreter,
%spark.r
Creating SparkConext and connecting to Cloudant DB
sc1 <- sparkR.init(sparkEnv = 
list("cloudant.host"="host_name","cloudant.username"="user_name","cloudant.password"="password",
 "jsonstore.rdd.schemaSampleSize"="-1"))
Database to be connected to extract the data
database <- "sensordata"
Creating Spark SQL Context
sqlContext <- sparkRSQL.init(sc)
Creating DataFrame for the "sensordata" Cloudant DB
sensorDataDF <- read.df(sqlContext, database, header='true', source = 
"com.cloudant.spark",inferSchema='true')
Get basic information about the DataFrame(sensorDataDF)
printSchema(sensorDataDF)
I am getting the following error(log):
ERROR [2016-08-25 03:28:37,336] (
{Thread-77}
JobProgressPoller.java[run]:54) - Can not get or update progress
org.apache.zeppelin.interpreter.InterpreterException: 
org.apache.thrift.transport.TTransportException
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getProgress(RemoteInterpreter.java:373)
at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:111)
at org.apache.zeppelin.notebook.Paragraph.progress(Paragraph.java:237)
at 
org.apache.zeppelin.scheduler.JobProgressPoller.run(JobProgressPoller.java:51)
Caused by: org.apache.thrift.transport.TTransportException
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at 
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_getProgress(RemoteInterpreterService.java:296)
at 
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.getProgress(RemoteInterpreterService.java:281)
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getProgress(RemoteInterpreter.java:370)
... 3 more
Help would be much appreciated.

> %spark.r interpreter is not working in Zeppelin 0.6.1
> -
>
> Key: ATLAS-1136
> URL: https://issues.apache.org/jira/browse/ATLAS-1136
> Project: Atlas
>  Issue Type: Bug
>Reporter: Pari Margu
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ATLAS-1136) %spark.r interpreter is not working in Zeppelin 0.6.1

2016-08-25 Thread Pari Margu (JIRA)
Pari Margu created ATLAS-1136:
-

 Summary: %spark.r interpreter is not working in Zeppelin 0.6.1
 Key: ATLAS-1136
 URL: https://issues.apache.org/jira/browse/ATLAS-1136
 Project: Atlas
  Issue Type: Bug
Reporter: Pari Margu






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (ATLAS-597) Handle Cluster renaming

2016-08-25 Thread Suma Shivaprasad (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299530#comment-15299530
 ] 

Suma Shivaprasad edited comment on ATLAS-597 at 8/25/16 6:45 AM:
-

Cluster entity

The cluster entity will have the following properties

name
description
list aliases - This will hold older cluster names or previous aliases
 

Hook changes for Entity updates/creates /deletes

This also means that the hooks will have to pass the qualifiedName in entity 
creates/updates/deletes  as  "dbName.tableName@clusterId" instead of the 
"dbName.tableName@clusterName"


was (Author: suma.shivaprasad):
The cluster entity will have the following properties

name
description
list aliases - This will hold older cluster names or previous aliases
 
This also means that the hooks will have to pass the qualifiedName in entity 
creates/updates such as  "dbName.tableName@clusterId" instead of the 
"dbName.tableName@clusterName"

> Handle Cluster renaming
> ---
>
> Key: ATLAS-597
> URL: https://issues.apache.org/jira/browse/ATLAS-597
> Project: Atlas
>  Issue Type: Sub-task
>Affects Versions: 0.7-incubating
>Reporter: Suma Shivaprasad
>
> Cluster name could easily be changed by users which is used in qualified 
> names to dedup entities. For eg: hive_table as  
> . Hence all entities will be recreated if they are renamed unless a mass 
> update on repository is done with new cluster name.
> Hence cluster  should be represented by an id which is constant and tied to 
> clusterName which could change



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (ATLAS-597) Handle Cluster renaming

2016-08-25 Thread Suma Shivaprasad (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436357#comment-15436357
 ] 

Suma Shivaprasad edited comment on ATLAS-597 at 8/25/16 6:43 AM:
-

The cluster entities can be updated through the entity API . 


was (Author: suma.shivaprasad):
The cluster entities can be updated through the entity API 

> Handle Cluster renaming
> ---
>
> Key: ATLAS-597
> URL: https://issues.apache.org/jira/browse/ATLAS-597
> Project: Atlas
>  Issue Type: Sub-task
>Affects Versions: 0.7-incubating
>Reporter: Suma Shivaprasad
>
> Cluster name could easily be changed by users which is used in qualified 
> names to dedup entities. For eg: hive_table as  
> . Hence all entities will be recreated if they are renamed unless a mass 
> update on repository is done with new cluster name.
> Hence cluster  should be represented by an id which is constant and tied to 
> clusterName which could change



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ATLAS-597) Handle Cluster renaming

2016-08-25 Thread Suma Shivaprasad (JIRA)

 [ 
https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated ATLAS-597:
---
Description: 
Cluster name could easily be changed by users which is used in qualified names 
to dedup entities. For eg: hive_table as  . Hence 
all entities will be recreated if they are renamed unless a mass update on 
repository is done with new cluster name.

Hence cluster  should be represented by an id which is constant and tied to 
clusterName which could change

  was:
Cluster name could easily be changed by users which is used in qualified names 
to dedup entities. For eg: hive_table as  . Hence 
all entities will be recreated if they are created again unless a mass update 
on repository is done with new cluster name.

Hence cluster  should be represented by an id which is constant and tied to 
clusterName which could change


> Handle Cluster renaming
> ---
>
> Key: ATLAS-597
> URL: https://issues.apache.org/jira/browse/ATLAS-597
> Project: Atlas
>  Issue Type: Sub-task
>Affects Versions: 0.7-incubating
>Reporter: Suma Shivaprasad
>
> Cluster name could easily be changed by users which is used in qualified 
> names to dedup entities. For eg: hive_table as  
> . Hence all entities will be recreated if they are renamed unless a mass 
> update on repository is done with new cluster name.
> Hence cluster  should be represented by an id which is constant and tied to 
> clusterName which could change



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ATLAS-597) Handle Cluster renaming

2016-08-25 Thread Suma Shivaprasad (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436357#comment-15436357
 ] 

Suma Shivaprasad commented on ATLAS-597:


The cluster entities can be updated through the entity API 

> Handle Cluster renaming
> ---
>
> Key: ATLAS-597
> URL: https://issues.apache.org/jira/browse/ATLAS-597
> Project: Atlas
>  Issue Type: Sub-task
>Affects Versions: 0.7-incubating
>Reporter: Suma Shivaprasad
>
> Cluster name could easily be changed by users which is used in qualified 
> names to dedup entities. For eg: hive_table as  
> . Hence all entities will be recreated if they are created again unless a 
> mass update on repository is done with new cluster name.
> Hence cluster  should be represented by an id which is constant and tied to 
> clusterName which could change



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (ATLAS-597) Handle Cluster renaming

2016-08-25 Thread Suma Shivaprasad (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272560#comment-15272560
 ] 

Suma Shivaprasad edited comment on ATLAS-597 at 8/25/16 6:36 AM:
-

This requires Cluster to be modeled as an entity.


was (Author: suma.shivaprasad):
This requires Cluster to be modeled as an entity that is updated periodically

> Handle Cluster renaming
> ---
>
> Key: ATLAS-597
> URL: https://issues.apache.org/jira/browse/ATLAS-597
> Project: Atlas
>  Issue Type: Sub-task
>Affects Versions: 0.7-incubating
>Reporter: Suma Shivaprasad
>
> Cluster name could easily be changed by users which is used in qualified 
> names to dedup entities. For eg: hive_table as  
> . Hence all entities will be recreated if they are created again unless a 
> mass update on repository is done with new cluster name.
> Hence cluster  should be represented by an id which is constant and tied to 
> clusterName which could change



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (ATLAS-597) Handle Cluster renaming

2016-08-25 Thread Suma Shivaprasad (JIRA)

[ 
https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299530#comment-15299530
 ] 

Suma Shivaprasad edited comment on ATLAS-597 at 8/25/16 6:36 AM:
-

The cluster entity will have the following properties

name
description
list aliases - This will hold older cluster names or previous aliases
 
This also means that the hooks will have to pass the qualifiedName in entity 
creates/updates such as  "dbName.tableName@clusterId" instead of the 
"dbName.tableName@clusterName"


was (Author: suma.shivaprasad):
The cluster entity will have the following properties

name
description
list aliases - This will hold other cluster names
 
The plan is to keep an inmemory copy and be able to do a  lookup by any of the 
cluster names i.e aliases and also by the primary /current name which is stored 
in "name" attribute

> Handle Cluster renaming
> ---
>
> Key: ATLAS-597
> URL: https://issues.apache.org/jira/browse/ATLAS-597
> Project: Atlas
>  Issue Type: Sub-task
>Affects Versions: 0.7-incubating
>Reporter: Suma Shivaprasad
>
> Cluster name could easily be changed by users which is used in qualified 
> names to dedup entities. For eg: hive_table as  
> . Hence all entities will be recreated if they are created again unless a 
> mass update on repository is done with new cluster name.
> Hence cluster  should be represented by an id which is constant and tied to 
> clusterName which could change



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)