[jira] [Created] (ATLAS-1138) UI: Show entity type with the entity name
Shwetha G S created ATLAS-1138: -- Summary: UI: Show entity type with the entity name Key: ATLAS-1138 URL: https://issues.apache.org/jira/browse/ATLAS-1138 Project: Atlas Issue Type: Bug Reporter: Shwetha G S When atlas has entities from different components like mysql table, hdfs path, hie table, the entity names may be same as they represent the same data. In lineage graph and entity details page, all these show up with same name and its confusing for the user. Displaying entity type with name will make it less confusing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-1131) Atlas hooks produce excess Kafka logs if Kafka topic is not created already
[ https://issues.apache.org/jira/browse/ATLAS-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436927#comment-15436927 ] Manikumar Reddy commented on ATLAS-1131: Kafka Producer logs these warn messages, when it is not able to fetch topic metadata. The reasons for this are unavailability of Topic metadata/Leader/Topic not existing etc. Kafka default settings should be fine for production deployments. Also the warn messages should not be a problem on a healthy cluster. Kafka producer metadata fetch logic is complex and difficult to bring any changes. During integration/unit tests, If we want to reduce the the log volume, then we can decrease the max.block.ms to 30 and increase retry-backoff-ms to 1000. > Atlas hooks produce excess Kafka logs if Kafka topic is not created already > --- > > Key: ATLAS-1131 > URL: https://issues.apache.org/jira/browse/ATLAS-1131 > Project: Atlas > Issue Type: Bug >Reporter: Vimal Sharma >Assignee: Vimal Sharma > Attachments: ATLAS-1131.patch > > > Hooks for Atlas publish messages to a Kafka topic named ATLAS_HOOK. If this > topic is not present and Atlas does not have permission to create this topic, > Kafka produces excessive logs as below (example in /tmp/hive/hive.log). > 2016-08-22 06:43:47,655 WARN [kafka-producer-network-thread | producer-1]: > clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while > fetching metadata with correlation id 1177 : > {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION} > 2016-08-22 06:43:47,756 WARN [kafka-producer-network-thread | producer-1]: > clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while > fetching metadata with correlation id 1178 : > {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION} > 2016-08-22 06:43:47,858 WARN [kafka-producer-network-thread | producer-1]: > clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while > fetching metadata with correlation id 1179 : > {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION} > 2016-08-22 06:43:47,961 WARN [kafka-producer-network-thread | producer-1]: > clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while > fetching metadata with correlation id 1180 : > {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION} > 2016-08-22 06:43:48,062 WARN [kafka-producer-network-thread | producer-1]: > clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while > fetching metadata with correlation id 1181 : > {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION} > 2016-08-22 06:43:48,165 WARN [kafka-producer-network-thread | producer-1]: > clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while > fetching metadata with correlation id 1182 : > {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION} > 2016-08-22 06:43:48,265 WARN [kafka-producer-network-thread | producer-1]: > clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while > fetching metadata with correlation id 1183 : > {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION} > 2016-08-22 06:43:48,366 WARN [kafka-producer-network-thread | producer-1]: > clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while > fetching metadata with correlation id 1184 : > {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION} > 2016-08-22 06:43:48,467 WARN [kafka-producer-network-thread | producer-1]: > clients.NetworkClient (NetworkClient.java:handleResponse(600)) - Error while > fetching metadata with correlation id 1185 : > {ATLAS_HOOK=UNKNOWN_TOPIC_OR_PARTITION} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-597) Handle Cluster renaming
[ https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436374#comment-15436374 ] Suma Shivaprasad commented on ATLAS-597: This change will require all the existing entities in the repository to have the clusterId in qualifiedName instead of clusterName and will require a migration before upgrade. > Handle Cluster renaming > --- > > Key: ATLAS-597 > URL: https://issues.apache.org/jira/browse/ATLAS-597 > Project: Atlas > Issue Type: Sub-task >Affects Versions: 0.7-incubating >Reporter: Suma Shivaprasad > > Cluster name could easily be changed by users which is used in qualified > names to dedup entities. For eg: hive_table has a qualifiedName >. Hence all entities will be recreated if they > are renamed unless a migration on repository is done with new cluster name. > Hence cluster should be represented by an id which is constant and tied to > clusterName which could change -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (ATLAS-597) Handle Cluster renaming
[ https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436374#comment-15436374 ] Suma Shivaprasad edited comment on ATLAS-597 at 8/25/16 6:55 AM: - This change will require all the existing entities in the repository to have the clusterId in qualifiedName instead of clusterName and will require a migration before ATLAS upgrade. was (Author: suma.shivaprasad): This change will require all the existing entities in the repository to have the clusterId in qualifiedName instead of clusterName and will require a migration before upgrade. > Handle Cluster renaming > --- > > Key: ATLAS-597 > URL: https://issues.apache.org/jira/browse/ATLAS-597 > Project: Atlas > Issue Type: Sub-task >Affects Versions: 0.7-incubating >Reporter: Suma Shivaprasad > > Cluster name could easily be changed by users which is used in qualified > names to dedup entities. For eg: hive_table has a qualifiedName >. Hence all entities will be recreated if they > are renamed unless a migration on repository is done with new cluster name. > Hence cluster should be represented by an id which is constant and tied to > clusterName which could change -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ATLAS-597) Handle Cluster renaming
[ https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated ATLAS-597: --- Description: Cluster name could easily be changed by users which is used in qualified names to dedup entities. For eg: hive_table has a qualifiedName. Hence all entities will be recreated if they are renamed unless a mass update on repository is done with new cluster name. Hence cluster should be represented by an id which is constant and tied to clusterName which could change was: Cluster name could easily be changed by users which is used in qualified names to dedup entities. For eg: hive_table as . Hence all entities will be recreated if they are renamed unless a mass update on repository is done with new cluster name. Hence cluster should be represented by an id which is constant and tied to clusterName which could change > Handle Cluster renaming > --- > > Key: ATLAS-597 > URL: https://issues.apache.org/jira/browse/ATLAS-597 > Project: Atlas > Issue Type: Sub-task >Affects Versions: 0.7-incubating >Reporter: Suma Shivaprasad > > Cluster name could easily be changed by users which is used in qualified > names to dedup entities. For eg: hive_table has a qualifiedName > . Hence all entities will be recreated if they > are renamed unless a mass update on repository is done with new cluster name. > Hence cluster should be represented by an id which is constant and tied to > clusterName which could change -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-1136) %spark.r interpreter is not working in Zeppelin 0.6.1
[ https://issues.apache.org/jira/browse/ATLAS-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436367#comment-15436367 ] Pari Margu commented on ATLAS-1136: --- I am having Spark 1.6.2 cluster with Hadoop YARN, Oozie. I have installed Zeppelin 0.6.1(Binary package with all interpreters: zeppelin-0.6.1-bin-all.tgz). When I am trying to use SparkR script with %spark.r interpreter, %spark.r Creating SparkConext and connecting to Cloudant DB sc1 <- sparkR.init(sparkEnv = list("cloudant.host"="host_name","cloudant.username"="user_name","cloudant.password"="password", "jsonstore.rdd.schemaSampleSize"="-1")) Database to be connected to extract the data database <- "sensordata" Creating Spark SQL Context sqlContext <- sparkRSQL.init(sc) Creating DataFrame for the "sensordata" Cloudant DB sensorDataDF <- read.df(sqlContext, database, header='true', source = "com.cloudant.spark",inferSchema='true') Get basic information about the DataFrame(sensorDataDF) printSchema(sensorDataDF) I am getting the following error(log): ERROR [2016-08-25 03:28:37,336] ( {Thread-77} JobProgressPoller.java[run]:54) - Can not get or update progress org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.transport.TTransportException at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getProgress(RemoteInterpreter.java:373) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:111) at org.apache.zeppelin.notebook.Paragraph.progress(Paragraph.java:237) at org.apache.zeppelin.scheduler.JobProgressPoller.run(JobProgressPoller.java:51) Caused by: org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_getProgress(RemoteInterpreterService.java:296) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.getProgress(RemoteInterpreterService.java:281) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getProgress(RemoteInterpreter.java:370) ... 3 more Help would be much appreciated. > %spark.r interpreter is not working in Zeppelin 0.6.1 > - > > Key: ATLAS-1136 > URL: https://issues.apache.org/jira/browse/ATLAS-1136 > Project: Atlas > Issue Type: Bug >Reporter: Pari Margu > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (ATLAS-1136) %spark.r interpreter is not working in Zeppelin 0.6.1
Pari Margu created ATLAS-1136: - Summary: %spark.r interpreter is not working in Zeppelin 0.6.1 Key: ATLAS-1136 URL: https://issues.apache.org/jira/browse/ATLAS-1136 Project: Atlas Issue Type: Bug Reporter: Pari Margu -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (ATLAS-597) Handle Cluster renaming
[ https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299530#comment-15299530 ] Suma Shivaprasad edited comment on ATLAS-597 at 8/25/16 6:45 AM: - Cluster entity The cluster entity will have the following properties name description list aliases - This will hold older cluster names or previous aliases Hook changes for Entity updates/creates /deletes This also means that the hooks will have to pass the qualifiedName in entity creates/updates/deletes as "dbName.tableName@clusterId" instead of the "dbName.tableName@clusterName" was (Author: suma.shivaprasad): The cluster entity will have the following properties name description list aliases - This will hold older cluster names or previous aliases This also means that the hooks will have to pass the qualifiedName in entity creates/updates such as "dbName.tableName@clusterId" instead of the "dbName.tableName@clusterName" > Handle Cluster renaming > --- > > Key: ATLAS-597 > URL: https://issues.apache.org/jira/browse/ATLAS-597 > Project: Atlas > Issue Type: Sub-task >Affects Versions: 0.7-incubating >Reporter: Suma Shivaprasad > > Cluster name could easily be changed by users which is used in qualified > names to dedup entities. For eg: hive_table as> . Hence all entities will be recreated if they are renamed unless a mass > update on repository is done with new cluster name. > Hence cluster should be represented by an id which is constant and tied to > clusterName which could change -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (ATLAS-597) Handle Cluster renaming
[ https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436357#comment-15436357 ] Suma Shivaprasad edited comment on ATLAS-597 at 8/25/16 6:43 AM: - The cluster entities can be updated through the entity API . was (Author: suma.shivaprasad): The cluster entities can be updated through the entity API > Handle Cluster renaming > --- > > Key: ATLAS-597 > URL: https://issues.apache.org/jira/browse/ATLAS-597 > Project: Atlas > Issue Type: Sub-task >Affects Versions: 0.7-incubating >Reporter: Suma Shivaprasad > > Cluster name could easily be changed by users which is used in qualified > names to dedup entities. For eg: hive_table as> . Hence all entities will be recreated if they are renamed unless a mass > update on repository is done with new cluster name. > Hence cluster should be represented by an id which is constant and tied to > clusterName which could change -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ATLAS-597) Handle Cluster renaming
[ https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated ATLAS-597: --- Description: Cluster name could easily be changed by users which is used in qualified names to dedup entities. For eg: hive_table as. Hence all entities will be recreated if they are renamed unless a mass update on repository is done with new cluster name. Hence cluster should be represented by an id which is constant and tied to clusterName which could change was: Cluster name could easily be changed by users which is used in qualified names to dedup entities. For eg: hive_table as . Hence all entities will be recreated if they are created again unless a mass update on repository is done with new cluster name. Hence cluster should be represented by an id which is constant and tied to clusterName which could change > Handle Cluster renaming > --- > > Key: ATLAS-597 > URL: https://issues.apache.org/jira/browse/ATLAS-597 > Project: Atlas > Issue Type: Sub-task >Affects Versions: 0.7-incubating >Reporter: Suma Shivaprasad > > Cluster name could easily be changed by users which is used in qualified > names to dedup entities. For eg: hive_table as > . Hence all entities will be recreated if they are renamed unless a mass > update on repository is done with new cluster name. > Hence cluster should be represented by an id which is constant and tied to > clusterName which could change -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-597) Handle Cluster renaming
[ https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436357#comment-15436357 ] Suma Shivaprasad commented on ATLAS-597: The cluster entities can be updated through the entity API > Handle Cluster renaming > --- > > Key: ATLAS-597 > URL: https://issues.apache.org/jira/browse/ATLAS-597 > Project: Atlas > Issue Type: Sub-task >Affects Versions: 0.7-incubating >Reporter: Suma Shivaprasad > > Cluster name could easily be changed by users which is used in qualified > names to dedup entities. For eg: hive_table as> . Hence all entities will be recreated if they are created again unless a > mass update on repository is done with new cluster name. > Hence cluster should be represented by an id which is constant and tied to > clusterName which could change -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (ATLAS-597) Handle Cluster renaming
[ https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272560#comment-15272560 ] Suma Shivaprasad edited comment on ATLAS-597 at 8/25/16 6:36 AM: - This requires Cluster to be modeled as an entity. was (Author: suma.shivaprasad): This requires Cluster to be modeled as an entity that is updated periodically > Handle Cluster renaming > --- > > Key: ATLAS-597 > URL: https://issues.apache.org/jira/browse/ATLAS-597 > Project: Atlas > Issue Type: Sub-task >Affects Versions: 0.7-incubating >Reporter: Suma Shivaprasad > > Cluster name could easily be changed by users which is used in qualified > names to dedup entities. For eg: hive_table as> . Hence all entities will be recreated if they are created again unless a > mass update on repository is done with new cluster name. > Hence cluster should be represented by an id which is constant and tied to > clusterName which could change -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (ATLAS-597) Handle Cluster renaming
[ https://issues.apache.org/jira/browse/ATLAS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299530#comment-15299530 ] Suma Shivaprasad edited comment on ATLAS-597 at 8/25/16 6:36 AM: - The cluster entity will have the following properties name description list aliases - This will hold older cluster names or previous aliases This also means that the hooks will have to pass the qualifiedName in entity creates/updates such as "dbName.tableName@clusterId" instead of the "dbName.tableName@clusterName" was (Author: suma.shivaprasad): The cluster entity will have the following properties name description list aliases - This will hold other cluster names The plan is to keep an inmemory copy and be able to do a lookup by any of the cluster names i.e aliases and also by the primary /current name which is stored in "name" attribute > Handle Cluster renaming > --- > > Key: ATLAS-597 > URL: https://issues.apache.org/jira/browse/ATLAS-597 > Project: Atlas > Issue Type: Sub-task >Affects Versions: 0.7-incubating >Reporter: Suma Shivaprasad > > Cluster name could easily be changed by users which is used in qualified > names to dedup entities. For eg: hive_table as> . Hence all entities will be recreated if they are created again unless a > mass update on repository is done with new cluster name. > Hence cluster should be represented by an id which is constant and tied to > clusterName which could change -- This message was sent by Atlassian JIRA (v6.3.4#6332)