[jira] [Commented] (ATLAS-872) Add Multitenancy support to Atlas
[ https://issues.apache.org/jira/browse/ATLAS-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15319173#comment-15319173 ] CASSIO DOS SANTOS commented on ATLAS-872: - We have implemented multi-tenancy on top of Atlas but, if nothing else, it's not efficient, and as we need to implement a different design where we store a separate graph per tenant. I've discussed a "non-disruptive" design for this with Neeru which would be based on the use of a ThreadLocal variable which could be set by intercepting calls to the Atlas API (from a configured servlet filter, maybe) to get the tenant ID passed by the calling application in the HTTP request header and checking for that variable and applying it accordingly when submitting requests to the underlying graph storage layer (via the new AAG layer or direct access to Titan or HBase, etc). This may also involve changes to the type cache provider, depending on how it loads data from the storage. If anyone thinks this approach may not work or turn out to be problematic to implement for some reason, or have other ideas or need more details, please share your thoughts here. > Add Multitenancy support to Atlas > - > > Key: ATLAS-872 > URL: https://issues.apache.org/jira/browse/ATLAS-872 > Project: Atlas > Issue Type: New Feature >Affects Versions: 0.7-incubating >Reporter: Neeru Gupta >Assignee: Neeru Gupta > Fix For: trunk > > > Atlas currently does not support multi tenancy. As part of this feature, > will add support to honor requests coming from multiple tenants. Individual > Tenant data should remain isolated from one another. > All the unique constraints should be applied per tenant and not globally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-819) All user defined types should have a set of common attributes
[ https://issues.apache.org/jira/browse/ATLAS-819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15299053#comment-15299053 ] CASSIO DOS SANTOS commented on ATLAS-819: - Can you confirm that this is not going to be the default for "all user defined types" as stated in the description, but instead an optional base class, or rather a couple of optional base classes, one with the read-only system attributes and one with 'name' and 'description', as per Dave comments, with which I fully agree? We have such classes in our application and other applications are likely to have something similar, so making it optional would give more flexibility to application developers. Based on my experience, it's not uncommon to have classes for which even those read-only system attributes are not required or desired (unnecessary overhead), as they represent objects that may need to be referenced from multiple objects but are to be otherwise handled as "lightweight" sub-objects of another root/parent object. > All user defined types should have a set of common attributes > - > > Key: ATLAS-819 > URL: https://issues.apache.org/jira/browse/ATLAS-819 > Project: Atlas > Issue Type: Bug >Reporter: Hemanth Yamijala > > It would be very convenient if all user defined types have a conventional set > of common attributes including: > * name > * description > * owner > * created at > * modified at -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (ATLAS-737) Ability to retrieve the size of the result of a DSL search without having to retrieve the result
CASSIO DOS SANTOS created ATLAS-737: --- Summary: Ability to retrieve the size of the result of a DSL search without having to retrieve the result Key: ATLAS-737 URL: https://issues.apache.org/jira/browse/ATLAS-737 Project: Atlas Issue Type: Sub-task Reporter: CASSIO DOS SANTOS This can be implemented analogously to a "select count *" in SQL and is more relevant given the added ability to paginate a search result. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-541) Soft deletes
[ https://issues.apache.org/jira/browse/ATLAS-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15227027#comment-15227027 ] CASSIO DOS SANTOS commented on ATLAS-541: - I think that enabling/disabling soft delete at the type level would give a better level of flexibility. A similar approach could be adopted when versioning support is added. > Soft deletes > > > Key: ATLAS-541 > URL: https://issues.apache.org/jira/browse/ATLAS-541 > Project: Atlas > Issue Type: New Feature >Reporter: Shwetha G S >Assignee: Shwetha G S > > We don't have graph versioning currently and hard deletes are not acceptable > for data governance. This jira tracks the proposal for soft deletes which can > mark an entity as deleted and by default search should return only active > entities. However, there should be an option to retrieve deleted entities -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-517) Upgrade titan to 1.x
[ https://issues.apache.org/jira/browse/ATLAS-517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196005#comment-15196005 ] CASSIO DOS SANTOS commented on ATLAS-517: - [~yhemanth] I more than understand the dependencies conundrum you have to deal with at the platform level, but am I right to assume that getting Atlas to run in multiple platforms or environments is a valid project goal? I can file a JIRA for the HBase 1.2.x, 1.3 support as you suggested, but we're also evaluating other short term alternatives like Cassandra so it's not clear yet what priority we would want to assign to that work. When we get to work on the changes to support Titan 1.x, could we try to further isolate/minimize the dependencies on Titan by adding an intermediate generic TinkerPop 3 based layer, so that support for different versions of different graph stores could be more easily plugged in in the future? If you have any ideas or plans along those lines we'd like to learn more about them, this is an area we'd be very interested in contributing to. Releasing on Titan 1.x instead of 0.54 would also allow us to avoid having to deal with data migration. > Upgrade titan to 1.x > > > Key: ATLAS-517 > URL: https://issues.apache.org/jira/browse/ATLAS-517 > Project: Atlas > Issue Type: Wish >Affects Versions: trunk >Reporter: Nigel Jones > > titan 0.54 currently ships with, and is supported by, Atlas. > This itself officially supports > - Cassandra 1.2.z, 2.0.z > - HBase 0.94.z, 0.96.z, 0.98.z > - ElasticSearch 1.0.z, 1.1.z, 1.2.z > - Solr 4.8.1 > - Tinkerpop 2.5.z > souce: http://s3.thinkaurelius.com/docs/titan/0.5.4/version-compat.html > As of 24 Feb 2015 titan 1.0.0 is current and supports > - Cassandra 1.2.z, 2.0.z, 2.1.z (ADDS support for 2.1) > - HBase 0.94.z, 0.96.z, 0.98.z, 1.0.z (ADDS support for 1.0) > - ElasticSearch 1.0.z, 1.1.z, 1.2.z (DROPS these, ADDS 1.5) > - Solr 5.2.z (DROPS 4.8.1, ADDS 5.2.z) > - Tinkerpop 3.0.z (DROPS 2.5, ADDS 3.0) > In addition in the titan community 1.1 is now being built, and there are > discussions around tinkerpop 3.1 support, as well as hadoop2 > source: > https://groups.google.com/forum/#!searchin/aureliusgraphs/1.1/aureliusgraphs/e5L5M6MQozY/QHXtx5hFAwAJ > I would like to be able to use current versions of titan as my graph store in > order to be able to benefit from > - a platform on which to better integrate with tinkerpop (see separate issue > to be raised) > - improvements in indexing > - more recent HBase and Cassandra support for underlying storage > Given titan 1.1 is imminent I would be inclined to aim for that as a target & > perhaps we should start experimenting with titan 1.0 since there have been > API changes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-487) Externalize tag in search method
[ https://issues.apache.org/jira/browse/ATLAS-487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194226#comment-15194226 ] CASSIO DOS SANTOS commented on ATLAS-487: - [~yhemanth] I think that [Atlas-491|https://issues.apache.org/jira/browse/ATLAS-491] is related but more generic and at a higher level of abstraction than multi-tenancy considered in this thread, where hierarchies are not involved. I wonder if that could be implemented more efficiently (and securely) via some "lower-level" partitioning of the data, possibly at the graph database layer? Ideally the storage layer under Atlas should provide that type of capability so we were not required to do any query rewriting. > Externalize tag in search method > > > Key: ATLAS-487 > URL: https://issues.apache.org/jira/browse/ATLAS-487 > Project: Atlas > Issue Type: Improvement >Reporter: Prasad S Madugundu >Priority: Critical > > Tagging metadata (or adding traits to metadata) can be used for > classification of metadata and metadata partitioning for multi-tenancy > purpose or partition based on the organization hierarchy. In these use cases, > it would be ideal if I can pass the trait as a separate parameter to the > search method, instead of including the tag as a predicate in the query > string. > If I have a complex query that retrieves metadata from multiple types, then > the query becomes more complex if I need to add predicates for the tags for > all the types that are used in the query. > Externalizing the tag from the search query would also lead to better > structure for the client code, because I can add the classification or > partition to the query without modifying the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-517) Upgrade titan to 1.x
[ https://issues.apache.org/jira/browse/ATLAS-517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193853#comment-15193853 ] CASSIO DOS SANTOS commented on ATLAS-517: - [~yhemanth] I wanted to check if there has been any activity / discussion on that front on your side, and any updates to your plan. Our cloud platform supports more recent stable versions of some of the products Atlas depends on, like Titan 1.x and HBase 1.2.x, and having to deploy with older versions prevents us from leveraging many of the management services available on the newer versions, not to mention the fact that some of those older versions are no longer supported by their providers. I understand that in this particular case you have the additional challenge of moving to Java 8, but in the more general case, it seems that the ability to more quickly validate and support newer versions of the underlying data stores may become more critical in environments like the cloud. I've noticed that the Atlas code has some mechanisms in place to address that on top of what Titan provides. If getting Atlas supported on top of Titan 1.x is going to require more work and take longer, one temporary option that may work for us is to get Titan 0.54 to work on top of HBase 1.2.x, which could require some changes to the Atlas code that would likely be much more localized and less disruptive that a full port to the latest version of Titan. > Upgrade titan to 1.x > > > Key: ATLAS-517 > URL: https://issues.apache.org/jira/browse/ATLAS-517 > Project: Atlas > Issue Type: Wish >Affects Versions: trunk >Reporter: Nigel Jones > > titan 0.54 currently ships with, and is supported by, Atlas. > This itself officially supports > - Cassandra 1.2.z, 2.0.z > - HBase 0.94.z, 0.96.z, 0.98.z > - ElasticSearch 1.0.z, 1.1.z, 1.2.z > - Solr 4.8.1 > - Tinkerpop 2.5.z > souce: http://s3.thinkaurelius.com/docs/titan/0.5.4/version-compat.html > As of 24 Feb 2015 titan 1.0.0 is current and supports > - Cassandra 1.2.z, 2.0.z, 2.1.z (ADDS support for 2.1) > - HBase 0.94.z, 0.96.z, 0.98.z, 1.0.z (ADDS support for 1.0) > - ElasticSearch 1.0.z, 1.1.z, 1.2.z (DROPS these, ADDS 1.5) > - Solr 5.2.z (DROPS 4.8.1, ADDS 5.2.z) > - Tinkerpop 3.0.z (DROPS 2.5, ADDS 3.0) > In addition in the titan community 1.1 is now being built, and there are > discussions around tinkerpop 3.1 support, as well as hadoop2 > source: > https://groups.google.com/forum/#!searchin/aureliusgraphs/1.1/aureliusgraphs/e5L5M6MQozY/QHXtx5hFAwAJ > I would like to be able to use current versions of titan as my graph store in > order to be able to benefit from > - a platform on which to better integrate with tinkerpop (see separate issue > to be raised) > - improvements in indexing > - more recent HBase and Cassandra support for underlying storage > Given titan 1.1 is imminent I would be inclined to aim for that as a target & > perhaps we should start experimenting with titan 1.0 since there have been > API changes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-511) Ability to run multiple instances of Atlas Server with automatic failover to one active server
[ https://issues.apache.org/jira/browse/ATLAS-511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185159#comment-15185159 ] CASSIO DOS SANTOS commented on ATLAS-511: - Hemanth Yamijala, you're right about what I meant by on "demand", lazy loading is maybe a better way to refer to it. Having said that, as Venkata Madugundu has considered in his comments, evaluating the impact to performance of turning cache off is something that we could maybe help with in the short term. I'd like to know your thoughts on other options like the use of a distributed cache, or a cache with some "refresh-if-obsolete" policy that would check for type timestamp in the backend store to decide if a cached type needs to be refreshed, or a "refresh type" broadcast to all instances. I agree we should use separate JIRAs from this one to cover those different alternatives, and maybe have some investigative or proof of concept work done in parallel on some of them. > Ability to run multiple instances of Atlas Server with automatic failover to > one active server > -- > > Key: ATLAS-511 > URL: https://issues.apache.org/jira/browse/ATLAS-511 > Project: Atlas > Issue Type: Sub-task >Reporter: Hemanth Yamijala >Assignee: Hemanth Yamijala > Attachments: HADesign.pdf > > > One of the most important components that only supports active-standby mode > currently is the Atlas server which hosts the API / UI for Atlas. As > described in the [HA > Documentation|http://atlas.incubator.apache.org/0.6.0-incubating/HighAvailability.html], > we currently are limited to running only one instance of the Atlas server > behind a proxy service. If the running instance goes down, a manual process > is required to bring up another instance. > In this JIRA, we propose to have an ability to run multiple Atlas server > instances. However, as a first step, only one of them will be actively > processing requests. To have a consistent terminology, let us call that > server the *master*. Any requests sent to the other servers will be > redirected to the master. > When the master suffers a partition, one of the other servers must > automatically become the master and start processing requests. What this mode > brings us over the current system is the ability to automatically failover > the Atlas server instance without any manual intervention. Note that this > can be arguably called an [active/active > setup|https://en.wikipedia.org/wiki/High-availability_cluster] > ATLAS-488 raised to support multiple active Atlas server instances. While > that would be ideal, we have to learn more about the underlying system > behavior before we can get there, and hopefully we can take smaller steps to > improve the system systematically. The method proposed here is similar to > what is adopted in many other Hadoop components including HDFS NameNode, > HBase HMaster etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-511) Ability to run multiple instances of Atlas Server with automatic failover to one active server
[ https://issues.apache.org/jira/browse/ATLAS-511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183156#comment-15183156 ] CASSIO DOS SANTOS commented on ATLAS-511: - A couple of questions: Should "TitanGraphProvider solr 5 index added" be marked in bold on page 2? Will the failover time grow as the number of types that need to be loaded increases (and possibly due to other factors), in particular if Atlas is used in a multi-tenant environment, and in that sense should you consider things like on-demand type cache initialization or a distributed cache? > Ability to run multiple instances of Atlas Server with automatic failover to > one active server > -- > > Key: ATLAS-511 > URL: https://issues.apache.org/jira/browse/ATLAS-511 > Project: Atlas > Issue Type: Sub-task >Reporter: Hemanth Yamijala >Assignee: Hemanth Yamijala > Attachments: HADesign.pdf > > > One of the most important components that only supports active-standby mode > currently is the Atlas server which hosts the API / UI for Atlas. As > described in the [HA > Documentation|http://atlas.incubator.apache.org/0.6.0-incubating/HighAvailability.html], > we currently are limited to running only one instance of the Atlas server > behind a proxy service. If the running instance goes down, a manual process > is required to bring up another instance. > In this JIRA, we propose to have an ability to run multiple Atlas server > instances. However, as a first step, only one of them will be actively > processing requests. To have a consistent terminology, let us call that > server the *master*. Any requests sent to the other servers will be > redirected to the master. > When the master suffers a partition, one of the other servers must > automatically become the master and start processing requests. What this mode > brings us over the current system is the ability to automatically failover > the Atlas server instance without any manual intervention. Note that this > can be arguably called an [active/active > setup|https://en.wikipedia.org/wiki/High-availability_cluster] > ATLAS-488 raised to support multiple active Atlas server instances. While > that would be ideal, we have to learn more about the underlying system > behavior before we can get there, and hopefully we can take smaller steps to > improve the system systematically. The method proposed here is similar to > what is adopted in many other Hadoop components including HDFS NameNode, > HBase HMaster etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)