Re: Review Request 74964: Atlas - Upgrade Nimbus-JOSE-JWT to 9.37.3

2024-06-18 Thread Priyanshi Shah

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74964/#review226564
---


Ship it!




Ship It!

- Priyanshi Shah


On April 24, 2024, 7:10 p.m., Mandar Ambawane wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/74964/
> ---
> 
> (Updated April 24, 2024, 7:10 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Priyanshi Shah, and Sheetal Shah.
> 
> 
> Bugs: ATLAS-4855
> https://issues.apache.org/jira/browse/ATLAS-4855
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Atlas is currently pulling in nimbus-jose-jwt 9.8.1. Upgrate it to 9.37.3
> 
> 
> Diffs
> -
> 
>   webapp/pom.xml 7d2d4c952 
> 
> 
> Diff: https://reviews.apache.org/r/74964/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Mandar Ambawane
> 
>



Re: Review Request 74961: ATLAS-4854 : Atlas - Upgrade Spring Security to 5.8.11

2024-06-18 Thread Sheetal Shah

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74961/#review226563
---


Ship it!




Ship It!

- Sheetal Shah


On June 17, 2024, 1:17 p.m., Priyanshi Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/74961/
> ---
> 
> (Updated June 17, 2024, 1:17 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Mandar Ambawane, Pinal Shah, and 
> Sheetal Shah.
> 
> 
> Bugs: ATLAS-4854
> https://issues.apache.org/jira/browse/ATLAS-4854
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Currently Atlas is using Spring security version 5.8.5 upgrading it to 5.8.11
> 
> 
> Diffs
> -
> 
>   pom.xml 6e6724275 
> 
> 
> Diff: https://reviews.apache.org/r/74961/diff/1/
> 
> 
> Testing
> ---
> 
> Manual testing done and ran PC
> PC link: 
> https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/1651/
> 
> 
> Thanks,
> 
> Priyanshi Shah
> 
>



Re: Review Request 74965: ATLAS-4844 : Atlas - Upgrade Common Configuration2 to 2.10.1

2024-06-18 Thread Sheetal Shah

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74965/#review226562
---


Ship it!




Ship It!

- Sheetal Shah


On June 17, 2024, 1:16 p.m., Priyanshi Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/74965/
> ---
> 
> (Updated June 17, 2024, 1:16 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Mandar Ambawane, Pinal Shah, and 
> Sheetal Shah.
> 
> 
> Bugs: ATLAS-4844
> https://issues.apache.org/jira/browse/ATLAS-4844
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Currently Atlas is using common configuration2 version 2.8.0 upgrading it to 
> 2.10.1
> 
> 
> Diffs
> -
> 
>   pom.xml 6e6724275 
> 
> 
> Diff: https://reviews.apache.org/r/74965/diff/1/
> 
> 
> Testing
> ---
> 
> Manual testing done and ran PC
> PC link: 
> https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/1651/
> 
> 
> Thanks,
> 
> Priyanshi Shah
> 
>



Re: Review Request 74964: Atlas - Upgrade Nimbus-JOSE-JWT to 9.37.3

2024-06-18 Thread Sheetal Shah

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74964/#review226561
---


Ship it!




Ship It!

- Sheetal Shah


On April 25, 2024, 12:40 a.m., Mandar Ambawane wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/74964/
> ---
> 
> (Updated April 25, 2024, 12:40 a.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Priyanshi Shah, and Sheetal Shah.
> 
> 
> Bugs: ATLAS-4855
> https://issues.apache.org/jira/browse/ATLAS-4855
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Atlas is currently pulling in nimbus-jose-jwt 9.8.1. Upgrate it to 9.37.3
> 
> 
> Diffs
> -
> 
>   webapp/pom.xml 7d2d4c952 
> 
> 
> Diff: https://reviews.apache.org/r/74964/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Mandar Ambawane
> 
>



possible to remove 'incubator-atlas' repo

2024-06-18 Thread PJ Fanning
Hi,

I noticed that you have this git repo:
https://github.com/apache/incubator-atlas

It hasn't been updated in 7 years. Your main repo is:
https://github.com/apache/atlas

I'm guessing but it's possible that you left behind the 'incubator-atlas' repo 
when you left the Apache Incubator. The repos are normally renamed but your 
team might have created a new repo based on the old repo.

It is a little confusing to still have the out of date 'incubator-atlas' repo. 
Would it be possible to get it deleted?

I am not subscribed to this list so please include me explicitly on any replies.

Regards,
PJ


Re: Review Request 75054: ATLAS-4882: Export/Import: Export exits with 'Found 0 entities'

2024-06-18 Thread Pinal Shah


> On June 18, 2024, 8:04 a.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/util/AtlasGremlin3QueryProvider.java
> > Line 41 (original), 41 (patched)
> > 
> >
> > Is removing ".has('__guid')" not needed in line #41, #42, #45?

Above queries are used in VertexExtractor and below is the condition of 
extractor strategy

 return (atlasEntityDef == null || 
atlasEntityDef.getRelationshipAttributeDefs().size() == 0)
? extractors.get(VERTEX_BASED_EXTRACT)
: extractors.get(RELATION_BASED_EXTRACT);

I was not able to verify it, since couldn't get the usecase of the 
vertexExtractor Strategy


- Pinal


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/75054/#review226559
---


On June 18, 2024, 7:54 a.m., Pinal Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/75054/
> ---
> 
> (Updated June 18, 2024, 7:54 a.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Jayendra Parab, Madhan Neethiraj, 
> and Radhika PC.
> 
> 
> Bugs: ATLAS-4882
> https://issues.apache.org/jira/browse/ATLAS-4882
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> please check the jira description 
> https://issues.apache.org/jira/browse/ATLAS-4882
> 
> 
> Diffs
> -
> 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/StartEntityFetchByExportRequest.java
>  d01d6775a 
>   
> repository/src/main/java/org/apache/atlas/util/AtlasGremlin3QueryProvider.java
>  5b10c353e 
> 
> 
> Diff: https://reviews.apache.org/r/75054/diff/1/
> 
> 
> Testing
> ---
> 
> Manually verified
> 
> 
> Thanks,
> 
> Pinal Shah
> 
>



[jira] [Updated] (ATLAS-4882) Export/Import: Export exits with "Found 0 entities"

2024-06-18 Thread Pinal Shah (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pinal Shah updated ATLAS-4882:
--
Description: 
*Issue:*
Export during ingestion fails giving Found 0 entities in the logs
Ingestion meaning Atlas is consuming messages

*Steps to Repro:*
 * Make sure backend has above 1M entities
 * Start creating tables under `db1@cm`
 * Start export for `db1@cm`
 * 
{code:java}
curl -v -X POST -u admin:admin -H "Content-Type: application/json"  
"http://<>/api/atlas/admin/export" -d 
'{"itemsToExport":[{"typeName":"hive_db","uniqueAttributes": { "qualifiedName": 
"db1@cm" }}],"options":{"fetchType":"full","replicatedTo":"cm"}} > 
export1.zip{code}

 * It fails after sometime.

*When is the issue seen?* 
It occurs when there is huge amount of data in backend and Atlas is consuming 
messages linked to entity of which export is running

*Analysis to find Root cause:*
 * when there is huge amount of data in backend, export FAILS
 * when there is huge amount of data in backend but less tables under it, then 
also export FAILS
 * if background consumption stops, export PASS
 * if consumption is of different entities then requested in export, export PASS
 * export query to find starting object uses below query, where has clause to 
check property is expensive

{code:java}
g.V().has('_typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('__guid').values('__guid'){code}
 - has('__guid') queries solr [(35x_t <> null)]:vertex_index
 - below is the timetaken in the solr logs

{code:java}
2024-06-14 02:38:56.218 INFO  (qtp1158676965-19) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=0=35x_t:*+=50=javabin=2}
 hits=1681928 status=0 QTime=4227
2024-06-14 02:40:23.945 INFO  (qtp1158676965-16) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=50=35x_t:*+=50=javabin=2}
 hits=1682086 status=0 QTime=787
2024-06-14 02:41:37.703 INFO  (qtp1158676965-14) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=100=35x_t:*+=50=javabin=2}
 hits=1682216 status=0 QTime=1962
2024-06-14 02:42:20.715 INFO  (qtp1158676965-20) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=150=35x_t:*+=50=javabin=2}
 hits=1682363 status=0 QTime=4465     {code}
 - ran same query through gremlin shell while ingestion is happening it doesn't 
fail
 - time taken for above gremlin query in code when ingestion                  : 
214825ms
 - time takem for above gremlin query in gremlin shell when ingestion : 104641ms
 - time taken for above gremlin query when no ingestion                         
  : 181682ms

Still Root cause is unknown

*WorkAround:*
 - Remove .has('__guid') clause from below, it is very quick and issue is not 
reproducible.

{code:java}
g.V().has('_typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('__guid').values('__guid'){code}
*Tests:*
 * upgrded tinkerpop and janusgraph version but didn't help
 * invalid property doesn't throw any exception or not existence of property

  was:
*Issue:*
Export during ingestion fails giving Found 0 entities in the logs
Ingestion meaning Atlas is consuming messages

*When is the issue seen?* 
It occurs when there is huge amount of data in backend and Atlas is consuming 
messages linked to entity of which export is running

*Analysis to find Root cause:*
 * when there is huge amount of data in backend, export FAILS
 * when there is huge amount of data in backend but less tables under it, then 
also export FAILS
 * if background consumption stops, export PASS
 * if consumption is of different entities then requested in export, export PASS
 * export query to find starting object uses below query, where has clause to 
check property is expensive

{code:java}
g.V().has('_typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('__guid').values('__guid'){code}
 - has('__guid') queries solr [(35x_t <> null)]:vertex_index
 - below is the timetaken in the solr logs

{code:java}
2024-06-14 02:38:56.218 INFO  (qtp1158676965-19) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=0=35x_t:*+=50=javabin=2}
 hits=1681928 status=0 QTime=4227
2024-06-14 02:40:23.945 INFO  (qtp1158676965-16) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 

Review Request 75054: ATLAS-4882: Export/Import: Export exits with 'Found 0 entities'

2024-06-18 Thread Pinal Shah

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/75054/
---

Review request for atlas, Ashutosh Mestry, Jayendra Parab, Madhan Neethiraj, 
and Radhika PC.


Bugs: ATLAS-4882
https://issues.apache.org/jira/browse/ATLAS-4882


Repository: atlas


Description
---

please check the jira description 
https://issues.apache.org/jira/browse/ATLAS-4882


Diffs
-

  
repository/src/main/java/org/apache/atlas/repository/impexp/StartEntityFetchByExportRequest.java
 d01d6775a 
  
repository/src/main/java/org/apache/atlas/util/AtlasGremlin3QueryProvider.java 
5b10c353e 


Diff: https://reviews.apache.org/r/75054/diff/1/


Testing
---

Manually verified


Thanks,

Pinal Shah



Re: Review Request 75054: ATLAS-4882: Export/Import: Export exits with 'Found 0 entities'

2024-06-18 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/75054/#review226559
---


Fix it, then Ship it!





repository/src/main/java/org/apache/atlas/util/AtlasGremlin3QueryProvider.java
Line 41 (original), 41 (patched)


Is removing ".has('__guid')" not needed in line #41, #42, #45?


- Madhan Neethiraj


On June 18, 2024, 7:54 a.m., Pinal Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/75054/
> ---
> 
> (Updated June 18, 2024, 7:54 a.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Jayendra Parab, Madhan Neethiraj, 
> and Radhika PC.
> 
> 
> Bugs: ATLAS-4882
> https://issues.apache.org/jira/browse/ATLAS-4882
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> please check the jira description 
> https://issues.apache.org/jira/browse/ATLAS-4882
> 
> 
> Diffs
> -
> 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/StartEntityFetchByExportRequest.java
>  d01d6775a 
>   
> repository/src/main/java/org/apache/atlas/util/AtlasGremlin3QueryProvider.java
>  5b10c353e 
> 
> 
> Diff: https://reviews.apache.org/r/75054/diff/1/
> 
> 
> Testing
> ---
> 
> Manually verified
> 
> 
> Thanks,
> 
> Pinal Shah
> 
>



Review Request 75053: ATLAS-4881: minor improvements in notification processing

2024-06-18 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/75053/
---

Review request for atlas, Ashutosh Mestry, chaitali, Disha Talreja, Jayendra 
Parab, Pinal Shah, Radhika Kundam, Sarath Subramanian, Sheetal Shah, and 
Sidharth Mishra.


Bugs: ATLAS-4881
https://issues.apache.org/jira/browse/ATLAS-4881


Repository: atlas


Description
---

- reduced noise in Atlas server log file by changing log levels during 
notification processing from info to debug level
- error handling during notificaiton processing updated to not retry if failure 
was due to invalid data (like entity-type not found)
- updated metrics log to include the total time taken to process a notification


Diffs
-

  common/src/main/java/org/apache/atlas/utils/AtlasPerfMetrics.java c72b2c3e2 
  
webapp/src/main/java/org/apache/atlas/notification/NotificationHookConsumer.java
 7b02ac449 
  
webapp/src/main/java/org/apache/atlas/notification/preprocessor/HiveDbDDLPreprocessor.java
 dcff0939d 
  
webapp/src/main/java/org/apache/atlas/notification/preprocessor/HivePreprocessor.java
 083e343b0 
  
webapp/src/main/java/org/apache/atlas/notification/preprocessor/HiveTableDDLPreprocessor.java
 83d4d7c1a 
  
webapp/src/main/java/org/apache/atlas/notification/preprocessor/PreprocessorContext.java
 f930d9f35 


Diff: https://reviews.apache.org/r/75053/diff/1/


Testing
---

- verified that notification preprocessing doesn't print logs in info level
- pre-commit tests run: 
https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/1654/


Thanks,

Madhan Neethiraj



[jira] [Updated] (ATLAS-4882) Export/Import: Export exits with "Found 0 entities"

2024-06-18 Thread Pinal Shah (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pinal Shah updated ATLAS-4882:
--
Description: 
*Issue:*
Export during ingestion fails giving Found 0 entities in the logs
Ingestion meaning Atlas is consuming messages

*When is the issue seen?* 
It occurs when there is huge amount of data in backend and Atlas is consuming 
messages linked to entity of which export is running

*Analysis to find Root cause:*
 * when there is huge amount of data in backend, export FAILS
 * when there is huge amount of data in backend but less tables under it, then 
also export FAILS
 * if background consumption stops, export PASS
 * if consumption is of different entities then requested in export, export PASS
 * export query to find starting object uses below query, where has clause to 
check property is expensive

{code:java}
g.V().has('_typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('guid').values('_guid'){code}
 - has('__guid') queries solr [(35x_t <> null)]:vertex_index
 - below is the timetaken in the solr logs

{code:java}
2024-06-14 02:38:56.218 INFO  (qtp1158676965-19) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=0=35x_t:*+=50=javabin=2}
 hits=1681928 status=0 QTime=4227
2024-06-14 02:40:23.945 INFO  (qtp1158676965-16) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=50=35x_t:*+=50=javabin=2}
 hits=1682086 status=0 QTime=787
2024-06-14 02:41:37.703 INFO  (qtp1158676965-14) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=100=35x_t:*+=50=javabin=2}
 hits=1682216 status=0 QTime=1962
2024-06-14 02:42:20.715 INFO  (qtp1158676965-20) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=150=35x_t:*+=50=javabin=2}
 hits=1682363 status=0 QTime=4465     {code}
 - ran same query through gremlin shell while ingestion is happening it doesn't 
fail
 - time taken for above gremlin query in code when ingestion                  : 
214825ms
 - time takem for above gremlin query in gremlin shell when ingestion : 104641ms
 - time taken for above gremlin query when no ingestion                         
  : 181682ms

Still Root cause is unknown

*WorkAround:*
 - Remove .has('__guid') clause from below, it is very quick and issue is not 
reproducible.

{code:java}
g.V().has('_typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('guid').values('_guid'){code}

  was:
*Issue:*
Export during ingestion fails giving Found 0 entities in the logs
Ingestion meaning Atlas is consuming messages

*When is the issue seen?* 
It occurs when there is huge amount of data in backend and Atlas is consuming 
messages linked to entity of which export is running

*Analysis to find Root cause:*
 * when there is huge amount of data in backend, export FAILS
 * when there is huge amount of data in backend but less tables under it, then 
also export FAILS
 * if background consumption stops, export PASS
 * if consumption is of different entities then requested in export, export PASS
 * export query to find starting object uses below query, where has clause to 
check property is expensive

{code:java}
g.V().has('_typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('guid').values('_guid'){code}

 - has('__guid') queries solr [(35x_t <> null)]:vertex_index
 - below is the timetaken in the solr logs 

{code:java}
2024-06-14 02:38:56.218 INFO  (qtp1158676965-19) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=0=35x_t:*+=50=javabin=2}
 hits=1681928 status=0 QTime=4227
2024-06-14 02:40:23.945 INFO  (qtp1158676965-16) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=50=35x_t:*+=50=javabin=2}
 hits=1682086 status=0 QTime=787
2024-06-14 02:41:37.703 INFO  (qtp1158676965-14) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=100=35x_t:*+=50=javabin=2}
 hits=1682216 status=0 QTime=1962
2024-06-14 02:42:20.715 INFO  (qtp1158676965-20) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 

[jira] [Updated] (ATLAS-4882) Export/Import: Export exits with "Found 0 entities"

2024-06-18 Thread Pinal Shah (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pinal Shah updated ATLAS-4882:
--
Description: 
*Issue:*
Export during ingestion fails giving Found 0 entities in the logs
Ingestion meaning Atlas is consuming messages

*When is the issue seen?* 
It occurs when there is huge amount of data in backend and Atlas is consuming 
messages linked to entity of which export is running

*Analysis to find Root cause:*
 * when there is huge amount of data in backend, export FAILS
 * when there is huge amount of data in backend but less tables under it, then 
also export FAILS
 * if background consumption stops, export PASS
 * if consumption is of different entities then requested in export, export PASS
 * export query to find starting object uses below query, where has clause to 
check property is expensive

{code:java}
g.V().has('_typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('guid').values('_guid'){code}

 - has('__guid') queries solr [(35x_t <> null)]:vertex_index
 - below is the timetaken in the solr logs 

{code:java}
2024-06-14 02:38:56.218 INFO  (qtp1158676965-19) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=0=35x_t:*+=50=javabin=2}
 hits=1681928 status=0 QTime=4227
2024-06-14 02:40:23.945 INFO  (qtp1158676965-16) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=50=35x_t:*+=50=javabin=2}
 hits=1682086 status=0 QTime=787
2024-06-14 02:41:37.703 INFO  (qtp1158676965-14) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=100=35x_t:*+=50=javabin=2}
 hits=1682216 status=0 QTime=1962
2024-06-14 02:42:20.715 INFO  (qtp1158676965-20) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=150=35x_t:*+=50=javabin=2}
 hits=1682363 status=0 QTime=4465     {code}
 - ran same query through gremlin shell while ingestion is happening it doesn't 
fail
 - time taken for above gremlin query in code when ingestion                  : 
214825ms
 - time takem for above gremlin query in gremlin shell when ingestion : 104641ms
 - time taken for above gremlin query when no ingestion                         
  : 181682ms

Still Root cause is unknown

*WorkAround:*
 - Remove .has('__guid') clause from below, it is very quick

{code:java}
g.V().has('_typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('guid').values('_guid'){code}

  was:
*Issue:*
Export during ingestion fails giving Found 0 entities in the logs
Ingestion meaning Atlas is consuming messages

*When is the issue seen?* 
It occurs when there is huge amount of data in backend and Atlas is consuming 
messages linked to entity of which export is running

*Analysis to find Root cause:*
 * when there is huge amount of data in backend, export FAILS
 * when there is huge amount of data in backend but less tables under it, then 
also export FAILS
 * if background consumption stops, export PASS
 * if consumption is of different entities then requested in export, export PASS
 * export query to find starting object uses below query, where has clause to 
check property is expensive
  
g.V().has('__typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('__guid').values('__guid')
- has('__guid') queries [(35x_t <> null)]:vertex_index , checked timetaken in 
the solr logs 

2024-06-14 02:38:56.218 INFO  (qtp1158676965-19) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params=\{q=*:*&_stateVer_=vertex_index:12=id=0=35x_t:*+=50=javabin=2}
 hits=1681928 status=0 QTime=4227
2024-06-14 02:40:23.945 INFO  (qtp1158676965-16) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params=\{q=*:*&_stateVer_=vertex_index:12=id=50=35x_t:*+=50=javabin=2}
 hits=1682086 status=0 QTime=787
2024-06-14 02:41:37.703 INFO  (qtp1158676965-14) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params=\{q=*:*&_stateVer_=vertex_index:12=id=100=35x_t:*+=50=javabin=2}
 hits=1682216 status=0 QTime=1962
2024-06-14 02:42:20.715 INFO  (qtp1158676965-20) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  

[jira] [Updated] (ATLAS-4882) Export/Import: Export exits with "Found 0 entities"

2024-06-18 Thread Pinal Shah (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pinal Shah updated ATLAS-4882:
--
Description: 
*Issue:*
Export during ingestion fails giving Found 0 entities in the logs
Ingestion meaning Atlas is consuming messages

*When is the issue seen?* 
It occurs when there is huge amount of data in backend and Atlas is consuming 
messages linked to entity of which export is running

*Analysis to find Root cause:*
 * when there is huge amount of data in backend, export FAILS
 * when there is huge amount of data in backend but less tables under it, then 
also export FAILS
 * if background consumption stops, export PASS
 * if consumption is of different entities then requested in export, export PASS
 * export query to find starting object uses below query, where has clause to 
check property is expensive

{code:java}
g.V().has('_typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('__guid').values('__guid'){code}
 - has('__guid') queries solr [(35x_t <> null)]:vertex_index
 - below is the timetaken in the solr logs

{code:java}
2024-06-14 02:38:56.218 INFO  (qtp1158676965-19) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=0=35x_t:*+=50=javabin=2}
 hits=1681928 status=0 QTime=4227
2024-06-14 02:40:23.945 INFO  (qtp1158676965-16) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=50=35x_t:*+=50=javabin=2}
 hits=1682086 status=0 QTime=787
2024-06-14 02:41:37.703 INFO  (qtp1158676965-14) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=100=35x_t:*+=50=javabin=2}
 hits=1682216 status=0 QTime=1962
2024-06-14 02:42:20.715 INFO  (qtp1158676965-20) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=150=35x_t:*+=50=javabin=2}
 hits=1682363 status=0 QTime=4465     {code}
 - ran same query through gremlin shell while ingestion is happening it doesn't 
fail
 - time taken for above gremlin query in code when ingestion                  : 
214825ms
 - time takem for above gremlin query in gremlin shell when ingestion : 104641ms
 - time taken for above gremlin query when no ingestion                         
  : 181682ms

Still Root cause is unknown

*WorkAround:*
 - Remove .has('__guid') clause from below, it is very quick and issue is not 
reproducible.

{code:java}
g.V().has('_typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('__guid').values('__guid'){code}
*Tests:*
 * upgrded tinkerpop and janusgraph version but didn't help
 * invalid property doesn't throw any exception or not existence of property

  was:
*Issue:*
Export during ingestion fails giving Found 0 entities in the logs
Ingestion meaning Atlas is consuming messages

*When is the issue seen?* 
It occurs when there is huge amount of data in backend and Atlas is consuming 
messages linked to entity of which export is running

*Analysis to find Root cause:*
 * when there is huge amount of data in backend, export FAILS
 * when there is huge amount of data in backend but less tables under it, then 
also export FAILS
 * if background consumption stops, export PASS
 * if consumption is of different entities then requested in export, export PASS
 * export query to find starting object uses below query, where has clause to 
check property is expensive

{code:java}
g.V().has('_typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('guid').values('_guid'){code}
 - has('__guid') queries solr [(35x_t <> null)]:vertex_index
 - below is the timetaken in the solr logs

{code:java}
2024-06-14 02:38:56.218 INFO  (qtp1158676965-19) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=0=35x_t:*+=50=javabin=2}
 hits=1681928 status=0 QTime=4227
2024-06-14 02:40:23.945 INFO  (qtp1158676965-16) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=50=35x_t:*+=50=javabin=2}
 hits=1682086 status=0 QTime=787
2024-06-14 02:41:37.703 INFO  (qtp1158676965-14) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params={q=:=vertex_index:12=id=100=35x_t:*+=50=javabin=2}
 hits=1682216 status=0 QTime=1962

[jira] [Created] (ATLAS-4882) Export/Import: Export exits with "Found 0 entities"

2024-06-18 Thread Pinal Shah (Jira)
Pinal Shah created ATLAS-4882:
-

 Summary: Export/Import: Export exits with "Found 0 entities" 
 Key: ATLAS-4882
 URL: https://issues.apache.org/jira/browse/ATLAS-4882
 Project: Atlas
  Issue Type: Bug
  Components:  atlas-core
Reporter: Pinal Shah
Assignee: Pinal Shah


*Issue:*
Export during ingestion fails giving Found 0 entities in the logs
Ingestion meaning Atlas is consuming messages

*When is the issue seen?* 
It occurs when there is huge amount of data in backend and Atlas is consuming 
messages linked to entity of which export is running

*Analysis to find Root cause:*
 * when there is huge amount of data in backend, export FAILS
 * when there is huge amount of data in backend but less tables under it, then 
also export FAILS
 * if background consumption stops, export PASS
 * if consumption is of different entities then requested in export, export PASS
 * export query to find starting object uses below query, where has clause to 
check property is expensive
  
g.V().has('__typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('__guid').values('__guid')
- has('__guid') queries [(35x_t <> null)]:vertex_index , checked timetaken in 
the solr logs 

2024-06-14 02:38:56.218 INFO  (qtp1158676965-19) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params=\{q=*:*&_stateVer_=vertex_index:12=id=0=35x_t:*+=50=javabin=2}
 hits=1681928 status=0 QTime=4227
2024-06-14 02:40:23.945 INFO  (qtp1158676965-16) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params=\{q=*:*&_stateVer_=vertex_index:12=id=50=35x_t:*+=50=javabin=2}
 hits=1682086 status=0 QTime=787
2024-06-14 02:41:37.703 INFO  (qtp1158676965-14) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params=\{q=*:*&_stateVer_=vertex_index:12=id=100=35x_t:*+=50=javabin=2}
 hits=1682216 status=0 QTime=1962
2024-06-14 02:42:20.715 INFO  (qtp1158676965-20) [c:vertex_index s:shard1 
r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request 
[vertex_index_shard1_replica_n1]  webapp=/solr path=/select 
params=\{q=*:*&_stateVer_=vertex_index:12=id=150=35x_t:*+=50=javabin=2}
 hits=1682363 status=0 QTime=4465

- ran same query through gremlin shell while ingestion is happening it doesnt 
fail
- time taken for above gremlin query in code when ingestion          : 214825ms
- time takem for above gremlin query in gremlin shell when ingestion : 104641ms
- time taken for above gremlin query when no ingestion               : 181682ms


WorkAround

- Remove .has('__guid') clause from below, it is very quick
g.V().has('__typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('__guid').values('__guid')



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ATLAS-4881) Reduce logs printed during notification processing

2024-06-18 Thread Madhan Neethiraj (Jira)
Madhan Neethiraj created ATLAS-4881:
---

 Summary: Reduce logs printed during notification processing
 Key: ATLAS-4881
 URL: https://issues.apache.org/jira/browse/ATLAS-4881
 Project: Atlas
  Issue Type: Improvement
  Components:  atlas-core
Reporter: Madhan Neethiraj
Assignee: Madhan Neethiraj


While processing notifications from Hive hook, Atlas server can print several 
info level logs about its internal performance improvements, like the following:
{noformat}
INFO  - [NotificationHookConsumer thread-0:] ~ setting 
hive_column_lineage.name=QUERY:db1.tbl1@cl1:1559003302000->:INSERT:db1.tbl1@cl1:1544966827000:col1.
 topic-offset=170, partition=0 (HivePreprocessor$HiveProcessPreprocessor:176)
INFO  - [NotificationHookConsumer thread-0:] ~ moved 269 referred-entities to 
end of entities-list (firstEntity:typeName=hive_table, 
qualifiedName=db1.tbl2@cl1). topic-offset=2143, partition=0 
(PreprocessorContext:369){noformat}
To avoid unnecessary overhead and the noise in the log file, these messages 
should be logged at debug level.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ATLAS-4878) utility to analyze hook notifications

2024-06-18 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17855802#comment-17855802
 ] 

ASF subversion and git services commented on ATLAS-4878:


Commit 978087a882348f1fc1b6002a0aeb29192d8cc00a in atlas's branch 
refs/heads/master from Madhan Neethiraj
[ https://gitbox.apache.org/repos/asf?p=atlas.git;h=978087a88 ]

ATLAS-4878: utility to analyze hook notifications


> utility to analyze hook notifications
> -
>
> Key: ATLAS-4878
> URL: https://issues.apache.org/jira/browse/ATLAS-4878
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Reporter: Madhan Neethiraj
>Assignee: Madhan Neethiraj
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: ATLAS-4878.patch
>
>
> A utility to analyze notifications received from hooks to gather following 
> details will be useful in troubleshooting:
>  # number of notifications per notification type (CREATE, UPDATE, 
> PARTIAL_UPDATE, DELETE, ..)
>  # number of entities referenced in notifications per entity type
>  # number of entity operations performed while processing the notifications 
> (create/update/delete)
>  
> For example, following details by analyzing 114k notifications from Hive hook 
> show that 94% of entities processed are of type hive_column and 
> hive_column_lineage :
> {noformat}
> {
>   "notifications": 114755,
>   "entities":  598435,
>   "notificationEntities": 2575347,
>   "notificationByType": {
> "ENTITY_CREATE_V2": 49428,
>     "ENTITY_FULL_UPDATE_V2": 1597,
> "ENTITY_PARTIAL_UPDATE_V2": 36561,
> "ENTITY_DELETE_V2": 27169
>   },
>   "notificationEntityByType": {
> "hdfs_path": 16417,
> "hive_db":   20471,
>     "hive_table":57143,
> "hive_storagedesc":  30018,
>     "hive_column":  685384,
> "hive_process":  41512
> "hive_column_lineage": 1724402,
>   },
>   "entityOperations": {
> "CREATE": 598435,
> "UPDATE":1913182
> "PARTIAL_UPDATE":  36561,
>     "DELETE":  27169
>   },
>   "entityOperationsByType": {
>     "CREATE": {
>   "hdfs_path":10940,
>   "hive_db":224,
>   "hive_table":   22154,
>   "hive_storagedesc": 15280,
>       "hive_column": 332332,
>   "hive_process": 23462,
>   "hive_column_lineage": 194043
>     },
> "UPDATE" {
>   "hdfs_path":  5477,
>   "hive_column":  319559,
>   "hive_column_lineage": 1530359,
>   "hive_db":   20203,
>   "hive_process":  18050,
>   "hive_storagedesc":  13204,
>   "hive_table": 6330
> },
>"PARTIAL_UPDATE": {
>  "hive_column":  33493,
>  "hive_storagedesc":  1534,
>  "hive_table":1534
> },
> "DELETE": {
>   "hive_db":   44,
>   "hive_table": 27125
> }
>   }
> } {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)