Re: Review Request 74608: ATLAS-4797 : Implement custom audit filters in Atlas

2023-10-25 Thread Sidharth Mishra

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74608/#review225896
---




intg/src/main/java/org/apache/atlas/model/audit/EntityAuditEventV2.java
Lines 122 (patched)


Kindly format with the correct indentation



repository/src/main/java/org/apache/atlas/repository/ogm/AtlasRuleDTO.java
Lines 39 (patched)


Kindly format with the correct indentation



repository/src/main/java/org/apache/atlas/repository/ogm/AtlasRuleDTO.java
Lines 117 (patched)


Kindly format with the correct indentation.



repository/src/main/java/org/apache/atlas/rulesengine/AtlasEntityAuditFilterService.java
Lines 209 (patched)


isValid can be avoided here like below for better readability.

private void validateRuleAction(String action) throws AtlasBaseException {
if (StringUtils.isNotEmpty(action)) {
for (RuleAction.Result res : RuleAction.Result.values()){
if (res.name().equals(action)) {
return;
}
}
throw new AtlasBaseException(AtlasErrorCode.INVALID_RULE_ACTION);
}
}



repository/src/main/java/org/apache/atlas/rulesengine/AtlasEntityAuditFilterService.java
Lines 224 (patched)


Its very big function and should be broken down like - 

private void validateRuleExprFormat(AtlasRule.RuleExpr ruleExpr) throws 
AtlasBaseException {
if (ruleExpr == null) {
return;
}

List allExpressions = 
ruleExpr.getRuleExprObjList();
List> recordedTypeNamesList = new ArrayList<>();

for (AtlasRule.RuleExprObject ruleExprObj : allExpressions) {
validateTypeName(ruleExprObj.getTypeName(), recordedTypeNamesList);

AtlasRule.Condition condition = ruleExprObj.getCondition();
List criterion = 
ruleExprObj.getCriterion();
validateConditionAndCriteria(condition, criterion);

validateAttributes(ruleExprObj.getTypeName(), condition, criterion);
}
}

private void validateTypeName(String typeName, List> 
recordedTypeNamesList) throws AtlasBaseException {
if (Strings.isNullOrEmpty(typeName) || "null".equals(typeName)) {
throw new 
AtlasBaseException(AtlasErrorCode.MISSING_MANDATORY_TYPENAME_IN_RULE_EXPR);
}

if (isDuplicateTypeNameValue(recordedTypeNamesList, typeName)) {
throw new 
AtlasBaseException(AtlasErrorCode.DUPLICATE_TYPENAME_IN_RULE_EXPR, typeName);
}

recordedTypeNamesList.add(Arrays.asList(typeName.split(","));

getEntityTypes(typeName);
}

private void validateConditionAndCriteria(AtlasRule.Condition condition, 
List criterion) throws AtlasBaseException {
if (condition == null && CollectionUtils.isEmpty(criterion)) {
throw new 
AtlasBaseException(AtlasErrorCode.MISSING_CRITERIA_CONDITION, "condition and 
criteria");
}
}

private void validateAttributes(String typeName, AtlasRule.Condition 
condition, List criterion) throws 
AtlasBaseException {
Set entityTypes = getEntityTypes(typeName);

for (AtlasEntityType entityType : entityTypes) {
if (condition != null && CollectionUtils.isNotEmpty(criterion)) {
validateCriteriaList(entityType, criterion);
} else {
validateExpression(entityType, ruleExprObj.getOperator(), 
ruleExprObj.getAttributeName(),  ruleExprObj.getAttributeValue() );
}
}
}



repository/src/main/java/org/apache/atlas/rulesengine/AtlasEntityAuditFilterService.java
Lines 292 (patched)


Kinldy consider to refactor this like-

private void validateExternalAttribute(String attrName, String attrValue, 
AtlasRule.RuleExprObject.Operator operator) throws AtlasBaseException {
if (ATTR_OPERATION_TYPE.equals(attrName)) {
EntityAuditEventV2.EntityAuditActionV2[] enumConstants = 
EntityAuditEventV2.EntityAuditActionV2.class.getEnumConstants();

if (isValidOperator(enumConstants, operator, attrValue)) {
return;
}
}

throw new 
AtlasBaseException(AtlasErrorCode.INVALID_OPERATOR_ON_ATTRIBUTE, 
operator.getSymbol(), ATTR_OPERATION_TYPE);
}

private boolean isValidOperator(EntityAuditEventV2.EntityAuditActionV2[] 
enumConstants, AtlasRule.RuleExprObject.Operator operator, String attrValue) {
switch (operator) {
case EQ:
return 

Re: Review Request 74452: ATLAS-4754 : Download Search with Basic Search gives java.io.FileNotFoundException

2023-10-25 Thread Jayendra Parab

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74452/#review225895
---


Ship it!




Ship It!

- Jayendra Parab


On May 23, 2023, 7:56 a.m., Mandar Ambawane wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/74452/
> ---
> 
> (Updated May 23, 2023, 7:56 a.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Pinal Shah, and Sheetal Shah.
> 
> 
> Bugs: ATLAS-4754
> https://issues.apache.org/jira/browse/ATLAS-4754
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Default value for the download directory will be read from system property 
> "user.dir" instead of property "atlas.home"
> 
> 
> Diffs
> -
> 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/searchdownload/SearchResultDownloadTask.java
>  fd90fd440 
> 
> 
> Diff: https://reviews.apache.org/r/74452/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Mandar Ambawane
> 
>



[jira] [Updated] (ATLAS-4804) Migrated Data: Process Entity Name not set to QualifiedName for impala_process and impala_process_execution

2023-10-25 Thread Paresh Devalia (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paresh Devalia updated ATLAS-4804:
--
Description: 
*Background*

In Atlas process names ({_}impala_process.name{_}) was using _queryText_ as the 
name. Latest version onwards where _name_ and _qualifiedName_ are same.

*Solution*

Add Java patch that updates the name property.

*Impact of Not Doing this Update*

The _queryText_ in _impala_process_ and impala_process_execution entities tends 
to be large. The name field is part of {_}AtlasEntityHeader{_}. When fetching 
search results, lineage display are some of the flows that have these entities.

*Issue* *occurred*
Caused by: org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: 
Error from server at 
[http://cdhworker11.cinconet.local:8993/solr/vertex_index_shard3_replica_n14:] 
Exception writing document id muydq8 to the index; possible analysis error: 
Document contains at least one immense term in field="cgsl_s" (whose UTF8 
encoding is longer than the max length 32766), all of which were skipped. 
Please correct the analyzer to not produce such terms. The prefix of the first 
immense term is: '[105, 110, 115, 101, 114, 116, 32, 105, 110, 116, 111, 32, 
103, 105, 103, 121, 97, 46, 115, 116, 103, 95, 108, 97, 98, 95, 103, 105, 103, 
121]...', original message: bytes can be at most 32766 in length; got 41069. 
Perhaps the document has an indexed string field (solr.StrField) which is too 
large at 
org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:125)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:46)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.directUpdate(BaseCloudSolrClient.java:557)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1046)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:906)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:838)
at org.janusgraph.diskstorage.solr.Solr6Index.commitChanges(Solr6Index.java:633)
at org.janusgraph.diskstorage.solr.Solr6Index.restore(Solr6Index.java:593)
... 9 more

  was:
*Background*

In Atlas process names ({_}impala_process.name{_}) was using _queryText_ as the 
name. This in 7.1.7 SP2 onwards where _name_ and _qualifiedName_ are same.

*Solution*

Add Java patch that updates the name property.

*Impact of Not Doing this Update*

The _queryText_ in _impala_process_ and impala_process_execution entities tends 
to be large. The name field is part of {_}AtlasEntityHeader{_}. When fetching 
search results, lineage display are some of the flows that have these entities.

*Issue* *occurred*
Caused by: org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: 
Error from server at 
[http://cdhworker11.cinconet.local:8993/solr/vertex_index_shard3_replica_n14:] 
Exception writing document id muydq8 to the index; possible analysis error: 
Document contains at least one immense term in field="cgsl_s" (whose UTF8 
encoding is longer than the max length 32766), all of which were skipped. 
Please correct the analyzer to not produce such terms. The prefix of the first 
immense term is: '[105, 110, 115, 101, 114, 116, 32, 105, 110, 116, 111, 32, 
103, 105, 103, 121, 97, 46, 115, 116, 103, 95, 108, 97, 98, 95, 103, 105, 103, 
121]...', original message: bytes can be at most 32766 in length; got 41069. 
Perhaps the document has an indexed string field (solr.StrField) which is too 
large at 
org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:125)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:46)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.directUpdate(BaseCloudSolrClient.java:557)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1046)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:906)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:838)
at org.janusgraph.diskstorage.solr.Solr6Index.commitChanges(Solr6Index.java:633)
at org.janusgraph.diskstorage.solr.Solr6Index.restore(Solr6Index.java:593)
... 9 more


> Migrated Data: Process Entity Name not set to QualifiedName for 
> impala_process and impala_process_execution
> ---
>
> Key: ATLAS-4804
> URL: https://issues.apache.org/jira/browse/ATLAS-4804
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Paresh Devalia
>Assignee: Paresh Devalia
>Priority: Major
>
> *Background*
> In 

[jira] [Created] (ATLAS-4804) Migrated Data: Process Entity Name not set to QualifiedName for impala_process and impala_process_execution

2023-10-25 Thread Paresh Devalia (Jira)
Paresh Devalia created ATLAS-4804:
-

 Summary: Migrated Data: Process Entity Name not set to 
QualifiedName for impala_process and impala_process_execution
 Key: ATLAS-4804
 URL: https://issues.apache.org/jira/browse/ATLAS-4804
 Project: Atlas
  Issue Type: Bug
  Components:  atlas-core
Reporter: Paresh Devalia
Assignee: Paresh Devalia


*Background*

In Atlas process names ({_}impala_process.name{_}) was using _queryText_ as the 
name. This in 7.1.7 SP2 onwards where _name_ and _qualifiedName_ are same.

*Solution*

Add Java patch that updates the name property.

*Impact of Not Doing this Update*

The _queryText_ in _impala_process_ and impala_process_execution entities tends 
to be large. The name field is part of {_}AtlasEntityHeader{_}. When fetching 
search results, lineage display are some of the flows that have these entities.

*Issue* *occurred*
Caused by: org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: 
Error from server at 
[http://cdhworker11.cinconet.local:8993/solr/vertex_index_shard3_replica_n14:] 
Exception writing document id muydq8 to the index; possible analysis error: 
Document contains at least one immense term in field="cgsl_s" (whose UTF8 
encoding is longer than the max length 32766), all of which were skipped. 
Please correct the analyzer to not produce such terms. The prefix of the first 
immense term is: '[105, 110, 115, 101, 114, 116, 32, 105, 110, 116, 111, 32, 
103, 105, 103, 121, 97, 46, 115, 116, 103, 95, 108, 97, 98, 95, 103, 105, 103, 
121]...', original message: bytes can be at most 32766 in length; got 41069. 
Perhaps the document has an indexed string field (solr.StrField) which is too 
large at 
org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:125)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:46)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.directUpdate(BaseCloudSolrClient.java:557)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1046)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:906)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:838)
at org.janusgraph.diskstorage.solr.Solr6Index.commitChanges(Solr6Index.java:633)
at org.janusgraph.diskstorage.solr.Solr6Index.restore(Solr6Index.java:593)
... 9 more



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (ATLAS-4803) Optimize Edge fetch

2023-10-25 Thread Paresh Devalia (Jira)
Paresh Devalia created ATLAS-4803:
-

 Summary: Optimize Edge fetch
 Key: ATLAS-4803
 URL: https://issues.apache.org/jira/browse/ATLAS-4803
 Project: Atlas
  Issue Type: Bug
  Components:  atlas-core
Reporter: Paresh Devalia
Assignee: Paresh Devalia


CU kafka lag was not decreasing for ATLAS_HOOK topics, create Entity API was 
taking 50-60 sec per request.

Hive_table count was 10mn record.

Impala_lineage_column count was 26mn count.

Able to reproduce the issue on in-house cluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)