[jira] [Updated] (ATLAS-4804) Migrated Data: Process Entity Name not set to QualifiedName for impala_process and impala_process_execution

2023-11-01 Thread Paresh Devalia (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paresh Devalia updated ATLAS-4804:
--
Attachment: ATLAS-4804-Migrated-Data-Process-Entity-Name-not-set.patch

> Migrated Data: Process Entity Name not set to QualifiedName for 
> impala_process and impala_process_execution
> ---
>
> Key: ATLAS-4804
> URL: https://issues.apache.org/jira/browse/ATLAS-4804
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Paresh Devalia
>Assignee: Paresh Devalia
>Priority: Major
> Attachments: 
> ATLAS-4804-Migrated-Data-Process-Entity-Name-not-set.patch
>
>
> *Background*
> In Atlas process names ({_}impala_process.name{_}) was using _queryText_ as 
> the name. Latest version onwards where _name_ and _qualifiedName_ are same.
> *Solution*
> Add Java patch that updates the name property.
> *Impact of Not Doing this Update*
> The _queryText_ in _impala_process_ and impala_process_execution entities 
> tends to be large. The name field is part of {_}AtlasEntityHeader{_}. When 
> fetching search results, lineage display are some of the flows that have 
> these entities.
> *Issue* *occurred*
> Caused by: org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: 
> Error from server at 
> [http://cdhworker11.cinconet.local:8993/solr/vertex_index_shard3_replica_n14:]
>  Exception writing document id muydq8 to the index; possible analysis error: 
> Document contains at least one immense term in field="cgsl_s" (whose UTF8 
> encoding is longer than the max length 32766), all of which were skipped. 
> Please correct the analyzer to not produce such terms. The prefix of the 
> first immense term is: '[105, 110, 115, 101, 114, 116, 32, 105, 110, 116, 
> 111, 32, 103, 105, 103, 121, 97, 46, 115, 116, 103, 95, 108, 97, 98, 95, 103, 
> 105, 103, 121]...', original message: bytes can be at most 32766 in length; 
> got 41069. Perhaps the document has an indexed string field (solr.StrField) 
> which is too large at 
> org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:125)
> at 
> org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:46)
> at 
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.directUpdate(BaseCloudSolrClient.java:557)
> at 
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1046)
> at 
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:906)
> at 
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:838)
> at 
> org.janusgraph.diskstorage.solr.Solr6Index.commitChanges(Solr6Index.java:633)
> at org.janusgraph.diskstorage.solr.Solr6Index.restore(Solr6Index.java:593)
> ... 9 more



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ATLAS-4804) Migrated Data: Process Entity Name not set to QualifiedName for impala_process and impala_process_execution

2023-11-01 Thread Paresh Devalia (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paresh Devalia updated ATLAS-4804:
--
External issue URL:   (was: https://reviews.apache.org/r/74708/)

> Migrated Data: Process Entity Name not set to QualifiedName for 
> impala_process and impala_process_execution
> ---
>
> Key: ATLAS-4804
> URL: https://issues.apache.org/jira/browse/ATLAS-4804
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Paresh Devalia
>Assignee: Paresh Devalia
>Priority: Major
>
> *Background*
> In Atlas process names ({_}impala_process.name{_}) was using _queryText_ as 
> the name. Latest version onwards where _name_ and _qualifiedName_ are same.
> *Solution*
> Add Java patch that updates the name property.
> *Impact of Not Doing this Update*
> The _queryText_ in _impala_process_ and impala_process_execution entities 
> tends to be large. The name field is part of {_}AtlasEntityHeader{_}. When 
> fetching search results, lineage display are some of the flows that have 
> these entities.
> *Issue* *occurred*
> Caused by: org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: 
> Error from server at 
> [http://cdhworker11.cinconet.local:8993/solr/vertex_index_shard3_replica_n14:]
>  Exception writing document id muydq8 to the index; possible analysis error: 
> Document contains at least one immense term in field="cgsl_s" (whose UTF8 
> encoding is longer than the max length 32766), all of which were skipped. 
> Please correct the analyzer to not produce such terms. The prefix of the 
> first immense term is: '[105, 110, 115, 101, 114, 116, 32, 105, 110, 116, 
> 111, 32, 103, 105, 103, 121, 97, 46, 115, 116, 103, 95, 108, 97, 98, 95, 103, 
> 105, 103, 121]...', original message: bytes can be at most 32766 in length; 
> got 41069. Perhaps the document has an indexed string field (solr.StrField) 
> which is too large at 
> org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:125)
> at 
> org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:46)
> at 
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.directUpdate(BaseCloudSolrClient.java:557)
> at 
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1046)
> at 
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:906)
> at 
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:838)
> at 
> org.janusgraph.diskstorage.solr.Solr6Index.commitChanges(Solr6Index.java:633)
> at org.janusgraph.diskstorage.solr.Solr6Index.restore(Solr6Index.java:593)
> ... 9 more



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ATLAS-4804) Migrated Data: Process Entity Name not set to QualifiedName for impala_process and impala_process_execution

2023-11-01 Thread Paresh Devalia (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paresh Devalia updated ATLAS-4804:
--
External issue URL: https://reviews.apache.org/r/74708/

> Migrated Data: Process Entity Name not set to QualifiedName for 
> impala_process and impala_process_execution
> ---
>
> Key: ATLAS-4804
> URL: https://issues.apache.org/jira/browse/ATLAS-4804
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Paresh Devalia
>Assignee: Paresh Devalia
>Priority: Major
>
> *Background*
> In Atlas process names ({_}impala_process.name{_}) was using _queryText_ as 
> the name. Latest version onwards where _name_ and _qualifiedName_ are same.
> *Solution*
> Add Java patch that updates the name property.
> *Impact of Not Doing this Update*
> The _queryText_ in _impala_process_ and impala_process_execution entities 
> tends to be large. The name field is part of {_}AtlasEntityHeader{_}. When 
> fetching search results, lineage display are some of the flows that have 
> these entities.
> *Issue* *occurred*
> Caused by: org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: 
> Error from server at 
> [http://cdhworker11.cinconet.local:8993/solr/vertex_index_shard3_replica_n14:]
>  Exception writing document id muydq8 to the index; possible analysis error: 
> Document contains at least one immense term in field="cgsl_s" (whose UTF8 
> encoding is longer than the max length 32766), all of which were skipped. 
> Please correct the analyzer to not produce such terms. The prefix of the 
> first immense term is: '[105, 110, 115, 101, 114, 116, 32, 105, 110, 116, 
> 111, 32, 103, 105, 103, 121, 97, 46, 115, 116, 103, 95, 108, 97, 98, 95, 103, 
> 105, 103, 121]...', original message: bytes can be at most 32766 in length; 
> got 41069. Perhaps the document has an indexed string field (solr.StrField) 
> which is too large at 
> org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:125)
> at 
> org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:46)
> at 
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.directUpdate(BaseCloudSolrClient.java:557)
> at 
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1046)
> at 
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:906)
> at 
> org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:838)
> at 
> org.janusgraph.diskstorage.solr.Solr6Index.commitChanges(Solr6Index.java:633)
> at org.janusgraph.diskstorage.solr.Solr6Index.restore(Solr6Index.java:593)
> ... 9 more



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ATLAS-4804) Migrated Data: Process Entity Name not set to QualifiedName for impala_process and impala_process_execution

2023-10-25 Thread Paresh Devalia (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paresh Devalia updated ATLAS-4804:
--
Description: 
*Background*

In Atlas process names ({_}impala_process.name{_}) was using _queryText_ as the 
name. Latest version onwards where _name_ and _qualifiedName_ are same.

*Solution*

Add Java patch that updates the name property.

*Impact of Not Doing this Update*

The _queryText_ in _impala_process_ and impala_process_execution entities tends 
to be large. The name field is part of {_}AtlasEntityHeader{_}. When fetching 
search results, lineage display are some of the flows that have these entities.

*Issue* *occurred*
Caused by: org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: 
Error from server at 
[http://cdhworker11.cinconet.local:8993/solr/vertex_index_shard3_replica_n14:] 
Exception writing document id muydq8 to the index; possible analysis error: 
Document contains at least one immense term in field="cgsl_s" (whose UTF8 
encoding is longer than the max length 32766), all of which were skipped. 
Please correct the analyzer to not produce such terms. The prefix of the first 
immense term is: '[105, 110, 115, 101, 114, 116, 32, 105, 110, 116, 111, 32, 
103, 105, 103, 121, 97, 46, 115, 116, 103, 95, 108, 97, 98, 95, 103, 105, 103, 
121]...', original message: bytes can be at most 32766 in length; got 41069. 
Perhaps the document has an indexed string field (solr.StrField) which is too 
large at 
org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:125)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:46)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.directUpdate(BaseCloudSolrClient.java:557)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1046)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:906)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:838)
at org.janusgraph.diskstorage.solr.Solr6Index.commitChanges(Solr6Index.java:633)
at org.janusgraph.diskstorage.solr.Solr6Index.restore(Solr6Index.java:593)
... 9 more

  was:
*Background*

In Atlas process names ({_}impala_process.name{_}) was using _queryText_ as the 
name. This in 7.1.7 SP2 onwards where _name_ and _qualifiedName_ are same.

*Solution*

Add Java patch that updates the name property.

*Impact of Not Doing this Update*

The _queryText_ in _impala_process_ and impala_process_execution entities tends 
to be large. The name field is part of {_}AtlasEntityHeader{_}. When fetching 
search results, lineage display are some of the flows that have these entities.

*Issue* *occurred*
Caused by: org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: 
Error from server at 
[http://cdhworker11.cinconet.local:8993/solr/vertex_index_shard3_replica_n14:] 
Exception writing document id muydq8 to the index; possible analysis error: 
Document contains at least one immense term in field="cgsl_s" (whose UTF8 
encoding is longer than the max length 32766), all of which were skipped. 
Please correct the analyzer to not produce such terms. The prefix of the first 
immense term is: '[105, 110, 115, 101, 114, 116, 32, 105, 110, 116, 111, 32, 
103, 105, 103, 121, 97, 46, 115, 116, 103, 95, 108, 97, 98, 95, 103, 105, 103, 
121]...', original message: bytes can be at most 32766 in length; got 41069. 
Perhaps the document has an indexed string field (solr.StrField) which is too 
large at 
org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:125)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.getRouteException(CloudSolrClient.java:46)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.directUpdate(BaseCloudSolrClient.java:557)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1046)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:906)
at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:838)
at org.janusgraph.diskstorage.solr.Solr6Index.commitChanges(Solr6Index.java:633)
at org.janusgraph.diskstorage.solr.Solr6Index.restore(Solr6Index.java:593)
... 9 more


> Migrated Data: Process Entity Name not set to QualifiedName for 
> impala_process and impala_process_execution
> ---
>
> Key: ATLAS-4804
> URL: https://issues.apache.org/jira/browse/ATLAS-4804
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Reporter: Paresh Devalia
>Assignee: Paresh Devalia
>Priority: Major
>
> *Background*
> In Atla