[ 
https://issues.apache.org/jira/browse/ATLAS-690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated ATLAS-690:
-----------------------------------
    Attachment: ATLAS-690-4.patch

I think this patch is ready for review.

It took a while to get to the bottom of the failing delete tests, which seem to 
have exposed an unexplainable behavior with Titan and composite indexes. 
Essentially adding a composite index with two property keys is causing lookups 
involving both the keys to behave unintuitively when one of the keys value 
changes. 

In our case, when we added a composite key with the unique attribute and state 
attribute, changing the state attribute to deleted was not causing index 
updates. Hence, when the actual object was being updated, the search by unique 
attribute and old state was still returning results - which caused the test 
failures. This behavior is also true if the attributes are being added 
independently to two different indexes. 

For now, I've fixed this by adding an index on only the unique attribute, and 
removed the system composite index on the state attribute. The performance 
fixes still are retained with this change. I've written on the Titan mailing 
list seeking guidance on understanding the issue here. Link to the mail thread 
is here: 
https://groups.google.com/forum/#!msg/aureliusgraphs/bx7T843mzXU/2IAEOHH4BAAJ

I've added tests for ensuring the system and user type indexes are properly 
getting added. This will help catch any regressions in future.

One nice side effect is that the test {{GraphRepoMapperScaleTest}} which was 
taking about 6 mins in recent times has come down to about 1m20s with these 
fixes. This is also an indication that the timing of these tests can be tracked 
to identify performance regressions.

I'm still waiting for [~ssainath]'s cluster runs at scale - but possibly it is 
better to do them with this patch. At any rate, local tests are indicating good 
improvements to move forward with review.

> Read timed out exceptions when tables are imported into Atlas.
> --------------------------------------------------------------
>
>                 Key: ATLAS-690
>                 URL: https://issues.apache.org/jira/browse/ATLAS-690
>             Project: Atlas
>          Issue Type: Bug
>         Environment: Atlas with External Kafka/  HBase / Solr
> atlas.notification.hook.numthreads=5
> ATLAS_HOOK created with 5 partitions
>            Reporter: Sharmadha Sainath
>            Assignee: Hemanth Yamijala
>            Priority: Blocker
>         Attachments: ATLAS-690-3.patch, ATLAS-690-4.patch
>
>
> When 1000 tables are imported into Atlas using Hive hook,Read time out 
> exceptions occur. This happened with the latest Atlas build with commit id : 
> 922a83c9a10e857d54855463225e9a5c375bc2b9. 
>    • Hive ingestion was completed in 1 minute 50 secs. 
>    • Atlas ingestion took more than an hour .
> With Last 1000 tables run that was done in Atlas with commit id :
> b9575f29df3cc014f1b076abf52d88249bf4d0ef,
>  • Hive ingestion was completed in 3 minutes
>  • Atlas ingestion by 5 minutes.
> The Exception stack trace :
> Error handling message 
> org.apache.atlas.notification.hook.HookNotification$EntityUpdateRequest@7474dd2d
>  (NotificationHookConsumer:224)
> com.sun.jersey.api.client.ClientHandlerException: 
> java.net.SocketTimeoutException: Read timed out
> at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
> at com.sun.jersey.api.client.Client.handle(Client.java:652)
> at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
> at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
> at com.sun.jersey.api.client.WebResource$Builder.method(WebResource.java:634)
> at org.apache.atlas.AtlasClient.callAPIWithResource(AtlasClient.java:911)
> at org.apache.atlas.AtlasClient.callAPIWithRetries(AtlasClient.java:565)
> at org.apache.atlas.AtlasClient.callAPI(AtlasClient.java:935)
> at org.apache.atlas.AtlasClient.updateEntities(AtlasClient.java:530)
> at 
> org.apache.atlas.notification.NotificationHookConsumer$HookConsumer.run(NotificationHookConsumer.java:216)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:152)
> at java.net.SocketInputStream.read(SocketInputStream.java:122)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
> at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:689)
> at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
> at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1324)
> at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
> at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:253)
> at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:153)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to