[ https://issues.apache.org/jira/browse/ARROW-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Philipp Moritz resolved ARROW-1410. ----------------------------------- Resolution: Fixed Issue resolved by pull request 992 [https://github.com/apache/arrow/pull/992] > Plasma object store occasionally pauses for a long time > ------------------------------------------------------- > > Key: ARROW-1410 > URL: https://issues.apache.org/jira/browse/ARROW-1410 > Project: Apache Arrow > Issue Type: Improvement > Environment: Ubuntu 16.04 > Reporter: Robert Nishihara > Assignee: Robert Nishihara > > The problem can be reproduced as follows. First start a plasma store with > {code} > plasma_store -s /tmp/s1 -m 500000000000 > {code} > Then continuously put in objects using a script like the following. > {code} > import pyarrow.plasma as plasma > import numpy as np > client = plasma.connect('/tmp/s1', '', 0) > for i in range(20000): > print(i) > object_id = plasma.ObjectID(np.random.bytes(20)) > client.create(object_id, np.random.randint(0, 100000000)) > client.seal(object_id) > {code} > As the loop counters are being printed, you will see long pauses. The problem > is the fact that we are mmapping pages with the MAP_POPULATE flag. Though > this can be used to improve performance of subsequent object creations, it > isn't worth the long pauses. We may want to find a way to populate the pages > in the background. -- This message was sent by Atlassian JIRA (v6.4.14#64029)