[jira] [Created] (ARROW-8873) Usage model for Object IDs. Object IDs don't disappear after delete

2020-05-20 Thread Abe Mammen (Jira)
Abe Mammen created ARROW-8873:
-

 Summary: Usage model for Object IDs. Object IDs don't disappear 
after delete
 Key: ARROW-8873
 URL: https://issues.apache.org/jira/browse/ARROW-8873
 Project: Apache Arrow
  Issue Type: Test
  Components: C++, Python
Affects Versions: 0.17.0
Reporter: Abe Mammen


I have an environment that uses Arrow + Plasma to send requests between Python 
clients and a C++ server that responds with search results etc.

I use a sequence number based approach for Object ID creation so its understood 
on both sides. All that works well. So each request from the client creates a 
unique Object ID, creates and seals it etc. On the other end, a get against 
that Object ID retrieves the request payload, releases and deletes the Object 
ID. A similar response scheme for Object IDs are used from the server side to 
the client to get search results etc where it creates its own unique Object ID 
understood by the client. The server side creates and seals and the Python 
client side does a get and deletes the Object ID (there is no release method in 
Python it appears). I have experimented with deleting the plasma buffer.

The end result is that as transactions build up, the server side memory use 
goes way up and I can see that a good # of the objects aren't deleted from the 
Plasma store until the server exits. I have nulled out the search result part 
too so that is not what is accumulating. I have not done a memory profile but 
wanted to get some feedback on some what might be wrong here.

Is there a better way to use Object IDs for example? And what might be causing 
the huge memory usage. In this example, I had ~4M transactions between clients 
and the server which hit a memory usage of about 10+ GB which is in the 
ballpark of the size of all the payloads. Besides doing release-deletes on 
Object IDs, is there a better way to purge and remove these objects?

Any help is appreciated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8709) ArrayToJSON

2020-05-05 Thread Abe Mammen (Jira)
Abe Mammen created ARROW-8709:
-

 Summary: ArrayToJSON
 Key: ARROW-8709
 URL: https://issues.apache.org/jira/browse/ARROW-8709
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Affects Versions: 0.17.0
Reporter: Abe Mammen


Analogous to ArrayFromJSON it would be good to have the reverse. Any plans?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8636) plasma client delete (of objectid) causes an exception and abort

2020-04-29 Thread Abe Mammen (Jira)
Abe Mammen created ARROW-8636:
-

 Summary: plasma client delete (of objectid) causes an exception 
and abort
 Key: ARROW-8636
 URL: https://issues.apache.org/jira/browse/ARROW-8636
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Abe Mammen


Built from this git repo.
for cpp:
{quote}{quote}ARROW_CHECK_OK(client.Delete(vector\{objectId}));
get:
{quote}Check failed: _s.ok() Operation failed: client.Delete(vector\{objectId})
Bad status: IOError: Encountered unexpected EOF
0 libarrow.18.0.0.dylib 0x0001070ed3c4 
_ZN5arrow4util7CerrLog14PrintBackTraceEv + 52
1 libarrow.18.0.0.dylib 0x0001070ed2e2 _ZN5arrow4util7CerrLogD2Ev + 98
2 libarrow.18.0.0.dylib 0x0001070ed245 _ZN5arrow4util7CerrLogD1Ev + 21
3 libarrow.18.0.0.dylib 0x0001070ed26c _ZN5arrow4util7CerrLogD0Ev + 28
4 libarrow.18.0.0.dylib 0x0001070ed152 _ZN5arrow4util8ArrowLogD2Ev + 82
5 libarrow.18.0.0.dylib 0x0001070ed185 _ZN5arrow4util8ArrowLogD1Ev + 21
6 purge_plasma_messages 0x00010431fe91 main + 2369
7 libdyld.dylib 0x7fff6650b7fd start + 1
8 ??? 0x0001 0x0 + 1
Abort trap: 6
and kills the plasma-store-server.
{quote}{quote}{quote}
What could I be doing wrong? Here is the code:

#include
#include 
#include 

using namespace std;
using namespace plasma;

int main(int argc, char** argv)
{
// Start up and connect a Plasma client.
PlasmaClient client;
ARROW_CHECK_OK(client.Connect("/tmp/plasma_store"));

std::unordered_map objectTable;
ARROW_CHECK_OK(client.List(&objectTable));

cout << "# of objects = " << objectTable.size() << endl;

for (auto it = objectTable.begin(); it != objectTable.end(); ++it) {
ObjectID objectId = it->first;
auto objectEntry = it->second.get();
string idString = objectId.binary();
cout << "object id = " << idString <<
", device = " << objectEntry->device_num <<
", data_size = " << objectEntry->data_size <<
", metadata_size = " << objectEntry->metadata_size <<
", ref_count = " << objectEntry->ref_count <<
endl;
ARROW_CHECK_OK(client.Delete(vector\{objectId}));
}
ARROW_CHECK_OK(client.Disconnect());
}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)