[ 
https://issues.apache.org/jira/browse/ARROW-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lu Qi  closed ARROW-1792.
-------------------------
    Resolution: Later

> [Plasma C++] continuous write tensor failed
> -------------------------------------------
>
>                 Key: ARROW-1792
>                 URL: https://issues.apache.org/jira/browse/ARROW-1792
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Plasma (C++)
>         Environment: ubuntu 14.04 gcc 4.8.4
>            Reporter: Lu Qi 
>   Original Estimate: 288h
>  Remaining Estimate: 288h
>
> start plasma using "plasma_store -m 8000000000 -s /tmp/plasma"
> write tensor in python using  
> {code:python}
> for i in range(10):
>         client = plasma.connect("/tmp/plasma", "", 0)
>         x = np.random.rand(1000,1000,5*256).astype("float32")    # write 5 GB
>         object_id = pa.plasma.ObjectID(random_object_id())
>         tensor = pa.Tensor.from_numpy(x)
>         data_size = pa.get_tensor_size(tensor)
>         buf = client.create(object_id, data_size)
>         stream = pa.FixedSizeBufferWriter(buf)
>         stream.set_memcopy_threads(6)
>         pa.write_tensor(tensor, stream)
>         client.seal(object_id)
> #        client.release(object_id)
> #        client.disconnect()
>         print(i)
> {code}
> The error is like below:
> pyarrow.lib.PlasmaStoreFull: object does not fit in the plasma store
> If I add "client.release(object_id)" ,the error is:
> /arrow/cpp/src/plasma/client.cc296 Check failed: object_entry != 
> objects_in_use_.end()
> Also,sometimes error is:
>   buf = client.create(object_id, data_size)
>   File "pyarrow/plasma.pyx", line 301, in pyarrow.plasma.PlasmaClient.create 
> (/arrow/python/build/temp.linux-x86_64-2.7/plasma.cxx:4382)
>   File "pyarrow/error.pxi", line 79, in pyarrow.lib.check_status 
> (/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:7888)
> pyarrow.lib.ArrowIOError: Broken pipe
> After adding "client.disconnect()" it seems to work , but using the code 
> below will fail:
> {code:python}
> client = plasma.connect("/tmp/plasma", "", 0)
> for i in range(10):
>         x = np.random.rand(1000,1000,5*256).astype("float32")    // write 5 GB
>         object_id = pa.plasma.ObjectID(random_object_id())
>         tensor = pa.Tensor.from_numpy(x)
>         data_size = pa.get_tensor_size(tensor)
>         buf = client.create(object_id, data_size)
>         stream = pa.FixedSizeBufferWriter(buf)
>         stream.set_memcopy_threads(6)
>         pa.write_tensor(tensor, stream)
>         client.seal(object_id)
> #        client.release(object_id)
> #        client.disconnect()
>         print(i)
> {code}
> plus: I have input another issue about the memory evict policy [Arrow-1795]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to