[jira] [Updated] (ARROW-5891) [C++][Gandiva] Remove duplicates in function registries
[ https://issues.apache.org/jira/browse/ARROW-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prudhvi Porandla updated ARROW-5891: Description: Each precompiled function should have at most one "NativeFunction" entry in the registry. No two function signatures can refer the same precompiled function. Also add a UnitTest which checks if there are duplicates (was: Each precompiled function should have at most one "NativeFunction" entry in the registry. Also add a UnitTest which checks if there are duplicates) > [C++][Gandiva] Remove duplicates in function registries > --- > > Key: ARROW-5891 > URL: https://issues.apache.org/jira/browse/ARROW-5891 > Project: Apache Arrow > Issue Type: Task > Components: C++ - Gandiva >Reporter: Prudhvi Porandla >Assignee: Prudhvi Porandla >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Each precompiled function should have at most one "NativeFunction" entry in > the registry. No two function signatures can refer the same precompiled > function. Also add a UnitTest which checks if there are duplicates -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (ARROW-5871) [Python] Can't import pyarrow 0.14.0 due to mismatching libcrypt
[ https://issues.apache.org/jira/browse/ARROW-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884841#comment-16884841 ] Suvayu Ali commented on ARROW-5871: --- Hi [~wesmckinn], I was able to build arrow-cpp and pyarrow from source from the maint-0.14.x branch. Although I have not done any testing like, installing the wheel on different platforms, the above crash does not happen when I do a simple import. > [Python] Can't import pyarrow 0.14.0 due to mismatching libcrypt > > > Key: ARROW-5871 > URL: https://issues.apache.org/jira/browse/ARROW-5871 > Project: Apache Arrow > Issue Type: Bug > Components: Packaging >Affects Versions: 0.14.0 > Environment: 5.1.16-300.fc30.x86_64 > Python 3.7.3 > libxcrypt-4.4.6-2.fc30.x86_64 >Reporter: Suvayu Ali >Priority: Major > Fix For: 1.0.0 > > > In a freshly created virtual environment, after I install pyarrow 0.14.0 > (using pip), importing pyarrow from the python prompt leads to crash: > {code:java} > $ mktmpenv > [..] > This is a temporary environment. It will be deleted when you run 'deactivate'. > $ pip install pyarrow > Collecting pyarrow > Using cached > https://files.pythonhosted.org/packages/8f/fa/407667d763c25c3d9977e1d19038df3b4a693f37789c4fe1fe5c74a6bc55/pyarrow-0.14.0-cp37-cp37m-manylinux2010_x86_64.whl > Collecting numpy>=1.14 (from pyarrow) > Using cached > https://files.pythonhosted.org/packages/fc/d1/45be1144b03b6b1e24f9a924f23f66b4ad030d834ad31fb9e5581bd328af/numpy-1.16.4-cp37-cp37m-manylinux1_x86_64.whl > Collecting six>=1.0.0 (from pyarrow) > Using cached > https://files.pythonhosted.org/packages/73/fb/00a976f728d0d1fecfe898238ce23f502a721c0ac0ecfedb80e0d88c64e9/six-1.12.0-py2.py3-none-any.whl > Installing collected packages: numpy, six, pyarrow > Successfully installed numpy-1.16.4 pyarrow-0.14.0 six-1.12.0 > $ python --version > Python 3.7.3 > $ python -m pyarrow > Traceback (most recent call last): > File "/usr/lib64/python3.7/runpy.py", line 183, in _run_module_as_main > mod_name, mod_spec, code = _get_module_details(mod_name, _Error) > File "/usr/lib64/python3.7/runpy.py", line 142, in _get_module_details > return _get_module_details(pkg_main_name, error) > File "/usr/lib64/python3.7/runpy.py", line 109, in _get_module_details > __import__(pkg_name) > File > "/home/user/.virtualenvs/tmp-8a4d52e7bb62853/lib/python3.7/site-packages/pyarrow/__init__.py", > line 49, in > from pyarrow.lib import cpu_count, set_cpu_count > ImportError: libcrypt.so.1: cannot open shared object file: No such file or > directory{code} > This is surprising because I have older versions of pyarrow (up to 0.13.0) > working, and libcrypt on my system (Fedora 30, Python 3.7) is libcrypt.so.2! -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (ARROW-5853) [Python] Expose boolean filter kernel on Array
[ https://issues.apache.org/jira/browse/ARROW-5853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884816#comment-16884816 ] Blake Haugen commented on ARROW-5853: - I would like to work on this issue to get familiar with the codebase so my apologies if this is a dumb question. I think I have something partially working but I am wondering what the expected behavior is if you pass something like an integer array as the filter. When I passed it an integer array of 0s and 1s I get a seg fault. My C++ is pretty rusty but in poking around in 'filter.cc' I suspect it may be something related to the 'checked_pointer_cast' in 'FilterKernel::Call'. It doesn't look like there are any C++ tests with an array other than BooleanArray. Should this function check that the Array is boolean and throw a type error if it isn't or should this be doing a different cast? > [Python] Expose boolean filter kernel on Array > -- > > Key: ARROW-5853 > URL: https://issues.apache.org/jira/browse/ARROW-5853 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Reporter: Joris Van den Bossche >Priority: Major > > Expose the filter kernel (https://issues.apache.org/jira/browse/ARROW-1558) > on the python Array class. > Could be done as {{.filter(mask)}} method and/or in {{\_\_getitem\_\_}}. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (ARROW-5929) [Java] Define API for ExtensionVector whose data must be serialized prior to being sent via IPC
[ https://issues.apache.org/jira/browse/ARROW-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884803#comment-16884803 ] Liya Fan commented on ARROW-5929: - [~wesmckinn], many thanks for opening this issue. I think this is a useful feature covering many important use cases. > [Java] Define API for ExtensionVector whose data must be serialized prior to > being sent via IPC > --- > > Key: ARROW-5929 > URL: https://issues.apache.org/jira/browse/ARROW-5929 > Project: Apache Arrow > Issue Type: Improvement > Components: Java >Reporter: Wes McKinney >Assignee: Liya Fan >Priority: Major > > As being discussed on the mailing list, a possible use case for > ExtensionVector involves having the Arrow buffers contain pointer-type values > referring to memory outside of the Arrow memory heap. In IPC, such vectors > would need to be serialized to a wholly Arrow-resident form, such as a > VarBinaryVector. We do not have an API to allow for this, so this JIRA > proposes to add new functions that can indicate to the IPC layer that an > ExtensionVector requires additional serialization to a native Arrow type (in > such case, the extension type metadata would be discarded) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Assigned] (ARROW-5929) [Java] Define API for ExtensionVector whose data must be serialized prior to being sent via IPC
[ https://issues.apache.org/jira/browse/ARROW-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liya Fan reassigned ARROW-5929: --- Assignee: Liya Fan > [Java] Define API for ExtensionVector whose data must be serialized prior to > being sent via IPC > --- > > Key: ARROW-5929 > URL: https://issues.apache.org/jira/browse/ARROW-5929 > Project: Apache Arrow > Issue Type: Improvement > Components: Java >Reporter: Wes McKinney >Assignee: Liya Fan >Priority: Major > > As being discussed on the mailing list, a possible use case for > ExtensionVector involves having the Arrow buffers contain pointer-type values > referring to memory outside of the Arrow memory heap. In IPC, such vectors > would need to be serialized to a wholly Arrow-resident form, such as a > VarBinaryVector. We do not have an API to allow for this, so this JIRA > proposes to add new functions that can indicate to the IPC layer that an > ExtensionVector requires additional serialization to a native Arrow type (in > such case, the extension type metadata would be discarded) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Resolved] (ARROW-5943) [GLib][Gandiva] Add support for function aliases
[ https://issues.apache.org/jira/browse/ARROW-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yosuke Shiro resolved ARROW-5943. - Resolution: Fixed Issue resolved by pull request 4874 [https://github.com/apache/arrow/pull/4874] > [GLib][Gandiva] Add support for function aliases > > > Key: ARROW-5943 > URL: https://issues.apache.org/jira/browse/ARROW-5943 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Sutou Kouhei >Assignee: Sutou Kouhei >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Resolved] (ARROW-5947) [Rust] [DataFusion] Remove serde_json dependency
[ https://issues.apache.org/jira/browse/ARROW-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sutou Kouhei resolved ARROW-5947. - Resolution: Fixed Fix Version/s: 1.0.0 Issue resolved by pull request 4879 [https://github.com/apache/arrow/pull/4879] > [Rust] [DataFusion] Remove serde_json dependency > > > Key: ARROW-5947 > URL: https://issues.apache.org/jira/browse/ARROW-5947 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust, Rust - DataFusion >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > I added a dependency to serde_json early on so that I could serialize logical > query plans because I wanted a way to pass them between processes. However, > this was just a short term hack and is non-standard. I would like to remove > this now. > I am now using gRPC in another project and serializing plans that way based > on the Gandiva protobuf def. I will start a discussion on the mailing list in > the next 1-2 weeks about pushing some changes into the Arrow repo related to > this. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (ARROW-5949) [Rust] Implement DictionaryArray
David Atienza created ARROW-5949: Summary: [Rust] Implement DictionaryArray Key: ARROW-5949 URL: https://issues.apache.org/jira/browse/ARROW-5949 Project: Apache Arrow Issue Type: New Feature Components: Rust Reporter: David Atienza I am pretty new to the codebase, but I have seen that DictionaryArray is not implemented in the Rust implementation. I went through the list of issues and I could not see any work on this. Is there any blocker? The specification is a bit [short|https://arrow.apache.org/docs/format/Layout.html#dictionary-encoding] or even [non-existant|https://arrow.apache.org/docs/format/Metadata.html#dictionary-encoding], so I am not sure how to implement it myself. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (ARROW-5948) [Rust] [DataFusion] create_logical_plan should not call optimizer
Andy Grove created ARROW-5948: - Summary: [Rust] [DataFusion] create_logical_plan should not call optimizer Key: ARROW-5948 URL: https://issues.apache.org/jira/browse/ARROW-5948 Project: Apache Arrow Issue Type: Improvement Components: Rust, Rust - DataFusion Affects Versions: 0.14.0 Reporter: Andy Grove Assignee: Andy Grove Fix For: 1.0.0 ExecutionContext.create_logical_plan currently returns an optimized plan. There is already a separate method on ExecutionContext for creating an optimized plan and it would be better to have create_logical_plan return the unoptimized plan. This helps with testing and also helps for my use case where I want to pass the logical plan to another node before it gets optimized (it is not currently possible to optimize a plan twice, and this is causing me some issues) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (ARROW-5947) [Rust] [DataFusion] Remove serde_json dependency
[ https://issues.apache.org/jira/browse/ARROW-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5947: -- Labels: pull-request-available (was: ) > [Rust] [DataFusion] Remove serde_json dependency > > > Key: ARROW-5947 > URL: https://issues.apache.org/jira/browse/ARROW-5947 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust, Rust - DataFusion >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Labels: pull-request-available > > I added a dependency to serde_json early on so that I could serialize logical > query plans because I wanted a way to pass them between processes. However, > this was just a short term hack and is non-standard. I would like to remove > this now. > I am now using gRPC in another project and serializing plans that way based > on the Gandiva protobuf def. I will start a discussion on the mailing list in > the next 1-2 weeks about pushing some changes into the Arrow repo related to > this. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (ARROW-5947) [Rust] [DataFusion] Remove serde_json dependency
Andy Grove created ARROW-5947: - Summary: [Rust] [DataFusion] Remove serde_json dependency Key: ARROW-5947 URL: https://issues.apache.org/jira/browse/ARROW-5947 Project: Apache Arrow Issue Type: Improvement Components: Rust, Rust - DataFusion Reporter: Andy Grove Assignee: Andy Grove I added a dependency to serde_json early on so that I could serialize logical query plans because I wanted a way to pass them between processes. However, this was just a short term hack and is non-standard. I would like to remove this now. I am now using gRPC in another project and serializing plans that way based on the Gandiva protobuf def. I will start a discussion on the mailing list in the next 1-2 weeks about pushing some changes into the Arrow repo related to this. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (ARROW-5946) [Rust] [DataFusion] Projection push down with aggregate producing incorrect results
[ https://issues.apache.org/jira/browse/ARROW-5946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5946: -- Labels: pull-request-available (was: ) > [Rust] [DataFusion] Projection push down with aggregate producing incorrect > results > --- > > Key: ARROW-5946 > URL: https://issues.apache.org/jira/browse/ARROW-5946 > Project: Apache Arrow > Issue Type: Bug > Components: Rust, Rust - DataFusion >Affects Versions: 0.14.0 >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > > I was testing some queries with the 0.14 release and noticed that the > projected schema for a table scan is completely wrong (however the results of > the query are not necessarily wrong) > > {code:java} > // schema for nyxtaxi csv files > let schema = Schema::new(vec![ > Field::new("VendorID", DataType::Utf8, true), > Field::new("tpep_pickup_datetime", DataType::Utf8, true), > Field::new("tpep_dropoff_datetime", DataType::Utf8, true), > Field::new("passenger_count", DataType::Utf8, true), > Field::new("trip_distance", DataType::Float64, true), > Field::new("RatecodeID", DataType::Utf8, true), > Field::new("store_and_fwd_flag", DataType::Utf8, true), > Field::new("PULocationID", DataType::Utf8, true), > Field::new("DOLocationID", DataType::Utf8, true), > Field::new("payment_type", DataType::Utf8, true), > Field::new("fare_amount", DataType::Float64, true), > Field::new("extra", DataType::Float64, true), > Field::new("mta_tax", DataType::Float64, true), > Field::new("tip_amount", DataType::Float64, true), > Field::new("tolls_amount", DataType::Float64, true), > Field::new("improvement_surcharge", DataType::Float64, true), > Field::new("total_amount", DataType::Float64, true), > ]); > let mut ctx = ExecutionContext::new(); > ctx.register_csv("tripdata", "file.csv", , true); > let optimized_plan = ctx.create_logical_plan( > "SELECT passenger_count, MIN(fare_amount), MAX(fare_amount) \ > FROM tripdata GROUP BY passenger_count").unwrap();{code} > The projected schema in the table scan has the first two columns from the > schema (VendorID and tpetp_pickup_datetime) rather than passenger_count and > fare_amount -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (ARROW-5924) [C++][Plasma] It is not convenient to release a GPU object
[ https://issues.apache.org/jira/browse/ARROW-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shengjun.li updated ARROW-5924: --- Description: cmake_modules/DefineOptions.cmake define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" ON) define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) The corrent sequence is as follow: (1) plasma_client.Create(object_id, size, nullptr, 0, , 1); // where device_num > 0 (2) plasma_client.Seal(object_id); (3) buff = nullptr; (4) plasma_client.Release(object_id); (5) plasma_client.Delete(object_id); To set buff nullptr (step 3) just before release the object (step 4) because CloseIpcBuffer is in its destructor (class CudaBuffer). If a user does not do that promptly, CloseIpcBuffer will be blocked. Then, the following error may occure when another object created. IOError: Cuda Driver API call in /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with code 208: cuIpcOpenMemHandle(, *handle, CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) was: cmake_modules/DefineOptions.cmake define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" ON) define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) The corrent sequence is as follow: (1) plasma_client.Create(object_id, size, nullptr, 0, , 1); // where device_num > 0 (2) plasma_client.Seal(object_id); (3) buff = nullptr; (4) plasma_client.Release(object_id); (5) plasma_client.Delete(object_id); To set buff nullptr (step 3) just before release the object (step 4) because CloseIpcBuffer is in its destructor (class CudaBuffer). If a user does not do that promptly, CloseIpcBuffer will be blocked. Then, the following error may occure when another object created. IOError: Cuda Driver API call in /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with code 208: cuIpcOpenMemHandle(, *handle, CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) Here is a sample. thread 1: { std::shared_ptr buff; plasma_client1.Create(object_id1, size, nullptr, 0, , 1); plasma_client1.Seal(object_id); // not to set buff nullptr plasma_client1.Release(object_id); plasma_client1.Delete(object_id); // ... do someting else or not to do anything } // let buff auto release here. thread 2: { std::shared_ptr buff; plasma_client2.Create(object_id2, size, nullptr, 0, , 1); // If the address allocated by the server is just the object_id1 released, error occur! } > [C++][Plasma] It is not convenient to release a GPU object > -- > > Key: ARROW-5924 > URL: https://issues.apache.org/jira/browse/ARROW-5924 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ - Plasma >Affects Versions: 0.14.0 >Reporter: shengjun.li >Priority: Major > Labels: pull-request-available > Fix For: 0.14.1 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > cmake_modules/DefineOptions.cmake > define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA > toolkit)" ON) > define_option(ARROW_PLASMA "Build the plasma object store along with > Arrow" ON) > The corrent sequence is as follow: > (1) plasma_client.Create(object_id, size, nullptr, 0, , 1); // where > device_num > 0 > (2) plasma_client.Seal(object_id); > (3) buff = nullptr; > (4) plasma_client.Release(object_id); > (5) plasma_client.Delete(object_id); > To set buff nullptr (step 3) just before release the object (step 4) because > CloseIpcBuffer is in its destructor (class CudaBuffer). > If a user does not do that promptly, CloseIpcBuffer will be blocked. > Then, the following error may occure when another object created. > IOError: Cuda Driver API call in > /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with > code 208: cuIpcOpenMemHandle(, *handle, > CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (ARROW-5924) [C++][Plasma] It is not convenient to release a GPU object
[ https://issues.apache.org/jira/browse/ARROW-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shengjun.li updated ARROW-5924: --- Description: cmake_modules/DefineOptions.cmake define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" ON) define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) The corrent sequence is as follow: (1) plasma_client.Create(object_id, size, nullptr, 0, , 1); // where device_num > 0 (2) plasma_client.Seal(object_id); (3) buff = nullptr; (4) plasma_client.Release(object_id); (5) plasma_client.Delete(object_id); To set buff nullptr (step 3) just before release the object (step 4) because CloseIpcBuffer is in its destructor (class CudaBuffer). If a user does not do that promptly, CloseIpcBuffer will be blocked. Then, the following error may occure when another object created. IOError: Cuda Driver API call in /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with code 208: cuIpcOpenMemHandle(, *handle, CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) Here is a sample. thread 1: { std::shared_ptr buff; plasma_client1.Create(object_id1, size, nullptr, 0, , 1); plasma_client1.Seal(object_id); // not to set buff nullptr plasma_client1.Release(object_id); plasma_client1.Delete(object_id); // ... do someting else or not to do anything } // let buff auto release here. thread 2: { std::shared_ptr buff; plasma_client2.Create(object_id2, size, nullptr, 0, , 1); // If the address allocated by the server is just the object_id1 released, error occur! } was: cmake_modules/DefineOptions.cmake define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" ON) define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) The corrent sequence is as follow: (1) plasma_client.Create(object_id, size, nullptr, 0, , 1); // where device_num > 0 (2) plasma_client.Seal(object_id); (3) buff = nullptr; (4) plasma_client.Release(object_id); (5) plasma_client.Delete(object_id); To set buff nullptr (step 3) just before release the object (step 4) because CloseIpcBuffer is in its destructor (class CudaBuffer). If a user does not do that promptly, CloseIpcBuffer will be blocked. Then, the following error may occure when another object created. IOError: Cuda Driver API call in /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with code 208: cuIpcOpenMemHandle(, *handle, CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) Here is a sample. thread 1: { std::shared_ptr buff; plasma_client1.Create(object_id1, size, nullptr, 0, , 1); plasma_client1.Seal(object_id); // not to set buff nullptr plasma_client1.Release(object_id); plasma_client1.Delete(object_id); // ... do someting else or not to do anything } // let buff auto release here. thread 2: { std::shared_ptr buff; plasma_client2.Create(object_id2, size, nullptr, 0, , 1); // If the address allocated by the server is just the object_id1 released, error occur! } > [C++][Plasma] It is not convenient to release a GPU object > -- > > Key: ARROW-5924 > URL: https://issues.apache.org/jira/browse/ARROW-5924 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ - Plasma >Affects Versions: 0.14.0 >Reporter: shengjun.li >Priority: Major > Labels: pull-request-available > Fix For: 0.14.1 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > cmake_modules/DefineOptions.cmake > define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA > toolkit)" ON) > define_option(ARROW_PLASMA "Build the plasma object store along with > Arrow" ON) > The corrent sequence is as follow: > (1) plasma_client.Create(object_id, size, nullptr, 0, , 1); // where > device_num > 0 > (2) plasma_client.Seal(object_id); > (3) buff = nullptr; > (4) plasma_client.Release(object_id); > (5) plasma_client.Delete(object_id); > To set buff nullptr (step 3) just before release the object (step 4) because > CloseIpcBuffer is in its destructor (class CudaBuffer). > If a user does not do that promptly, CloseIpcBuffer will be blocked. > Then, the following error may occure when another object created. > IOError: Cuda Driver API call in > /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with > code 208: cuIpcOpenMemHandle(, *handle, > CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) > Here is a sample. > thread 1: > { > std::shared_ptr buff; > plasma_client1.Create(object_id1, size, nullptr, 0, , 1); > plasma_client1.Seal(object_id); > // not to set buff nullptr > plasma_client1.Release(object_id); > plasma_client1.Delete(object_id); > // ... do someting else or not to do anything > } > // let buff auto release here.
[jira] [Updated] (ARROW-5924) [C++][Plasma] It is not convenient to release a GPU object
[ https://issues.apache.org/jira/browse/ARROW-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shengjun.li updated ARROW-5924: --- Description: cmake_modules/DefineOptions.cmake define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" ON) define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) The corrent sequence is as follow: (1) plasma_client.Create(object_id, size, nullptr, 0, , 1); // where device_num > 0 (2) plasma_client.Seal(object_id); (3) buff = nullptr; (4) plasma_client.Release(object_id); (5) plasma_client.Delete(object_id); To set buff nullptr (step 3) just before release the object (step 4) because CloseIpcBuffer is in its destructor (class CudaBuffer). If a user does not do that promptly, CloseIpcBuffer will be blocked. Then, the following error may occure when another object created. IOError: Cuda Driver API call in /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with code 208: cuIpcOpenMemHandle(, *handle, CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) Here is a sample. thread 1: { std::shared_ptr buff; plasma_client1.Create(object_id1, size, nullptr, 0, , 1); plasma_client1.Seal(object_id); // not to set buff nullptr plasma_client1.Release(object_id); plasma_client1.Delete(object_id); // ... do someting else or not to do anything } // let buff auto release here. thread 2: { std::shared_ptr buff; plasma_client2.Create(object_id2, size, nullptr, 0, , 1); // If the address allocated by the server is just the object_id1 released, error occur! } was: cmake_modules/DefineOptions.cmake define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" ON) define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) The corrent sequence is as follow: (1) plasma_client.Create(object_id, size, nullptr, 0, , 1); // where device_num > 0 (2) plasma_client.Seal(object_id); (3) buff = nullptr; (4) plasma_client.Release(object_id); (5) plasma_client.Delete(object_id); To set buff nullptr (step 3) just before release the object (step 4) because CloseIpcBuffer is in its destructor (class CudaBuffer). If a user does not do that promptly, CloseIpcBuffer will be blocked. Then, the following error may occure when another object created. IOError: Cuda Driver API call in /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with code 208: cuIpcOpenMemHandle(, *handle, CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) Here is a sample. thread 1: {std::shared_ptr buff; plasma_client1.Create(object_id1, size, nullptr, 0, , 1); plasma_client1.Seal(object_id); // not to set buff nullptr plasma_client1.Release(object_id); plasma_client1.Delete(object_id); // ... do someting else or not to do anything } // let buff auto release here. thread 2: { std::shared_ptr buff; plasma_client2.Create(object_id2, size, nullptr, 0, , 1); // If the address allocated by the server is just the object_id1 released, error occur! } > [C++][Plasma] It is not convenient to release a GPU object > -- > > Key: ARROW-5924 > URL: https://issues.apache.org/jira/browse/ARROW-5924 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ - Plasma >Affects Versions: 0.14.0 >Reporter: shengjun.li >Priority: Major > Labels: pull-request-available > Fix For: 0.14.1 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > cmake_modules/DefineOptions.cmake > define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA > toolkit)" ON) > define_option(ARROW_PLASMA "Build the plasma object store along with > Arrow" ON) > The corrent sequence is as follow: > (1) plasma_client.Create(object_id, size, nullptr, 0, , 1); // where > device_num > 0 > (2) plasma_client.Seal(object_id); > (3) buff = nullptr; > (4) plasma_client.Release(object_id); > (5) plasma_client.Delete(object_id); > To set buff nullptr (step 3) just before release the object (step 4) because > CloseIpcBuffer is in its destructor (class CudaBuffer). > If a user does not do that promptly, CloseIpcBuffer will be blocked. > Then, the following error may occure when another object created. > IOError: Cuda Driver API call in > /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with > code 208: cuIpcOpenMemHandle(, *handle, > CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) > Here is a sample. > thread 1: > { > std::shared_ptr buff; > plasma_client1.Create(object_id1, size, nullptr, 0, , 1); > plasma_client1.Seal(object_id); > // not to set buff nullptr > plasma_client1.Release(object_id); > plasma_client1.Delete(object_id); > // ... do someting else or not to do
[jira] [Updated] (ARROW-5924) [C++][Plasma] It is not convenient to release a GPU object
[ https://issues.apache.org/jira/browse/ARROW-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shengjun.li updated ARROW-5924: --- Description: cmake_modules/DefineOptions.cmake define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" ON) define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) The corrent sequence is as follow: (1) plasma_client.Create(object_id, size, nullptr, 0, , 1); // where device_num > 0 (2) plasma_client.Seal(object_id); (3) buff = nullptr; (4) plasma_client.Release(object_id); (5) plasma_client.Delete(object_id); To set buff nullptr (step 3) just before release the object (step 4) because CloseIpcBuffer is in its destructor (class CudaBuffer). If a user does not do that promptly, CloseIpcBuffer will be blocked. Then, the following error may occure when another object created. IOError: Cuda Driver API call in /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with code 208: cuIpcOpenMemHandle(, *handle, CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) Here is a sample. thread 1: { std::shared_ptr buff; plasma_client1.Create(object_id1, size, nullptr, 0, , 1); plasma_client1.Seal(object_id); // not to set buff nullptr plasma_client1.Release(object_id); plasma_client1.Delete(object_id); // ... do someting else or not to do anything } // let buff auto release here. thread 2: { std::shared_ptr buff; plasma_client2.Create(object_id2, size, nullptr, 0, , 1); // If the address allocated by the server is just the object_id1 released, error occur! } was: cmake_modules/DefineOptions.cmake define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" ON) define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) The corrent sequence is as follow: (1) plasma_client.Create(object_id, size, nullptr, 0, , 1); // where device_num > 0 (2) plasma_client.Seal(object_id); (3) buff = nullptr; (4) plasma_client.Release(object_id); (5) plasma_client.Delete(object_id); To set buff nullptr (step 3) just before release the object (step 4) because CloseIpcBuffer is in its destructor (class CudaBuffer). If a user does not do that promptly, CloseIpcBuffer will be blocked. Then, the following error may occure when another object created: IOError: Cuda Driver API call in /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with code 208: cuIpcOpenMemHandle(, *handle, CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) To prevent the risk, we can call CloseIpcBuffer manually. > [C++][Plasma] It is not convenient to release a GPU object > -- > > Key: ARROW-5924 > URL: https://issues.apache.org/jira/browse/ARROW-5924 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ - Plasma >Affects Versions: 0.14.0 >Reporter: shengjun.li >Priority: Major > Labels: pull-request-available > Fix For: 0.14.1 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > cmake_modules/DefineOptions.cmake > define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA > toolkit)" ON) > define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" > ON) > The corrent sequence is as follow: > (1) plasma_client.Create(object_id, size, nullptr, 0, , 1); // where > device_num > 0 > (2) plasma_client.Seal(object_id); > (3) buff = nullptr; > (4) plasma_client.Release(object_id); > (5) plasma_client.Delete(object_id); > To set buff nullptr (step 3) just before release the object (step 4) because > CloseIpcBuffer is in its destructor (class CudaBuffer). > If a user does not do that promptly, CloseIpcBuffer will be blocked. > Then, the following error may occure when another object created. > IOError: Cuda Driver API call in > /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with > code 208: cuIpcOpenMemHandle(, *handle, > CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) > Here is a sample. > thread 1: > { > std::shared_ptr buff; > plasma_client1.Create(object_id1, size, nullptr, 0, , 1); > plasma_client1.Seal(object_id); > // not to set buff nullptr > plasma_client1.Release(object_id); > plasma_client1.Delete(object_id); > // ... do someting else or not to do anything > } > // let buff auto release here. > thread 2: > { > std::shared_ptr buff; > plasma_client2.Create(object_id2, size, nullptr, 0, , 1); > // If the address allocated by the server is just the object_id1 released, > error occur! > } > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (ARROW-5924) [C++][Plasma] It is not convenient to release a GPU object
[ https://issues.apache.org/jira/browse/ARROW-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shengjun.li updated ARROW-5924: --- Description: cmake_modules/DefineOptions.cmake define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" ON) define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) The corrent sequence is as follow: (1) plasma_client.Create(object_id, size, nullptr, 0, , 1); // where device_num > 0 (2) plasma_client.Seal(object_id); (3) buff = nullptr; (4) plasma_client.Release(object_id); (5) plasma_client.Delete(object_id); To set buff nullptr (step 3) just before release the object (step 4) because CloseIpcBuffer is in its destructor (class CudaBuffer). If a user does not do that promptly, CloseIpcBuffer will be blocked. Then, the following error may occure when another object created. IOError: Cuda Driver API call in /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with code 208: cuIpcOpenMemHandle(, *handle, CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) Here is a sample. thread 1: {std::shared_ptr buff; plasma_client1.Create(object_id1, size, nullptr, 0, , 1); plasma_client1.Seal(object_id); // not to set buff nullptr plasma_client1.Release(object_id); plasma_client1.Delete(object_id); // ... do someting else or not to do anything } // let buff auto release here. thread 2: { std::shared_ptr buff; plasma_client2.Create(object_id2, size, nullptr, 0, , 1); // If the address allocated by the server is just the object_id1 released, error occur! } was: cmake_modules/DefineOptions.cmake define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA toolkit)" ON) define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" ON) The corrent sequence is as follow: (1) plasma_client.Create(object_id, size, nullptr, 0, , 1); // where device_num > 0 (2) plasma_client.Seal(object_id); (3) buff = nullptr; (4) plasma_client.Release(object_id); (5) plasma_client.Delete(object_id); To set buff nullptr (step 3) just before release the object (step 4) because CloseIpcBuffer is in its destructor (class CudaBuffer). If a user does not do that promptly, CloseIpcBuffer will be blocked. Then, the following error may occure when another object created. IOError: Cuda Driver API call in /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with code 208: cuIpcOpenMemHandle(, *handle, CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) Here is a sample. thread 1: { std::shared_ptr buff; plasma_client1.Create(object_id1, size, nullptr, 0, , 1); plasma_client1.Seal(object_id); // not to set buff nullptr plasma_client1.Release(object_id); plasma_client1.Delete(object_id); // ... do someting else or not to do anything } // let buff auto release here. thread 2: { std::shared_ptr buff; plasma_client2.Create(object_id2, size, nullptr, 0, , 1); // If the address allocated by the server is just the object_id1 released, error occur! } > [C++][Plasma] It is not convenient to release a GPU object > -- > > Key: ARROW-5924 > URL: https://issues.apache.org/jira/browse/ARROW-5924 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ - Plasma >Affects Versions: 0.14.0 >Reporter: shengjun.li >Priority: Major > Labels: pull-request-available > Fix For: 0.14.1 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > cmake_modules/DefineOptions.cmake > define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA > toolkit)" ON) > define_option(ARROW_PLASMA "Build the plasma object store along with > Arrow" ON) > The corrent sequence is as follow: > (1) plasma_client.Create(object_id, size, nullptr, 0, , 1); // where > device_num > 0 > (2) plasma_client.Seal(object_id); > (3) buff = nullptr; > (4) plasma_client.Release(object_id); > (5) plasma_client.Delete(object_id); > To set buff nullptr (step 3) just before release the object (step 4) because > CloseIpcBuffer is in its destructor (class CudaBuffer). > If a user does not do that promptly, CloseIpcBuffer will be blocked. > Then, the following error may occure when another object created. > IOError: Cuda Driver API call in > /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with > code 208: cuIpcOpenMemHandle(, *handle, > CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil) > Here is a sample. > thread 1: > {std::shared_ptr buff; plasma_client1.Create(object_id1, size, nullptr, 0, > , 1); plasma_client1.Seal(object_id); // not to set buff nullptr > plasma_client1.Release(object_id); plasma_client1.Delete(object_id); // > ... do someting else or not to do anything }
[jira] [Updated] (ARROW-5944) [C++][Gandiva] Remove 'div' alias for 'divide'
[ https://issues.apache.org/jira/browse/ARROW-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pindikura Ravindra updated ARROW-5944: -- Component/s: C++ - Gandiva > [C++][Gandiva] Remove 'div' alias for 'divide' > --- > > Key: ARROW-5944 > URL: https://issues.apache.org/jira/browse/ARROW-5944 > Project: Apache Arrow > Issue Type: Task > Components: C++ - Gandiva >Reporter: Prudhvi Porandla >Assignee: Prudhvi Porandla >Priority: Minor > Labels: pull-request-available > Fix For: 0.14.1 > > Time Spent: 20m > Remaining Estimate: 0h > > div and divide are two different operators. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Resolved] (ARROW-5944) [C++][Gandiva] Remove 'div' alias for 'divide'
[ https://issues.apache.org/jira/browse/ARROW-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pindikura Ravindra resolved ARROW-5944. --- Resolution: Fixed Fix Version/s: 0.14.1 Issue resolved by pull request 4876 [https://github.com/apache/arrow/pull/4876] > [C++][Gandiva] Remove 'div' alias for 'divide' > --- > > Key: ARROW-5944 > URL: https://issues.apache.org/jira/browse/ARROW-5944 > Project: Apache Arrow > Issue Type: Task >Reporter: Prudhvi Porandla >Assignee: Prudhvi Porandla >Priority: Minor > Labels: pull-request-available > Fix For: 0.14.1 > > Time Spent: 10m > Remaining Estimate: 0h > > div and divide are two different operators. -- This message was sent by Atlassian JIRA (v7.6.14#76016)