[jira] [Updated] (ARROW-5891) [C++][Gandiva] Remove duplicates in function registries

2019-07-14 Thread Prudhvi Porandla (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prudhvi Porandla updated ARROW-5891:

Description: Each precompiled function should have at most one 
"NativeFunction" entry in the registry. No two function signatures can refer 
the same precompiled function. Also add a UnitTest which checks if there are 
duplicates  (was: Each precompiled function should have at most one 
"NativeFunction" entry in the registry. Also add a UnitTest which checks if 
there are duplicates)

> [C++][Gandiva] Remove duplicates in function registries
> ---
>
> Key: ARROW-5891
> URL: https://issues.apache.org/jira/browse/ARROW-5891
> Project: Apache Arrow
>  Issue Type: Task
>  Components: C++ - Gandiva
>Reporter: Prudhvi Porandla
>Assignee: Prudhvi Porandla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Each precompiled function should have at most one "NativeFunction" entry in 
> the registry. No two function signatures can refer the same precompiled 
> function. Also add a UnitTest which checks if there are duplicates



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ARROW-5871) [Python] Can't import pyarrow 0.14.0 due to mismatching libcrypt

2019-07-14 Thread Suvayu Ali (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884841#comment-16884841
 ] 

Suvayu Ali commented on ARROW-5871:
---

Hi [~wesmckinn], I was able to build arrow-cpp and pyarrow from source from the 
maint-0.14.x branch.  Although I have not done any testing like, installing the 
wheel on different platforms, the above crash does not happen when I do a 
simple import.

> [Python] Can't import pyarrow 0.14.0 due to mismatching libcrypt
> 
>
> Key: ARROW-5871
> URL: https://issues.apache.org/jira/browse/ARROW-5871
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Packaging
>Affects Versions: 0.14.0
> Environment: 5.1.16-300.fc30.x86_64
> Python 3.7.3
> libxcrypt-4.4.6-2.fc30.x86_64
>Reporter: Suvayu Ali
>Priority: Major
> Fix For: 1.0.0
>
>
> In a freshly created virtual environment, after I install pyarrow 0.14.0 
> (using pip), importing pyarrow from the python prompt leads to crash:
> {code:java}
> $ mktmpenv
> [..]
> This is a temporary environment. It will be deleted when you run 'deactivate'.
> $ pip install pyarrow
> Collecting pyarrow
> Using cached 
> https://files.pythonhosted.org/packages/8f/fa/407667d763c25c3d9977e1d19038df3b4a693f37789c4fe1fe5c74a6bc55/pyarrow-0.14.0-cp37-cp37m-manylinux2010_x86_64.whl
> Collecting numpy>=1.14 (from pyarrow)
> Using cached 
> https://files.pythonhosted.org/packages/fc/d1/45be1144b03b6b1e24f9a924f23f66b4ad030d834ad31fb9e5581bd328af/numpy-1.16.4-cp37-cp37m-manylinux1_x86_64.whl
> Collecting six>=1.0.0 (from pyarrow)
> Using cached 
> https://files.pythonhosted.org/packages/73/fb/00a976f728d0d1fecfe898238ce23f502a721c0ac0ecfedb80e0d88c64e9/six-1.12.0-py2.py3-none-any.whl
> Installing collected packages: numpy, six, pyarrow
> Successfully installed numpy-1.16.4 pyarrow-0.14.0 six-1.12.0
> $ python --version
> Python 3.7.3
> $ python -m pyarrow
> Traceback (most recent call last):
> File "/usr/lib64/python3.7/runpy.py", line 183, in _run_module_as_main
> mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
> File "/usr/lib64/python3.7/runpy.py", line 142, in _get_module_details
> return _get_module_details(pkg_main_name, error)
> File "/usr/lib64/python3.7/runpy.py", line 109, in _get_module_details
> __import__(pkg_name)
> File 
> "/home/user/.virtualenvs/tmp-8a4d52e7bb62853/lib/python3.7/site-packages/pyarrow/__init__.py",
>  line 49, in 
> from pyarrow.lib import cpu_count, set_cpu_count
> ImportError: libcrypt.so.1: cannot open shared object file: No such file or 
> directory{code}
> This is surprising because I have older versions of pyarrow (up to 0.13.0) 
> working, and libcrypt on my system (Fedora 30, Python 3.7) is libcrypt.so.2!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ARROW-5853) [Python] Expose boolean filter kernel on Array

2019-07-14 Thread Blake Haugen (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884816#comment-16884816
 ] 

Blake Haugen commented on ARROW-5853:
-

I would like to work on this issue to get familiar with the codebase so my 
apologies if this is a dumb question. I think I have something partially 
working but I am wondering what the expected behavior is if you pass something 
like an integer array as the filter.

When I passed it an integer array of 0s and 1s I get a seg fault. My C++ is 
pretty rusty but in poking around in 'filter.cc' I suspect it may be something 
related to the 'checked_pointer_cast' in 'FilterKernel::Call'.

It doesn't look like there are any C++ tests with an array other than 
BooleanArray. Should this function check that the Array is boolean and throw a 
type error if it isn't or should this be doing a different cast?

> [Python] Expose boolean filter kernel on Array
> --
>
> Key: ARROW-5853
> URL: https://issues.apache.org/jira/browse/ARROW-5853
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Joris Van den Bossche
>Priority: Major
>
> Expose the filter kernel (https://issues.apache.org/jira/browse/ARROW-1558) 
> on the python Array class.
> Could be done as {{.filter(mask)}} method and/or in {{\_\_getitem\_\_}}.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ARROW-5929) [Java] Define API for ExtensionVector whose data must be serialized prior to being sent via IPC

2019-07-14 Thread Liya Fan (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884803#comment-16884803
 ] 

Liya Fan commented on ARROW-5929:
-

[~wesmckinn], many thanks for opening this issue.

I think this is a useful feature covering many important use cases.

> [Java] Define API for ExtensionVector whose data must be serialized prior to 
> being sent via IPC
> ---
>
> Key: ARROW-5929
> URL: https://issues.apache.org/jira/browse/ARROW-5929
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java
>Reporter: Wes McKinney
>Assignee: Liya Fan
>Priority: Major
>
> As being discussed on the mailing list, a possible use case for 
> ExtensionVector involves having the Arrow buffers contain pointer-type values 
> referring to memory outside of the Arrow memory heap. In IPC, such vectors 
> would need to be serialized to a wholly Arrow-resident form, such as a 
> VarBinaryVector. We do not have an API to allow for this, so this JIRA 
> proposes to add new functions that can indicate to the IPC layer that an 
> ExtensionVector requires additional serialization to a native Arrow type (in 
> such case, the extension type metadata would be discarded)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (ARROW-5929) [Java] Define API for ExtensionVector whose data must be serialized prior to being sent via IPC

2019-07-14 Thread Liya Fan (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liya Fan reassigned ARROW-5929:
---

Assignee: Liya Fan

> [Java] Define API for ExtensionVector whose data must be serialized prior to 
> being sent via IPC
> ---
>
> Key: ARROW-5929
> URL: https://issues.apache.org/jira/browse/ARROW-5929
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java
>Reporter: Wes McKinney
>Assignee: Liya Fan
>Priority: Major
>
> As being discussed on the mailing list, a possible use case for 
> ExtensionVector involves having the Arrow buffers contain pointer-type values 
> referring to memory outside of the Arrow memory heap. In IPC, such vectors 
> would need to be serialized to a wholly Arrow-resident form, such as a 
> VarBinaryVector. We do not have an API to allow for this, so this JIRA 
> proposes to add new functions that can indicate to the IPC layer that an 
> ExtensionVector requires additional serialization to a native Arrow type (in 
> such case, the extension type metadata would be discarded)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (ARROW-5943) [GLib][Gandiva] Add support for function aliases

2019-07-14 Thread Yosuke Shiro (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yosuke Shiro resolved ARROW-5943.
-
Resolution: Fixed

Issue resolved by pull request 4874
[https://github.com/apache/arrow/pull/4874]

> [GLib][Gandiva] Add support for function aliases
> 
>
> Key: ARROW-5943
> URL: https://issues.apache.org/jira/browse/ARROW-5943
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Sutou Kouhei
>Assignee: Sutou Kouhei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (ARROW-5947) [Rust] [DataFusion] Remove serde_json dependency

2019-07-14 Thread Sutou Kouhei (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sutou Kouhei resolved ARROW-5947.
-
   Resolution: Fixed
Fix Version/s: 1.0.0

Issue resolved by pull request 4879
[https://github.com/apache/arrow/pull/4879]

> [Rust] [DataFusion] Remove serde_json dependency
> 
>
> Key: ARROW-5947
> URL: https://issues.apache.org/jira/browse/ARROW-5947
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust, Rust - DataFusion
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I added a dependency to serde_json early on so that I could serialize logical 
> query plans because I wanted a way to pass them between processes. However, 
> this was just a short term hack and is non-standard. I would like to remove 
> this now.
> I am now using gRPC in another project and serializing plans that way based 
> on the Gandiva protobuf def. I will start a discussion on the mailing list in 
> the next 1-2 weeks about pushing some changes into the Arrow repo related to 
> this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-5949) [Rust] Implement DictionaryArray

2019-07-14 Thread David Atienza (JIRA)
David Atienza created ARROW-5949:


 Summary: [Rust] Implement DictionaryArray
 Key: ARROW-5949
 URL: https://issues.apache.org/jira/browse/ARROW-5949
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Rust
Reporter: David Atienza


I am pretty new to the codebase, but I have seen that DictionaryArray is not 
implemented in the Rust implementation.

I went through the list of issues and I could not see any work on this. Is 
there any blocker?

 

The specification is a bit 
[short|https://arrow.apache.org/docs/format/Layout.html#dictionary-encoding] or 
even 
[non-existant|https://arrow.apache.org/docs/format/Metadata.html#dictionary-encoding],
 so I am not sure how to implement it myself.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-5948) [Rust] [DataFusion] create_logical_plan should not call optimizer

2019-07-14 Thread Andy Grove (JIRA)
Andy Grove created ARROW-5948:
-

 Summary: [Rust] [DataFusion] create_logical_plan should not call 
optimizer
 Key: ARROW-5948
 URL: https://issues.apache.org/jira/browse/ARROW-5948
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust, Rust - DataFusion
Affects Versions: 0.14.0
Reporter: Andy Grove
Assignee: Andy Grove
 Fix For: 1.0.0


ExecutionContext.create_logical_plan currently returns an optimized plan.

There is already a separate method on ExecutionContext for creating an 
optimized plan and it would be better to have create_logical_plan return the 
unoptimized plan. This helps with testing and also helps for my use case where 
I want to pass the logical plan to another node before it gets optimized (it is 
not currently possible to optimize a plan twice, and this is causing me some 
issues)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ARROW-5947) [Rust] [DataFusion] Remove serde_json dependency

2019-07-14 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-5947:
--
Labels: pull-request-available  (was: )

> [Rust] [DataFusion] Remove serde_json dependency
> 
>
> Key: ARROW-5947
> URL: https://issues.apache.org/jira/browse/ARROW-5947
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust, Rust - DataFusion
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
>
> I added a dependency to serde_json early on so that I could serialize logical 
> query plans because I wanted a way to pass them between processes. However, 
> this was just a short term hack and is non-standard. I would like to remove 
> this now.
> I am now using gRPC in another project and serializing plans that way based 
> on the Gandiva protobuf def. I will start a discussion on the mailing list in 
> the next 1-2 weeks about pushing some changes into the Arrow repo related to 
> this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-5947) [Rust] [DataFusion] Remove serde_json dependency

2019-07-14 Thread Andy Grove (JIRA)
Andy Grove created ARROW-5947:
-

 Summary: [Rust] [DataFusion] Remove serde_json dependency
 Key: ARROW-5947
 URL: https://issues.apache.org/jira/browse/ARROW-5947
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust, Rust - DataFusion
Reporter: Andy Grove
Assignee: Andy Grove


I added a dependency to serde_json early on so that I could serialize logical 
query plans because I wanted a way to pass them between processes. However, 
this was just a short term hack and is non-standard. I would like to remove 
this now.

I am now using gRPC in another project and serializing plans that way based on 
the Gandiva protobuf def. I will start a discussion on the mailing list in the 
next 1-2 weeks about pushing some changes into the Arrow repo related to this.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ARROW-5946) [Rust] [DataFusion] Projection push down with aggregate producing incorrect results

2019-07-14 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-5946:
--
Labels: pull-request-available  (was: )

> [Rust] [DataFusion] Projection push down with aggregate producing incorrect 
> results
> ---
>
> Key: ARROW-5946
> URL: https://issues.apache.org/jira/browse/ARROW-5946
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Rust, Rust - DataFusion
>Affects Versions: 0.14.0
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>
> I was testing some queries with the 0.14 release and noticed that the 
> projected schema for a table scan is completely wrong (however the results of 
> the query are not necessarily wrong)
>  
> {code:java}
> // schema for nyxtaxi csv files
> let schema = Schema::new(vec![
> Field::new("VendorID", DataType::Utf8, true),
> Field::new("tpep_pickup_datetime", DataType::Utf8, true),
> Field::new("tpep_dropoff_datetime", DataType::Utf8, true),
> Field::new("passenger_count", DataType::Utf8, true),
> Field::new("trip_distance", DataType::Float64, true),
> Field::new("RatecodeID", DataType::Utf8, true),
> Field::new("store_and_fwd_flag", DataType::Utf8, true),
> Field::new("PULocationID", DataType::Utf8, true),
> Field::new("DOLocationID", DataType::Utf8, true),
> Field::new("payment_type", DataType::Utf8, true),
> Field::new("fare_amount", DataType::Float64, true),
> Field::new("extra", DataType::Float64, true),
> Field::new("mta_tax", DataType::Float64, true),
> Field::new("tip_amount", DataType::Float64, true),
> Field::new("tolls_amount", DataType::Float64, true),
> Field::new("improvement_surcharge", DataType::Float64, true),
> Field::new("total_amount", DataType::Float64, true),
> ]);
> let mut ctx = ExecutionContext::new();
> ctx.register_csv("tripdata", "file.csv", , true);
> let optimized_plan = ctx.create_logical_plan(
> "SELECT passenger_count, MIN(fare_amount), MAX(fare_amount) \
> FROM tripdata GROUP BY passenger_count").unwrap();{code}
>  The projected schema in the table scan has the first two columns from the 
> schema (VendorID and tpetp_pickup_datetime) rather than passenger_count and 
> fare_amount



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ARROW-5924) [C++][Plasma] It is not convenient to release a GPU object

2019-07-14 Thread shengjun.li (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shengjun.li updated ARROW-5924:
---
Description: 
cmake_modules/DefineOptions.cmake
   define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA 
toolkit)" ON)
   define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" 
ON)

The corrent sequence is as follow:
 (1) plasma_client.Create(object_id, size, nullptr, 0, , 1);  // where 
device_num > 0
 (2) plasma_client.Seal(object_id);
 (3) buff = nullptr;
 (4) plasma_client.Release(object_id);
 (5) plasma_client.Delete(object_id);

To set buff nullptr (step 3) just before release the object (step 4) because 
CloseIpcBuffer is in its destructor (class CudaBuffer).
If a user does not do that promptly, CloseIpcBuffer will be blocked.
Then, the following error may occure when another object created.
     IOError: Cuda Driver API call in 
/home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with 
code 208: cuIpcOpenMemHandle(, *handle, 
CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)


  was:
cmake_modules/DefineOptions.cmake
   define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA 
toolkit)" ON)
   define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" 
ON)

The corrent sequence is as follow:
 (1) plasma_client.Create(object_id, size, nullptr, 0, , 1);  // where 
device_num > 0
 (2) plasma_client.Seal(object_id);
 (3) buff = nullptr;
 (4) plasma_client.Release(object_id);
 (5) plasma_client.Delete(object_id);

To set buff nullptr (step 3) just before release the object (step 4) because 
CloseIpcBuffer is in its destructor (class CudaBuffer).
If a user does not do that promptly, CloseIpcBuffer will be blocked.
Then, the following error may occure when another object created.
     IOError: Cuda Driver API call in 
/home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with 
code 208: cuIpcOpenMemHandle(, *handle, 
CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)

Here is a sample.
thread 1:
{
std::shared_ptr buff;
plasma_client1.Create(object_id1, size, nullptr, 0, , 1);
plasma_client1.Seal(object_id);
// not to set buff nullptr
plasma_client1.Release(object_id);
plasma_client1.Delete(object_id);
// ... do someting else or not to do anything
}
// let buff auto release here.

thread 2:
{
std::shared_ptr buff;
plasma_client2.Create(object_id2, size, nullptr, 0, , 1);
// If the address allocated by the server is just the object_id1 released, 
error occur!
}


> [C++][Plasma] It is not convenient to release a GPU object
> --
>
> Key: ARROW-5924
> URL: https://issues.apache.org/jira/browse/ARROW-5924
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++ - Plasma
>Affects Versions: 0.14.0
>Reporter: shengjun.li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.14.1
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> cmake_modules/DefineOptions.cmake
>    define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA 
> toolkit)" ON)
>    define_option(ARROW_PLASMA "Build the plasma object store along with 
> Arrow" ON)
> The corrent sequence is as follow:
>  (1) plasma_client.Create(object_id, size, nullptr, 0, , 1);  // where 
> device_num > 0
>  (2) plasma_client.Seal(object_id);
>  (3) buff = nullptr;
>  (4) plasma_client.Release(object_id);
>  (5) plasma_client.Delete(object_id);
> To set buff nullptr (step 3) just before release the object (step 4) because 
> CloseIpcBuffer is in its destructor (class CudaBuffer).
> If a user does not do that promptly, CloseIpcBuffer will be blocked.
> Then, the following error may occure when another object created.
>      IOError: Cuda Driver API call in 
> /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with 
> code 208: cuIpcOpenMemHandle(, *handle, 
> CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ARROW-5924) [C++][Plasma] It is not convenient to release a GPU object

2019-07-14 Thread shengjun.li (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shengjun.li updated ARROW-5924:
---
Description: 
cmake_modules/DefineOptions.cmake
   define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA 
toolkit)" ON)
   define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" 
ON)

The corrent sequence is as follow:
 (1) plasma_client.Create(object_id, size, nullptr, 0, , 1);  // where 
device_num > 0
 (2) plasma_client.Seal(object_id);
 (3) buff = nullptr;
 (4) plasma_client.Release(object_id);
 (5) plasma_client.Delete(object_id);

To set buff nullptr (step 3) just before release the object (step 4) because 
CloseIpcBuffer is in its destructor (class CudaBuffer).
If a user does not do that promptly, CloseIpcBuffer will be blocked.
Then, the following error may occure when another object created.
     IOError: Cuda Driver API call in 
/home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with 
code 208: cuIpcOpenMemHandle(, *handle, 
CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)

Here is a sample.
thread 1:
{
std::shared_ptr buff;
plasma_client1.Create(object_id1, size, nullptr, 0, , 1);
plasma_client1.Seal(object_id);
// not to set buff nullptr
plasma_client1.Release(object_id);
plasma_client1.Delete(object_id);
// ... do someting else or not to do anything
}
// let buff auto release here.

thread 2:
{
std::shared_ptr buff;
plasma_client2.Create(object_id2, size, nullptr, 0, , 1);
// If the address allocated by the server is just the object_id1 released, 
error occur!
}

  was:
cmake_modules/DefineOptions.cmake
   define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA 
toolkit)" ON)
   define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" 
ON)

The corrent sequence is as follow:
 (1) plasma_client.Create(object_id, size, nullptr, 0, , 1);  // where 
device_num > 0
 (2) plasma_client.Seal(object_id);
 (3) buff = nullptr;
 (4) plasma_client.Release(object_id);
 (5) plasma_client.Delete(object_id);

To set buff nullptr (step 3) just before release the object (step 4) because 
CloseIpcBuffer is in its destructor (class CudaBuffer).
If a user does not do that promptly, CloseIpcBuffer will be blocked.
Then, the following error may occure when another object created.
     IOError: Cuda Driver API call in 
/home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with 
code 208: cuIpcOpenMemHandle(, *handle, 
CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)

Here is a sample.
thread 1:
{
  std::shared_ptr buff;
  plasma_client1.Create(object_id1, size, nullptr, 0, , 1);
  plasma_client1.Seal(object_id);
  // not to set buff nullptr
  plasma_client1.Release(object_id);
  plasma_client1.Delete(object_id);
  // ... do someting else or not to do anything
}
// let buff auto release here.

thread 2:
{
  std::shared_ptr buff;
  plasma_client2.Create(object_id2, size, nullptr, 0, , 1);
  // If the address allocated by the server is just the object_id1 released, 
error occur!
}


> [C++][Plasma] It is not convenient to release a GPU object
> --
>
> Key: ARROW-5924
> URL: https://issues.apache.org/jira/browse/ARROW-5924
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++ - Plasma
>Affects Versions: 0.14.0
>Reporter: shengjun.li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.14.1
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> cmake_modules/DefineOptions.cmake
>    define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA 
> toolkit)" ON)
>    define_option(ARROW_PLASMA "Build the plasma object store along with 
> Arrow" ON)
> The corrent sequence is as follow:
>  (1) plasma_client.Create(object_id, size, nullptr, 0, , 1);  // where 
> device_num > 0
>  (2) plasma_client.Seal(object_id);
>  (3) buff = nullptr;
>  (4) plasma_client.Release(object_id);
>  (5) plasma_client.Delete(object_id);
> To set buff nullptr (step 3) just before release the object (step 4) because 
> CloseIpcBuffer is in its destructor (class CudaBuffer).
> If a user does not do that promptly, CloseIpcBuffer will be blocked.
> Then, the following error may occure when another object created.
>      IOError: Cuda Driver API call in 
> /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with 
> code 208: cuIpcOpenMemHandle(, *handle, 
> CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)
> Here is a sample.
> thread 1:
> {
> std::shared_ptr buff;
> plasma_client1.Create(object_id1, size, nullptr, 0, , 1);
> plasma_client1.Seal(object_id);
> // not to set buff nullptr
> plasma_client1.Release(object_id);
> plasma_client1.Delete(object_id);
> // ... do someting else or not to do anything
> }
> // let buff auto release here.

[jira] [Updated] (ARROW-5924) [C++][Plasma] It is not convenient to release a GPU object

2019-07-14 Thread shengjun.li (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shengjun.li updated ARROW-5924:
---
Description: 
cmake_modules/DefineOptions.cmake
   define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA 
toolkit)" ON)
   define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" 
ON)

The corrent sequence is as follow:
 (1) plasma_client.Create(object_id, size, nullptr, 0, , 1);  // where 
device_num > 0
 (2) plasma_client.Seal(object_id);
 (3) buff = nullptr;
 (4) plasma_client.Release(object_id);
 (5) plasma_client.Delete(object_id);

To set buff nullptr (step 3) just before release the object (step 4) because 
CloseIpcBuffer is in its destructor (class CudaBuffer).
If a user does not do that promptly, CloseIpcBuffer will be blocked.
Then, the following error may occure when another object created.
     IOError: Cuda Driver API call in 
/home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with 
code 208: cuIpcOpenMemHandle(, *handle, 
CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)

Here is a sample.
thread 1:
{
  std::shared_ptr buff;
  plasma_client1.Create(object_id1, size, nullptr, 0, , 1);
  plasma_client1.Seal(object_id);
  // not to set buff nullptr
  plasma_client1.Release(object_id);
  plasma_client1.Delete(object_id);
  // ... do someting else or not to do anything
}
// let buff auto release here.

thread 2:
{
  std::shared_ptr buff;
  plasma_client2.Create(object_id2, size, nullptr, 0, , 1);
  // If the address allocated by the server is just the object_id1 released, 
error occur!
}

  was:
cmake_modules/DefineOptions.cmake
   define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA 
toolkit)" ON)
   define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" 
ON)

The corrent sequence is as follow:
 (1) plasma_client.Create(object_id, size, nullptr, 0, , 1);  // where 
device_num > 0
 (2) plasma_client.Seal(object_id);
 (3) buff = nullptr;
 (4) plasma_client.Release(object_id);
 (5) plasma_client.Delete(object_id);

To set buff nullptr (step 3) just before release the object (step 4) because 
CloseIpcBuffer is in its destructor (class CudaBuffer).
 If a user does not do that promptly, CloseIpcBuffer will be blocked.
 Then, the following error may occure when another object created.
     IOError: Cuda Driver API call in 
/home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with 
code 208: cuIpcOpenMemHandle(, *handle, 
CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)

Here is a sample.
 thread 1:

{std::shared_ptr buff;   plasma_client1.Create(object_id1, size, nullptr, 0, 
, 1);   plasma_client1.Seal(object_id);   // not to set buff nullptr   
plasma_client1.Release(object_id);   plasma_client1.Delete(object_id);   // ... 
do someting else or not to do anything }

// let buff auto release here.

thread 2:

{   std::shared_ptr buff;   plasma_client2.Create(object_id2, size, nullptr, 0, 
, 1);   // If the address allocated by the server is just the object_id1 
released, error occur! }

 


> [C++][Plasma] It is not convenient to release a GPU object
> --
>
> Key: ARROW-5924
> URL: https://issues.apache.org/jira/browse/ARROW-5924
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++ - Plasma
>Affects Versions: 0.14.0
>Reporter: shengjun.li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.14.1
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> cmake_modules/DefineOptions.cmake
>    define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA 
> toolkit)" ON)
>    define_option(ARROW_PLASMA "Build the plasma object store along with 
> Arrow" ON)
> The corrent sequence is as follow:
>  (1) plasma_client.Create(object_id, size, nullptr, 0, , 1);  // where 
> device_num > 0
>  (2) plasma_client.Seal(object_id);
>  (3) buff = nullptr;
>  (4) plasma_client.Release(object_id);
>  (5) plasma_client.Delete(object_id);
> To set buff nullptr (step 3) just before release the object (step 4) because 
> CloseIpcBuffer is in its destructor (class CudaBuffer).
> If a user does not do that promptly, CloseIpcBuffer will be blocked.
> Then, the following error may occure when another object created.
>      IOError: Cuda Driver API call in 
> /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with 
> code 208: cuIpcOpenMemHandle(, *handle, 
> CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)
> Here is a sample.
> thread 1:
> {
>   std::shared_ptr buff;
>   plasma_client1.Create(object_id1, size, nullptr, 0, , 1);
>   plasma_client1.Seal(object_id);
>   // not to set buff nullptr
>   plasma_client1.Release(object_id);
>   plasma_client1.Delete(object_id);
>   // ... do someting else or not to do 

[jira] [Updated] (ARROW-5924) [C++][Plasma] It is not convenient to release a GPU object

2019-07-14 Thread shengjun.li (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shengjun.li updated ARROW-5924:
---
Description: 
cmake_modules/DefineOptions.cmake
  define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA 
toolkit)" ON)
  define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" 
ON)

The corrent sequence is as follow:
(1) plasma_client.Create(object_id, size, nullptr, 0, , 1);  // where 
device_num > 0
(2) plasma_client.Seal(object_id);
(3) buff = nullptr;
(4) plasma_client.Release(object_id);
(5) plasma_client.Delete(object_id);

To set buff nullptr (step 3) just before release the object (step 4) because 
CloseIpcBuffer is in its destructor (class CudaBuffer).
If a user does not do that promptly, CloseIpcBuffer will be blocked.
Then, the following error may occure when another object created.
    IOError: Cuda Driver API call in 
/home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with 
code 208: cuIpcOpenMemHandle(, *handle, 
CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)

Here is a sample.
thread 1:
{
  std::shared_ptr buff;
  plasma_client1.Create(object_id1, size, nullptr, 0, , 1);
  plasma_client1.Seal(object_id);
  // not to set buff nullptr
  plasma_client1.Release(object_id);
  plasma_client1.Delete(object_id);
  // ... do someting else or not to do anything
}
// let buff auto release here.

thread 2:
{
  std::shared_ptr buff;
  plasma_client2.Create(object_id2, size, nullptr, 0, , 1);
  // If the address allocated by the server is just the object_id1 released, 
error occur!
}

 

  was:
cmake_modules/DefineOptions.cmake
  define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA 
toolkit)" ON)
  define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" 
ON)


The corrent sequence is as follow:
(1) plasma_client.Create(object_id, size, nullptr, 0, , 1);  // where 
device_num > 0
(2) plasma_client.Seal(object_id);
(3) buff = nullptr;
(4) plasma_client.Release(object_id);
(5) plasma_client.Delete(object_id);


To set buff nullptr (step 3) just before release the object (step 4) because 
CloseIpcBuffer is in its destructor (class CudaBuffer).
If a user does not do that promptly, CloseIpcBuffer will be blocked. Then, the 
following error may occure when another object created:
    IOError: Cuda Driver API call in 
/home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with 
code 208: cuIpcOpenMemHandle(, *handle, 
CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)


To prevent the risk, we can call CloseIpcBuffer manually.


> [C++][Plasma] It is not convenient to release a GPU object
> --
>
> Key: ARROW-5924
> URL: https://issues.apache.org/jira/browse/ARROW-5924
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++ - Plasma
>Affects Versions: 0.14.0
>Reporter: shengjun.li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.14.1
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> cmake_modules/DefineOptions.cmake
>   define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA 
> toolkit)" ON)
>   define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" 
> ON)
> The corrent sequence is as follow:
> (1) plasma_client.Create(object_id, size, nullptr, 0, , 1);  // where 
> device_num > 0
> (2) plasma_client.Seal(object_id);
> (3) buff = nullptr;
> (4) plasma_client.Release(object_id);
> (5) plasma_client.Delete(object_id);
> To set buff nullptr (step 3) just before release the object (step 4) because 
> CloseIpcBuffer is in its destructor (class CudaBuffer).
> If a user does not do that promptly, CloseIpcBuffer will be blocked.
> Then, the following error may occure when another object created.
>     IOError: Cuda Driver API call in 
> /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with 
> code 208: cuIpcOpenMemHandle(, *handle, 
> CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)
> Here is a sample.
> thread 1:
> {
>   std::shared_ptr buff;
>   plasma_client1.Create(object_id1, size, nullptr, 0, , 1);
>   plasma_client1.Seal(object_id);
>   // not to set buff nullptr
>   plasma_client1.Release(object_id);
>   plasma_client1.Delete(object_id);
>   // ... do someting else or not to do anything
> }
> // let buff auto release here.
> thread 2:
> {
>   std::shared_ptr buff;
>   plasma_client2.Create(object_id2, size, nullptr, 0, , 1);
>   // If the address allocated by the server is just the object_id1 released, 
> error occur!
> }
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ARROW-5924) [C++][Plasma] It is not convenient to release a GPU object

2019-07-14 Thread shengjun.li (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shengjun.li updated ARROW-5924:
---
Description: 
cmake_modules/DefineOptions.cmake
   define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA 
toolkit)" ON)
   define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" 
ON)

The corrent sequence is as follow:
 (1) plasma_client.Create(object_id, size, nullptr, 0, , 1);  // where 
device_num > 0
 (2) plasma_client.Seal(object_id);
 (3) buff = nullptr;
 (4) plasma_client.Release(object_id);
 (5) plasma_client.Delete(object_id);

To set buff nullptr (step 3) just before release the object (step 4) because 
CloseIpcBuffer is in its destructor (class CudaBuffer).
 If a user does not do that promptly, CloseIpcBuffer will be blocked.
 Then, the following error may occure when another object created.
     IOError: Cuda Driver API call in 
/home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with 
code 208: cuIpcOpenMemHandle(, *handle, 
CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)

Here is a sample.
 thread 1:

{std::shared_ptr buff;   plasma_client1.Create(object_id1, size, nullptr, 0, 
, 1);   plasma_client1.Seal(object_id);   // not to set buff nullptr   
plasma_client1.Release(object_id);   plasma_client1.Delete(object_id);   // ... 
do someting else or not to do anything }

// let buff auto release here.

thread 2:

{   std::shared_ptr buff;   plasma_client2.Create(object_id2, size, nullptr, 0, 
, 1);   // If the address allocated by the server is just the object_id1 
released, error occur! }

 

  was:
cmake_modules/DefineOptions.cmake
  define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA 
toolkit)" ON)
  define_option(ARROW_PLASMA "Build the plasma object store along with Arrow" 
ON)

The corrent sequence is as follow:
(1) plasma_client.Create(object_id, size, nullptr, 0, , 1);  // where 
device_num > 0
(2) plasma_client.Seal(object_id);
(3) buff = nullptr;
(4) plasma_client.Release(object_id);
(5) plasma_client.Delete(object_id);

To set buff nullptr (step 3) just before release the object (step 4) because 
CloseIpcBuffer is in its destructor (class CudaBuffer).
If a user does not do that promptly, CloseIpcBuffer will be blocked.
Then, the following error may occure when another object created.
    IOError: Cuda Driver API call in 
/home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with 
code 208: cuIpcOpenMemHandle(, *handle, 
CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)

Here is a sample.
thread 1:
{
  std::shared_ptr buff;
  plasma_client1.Create(object_id1, size, nullptr, 0, , 1);
  plasma_client1.Seal(object_id);
  // not to set buff nullptr
  plasma_client1.Release(object_id);
  plasma_client1.Delete(object_id);
  // ... do someting else or not to do anything
}
// let buff auto release here.

thread 2:
{
  std::shared_ptr buff;
  plasma_client2.Create(object_id2, size, nullptr, 0, , 1);
  // If the address allocated by the server is just the object_id1 released, 
error occur!
}

 


> [C++][Plasma] It is not convenient to release a GPU object
> --
>
> Key: ARROW-5924
> URL: https://issues.apache.org/jira/browse/ARROW-5924
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++ - Plasma
>Affects Versions: 0.14.0
>Reporter: shengjun.li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.14.1
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> cmake_modules/DefineOptions.cmake
>    define_option(ARROW_CUDA "Build the Arrow CUDA extensions (requires CUDA 
> toolkit)" ON)
>    define_option(ARROW_PLASMA "Build the plasma object store along with 
> Arrow" ON)
> The corrent sequence is as follow:
>  (1) plasma_client.Create(object_id, size, nullptr, 0, , 1);  // where 
> device_num > 0
>  (2) plasma_client.Seal(object_id);
>  (3) buff = nullptr;
>  (4) plasma_client.Release(object_id);
>  (5) plasma_client.Delete(object_id);
> To set buff nullptr (step 3) just before release the object (step 4) because 
> CloseIpcBuffer is in its destructor (class CudaBuffer).
>  If a user does not do that promptly, CloseIpcBuffer will be blocked.
>  Then, the following error may occure when another object created.
>      IOError: Cuda Driver API call in 
> /home/zilliz/arrow/cpp/src/arrow/gpu/cuda_context.cc at line 156 failed with 
> code 208: cuIpcOpenMemHandle(, *handle, 
> CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS) (nil)
> Here is a sample.
>  thread 1:
> {std::shared_ptr buff;   plasma_client1.Create(object_id1, size, nullptr, 0, 
> , 1);   plasma_client1.Seal(object_id);   // not to set buff nullptr   
> plasma_client1.Release(object_id);   plasma_client1.Delete(object_id);   // 
> ... do someting else or not to do anything }

[jira] [Updated] (ARROW-5944) [C++][Gandiva] Remove 'div' alias for 'divide'

2019-07-14 Thread Pindikura Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pindikura Ravindra updated ARROW-5944:
--
Component/s: C++ - Gandiva

> [C++][Gandiva] Remove 'div' alias for 'divide' 
> ---
>
> Key: ARROW-5944
> URL: https://issues.apache.org/jira/browse/ARROW-5944
> Project: Apache Arrow
>  Issue Type: Task
>  Components: C++ - Gandiva
>Reporter: Prudhvi Porandla
>Assignee: Prudhvi Porandla
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.14.1
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> div and divide are two different operators.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (ARROW-5944) [C++][Gandiva] Remove 'div' alias for 'divide'

2019-07-14 Thread Pindikura Ravindra (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pindikura Ravindra resolved ARROW-5944.
---
   Resolution: Fixed
Fix Version/s: 0.14.1

Issue resolved by pull request 4876
[https://github.com/apache/arrow/pull/4876]

> [C++][Gandiva] Remove 'div' alias for 'divide' 
> ---
>
> Key: ARROW-5944
> URL: https://issues.apache.org/jira/browse/ARROW-5944
> Project: Apache Arrow
>  Issue Type: Task
>Reporter: Prudhvi Porandla
>Assignee: Prudhvi Porandla
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.14.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> div and divide are two different operators.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)