[jira] [Assigned] (ARROW-8504) [C++] Add a method that takes an RLE visitor for a bitmap.

2020-04-29 Thread Micah Kornfield (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Kornfield reassigned ARROW-8504:
--

Assignee: Micah Kornfield

> [C++] Add a method that takes an RLE visitor for a bitmap.
> --
>
> Key: ARROW-8504
> URL: https://issues.apache.org/jira/browse/ARROW-8504
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Micah Kornfield
>Assignee: Micah Kornfield
>Priority: Major
>
> For nullability data, in many cases nulls are not evenly distributed.  In 
> these cases it would be beneficial to have a mechanism to understand when 
> runs of set/unset bits are encountered.  One example of this is writing 
> translating a bitmap to parquet definition levels .
>  
> An implementation path could be to add this as method on Bitmap that makes an 
> adaptor callback for VisitWords but I think at least for parquet an iterator 
> API might be more appropriate (something that is easily stoppable/resumable).
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8641) [Python] Regression in feather: no longer supports permutation in column selection

2020-04-29 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-8641:


 Summary: [Python] Regression in feather: no longer supports 
permutation in column selection
 Key: ARROW-8641
 URL: https://issues.apache.org/jira/browse/ARROW-8641
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++, Python
Reporter: Joris Van den Bossche


A quite annoying regression (original report from 
https://github.com/pandas-dev/pandas/issues/33878), is that when specifying 
{{columns}} to read, this now fails if the order of the columns is not exactly 
the same as in the file:

{code: python}
In [27]: table = pa.table([[1, 2, 3], [4, 5, 6], [7, 8, 9]], names=['a', 'b', 
'c'])

In [29]: from pyarrow import feather 

In [30]: feather.write_feather(table, "test.feather")   

# this works fine
In [32]: feather.read_table("test.feather", columns=['a', 'b']) 

   
Out[32]: 
pyarrow.Table
a: int64
b: int64

In [33]: feather.read_table("test.feather", columns=['b', 'a']) 

   
---
ArrowInvalid  Traceback (most recent call last)
 in 
> 1 feather.read_table("test.feather", columns=['b', 'a'])

~/scipy/repos/arrow/python/pyarrow/feather.py in read_table(source, columns, 
memory_map)
237 return reader.read_indices(columns)
238 elif all(map(lambda t: t == str, column_types)):
--> 239 return reader.read_names(columns)
240 
241 column_type_names = [t.__name__ for t in column_types]

~/scipy/repos/arrow/python/pyarrow/feather.pxi in 
pyarrow.lib.FeatherReader.read_names()

~/scipy/repos/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status()

ArrowInvalid: Schema at index 0 was different: 
b: int64
a: int64
vs
a: int64
b: int64
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8592) [C++] Docs still list LLVM 7 as compiler used

2020-04-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-8592:
--
Labels: pull-request-available  (was: )

> [C++] Docs still list LLVM 7 as compiler used
> -
>
> Key: ARROW-8592
> URL: https://issues.apache.org/jira/browse/ARROW-8592
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Documentation
>Reporter: Micah Kornfield
>Assignee: Micah Kornfield
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> should be LLVM 8



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (ARROW-8640) pyarrow.UnionArray.from_buffers() expected number of buffers (1) did not match the passed number (3)

2020-04-29 Thread Anish Biswas (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anish Biswas closed ARROW-8640.
---

> pyarrow.UnionArray.from_buffers() expected number of buffers (1) did not 
> match the passed number (3)
> 
>
> Key: ARROW-8640
> URL: https://issues.apache.org/jira/browse/ARROW-8640
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Anish Biswas
>Priority: Major
>
> {code:python}
> arr1 = pa.array([1,2,3,4,5])arr1.buffers()
> arr2 = pa.array([1.1,2.2,3.3,4.4,5.5])
> types = pa.array([0, 1, 0, 0, 1, 1, 0], type='int8')
> value_offsets = pa.array([1, 0, 0, 2, 1, 2, 3], 
> type='int32')value_offsets.buffers()
> arr = pa.UnionArray.from_dense(types, value_offsets,  
> [arr1, arr2])
> arr4 = pa.UnionArray.from_buffers(pa.struct([pa.field("0", arr1.type) , 
> pa.field("1", arr2.type)]), 5, arr.buffers()[0:3], children=[arr1, arr2])
> {code}
> The problem here arises when I try to produce the Union Array via buffers, 
> according to the Columnar Documentation I need 3 buffers to produce a dense 
> Union Array. But when I try this, there is the error `Type's expected number 
> of buffers (1) did not match the passed number (3)`. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8640) pyarrow.UnionArray.from_buffers() expected number of buffers (1) did not match the passed number (3)

2020-04-29 Thread Anish Biswas (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17096164#comment-17096164
 ] 

Anish Biswas commented on ARROW-8640:
-

Ah, I see. Yes, that makes more sense. Thanks for the help! I'll close this 
issue now.

> pyarrow.UnionArray.from_buffers() expected number of buffers (1) did not 
> match the passed number (3)
> 
>
> Key: ARROW-8640
> URL: https://issues.apache.org/jira/browse/ARROW-8640
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Anish Biswas
>Priority: Major
>
> {code:python}
> arr1 = pa.array([1,2,3,4,5])arr1.buffers()
> arr2 = pa.array([1.1,2.2,3.3,4.4,5.5])
> types = pa.array([0, 1, 0, 0, 1, 1, 0], type='int8')
> value_offsets = pa.array([1, 0, 0, 2, 1, 2, 3], 
> type='int32')value_offsets.buffers()
> arr = pa.UnionArray.from_dense(types, value_offsets,  
> [arr1, arr2])
> arr4 = pa.UnionArray.from_buffers(pa.struct([pa.field("0", arr1.type) , 
> pa.field("1", arr2.type)]), 5, arr.buffers()[0:3], children=[arr1, arr2])
> {code}
> The problem here arises when I try to produce the Union Array via buffers, 
> according to the Columnar Documentation I need 3 buffers to produce a dense 
> Union Array. But when I try this, there is the error `Type's expected number 
> of buffers (1) did not match the passed number (3)`. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8640) pyarrow.UnionArray.from_buffers() expected number of buffers (1) did not match the passed number (3)

2020-04-29 Thread Micah Kornfield (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17096159#comment-17096159
 ] 

Micah Kornfield commented on ARROW-8640:


really it is misleading to use pa.UnionArray.from_buffers, you should probably 
use pa.Array.from_buffers.

 

I'm not sure what you mean?  If you think there is a more obvious API for Union 
construction, feel free to create a proposal JIRA/PR for discussion.

> pyarrow.UnionArray.from_buffers() expected number of buffers (1) did not 
> match the passed number (3)
> 
>
> Key: ARROW-8640
> URL: https://issues.apache.org/jira/browse/ARROW-8640
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Anish Biswas
>Priority: Major
>
> {code:python}
> arr1 = pa.array([1,2,3,4,5])arr1.buffers()
> arr2 = pa.array([1.1,2.2,3.3,4.4,5.5])
> types = pa.array([0, 1, 0, 0, 1, 1, 0], type='int8')
> value_offsets = pa.array([1, 0, 0, 2, 1, 2, 3], 
> type='int32')value_offsets.buffers()
> arr = pa.UnionArray.from_dense(types, value_offsets,  
> [arr1, arr2])
> arr4 = pa.UnionArray.from_buffers(pa.struct([pa.field("0", arr1.type) , 
> pa.field("1", arr2.type)]), 5, arr.buffers()[0:3], children=[arr1, arr2])
> {code}
> The problem here arises when I try to produce the Union Array via buffers, 
> according to the Columnar Documentation I need 3 buffers to produce a dense 
> Union Array. But when I try this, there is the error `Type's expected number 
> of buffers (1) did not match the passed number (3)`. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8640) pyarrow.UnionArray.from_buffers() expected number of buffers (1) did not match the passed number (3)

2020-04-29 Thread Micah Kornfield (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Kornfield resolved ARROW-8640.

Resolution: Not A Problem

> pyarrow.UnionArray.from_buffers() expected number of buffers (1) did not 
> match the passed number (3)
> 
>
> Key: ARROW-8640
> URL: https://issues.apache.org/jira/browse/ARROW-8640
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Anish Biswas
>Priority: Major
>
> {code:python}
> arr1 = pa.array([1,2,3,4,5])arr1.buffers()
> arr2 = pa.array([1.1,2.2,3.3,4.4,5.5])
> types = pa.array([0, 1, 0, 0, 1, 1, 0], type='int8')
> value_offsets = pa.array([1, 0, 0, 2, 1, 2, 3], 
> type='int32')value_offsets.buffers()
> arr = pa.UnionArray.from_dense(types, value_offsets,  
> [arr1, arr2])
> arr4 = pa.UnionArray.from_buffers(pa.struct([pa.field("0", arr1.type) , 
> pa.field("1", arr2.type)]), 5, arr.buffers()[0:3], children=[arr1, arr2])
> {code}
> The problem here arises when I try to produce the Union Array via buffers, 
> according to the Columnar Documentation I need 3 buffers to produce a dense 
> Union Array. But when I try this, there is the error `Type's expected number 
> of buffers (1) did not match the passed number (3)`. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8640) pyarrow.UnionArray.from_buffers() expected number of buffers (1) did not match the passed number (3)

2020-04-29 Thread Anish Biswas (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17096156#comment-17096156
 ] 

Anish Biswas commented on ARROW-8640:
-

Yup, thanks that was the trick!
{code:java}
type_arr = pa.union(pa.struct([pa.field("0", arr1.type) , pa.field("1", 
arr2.type)]), "dense", [0,1])

arr4 = pa.UnionArray.from_buffers(type_arr, 5, arr.buffers()[0:3], 
children=[arr1, arr2])
{code}
Creating a type_arr and then passing it into the types field worked. Thanks! 
Just a follow-up question. Why are we required to place the Datatype in fields, 
finally accumulated by the struct instead of passing a list of Datatype in 
which the position determines the type-codes?

> pyarrow.UnionArray.from_buffers() expected number of buffers (1) did not 
> match the passed number (3)
> 
>
> Key: ARROW-8640
> URL: https://issues.apache.org/jira/browse/ARROW-8640
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Anish Biswas
>Priority: Major
>
> {code:python}
> arr1 = pa.array([1,2,3,4,5])arr1.buffers()
> arr2 = pa.array([1.1,2.2,3.3,4.4,5.5])
> types = pa.array([0, 1, 0, 0, 1, 1, 0], type='int8')
> value_offsets = pa.array([1, 0, 0, 2, 1, 2, 3], 
> type='int32')value_offsets.buffers()
> arr = pa.UnionArray.from_dense(types, value_offsets,  
> [arr1, arr2])
> arr4 = pa.UnionArray.from_buffers(pa.struct([pa.field("0", arr1.type) , 
> pa.field("1", arr2.type)]), 5, arr.buffers()[0:3], children=[arr1, arr2])
> {code}
> The problem here arises when I try to produce the Union Array via buffers, 
> according to the Columnar Documentation I need 3 buffers to produce a dense 
> Union Array. But when I try this, there is the error `Type's expected number 
> of buffers (1) did not match the passed number (3)`. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8640) pyarrow.UnionArray.from_buffers() expected number of buffers (1) did not match the passed number (3)

2020-04-29 Thread Micah Kornfield (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17096145#comment-17096145
 ] 

Micah Kornfield commented on ARROW-8640:


It looks like from_buffers is a method declared on the parent class Array.  And 
 the type that the sample code is passing in is a struct (which should only 
have the nullability buffer).  To construct a dense union you need to 
instantiate a new UnionType.  I'm not seeing it the documentation for some 
reason but I think you can use the union method (definition pasted below).

 

 

{{def union(children_fields, mode, type_codes=None):}}
{{ """}}
{{ Create UnionType from children fields.}}
{{ A union is defined by an ordered sequence of types; each slot in the union}}
{{ can have a value chosen from these types.}}
{{ Parameters}}
{{ --}}
{{ fields : sequence of Field values}}
{{ Each field must have a UTF8-encoded name, and these field names are}}
{{ part of the type metadata.}}
{{ mode : str}}
{{ Either 'dense' or 'sparse'.}}
{{ type_codes : list of integers, default None}}
{{ Returns}}
{{ ---}}
{{ type : DataType}}

> pyarrow.UnionArray.from_buffers() expected number of buffers (1) did not 
> match the passed number (3)
> 
>
> Key: ARROW-8640
> URL: https://issues.apache.org/jira/browse/ARROW-8640
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Anish Biswas
>Priority: Major
>
> {code:python}
> arr1 = pa.array([1,2,3,4,5])arr1.buffers()
> arr2 = pa.array([1.1,2.2,3.3,4.4,5.5])
> types = pa.array([0, 1, 0, 0, 1, 1, 0], type='int8')
> value_offsets = pa.array([1, 0, 0, 2, 1, 2, 3], 
> type='int32')value_offsets.buffers()
> arr = pa.UnionArray.from_dense(types, value_offsets,  
> [arr1, arr2])
> arr4 = pa.UnionArray.from_buffers(pa.struct([pa.field("0", arr1.type) , 
> pa.field("1", arr2.type)]), 5, arr.buffers()[0:3], children=[arr1, arr2])
> {code}
> The problem here arises when I try to produce the Union Array via buffers, 
> according to the Columnar Documentation I need 3 buffers to produce a dense 
> Union Array. But when I try this, there is the error `Type's expected number 
> of buffers (1) did not match the passed number (3)`. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8640) pyarrow.UnionArray.from_buffers() expected number of buffers (1) did not match the passed number (3)

2020-04-29 Thread Anish Biswas (Jira)
Anish Biswas created ARROW-8640:
---

 Summary: pyarrow.UnionArray.from_buffers() expected number of 
buffers (1) did not match the passed number (3)
 Key: ARROW-8640
 URL: https://issues.apache.org/jira/browse/ARROW-8640
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Anish Biswas


{code:python}
arr1 = pa.array([1,2,3,4,5])arr1.buffers()
arr2 = pa.array([1.1,2.2,3.3,4.4,5.5])
types = pa.array([0, 1, 0, 0, 1, 1, 0], type='int8')
value_offsets = pa.array([1, 0, 0, 2, 1, 2, 3], 
type='int32')value_offsets.buffers()
arr = pa.UnionArray.from_dense(types, value_offsets,
  [arr1, arr2])
arr4 = pa.UnionArray.from_buffers(pa.struct([pa.field("0", arr1.type) , 
pa.field("1", arr2.type)]), 5, arr.buffers()[0:3], children=[arr1, arr2])
{code}

The problem here arises when I try to produce the Union Array via buffers, 
according to the Columnar Documentation I need 3 buffers to produce a dense 
Union Array. But when I try this, there is the error `Type's expected number of 
buffers (1) did not match the passed number (3)`. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8639) [C++][Plasma] Require gflags

2020-04-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-8639:
--
Labels: pull-request-available  (was: )

> [C++][Plasma] Require gflags
> 
>
> Key: ARROW-8639
> URL: https://issues.apache.org/jira/browse/ARROW-8639
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++ - Plasma
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8639) [C++][Plasma] Require gflags

2020-04-29 Thread Kouhei Sutou (Jira)
Kouhei Sutou created ARROW-8639:
---

 Summary: [C++][Plasma] Require gflags
 Key: ARROW-8639
 URL: https://issues.apache.org/jira/browse/ARROW-8639
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++ - Plasma
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8638) Arrow Cython API Usage Gives an error when calling CTable API Endpoints

2020-04-29 Thread Vibhatha Lakmal Abeykoon (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vibhatha Lakmal Abeykoon updated ARROW-8638:

Description: 
I am working on using both Arrow C++ API and Cython API to support an 
application that I am developing. But here, I will add the issue I experienced 
when I am trying to follow the example, 

[https://arrow.apache.org/docs/python/extending.html]

I am testing on Ubuntu 20.04 LTS

Python version 3.8.2

These are the steps I followed.
 # Create Virtualenv

python3 -m venv ENVARROW

 

2. Activate ENV

source ENVARROW/bin/activate

 

3. pip3 install pyarrow==0.16.0 cython numpy

 

 4. Code block and Tools,

 

+*example.pyx*+

 

 
{code:java}
from pyarrow.lib cimport *
def get_array_length(obj):
 # Just an example function accessing both the pyarrow Cython API
 # and the Arrow C++ API
 cdef shared_ptr[CArray] arr = pyarrow_unwrap_array(obj)
 if arr.get() == NULL:
 raise TypeError("not an array")
 return arr.get().length()
def get_table_info(obj):
 cdef shared_ptr[CTable] table = pyarrow_unwrap_table(obj)
 if table.get() == NULL:
 raise TypeError("not an table")
 
 return table.get().num_columns() 
{code}
 

 

+*setup.py*+

 

 
{code:java}
from distutils.core import setup
from Cython.Build import cythonize
import os
import numpy as np
import pyarrow as pa

ext_modules = cythonize("example.pyx")
for ext in ext_modules:
 # The Numpy C headers are currently required
 ext.include_dirs.append(np.get_include())
 ext.include_dirs.append(pa.get_include())
 ext.libraries.extend(pa.get_libraries())
 ext.library_dirs.extend(pa.get_library_dirs())
if os.name == 'posix':
 ext.extra_compile_args.append('-std=c++11')
# Try uncommenting the following line on Linux
 # if you get weird linker errors or runtime crashes
 #ext.define_macros.append(("_GLIBCXX_USE_CXX11_ABI", "0"))

setup(ext_modules=ext_modules)
{code}
 

 

+*arrow_array.py*+

 
{code:java}
import example
import pyarrow as pa
import numpy as np
arr = pa.array([1,2,3,4,5])
len = example.get_array_length(arr)
print("Array length {} ".format(len)) 
{code}
 

+*arrow_table.py*+

 
{code:java}
import example
import pyarrow as pa
import numpy as np
from pyarrow import csv
fn = 'data.csv'
table = csv.read_csv(fn)
print(table)
cols = example.get_table_info(table)
print(cols)
 
{code}
+*data.csv*+
{code:java}
1,2,3,4,5
6,7,8,9,10
11,12,13,14,15
{code}
 

+*Makefile*+

 
{code:java}
install: 
python3 setup.py build_ext --inplace
clean: 
rm -R *.so build *.cpp 
{code}
 

**When I try to run either of the python example scripts arrow_table.py or 
arrow_array.py, 

I get the following error. 

 
{code:java}
File "arrow_array.py", line 1, in 
 import example
ImportError: libarrow.so.16: cannot open shared object file: No such file or 
directory
{code}
 

 

*Note: I also checked this on RHEL7 with Python 3.6.8, I got a similar 
response.* 

 

 

 

 

 

 

 

 

 

 

 

  was:
I am working on using both Arrow C++ API and Cython API to support an 
application that I am developing. But here, I will add the issue I experienced 
when I am trying to follow the example, 

[https://arrow.apache.org/docs/python/extending.html]

I am testing on Ubuntu 20.04 LTS

Python version 3.8.2

These are the steps I followed.
 # Create Virtualenv

python3 -m venv ENVARROW

 

2. Activate ENV

source ENVARROW/bin/activate

 

3. pip3 install pyarrow==0.16.0 cython numpy

 

 4. Code block and Tools,

 

+*example.pyx*+

 

 
{code:java}
from pyarrow.lib cimport *
def get_array_length(obj):
 # Just an example function accessing both the pyarrow Cython API
 # and the Arrow C++ API
 cdef shared_ptr[CArray] arr = pyarrow_unwrap_array(obj)
 if arr.get() == NULL:
 raise TypeError("not an array")
 return arr.get().length()
def get_table_info(obj):
 cdef shared_ptr[CTable] table = pyarrow_unwrap_table(obj)
 if table.get() == NULL:
 raise TypeError("not an table")
 
 return table.get().num_columns() 
{code}
 

 

+*setup.py*+

 

 
{code:java}
from distutils.core import setup
from Cython.Build import cythonize
import os
import numpy as np
import pyarrow as pa

ext_modules = cythonize("example.pyx")
for ext in ext_modules:
 # The Numpy C headers are currently required
 ext.include_dirs.append(np.get_include())
 ext.include_dirs.append(pa.get_include())
 ext.libraries.extend(pa.get_libraries())
 ext.library_dirs.extend(pa.get_library_dirs())
if os.name == 'posix':
 ext.extra_compile_args.append('-std=c++11')
# Try uncommenting the following line on Linux
 # if you get weird linker errors or runtime crashes
 #ext.define_macros.append(("_GLIBCXX_USE_CXX11_ABI", "0"))

setup(ext_modules=ext_modules)
{code}
 

 

+*arrow_array.py*+

 
{code:java}
import example
import pyarrow as pa
import numpy as np
arr = pa.array([1,2,3,4,5])
len = example.get_array_length(arr)
print("Array length {} ".format(len)) 
{code}
 

+*arrow_table.py*+

 
{code:java}
import exam

[jira] [Updated] (ARROW-8638) Arrow Cython API Usage Gives an error when calling CTable API Endpoints

2020-04-29 Thread Vibhatha Lakmal Abeykoon (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vibhatha Lakmal Abeykoon updated ARROW-8638:

Description: 
I am working on using both Arrow C++ API and Cython API to support an 
application that I am developing. But here, I will add the issue I experienced 
when I am trying to follow the example, 

[https://arrow.apache.org/docs/python/extending.html]

I am testing on Ubuntu 20.04 LTS

Python version 3.8.2

These are the steps I followed.
 # Create Virtualenv

python3 -m venv ENVARROW

 

2. Activate ENV

source ENVARROW/bin/activate

 

3. pip3 install pyarrow==0.16.0 cython numpy

 

 4. Code block and Tools,

 

+*example.pyx*+

 

 
{code:java}
from pyarrow.lib cimport *
def get_array_length(obj):
 # Just an example function accessing both the pyarrow Cython API
 # and the Arrow C++ API
 cdef shared_ptr[CArray] arr = pyarrow_unwrap_array(obj)
 if arr.get() == NULL:
 raise TypeError("not an array")
 return arr.get().length()
def get_table_info(obj):
 cdef shared_ptr[CTable] table = pyarrow_unwrap_table(obj)
 if table.get() == NULL:
 raise TypeError("not an table")
 
 return table.get().num_columns() 
{code}
 

 

+*setup.py*+

 

 
{code:java}
from distutils.core import setup
from Cython.Build import cythonize
import os
import numpy as np
import pyarrow as pa

ext_modules = cythonize("example.pyx")
for ext in ext_modules:
 # The Numpy C headers are currently required
 ext.include_dirs.append(np.get_include())
 ext.include_dirs.append(pa.get_include())
 ext.libraries.extend(pa.get_libraries())
 ext.library_dirs.extend(pa.get_library_dirs())
if os.name == 'posix':
 ext.extra_compile_args.append('-std=c++11')
# Try uncommenting the following line on Linux
 # if you get weird linker errors or runtime crashes
 #ext.define_macros.append(("_GLIBCXX_USE_CXX11_ABI", "0"))

setup(ext_modules=ext_modules)
{code}
 

 

+*arrow_array.py*+

 
{code:java}
import example
import pyarrow as pa
import numpy as np
arr = pa.array([1,2,3,4,5])
len = example.get_array_length(arr)
print("Array length {} ".format(len)) 
{code}
 

+*arrow_table.py*+

 
{code:java}
import example
import pyarrow as pa
import numpy as np
from pyarrow import csv
fn = 'data.csv'
table = csv.read_csv(fn)
print(table)
cols = example.get_table_info(table)
print(cols)
 
{code}
+*data.csv*+
{code:java}
1,2,3,4,5
6,7,8,9,10
11,12,13,14,15
{code}
 

 

**When I try to run either of the python example scripts arrow_table.py or 
arrow_array.py, 

I get the following error. 

 
{code:java}
File "arrow_array.py", line 1, in 
 import example
ImportError: libarrow.so.16: cannot open shared object file: No such file or 
directory
{code}
 

 

*Note: I also checked this on RHEL7 with Python 3.6.8, I got a similar 
response.* 

 

 

 

 

 

 

+*Makefile*+

 

 
{code:java}
 {code}
*install:*
 *python3 setup.py build_ext --inplace*
 *clean:* 
 *rm -R *.so build *.cpp***

 

 

 

 

  was:
I am working on using both Arrow C++ API and Cython API to support an 
application that I am developing. But here, I will add the issue I experienced 
when I am trying to follow the example, 

[https://arrow.apache.org/docs/python/extending.html]

I am testing on Ubuntu 20.04 LTS

Python version 3.8.2

These are the steps I followed.
 # Create Virtualenv

python3 -m venv ENVARROW

 

2. Activate ENV

source ENVARROW/bin/activate

 

3. pip3 install pyarrow==0.16.0 cython numpy

 

 4. Code block and Tools,

 

+*example.pyx*+

 

 
{code:java}

{code}
*from pyarrow.lib cimport *
def get_array_length(obj):
# Just an example function accessing both the pyarrow Cython API
# and the Arrow C++ API
cdef shared_ptr[CArray] arr = pyarrow_unwrap_array(obj)
if arr.get() == NULL:
raise TypeError("not an array")
return arr.get().length()def get_table_info(obj):
cdef shared_ptr[CTable] table = pyarrow_unwrap_table(obj)
if table.get() == NULL:
raise TypeError("not an table")

return table.get().num_columns()***

 

+*setup.py*+

 
{code:java}

{code}
*from distutils.core import setup
from Cython.Build import cythonizeimport os
import numpy as np
import pyarrow as pa
ext_modules = cythonize("example.pyx")for ext in ext_modules:
# The Numpy C headers are currently required
ext.include_dirs.append(np.get_include())
ext.include_dirs.append(pa.get_include())
ext.libraries.extend(pa.get_libraries())
ext.library_dirs.extend(pa.get_library_dirs())*
*if os.name == 'posix':
ext.extra_compile_args.append('-std=c++11')*
*# Try uncommenting the following line on Linux
# if you get weird linker errors or runtime crashes
#ext.define_macros.append(("_GLIBCXX_USE_CXX11_ABI", "0"))
setup(ext_modules=ext_modules)***

 

+*arrow_array.py*+

 
{code:java}

{code}
*import example
import pyarrow as pa
import numpy as np
arr = pa.array([1,2,3,4,5])
len = example.get_array_length(arr)
pr

[jira] [Created] (ARROW-8638) Arrow Cython API Usage Gives an error when calling CTable API Endpoints

2020-04-29 Thread Vibhatha Lakmal Abeykoon (Jira)
Vibhatha Lakmal Abeykoon created ARROW-8638:
---

 Summary: Arrow Cython API Usage Gives an error when calling CTable 
API Endpoints
 Key: ARROW-8638
 URL: https://issues.apache.org/jira/browse/ARROW-8638
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++, Python
Affects Versions: 0.16.0
 Environment: Ubuntu 20.04 with Python 3.8.2
RHEL7 with Python 3.6.8
Reporter: Vibhatha Lakmal Abeykoon
 Fix For: 0.16.0


I am working on using both Arrow C++ API and Cython API to support an 
application that I am developing. But here, I will add the issue I experienced 
when I am trying to follow the example, 

[https://arrow.apache.org/docs/python/extending.html]

I am testing on Ubuntu 20.04 LTS

Python version 3.8.2

These are the steps I followed.
 # Create Virtualenv

python3 -m venv ENVARROW

 

2. Activate ENV

source ENVARROW/bin/activate

 

3. pip3 install pyarrow==0.16.0 cython numpy

 

 4. Code block and Tools,

 

+*example.pyx*+

 

 
{code:java}

{code}
*from pyarrow.lib cimport *
def get_array_length(obj):
# Just an example function accessing both the pyarrow Cython API
# and the Arrow C++ API
cdef shared_ptr[CArray] arr = pyarrow_unwrap_array(obj)
if arr.get() == NULL:
raise TypeError("not an array")
return arr.get().length()def get_table_info(obj):
cdef shared_ptr[CTable] table = pyarrow_unwrap_table(obj)
if table.get() == NULL:
raise TypeError("not an table")

return table.get().num_columns()***

 

+*setup.py*+

 
{code:java}

{code}
*from distutils.core import setup
from Cython.Build import cythonizeimport os
import numpy as np
import pyarrow as pa
ext_modules = cythonize("example.pyx")for ext in ext_modules:
# The Numpy C headers are currently required
ext.include_dirs.append(np.get_include())
ext.include_dirs.append(pa.get_include())
ext.libraries.extend(pa.get_libraries())
ext.library_dirs.extend(pa.get_library_dirs())*
*if os.name == 'posix':
ext.extra_compile_args.append('-std=c++11')*
*# Try uncommenting the following line on Linux
# if you get weird linker errors or runtime crashes
#ext.define_macros.append(("_GLIBCXX_USE_CXX11_ABI", "0"))
setup(ext_modules=ext_modules)***

 

+*arrow_array.py*+

 
{code:java}

{code}
*import example
import pyarrow as pa
import numpy as np
arr = pa.array([1,2,3,4,5])
len = example.get_array_length(arr)
print("Array length {} ".format(len))***

 

+*arrow_table.py*+

 

 
{code:java}

{code}
*import example
import pyarrow as pa
import numpy as np
from pyarrow import csv*
*fn = 'data.csv'*
*table = csv.read_csv(fn)*
*print(table)*
*cols = example.get_table_info(table)*
*print(cols)***

 

+*data.csv*+
{code:java}
1,2,3,4,5
6,7,8,9,10
11,12,13,14,15
{code}
 

 

**When I try to run either of the python example scripts arrow_table.py or 
arrow_array.py, 

I get the following error. 

 
{code:java}
File "arrow_array.py", line 1, in 
 import example
ImportError: libarrow.so.16: cannot open shared object file: No such file or 
directory
{code}
 

 

*Note: I also checked this on RHEL7 with Python 3.6.8, I got a similar 
response.* 

 

 

 

 

 

 

+*Makefile*+

 

 
{code:java}

{code}
*install:*
 *python3 setup.py build_ext --inplace*
*clean:* 
 *rm -R *.so build *.cpp***

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-2260) [C++][Plasma] plasma_store should show usage

2020-04-29 Thread Micah Kornfield (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Kornfield reassigned ARROW-2260:
--

Assignee: Christian Hudon

> [C++][Plasma] plasma_store should show usage
> 
>
> Key: ARROW-2260
> URL: https://issues.apache.org/jira/browse/ARROW-2260
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++ - Plasma
>Affects Versions: 0.8.0
>Reporter: Antoine Pitrou
>Assignee: Christian Hudon
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently the options exposed by the {{plasma_store}} executable aren't very 
> discoverable:
> {code:bash}
> $ plasma_store -h
> please specify socket for incoming connections with -s switch
> Abandon
> (pyarrow) antoine@fsol:~/arrow/cpp (ARROW-2135-nan-conversion-when-casting 
> *)$ plasma_store 
> please specify socket for incoming connections with -s switch
> Abandon
> (pyarrow) antoine@fsol:~/arrow/cpp (ARROW-2135-nan-conversion-when-casting 
> *)$ plasma_store --help
> plasma_store: invalid option -- '-'
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8392) [Java] Fix overflow related corner cases for vector value comparison

2020-04-29 Thread Micah Kornfield (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Kornfield resolved ARROW-8392.

Fix Version/s: 1.0.0
   Resolution: Fixed

Issue resolved by pull request 6892
[https://github.com/apache/arrow/pull/6892]

> [Java] Fix overflow related corner cases for vector value comparison
> 
>
> Key: ARROW-8392
> URL: https://issues.apache.org/jira/browse/ARROW-8392
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Java
>Reporter: Liya Fan
>Assignee: Liya Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> 1. Fix corner cases related to overflow.
> 2. Provide test cases for the corner cases. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (ARROW-8635) [R] test-filesystem.R takes ~40 seconds to run?

2020-04-29 Thread Wes McKinney (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney closed ARROW-8635.
---
Fix Version/s: (was: 1.0.0)
   Resolution: Workaround

Cool thanks. I'm setting

{code}
export AWS_EC2_METADATA_DISABLED=TRUE
{code}

in my development environment

> [R] test-filesystem.R takes ~40 seconds to run?
> ---
>
> Key: ARROW-8635
> URL: https://issues.apache.org/jira/browse/ARROW-8635
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Reporter: Wes McKinney
>Priority: Major
>
> {code}
> ✔ |  22   | Expressions
> ✔ | 107   | Feather [0.2 s]
> ✔ |   7   | Field
> ✔ |  40   | File system [38.1 s]
> ✔ |   6   | install_arrow()
> ✔ |  26   | JsonTableReader [0.1 s]
> ✔ |  24   | MessageReader
> ✔ |  12   | Message
> ✔ |  31   | Parquet file reading/writing [0.2 s]
> ⠏ |   0   | To/from Pythonvirtualenv: arrow-test
> {code}
> Is this expected? I assume it's related to S3 but that seems like a long 
> time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8637) [Rust] Resolve Issues with `prettytable-rs` dependency

2020-04-29 Thread Mark Hildreth (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hildreth updated ARROW-8637:
-
Summary: [Rust] Resolve Issues with `prettytable-rs` dependency  (was: 
Resolve Issues with `prettytable-rs` dependency)

> [Rust] Resolve Issues with `prettytable-rs` dependency
> --
>
> Key: ARROW-8637
> URL: https://issues.apache.org/jira/browse/ARROW-8637
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust
>Reporter: Mark Hildreth
>Priority: Major
>
> {{prettytable-rs}} is a dependency of Arrow for creating a string for 
> displaying record batches in a tabular form (see [pretty 
> util|https://github.com/apache/arrow/blob/c546eef41e6ab20c4ca29a2d836987959843896f/rust/arrow/src/util/pretty.rs#L24-L25])
>  The crate, however, has some issues:
>  
> 1.) {{prettytable-rs}} has a dependency on the {{term}} crate. The {{term}} 
> crate is under minimal maintenance, and it is advised to switch to another 
> crate. This will probably pop up in an [informational security 
> advisory|https://rustsec.org/advisories/RUSTSEC-2018-0015] if it's decided 
> one day to audit the crates.
> 2.) The crate also has a dependency on {{encode-unicode}}. While not 
> problematic in its own right, this crate implements some traits which can 
> bring about confusing type inference issues. For example, after adding the 
> {{prettytable-rs}} dependency in arrow, the following error occurred what 
> attempting to compile the parquet crate:
>  
> {{let seed_vec: Vec =}}
> {{    Standard.sample_iter(&mut rng).take(seed_len).collect();}}
>  
> {{error[E0282]: type annotations needed}}
> {{   --> parquet/src/encodings/rle.rs:833:26}}
> {{    |}}
> {{833 | Standard.sample_iter(&mut rng).take(seed_len).collect();}}
> {{    | ^^^ cannot infer type for `T`}}
>  
> Any user of the arrow crate would see a similar style of error.
>  
> There are a few possible ways to resolve this:
>  
> 1.) Hopefully hear from the crate maintainer. There is a [PR 
> open|https://github.com/phsym/prettytable-rs/pull/125] for the encode-unicode 
> issue.
> 2.) Find a different table-generating crate with less issues.
> 3.) Fork and fix prettytable-rs.
> 4.) ???
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8637) [Rust] Resolve Issues with `prettytable-rs` dependency

2020-04-29 Thread Mark Hildreth (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Hildreth updated ARROW-8637:
-
Priority: Minor  (was: Major)

> [Rust] Resolve Issues with `prettytable-rs` dependency
> --
>
> Key: ARROW-8637
> URL: https://issues.apache.org/jira/browse/ARROW-8637
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust
>Reporter: Mark Hildreth
>Priority: Minor
>
> {{prettytable-rs}} is a dependency of Arrow for creating a string for 
> displaying record batches in a tabular form (see [pretty 
> util|https://github.com/apache/arrow/blob/c546eef41e6ab20c4ca29a2d836987959843896f/rust/arrow/src/util/pretty.rs#L24-L25])
>  The crate, however, has some issues:
>  
> 1.) {{prettytable-rs}} has a dependency on the {{term}} crate. The {{term}} 
> crate is under minimal maintenance, and it is advised to switch to another 
> crate. This will probably pop up in an [informational security 
> advisory|https://rustsec.org/advisories/RUSTSEC-2018-0015] if it's decided 
> one day to audit the crates.
> 2.) The crate also has a dependency on {{encode-unicode}}. While not 
> problematic in its own right, this crate implements some traits which can 
> bring about confusing type inference issues. For example, after adding the 
> {{prettytable-rs}} dependency in arrow, the following error occurred what 
> attempting to compile the parquet crate:
>  
> {{let seed_vec: Vec =}}
> {{    Standard.sample_iter(&mut rng).take(seed_len).collect();}}
>  
> {{error[E0282]: type annotations needed}}
> {{   --> parquet/src/encodings/rle.rs:833:26}}
> {{    |}}
> {{833 | Standard.sample_iter(&mut rng).take(seed_len).collect();}}
> {{    | ^^^ cannot infer type for `T`}}
>  
> Any user of the arrow crate would see a similar style of error.
>  
> There are a few possible ways to resolve this:
>  
> 1.) Hopefully hear from the crate maintainer. There is a [PR 
> open|https://github.com/phsym/prettytable-rs/pull/125] for the encode-unicode 
> issue.
> 2.) Find a different table-generating crate with less issues.
> 3.) Fork and fix prettytable-rs.
> 4.) ???
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8637) Resolve Issues with `prettytable-rs` dependency

2020-04-29 Thread Mark Hildreth (Jira)
Mark Hildreth created ARROW-8637:


 Summary: Resolve Issues with `prettytable-rs` dependency
 Key: ARROW-8637
 URL: https://issues.apache.org/jira/browse/ARROW-8637
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: Mark Hildreth


{{prettytable-rs}} is a dependency of Arrow for creating a string for 
displaying record batches in a tabular form (see [pretty 
util|https://github.com/apache/arrow/blob/c546eef41e6ab20c4ca29a2d836987959843896f/rust/arrow/src/util/pretty.rs#L24-L25])
 The crate, however, has some issues:

 

1.) {{prettytable-rs}} has a dependency on the {{term}} crate. The {{term}} 
crate is under minimal maintenance, and it is advised to switch to another 
crate. This will probably pop up in an [informational security 
advisory|https://rustsec.org/advisories/RUSTSEC-2018-0015] if it's decided one 
day to audit the crates.

2.) The crate also has a dependency on {{encode-unicode}}. While not 
problematic in its own right, this crate implements some traits which can bring 
about confusing type inference issues. For example, after adding the 
{{prettytable-rs}} dependency in arrow, the following error occurred what 
attempting to compile the parquet crate:

 

{{let seed_vec: Vec =}}

{{    Standard.sample_iter(&mut rng).take(seed_len).collect();}}

 

{{error[E0282]: type annotations needed}}
{{   --> parquet/src/encodings/rle.rs:833:26}}
{{    |}}
{{833 | Standard.sample_iter(&mut rng).take(seed_len).collect();}}
{{    | ^^^ cannot infer type for `T`}}

 

Any user of the arrow crate would see a similar style of error.

 

There are a few possible ways to resolve this:

 

1.) Hopefully hear from the crate maintainer. There is a [PR 
open|https://github.com/phsym/prettytable-rs/pull/125] for the encode-unicode 
issue.

2.) Find a different table-generating crate with less issues.

3.) Fork and fix prettytable-rs.

4.) ???

 
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8633) [C++] Add ValidateAscii function

2020-04-29 Thread Yuqi Gu (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17095996#comment-17095996
 ] 

Yuqi Gu commented on ARROW-8633:


Let me break out the ASCII validation changes into a new PR.

> [C++] Add ValidateAscii function
> 
>
> Key: ARROW-8633
> URL: https://issues.apache.org/jira/browse/ARROW-8633
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Yuqi Gu
>Priority: Major
> Fix For: 1.0.0
>
>
> In some cases, we want to be able to check whether it's safe to use functions 
> that assume ASCII (like {{std::tolower}}, or {{std::string::substr). This was 
> implemented in a PR for ARROW-6131 that was not merged



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8636) plasma client delete (of objectid) causes an exception and abort

2020-04-29 Thread Abe Mammen (Jira)
Abe Mammen created ARROW-8636:
-

 Summary: plasma client delete (of objectid) causes an exception 
and abort
 Key: ARROW-8636
 URL: https://issues.apache.org/jira/browse/ARROW-8636
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Abe Mammen


Built from this git repo.
for cpp:
{quote}{quote}ARROW_CHECK_OK(client.Delete(vector\{objectId}));
get:
{quote}Check failed: _s.ok() Operation failed: client.Delete(vector\{objectId})
Bad status: IOError: Encountered unexpected EOF
0 libarrow.18.0.0.dylib 0x0001070ed3c4 
_ZN5arrow4util7CerrLog14PrintBackTraceEv + 52
1 libarrow.18.0.0.dylib 0x0001070ed2e2 _ZN5arrow4util7CerrLogD2Ev + 98
2 libarrow.18.0.0.dylib 0x0001070ed245 _ZN5arrow4util7CerrLogD1Ev + 21
3 libarrow.18.0.0.dylib 0x0001070ed26c _ZN5arrow4util7CerrLogD0Ev + 28
4 libarrow.18.0.0.dylib 0x0001070ed152 _ZN5arrow4util8ArrowLogD2Ev + 82
5 libarrow.18.0.0.dylib 0x0001070ed185 _ZN5arrow4util8ArrowLogD1Ev + 21
6 purge_plasma_messages 0x00010431fe91 main + 2369
7 libdyld.dylib 0x7fff6650b7fd start + 1
8 ??? 0x0001 0x0 + 1
Abort trap: 6
and kills the plasma-store-server.
{quote}{quote}{quote}
What could I be doing wrong? Here is the code:

#include
#include 
#include 

using namespace std;
using namespace plasma;

int main(int argc, char** argv)
{
// Start up and connect a Plasma client.
PlasmaClient client;
ARROW_CHECK_OK(client.Connect("/tmp/plasma_store"));

std::unordered_map objectTable;
ARROW_CHECK_OK(client.List(&objectTable));

cout << "# of objects = " << objectTable.size() << endl;

for (auto it = objectTable.begin(); it != objectTable.end(); ++it) {
ObjectID objectId = it->first;
auto objectEntry = it->second.get();
string idString = objectId.binary();
cout << "object id = " << idString <<
", device = " << objectEntry->device_num <<
", data_size = " << objectEntry->data_size <<
", metadata_size = " << objectEntry->metadata_size <<
", ref_count = " << objectEntry->ref_count <<
endl;
ARROW_CHECK_OK(client.Delete(vector\{objectId}));
}
ARROW_CHECK_OK(client.Disconnect());
}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-8633) [C++] Add ValidateAscii function

2020-04-29 Thread Yuqi Gu (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuqi Gu reassigned ARROW-8633:
--

Assignee: Yuqi Gu

> [C++] Add ValidateAscii function
> 
>
> Key: ARROW-8633
> URL: https://issues.apache.org/jira/browse/ARROW-8633
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Yuqi Gu
>Priority: Major
> Fix For: 1.0.0
>
>
> In some cases, we want to be able to check whether it's safe to use functions 
> that assume ASCII (like {{std::tolower}}, or {{std::string::substr). This was 
> implemented in a PR for ARROW-6131 that was not merged



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8634) [Java] Create an example

2020-04-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-8634:
--
Labels: pull-request-available  (was: )

> [Java] Create an example
> 
>
> Key: ARROW-8634
> URL: https://issues.apache.org/jira/browse/ARROW-8634
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Java
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The Java implementation doesn't seem to have any documentation or examples on 
> how to get started with basic operations such as creating an array. Javadocs 
> exist but how do new users even know which class to look for?
> I would like to create an examples module and one simple example as a 
> starting point. I hope to have a PR soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8635) [R] test-filesystem.R takes ~40 seconds to run?

2020-04-29 Thread Neal Richardson (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17095975#comment-17095975
 ] 

Neal Richardson commented on ARROW-8635:


Have you set this aws-sdk environment variable? 
https://github.com/apache/arrow/blob/master/ci/scripts/r_test.sh#L44-L46 
François found it, and it seems to help.

> [R] test-filesystem.R takes ~40 seconds to run?
> ---
>
> Key: ARROW-8635
> URL: https://issues.apache.org/jira/browse/ARROW-8635
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 1.0.0
>
>
> {code}
> ✔ |  22   | Expressions
> ✔ | 107   | Feather [0.2 s]
> ✔ |   7   | Field
> ✔ |  40   | File system [38.1 s]
> ✔ |   6   | install_arrow()
> ✔ |  26   | JsonTableReader [0.1 s]
> ✔ |  24   | MessageReader
> ✔ |  12   | Message
> ✔ |  31   | Parquet file reading/writing [0.2 s]
> ⠏ |   0   | To/from Pythonvirtualenv: arrow-test
> {code}
> Is this expected? I assume it's related to S3 but that seems like a long 
> time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8635) [R] test-filesystem.R takes ~40 seconds to run?

2020-04-29 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8635:
---

 Summary: [R] test-filesystem.R takes ~40 seconds to run?
 Key: ARROW-8635
 URL: https://issues.apache.org/jira/browse/ARROW-8635
 Project: Apache Arrow
  Issue Type: Bug
  Components: R
Reporter: Wes McKinney
 Fix For: 1.0.0


{code}
✔ |  22   | Expressions
✔ | 107   | Feather [0.2 s]
✔ |   7   | Field
✔ |  40   | File system [38.1 s]
✔ |   6   | install_arrow()
✔ |  26   | JsonTableReader [0.1 s]
✔ |  24   | MessageReader
✔ |  12   | Message
✔ |  31   | Parquet file reading/writing [0.2 s]
⠏ |   0   | To/from Pythonvirtualenv: arrow-test
{code}

Is this expected? I assume it's related to S3 but that seems like a long time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8634) [Java] Create an example

2020-04-29 Thread Andy Grove (Jira)
Andy Grove created ARROW-8634:
-

 Summary: [Java] Create an example
 Key: ARROW-8634
 URL: https://issues.apache.org/jira/browse/ARROW-8634
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Java
Reporter: Andy Grove
Assignee: Andy Grove
 Fix For: 1.0.0


The Java implementation doesn't seem to have any documentation or examples on 
how to get started with basic operations such as creating an array. Javadocs 
exist but how do new users even know which class to look for?

I would like to create an examples module and one simple example as a starting 
point. I hope to have a PR soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-7893) [Developer][GLib] Document GLib development workflow when using conda environment on GTK-based Linux systems

2020-04-29 Thread Wes McKinney (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-7893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17095966#comment-17095966
 ] 

Wes McKinney commented on ARROW-7893:
-

I found out the solution to this. The problem occurs when using {{pkg-config}} 
installed by conda rather than the pkg-config provided by the host Linux 
system. If someone is developing using glib provided by conda then using the 
conda-provided pkg-config should work. 

So the fact that pkg-config is in 
https://github.com/apache/arrow/blob/master/ci/conda_env_cpp.yml is what's 
causing the problem. Running {{conda remove pkg-config}} resolved the issue I 
cited above. I'm not sure what can be documented better about this but it's at 
least some conclusion

> [Developer][GLib] Document GLib development workflow when using conda 
> environment on GTK-based Linux systems
> 
>
> Key: ARROW-7893
> URL: https://issues.apache.org/jira/browse/ARROW-7893
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Documentation, GLib
>Reporter: Wes McKinney
>Assignee: Kouhei Sutou
>Priority: Major
> Fix For: 1.0.0
>
>
> I periodically deal with annoying errors like:
> {code}
> checking for GLIB - version >= 2.32.4... 
> *** 'pkg-config --modversion glib-2.0' returned 2.58.3, but GLIB (2.56.4)
> *** was found! If pkg-config was correct, then it is best
> *** to remove the old version of GLib. You may also be able to fix the error
> *** by modifying your LD_LIBRARY_PATH enviroment variable, or by editing
> *** /etc/ld.so.conf. Make sure you have run ldconfig if that is
> *** required on your system.
> *** If pkg-config was wrong, set the environment variable PKG_CONFIG_PATH
> *** to point to the correct configuration files
> no
> configure: error: GLib isn't available
> make: *** No targets specified and no makefile found.  Stop.
> make: *** No rule to make target 'install'.  Stop.
> Traceback (most recent call last):
>   2: from /home/wesm/code/arrow/c_glib/test/run-test.rb:30:in `'
>   1: from /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in 
> `require'
> /usr/lib/ruby/2.5.0/rubygems/core_ext/kernel_require.rb:59:in `require': 
> cannot load such file -- gi (LoadError)
> {code}
> The problem is that I have one version of glib on my Linux system while 
> another in the activated conda environment, it seems that there is a conflict 
> even though {{$PKG_CONFIG_PATH}} is set to ignore system directories
> https://gist.github.com/wesm/e62bf4517468be78200e8dd6db0fc544



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8287) [Rust] Arrow examples should use utility to print results

2020-04-29 Thread Andy Grove (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Grove resolved ARROW-8287.
---
Resolution: Fixed

Issue resolved by pull request 6972
[https://github.com/apache/arrow/pull/6972]

> [Rust] Arrow examples should use utility to print results
> -
>
> Key: ARROW-8287
> URL: https://issues.apache.org/jira/browse/ARROW-8287
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust
>Reporter: Andy Grove
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/arrow/pull/6773] added a utility for printing 
> record batches and the DataFusion examples were updated to use this. We 
> should now do the same for the Arrow examples. This will require moving the 
> utility method from the datafusion crate to the arrow crate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8622) [Rust] Parquet crate does not compile on aarch64

2020-04-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-8622:
--
Labels: pull-request-available  (was: )

> [Rust] Parquet crate does not compile on aarch64
> 
>
> Key: ARROW-8622
> URL: https://issues.apache.org/jira/browse/ARROW-8622
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Rust
>Reporter: Paddy Horan
>Assignee: R. Tyler Croy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8633) [C++] Add ValidateAscii function

2020-04-29 Thread Wes McKinney (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17095920#comment-17095920
 ] 

Wes McKinney commented on ARROW-8633:
-

Testing (ignore)

> [C++] Add ValidateAscii function
> 
>
> Key: ARROW-8633
> URL: https://issues.apache.org/jira/browse/ARROW-8633
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 1.0.0
>
>
> In some cases, we want to be able to check whether it's safe to use functions 
> that assume ASCII (like {{std::tolower}}, or {{std::string::substr). This was 
> implemented in a PR for ARROW-6131 that was not merged



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (ARROW-8633) [C++] Add ValidateAscii function

2020-04-29 Thread Wes McKinney (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-8633:

Comment: was deleted

(was: Testing (ignore))

> [C++] Add ValidateAscii function
> 
>
> Key: ARROW-8633
> URL: https://issues.apache.org/jira/browse/ARROW-8633
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 1.0.0
>
>
> In some cases, we want to be able to check whether it's safe to use functions 
> that assume ASCII (like {{std::tolower}}, or {{std::string::substr). This was 
> implemented in a PR for ARROW-6131 that was not merged



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6945) [Rust] Enable integration tests

2020-04-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-6945:
--
Labels: pull-request-available  (was: )

> [Rust] Enable integration tests
> ---
>
> Key: ARROW-6945
> URL: https://issues.apache.org/jira/browse/ARROW-6945
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Integration, Rust
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Use docker-compose to generate test files using the Java implementation and 
> then have Rust tests read them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8633) [C++] Add ValidateAscii function

2020-04-29 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8633:
---

 Summary: [C++] Add ValidateAscii function
 Key: ARROW-8633
 URL: https://issues.apache.org/jira/browse/ARROW-8633
 Project: Apache Arrow
  Issue Type: New Feature
  Components: C++
Reporter: Wes McKinney
 Fix For: 1.0.0


In some cases, we want to be able to check whether it's safe to use functions 
that assume ASCII (like {{std::tolower}}, or {{std::string::substr). This was 
implemented in a PR for ARROW-6131 that was not merged



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8632) [C++] Fix conversion error warning in array_union_test.cc

2020-04-29 Thread Francois Saint-Jacques (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francois Saint-Jacques resolved ARROW-8632.
---
Resolution: Fixed

Issue resolved by pull request 7062
[https://github.com/apache/arrow/pull/7062]

> [C++] Fix conversion error warning in array_union_test.cc
> -
>
> Key: ARROW-8632
> URL: https://issues.apache.org/jira/browse/ARROW-8632
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.17.0
>Reporter: Ben Kietzman
>Assignee: Ben Kietzman
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/3257/job/c4f2kqcsm04gjd8u#L1074



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8632) [C++] Fix conversion error warning in array_union_test.cc

2020-04-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-8632:
--
Labels: pull-request-available  (was: )

> [C++] Fix conversion error warning in array_union_test.cc
> -
>
> Key: ARROW-8632
> URL: https://issues.apache.org/jira/browse/ARROW-8632
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.17.0
>Reporter: Ben Kietzman
>Assignee: Ben Kietzman
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/3257/job/c4f2kqcsm04gjd8u#L1074



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8632) [C++] Fix conversion error warning in array_union_test.cc

2020-04-29 Thread Ben Kietzman (Jira)
Ben Kietzman created ARROW-8632:
---

 Summary: [C++] Fix conversion error warning in array_union_test.cc
 Key: ARROW-8632
 URL: https://issues.apache.org/jira/browse/ARROW-8632
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.17.0
Reporter: Ben Kietzman
Assignee: Ben Kietzman
 Fix For: 1.0.0



https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/3257/job/c4f2kqcsm04gjd8u#L1074




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8631) [C++][Dataset] Add ConvertOptions and ReadOptions to CsvFileFormat

2020-04-29 Thread Ben Kietzman (Jira)
Ben Kietzman created ARROW-8631:
---

 Summary: [C++][Dataset] Add ConvertOptions and ReadOptions to 
CsvFileFormat
 Key: ARROW-8631
 URL: https://issues.apache.org/jira/browse/ARROW-8631
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Affects Versions: 0.17.0
Reporter: Ben Kietzman
Assignee: Ben Kietzman
 Fix For: 1.0.0


https://github.com/apache/arrow/pull/7033 does not add ConvertOptions 
(including alternate spellings for null/true/false, etc) or ReadOptions 
(block_size, column name customization, etc). These will be helpful but will 
require some discussion to find the optimal way to integrate them with dataset::



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8630) [C++][Dataset] Pass schema including all materialized fields to catch CSV edge cases

2020-04-29 Thread Ben Kietzman (Jira)
Ben Kietzman created ARROW-8630:
---

 Summary: [C++][Dataset] Pass schema including all materialized 
fields to catch CSV edge cases
 Key: ARROW-8630
 URL: https://issues.apache.org/jira/browse/ARROW-8630
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Affects Versions: 0.17.0
Reporter: Ben Kietzman
Assignee: Ben Kietzman
 Fix For: 1.0.0


see discussion here 
https://github.com/apache/arrow/pull/7033#discussion_r416941674

Fields filtered but not projected will revert to their inferred type, whatever 
their dataset's schema may be. This can cause validated filters to fail due to 
type disagreements



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8629) [Rust] Eliminate indirection of ZST allocations

2020-04-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-8629:
--
Labels: pull-request-available  (was: )

> [Rust] Eliminate indirection of ZST allocations
> ---
>
> Key: ARROW-8629
> URL: https://issues.apache.org/jira/browse/ARROW-8629
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust
>Affects Versions: 0.17.0
>Reporter: Mahmut Bulut
>Assignee: Mahmut Bulut
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, no matter what any array construction without data creates 0 sized 
> layouts and passes itself to Rust's allocator API, thus OS then comes back to 
> the program and does the job.
> This issue is two-fold:
>  * First, this creates indirection and UB in a sense.
>  * Second, degrades the performance for the merging simd, merging arrays, 
> constructing arrays etc. and intermediate arrays when doing ops. over them.
> The solution would be:
>  * Having UB solved without a performance downside.
>  * Improve the performance



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-7313) [C++] Add function for retrieving a scalar from an array slot

2020-04-29 Thread Ben Kietzman (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-7313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Kietzman updated ARROW-7313:

Fix Version/s: 1.0.0

> [C++] Add function for retrieving a scalar from an array slot
> -
>
> Key: ARROW-7313
> URL: https://issues.apache.org/jira/browse/ARROW-7313
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Affects Versions: 0.15.1
>Reporter: Ben Kietzman
>Assignee: Ben Kietzman
>Priority: Major
> Fix For: 1.0.0
>
>
> It'd be useful to construct scalar values given an array and an index.
> {code}
> /* static */ std::shared_ptr Scalar::FromArray(const Array&, int64_t);
> {code}
> Since this is much less efficient than unboxing the entire array and 
> accessing its buffers directly, it should not be used in hot loops.
> [~kszucs] [~fsaintjacques]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8629) [Rust] Eliminate indirection of ZST allocations

2020-04-29 Thread Mahmut Bulut (Jira)
Mahmut Bulut created ARROW-8629:
---

 Summary: [Rust] Eliminate indirection of ZST allocations
 Key: ARROW-8629
 URL: https://issues.apache.org/jira/browse/ARROW-8629
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Affects Versions: 0.17.0
Reporter: Mahmut Bulut
Assignee: Mahmut Bulut


Currently, no matter what any array construction without data creates 0 sized 
layouts and passes itself to Rust's allocator API, thus OS then comes back to 
the program and does the job.

This issue is two-fold:
 * First, this creates indirection and UB in a sense.
 * Second, degrades the performance for the merging simd, merging arrays, 
constructing arrays etc. and intermediate arrays when doing ops. over them.

The solution would be:
 * Having UB solved without a performance downside.
 * Improve the performance



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8627) [Rust] Invalid mem access in BufferBuilderTrait

2020-04-29 Thread Mahmut Bulut (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahmut Bulut updated ARROW-8627:

Description: 
Currently, there is an invalid access happening through the append_n method to 
a mutable location with multiple shared refs. Happens when benchmark code 
executes with `bench_bool`.

Happens on (rustc 1.44.0-nightly (45d050cde 2020-04-21))

 

bt shown below:

```
 * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS 
(code=1, address=0x1004e7000)
 * frame #0: 0x000100150d37 
builder-6a49123b1fedb178`_$LT$arrow..array..builder..BufferBuilder$LT$arrow..datatypes..BooleanType$GT$$u20$as$u20$arrow..array..builder..BufferBuilderTrait$LT$arrow..datatypes..BooleanType$GT$$GT$::append_n::h6ae4d34cca93d03c
 + 311
 frame #1: 0x00017303 
builder-6a49123b1fedb178`arrow::array::builder::PrimitiveBuilder$LT$T$GT$::append_slice::h8d33144acea1616b
 + 51
 frame #2: 0x00010001b143 
builder-6a49123b1fedb178`criterion::Bencher$LT$M$GT$::iter::hfcae173a53b56e6f + 
259
 frame #3: 0x00013136 
builder-6a49123b1fedb178`_$LT$criterion..routine..Function$LT$M$C$F$C$T$GT$$u20$as$u20$criterion..routine..Routine$LT$M$C$T$GT$$GT$::warm_up::h5b415f52c0951798
 + 102
 frame #4: 0x0001373b 
builder-6a49123b1fedb178`criterion::routine::Routine::sample::h2802012b9b92a2a5 
+ 203
 frame #5: 0x0001000287a2 
builder-6a49123b1fedb178`criterion::analysis::common::h1eabf5af2afe42e5 + 834
 frame #6: 0x000100023a83 
builder-6a49123b1fedb178`_$LT$criterion..benchmark..Benchmark$LT$M$GT$$u20$as$u20$criterion..benchmark..BenchmarkDefinition$LT$M$GT$$GT$::run::hf631a3f91617ae46
 + 1507
 frame #7: 0x0001000109b8 
builder-6a49123b1fedb178`builder::main::he83c09c3b2c8f318 + 216
 frame #8: 0x000100021c96 
builder-6a49123b1fedb178`std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::hfb404fc983af2389
 + 6
 frame #9: 0x0001001e9499 
builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 
[inlined] 
std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::h096599b40842db82 
at rt.rs:52:13 [opt]
 frame #10: 0x0001001e948e 
builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 
[inlined] std::panicking::try::do_call::h1c9f73590350b657 at panicking.rs:331 
[opt]
 frame #11: 0x0001001e948e 
builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 
[inlined] std::panicking::try::hca6829be93a31f1b at panicking.rs:274 [opt]
 frame #12: 0x0001001e948e 
builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 
[inlined] std::panic::catch_unwind::hb3c8ad89db0960bd at panic.rs:394 [opt]
 frame #13: 0x0001001e948e 
builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 at 
rt.rs:51 [opt]
 frame #14: 0x000100010b49 builder-6a49123b1fedb178`main + 41
 frame #15: 0x7fff691c07fd libdyld.dylib`start + 1
 frame #16: 0x7fff691c07fd libdyld.dylib`start + 1

```

  was:
Currently, there is an invalid access happening through the append_n method to 
a mutable location with multiple shared refs. Happens when benchmark code 
executes with `bench_bool`.

bt shown below:

```
 * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS 
(code=1, address=0x1004e7000)
 * frame #0: 0x000100150d37 
builder-6a49123b1fedb178`_$LT$arrow..array..builder..BufferBuilder$LT$arrow..datatypes..BooleanType$GT$$u20$as$u20$arrow..array..builder..BufferBuilderTrait$LT$arrow..datatypes..BooleanType$GT$$GT$::append_n::h6ae4d34cca93d03c
 + 311
 frame #1: 0x00017303 
builder-6a49123b1fedb178`arrow::array::builder::PrimitiveBuilder$LT$T$GT$::append_slice::h8d33144acea1616b
 + 51
 frame #2: 0x00010001b143 
builder-6a49123b1fedb178`criterion::Bencher$LT$M$GT$::iter::hfcae173a53b56e6f + 
259
 frame #3: 0x00013136 
builder-6a49123b1fedb178`_$LT$criterion..routine..Function$LT$M$C$F$C$T$GT$$u20$as$u20$criterion..routine..Routine$LT$M$C$T$GT$$GT$::warm_up::h5b415f52c0951798
 + 102
 frame #4: 0x0001373b 
builder-6a49123b1fedb178`criterion::routine::Routine::sample::h2802012b9b92a2a5 
+ 203
 frame #5: 0x0001000287a2 
builder-6a49123b1fedb178`criterion::analysis::common::h1eabf5af2afe42e5 + 834
 frame #6: 0x000100023a83 
builder-6a49123b1fedb178`_$LT$criterion..benchmark..Benchmark$LT$M$GT$$u20$as$u20$criterion..benchmark..BenchmarkDefinition$LT$M$GT$$GT$::run::hf631a3f91617ae46
 + 1507
 frame #7: 0x0001000109b8 
builder-6a49123b1fedb178`builder::main::he83c09c3b2c8f318 + 216
 frame #8: 0x000100021c96 
builder-6a49123b1fedb178`std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::hfb404fc983af2389
 + 6
 frame #9: 0x0001001e9499 
builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 
[inlined] 
std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::h096599b40842db82 
at rt.rs:52:13 [opt]
 frame #10: 0x0001001e948e 
builder-6a49123b

[jira] [Updated] (ARROW-8628) [Dev] Wrap docker-compose commands with archery

2020-04-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-8628:
--
Labels: pull-request-available  (was: )

> [Dev] Wrap docker-compose commands with archery
> ---
>
> Key: ARROW-8628
> URL: https://issues.apache.org/jira/browse/ARROW-8628
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Developer Tools
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Build the image hierarchy automatically.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8628) [Dev] Wrap docker-compose commands with archery

2020-04-29 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8628:
--

 Summary: [Dev] Wrap docker-compose commands with archery
 Key: ARROW-8628
 URL: https://issues.apache.org/jira/browse/ARROW-8628
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Developer Tools
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs
 Fix For: 1.0.0


Build the image hierarchy automatically.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8627) [Rust] Invalid mem access in BufferBuilderTrait

2020-04-29 Thread Mahmut Bulut (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahmut Bulut updated ARROW-8627:

Summary: [Rust] Invalid mem access in BufferBuilderTrait  (was: [Rust] 
Invalid access through append_n through BufferBuilderTrait)

> [Rust] Invalid mem access in BufferBuilderTrait
> ---
>
> Key: ARROW-8627
> URL: https://issues.apache.org/jira/browse/ARROW-8627
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Rust
>Reporter: Mahmut Bulut
>Priority: Major
>
> Currently, there is an invalid access happening through the append_n method 
> to a mutable location with multiple shared refs. Happens when benchmark code 
> executes with `bench_bool`.
> bt shown below:
> ```
>  * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS 
> (code=1, address=0x1004e7000)
>  * frame #0: 0x000100150d37 
> builder-6a49123b1fedb178`_$LT$arrow..array..builder..BufferBuilder$LT$arrow..datatypes..BooleanType$GT$$u20$as$u20$arrow..array..builder..BufferBuilderTrait$LT$arrow..datatypes..BooleanType$GT$$GT$::append_n::h6ae4d34cca93d03c
>  + 311
>  frame #1: 0x00017303 
> builder-6a49123b1fedb178`arrow::array::builder::PrimitiveBuilder$LT$T$GT$::append_slice::h8d33144acea1616b
>  + 51
>  frame #2: 0x00010001b143 
> builder-6a49123b1fedb178`criterion::Bencher$LT$M$GT$::iter::hfcae173a53b56e6f 
> + 259
>  frame #3: 0x00013136 
> builder-6a49123b1fedb178`_$LT$criterion..routine..Function$LT$M$C$F$C$T$GT$$u20$as$u20$criterion..routine..Routine$LT$M$C$T$GT$$GT$::warm_up::h5b415f52c0951798
>  + 102
>  frame #4: 0x0001373b 
> builder-6a49123b1fedb178`criterion::routine::Routine::sample::h2802012b9b92a2a5
>  + 203
>  frame #5: 0x0001000287a2 
> builder-6a49123b1fedb178`criterion::analysis::common::h1eabf5af2afe42e5 + 834
>  frame #6: 0x000100023a83 
> builder-6a49123b1fedb178`_$LT$criterion..benchmark..Benchmark$LT$M$GT$$u20$as$u20$criterion..benchmark..BenchmarkDefinition$LT$M$GT$$GT$::run::hf631a3f91617ae46
>  + 1507
>  frame #7: 0x0001000109b8 
> builder-6a49123b1fedb178`builder::main::he83c09c3b2c8f318 + 216
>  frame #8: 0x000100021c96 
> builder-6a49123b1fedb178`std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::hfb404fc983af2389
>  + 6
>  frame #9: 0x0001001e9499 
> builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 
> [inlined] 
> std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::h096599b40842db82 
> at rt.rs:52:13 [opt]
>  frame #10: 0x0001001e948e 
> builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 
> [inlined] std::panicking::try::do_call::h1c9f73590350b657 at panicking.rs:331 
> [opt]
>  frame #11: 0x0001001e948e 
> builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 
> [inlined] std::panicking::try::hca6829be93a31f1b at panicking.rs:274 [opt]
>  frame #12: 0x0001001e948e 
> builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 
> [inlined] std::panic::catch_unwind::hb3c8ad89db0960bd at panic.rs:394 [opt]
>  frame #13: 0x0001001e948e 
> builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 at 
> rt.rs:51 [opt]
>  frame #14: 0x000100010b49 builder-6a49123b1fedb178`main + 41
>  frame #15: 0x7fff691c07fd libdyld.dylib`start + 1
>  frame #16: 0x7fff691c07fd libdyld.dylib`start + 1
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8627) [Rust] Invalid access through append_n through BufferBuilderTrait

2020-04-29 Thread Mahmut Bulut (Jira)
Mahmut Bulut created ARROW-8627:
---

 Summary: [Rust] Invalid access through append_n through 
BufferBuilderTrait
 Key: ARROW-8627
 URL: https://issues.apache.org/jira/browse/ARROW-8627
 Project: Apache Arrow
  Issue Type: Bug
  Components: Rust
Reporter: Mahmut Bulut


Currently, there is an invalid access happening through the append_n method to 
a mutable location with multiple shared refs. Happens when benchmark code 
executes with `bench_bool`.

bt shown below:

```
 * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS 
(code=1, address=0x1004e7000)
 * frame #0: 0x000100150d37 
builder-6a49123b1fedb178`_$LT$arrow..array..builder..BufferBuilder$LT$arrow..datatypes..BooleanType$GT$$u20$as$u20$arrow..array..builder..BufferBuilderTrait$LT$arrow..datatypes..BooleanType$GT$$GT$::append_n::h6ae4d34cca93d03c
 + 311
 frame #1: 0x00017303 
builder-6a49123b1fedb178`arrow::array::builder::PrimitiveBuilder$LT$T$GT$::append_slice::h8d33144acea1616b
 + 51
 frame #2: 0x00010001b143 
builder-6a49123b1fedb178`criterion::Bencher$LT$M$GT$::iter::hfcae173a53b56e6f + 
259
 frame #3: 0x00013136 
builder-6a49123b1fedb178`_$LT$criterion..routine..Function$LT$M$C$F$C$T$GT$$u20$as$u20$criterion..routine..Routine$LT$M$C$T$GT$$GT$::warm_up::h5b415f52c0951798
 + 102
 frame #4: 0x0001373b 
builder-6a49123b1fedb178`criterion::routine::Routine::sample::h2802012b9b92a2a5 
+ 203
 frame #5: 0x0001000287a2 
builder-6a49123b1fedb178`criterion::analysis::common::h1eabf5af2afe42e5 + 834
 frame #6: 0x000100023a83 
builder-6a49123b1fedb178`_$LT$criterion..benchmark..Benchmark$LT$M$GT$$u20$as$u20$criterion..benchmark..BenchmarkDefinition$LT$M$GT$$GT$::run::hf631a3f91617ae46
 + 1507
 frame #7: 0x0001000109b8 
builder-6a49123b1fedb178`builder::main::he83c09c3b2c8f318 + 216
 frame #8: 0x000100021c96 
builder-6a49123b1fedb178`std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::hfb404fc983af2389
 + 6
 frame #9: 0x0001001e9499 
builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 
[inlined] 
std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::h096599b40842db82 
at rt.rs:52:13 [opt]
 frame #10: 0x0001001e948e 
builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 
[inlined] std::panicking::try::do_call::h1c9f73590350b657 at panicking.rs:331 
[opt]
 frame #11: 0x0001001e948e 
builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 
[inlined] std::panicking::try::hca6829be93a31f1b at panicking.rs:274 [opt]
 frame #12: 0x0001001e948e 
builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 
[inlined] std::panic::catch_unwind::hb3c8ad89db0960bd at panic.rs:394 [opt]
 frame #13: 0x0001001e948e 
builder-6a49123b1fedb178`std::rt::lang_start_internal::h434140244059d623 at 
rt.rs:51 [opt]
 frame #14: 0x000100010b49 builder-6a49123b1fedb178`main + 41
 frame #15: 0x7fff691c07fd libdyld.dylib`start + 1
 frame #16: 0x7fff691c07fd libdyld.dylib`start + 1

```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (ARROW-8626) [C++] Implement "round robin" scheduler interface to fixed-size ThreadPool

2020-04-29 Thread Wes McKinney (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17095585#comment-17095585
 ] 

Wes McKinney edited comment on ARROW-8626 at 4/29/20, 4:08 PM:
---

This work should also be able to give way to other kinds of fairer resource 
allocation schemes. For example, round robin is bad when you have two consumers 
whose tasks are greatly unequal in cost. Imagine CPU tasks that take 1000x as 
long for one consumer as the other -- then eventually the first (costly 
per-task) consumer is running on all threads and the second (less-costly per 
task) consumer is waiting


was (Author: wesmckinn):
This work should also be able to give way to other kinds of fairer resource 
allocation schemes

> [C++] Implement "round robin" scheduler interface to fixed-size ThreadPool 
> ---
>
> Key: ARROW-8626
> URL: https://issues.apache.org/jira/browse/ARROW-8626
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 1.0.0
>
>
> Currently, when submitting tasks to a thread pool, they are all commingled in 
> a common queue. When a new task submitter shows up, they must wait in the 
> back of the line behind all other queued tasks.
> A simple alternative to this would be round-robin scheduling, where each new 
> consumer is assigned a unique integer id, and the schedule / thread pool 
> internally maintains the tasks associated with the consumer in separate 
> queues. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8626) [C++] Implement "round robin" scheduler interface to fixed-size ThreadPool

2020-04-29 Thread Wes McKinney (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17095585#comment-17095585
 ] 

Wes McKinney commented on ARROW-8626:
-

This work should also be able to give way to other kinds of fairer resource 
allocation schemes

> [C++] Implement "round robin" scheduler interface to fixed-size ThreadPool 
> ---
>
> Key: ARROW-8626
> URL: https://issues.apache.org/jira/browse/ARROW-8626
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 1.0.0
>
>
> Currently, when submitting tasks to a thread pool, they are all commingled in 
> a common queue. When a new task submitter shows up, they must wait in the 
> back of the line behind all other queued tasks.
> A simple alternative to this would be round-robin scheduling, where each new 
> consumer is assigned a unique integer id, and the schedule / thread pool 
> internally maintains the tasks associated with the consumer in separate 
> queues. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8626) [C++] Implement "round robin" scheduler interface to fixed-size ThreadPool

2020-04-29 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8626:
---

 Summary: [C++] Implement "round robin" scheduler interface to 
fixed-size ThreadPool 
 Key: ARROW-8626
 URL: https://issues.apache.org/jira/browse/ARROW-8626
 Project: Apache Arrow
  Issue Type: New Feature
  Components: C++
Reporter: Wes McKinney
 Fix For: 1.0.0


Currently, when submitting tasks to a thread pool, they are all commingled in a 
common queue. When a new task submitter shows up, they must wait in the back of 
the line behind all other queued tasks.

A simple alternative to this would be round-robin scheduling, where each new 
consumer is assigned a unique integer id, and the schedule / thread pool 
internally maintains the tasks associated with the consumer in separate queues. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8625) Minimum working example of `UnionArray.from_buffers()` method

2020-04-29 Thread Anish Biswas (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17095564#comment-17095564
 ] 

Anish Biswas commented on ARROW-8625:
-

This is exactly what I need, thank you so much. 

> Minimum working example of `UnionArray.from_buffers()` method
> -
>
> Key: ARROW-8625
> URL: https://issues.apache.org/jira/browse/ARROW-8625
> Project: Apache Arrow
>  Issue Type: Wish
>Reporter: Anish Biswas
>Priority: Major
>
> Hi, it would be great if someone could provide me with a minimum working 
> example of how to arrange the buffers in `pyarrow.UnionArray.from_buffers` 
> method.Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (ARROW-8625) Minimum working example of `UnionArray.from_buffers()` method

2020-04-29 Thread Anish Biswas (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anish Biswas closed ARROW-8625.
---
Resolution: Fixed

> Minimum working example of `UnionArray.from_buffers()` method
> -
>
> Key: ARROW-8625
> URL: https://issues.apache.org/jira/browse/ARROW-8625
> Project: Apache Arrow
>  Issue Type: Wish
>Reporter: Anish Biswas
>Priority: Major
>
> Hi, it would be great if someone could provide me with a minimum working 
> example of how to arrange the buffers in `pyarrow.UnionArray.from_buffers` 
> method.Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8625) Minimum working example of `UnionArray.from_buffers()` method

2020-04-29 Thread Wes McKinney (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1709#comment-1709
 ] 

Wes McKinney commented on ARROW-8625:
-

Could you clarify what you're looking for? This is documented here

https://github.com/apache/arrow/blob/master/docs/source/format/Columnar.rst#buffer-listing-for-each-layout

> Minimum working example of `UnionArray.from_buffers()` method
> -
>
> Key: ARROW-8625
> URL: https://issues.apache.org/jira/browse/ARROW-8625
> Project: Apache Arrow
>  Issue Type: Wish
>Reporter: Anish Biswas
>Priority: Major
>
> Hi, it would be great if someone could provide me with a minimum working 
> example of how to arrange the buffers in `pyarrow.UnionArray.from_buffers` 
> method.Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8625) Minimum working example of `UnionArray.from_buffers()` method

2020-04-29 Thread Anish Biswas (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anish Biswas updated ARROW-8625:

Description: Hi, it would be great if someone could provide me with a 
minimum working example of how to arrange the buffers in 
`pyarrow.UnionArray.from_buffers` method.Thanks!  (was: Hi, it would be great 
if someone could provide me with a minimum working example of how to arrange 
the buffers in `pyarrow.UnionArray.from_buffers` method. Thanks!)

> Minimum working example of `UnionArray.from_buffers()` method
> -
>
> Key: ARROW-8625
> URL: https://issues.apache.org/jira/browse/ARROW-8625
> Project: Apache Arrow
>  Issue Type: Wish
>Reporter: Anish Biswas
>Priority: Major
>
> Hi, it would be great if someone could provide me with a minimum working 
> example of how to arrange the buffers in `pyarrow.UnionArray.from_buffers` 
> method.Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8625) Minimum working example of `UnionArray.from_buffers()` method

2020-04-29 Thread Anish Biswas (Jira)
Anish Biswas created ARROW-8625:
---

 Summary: Minimum working example of `UnionArray.from_buffers()` 
method
 Key: ARROW-8625
 URL: https://issues.apache.org/jira/browse/ARROW-8625
 Project: Apache Arrow
  Issue Type: Wish
Reporter: Anish Biswas


Hi, it would be great if someone could provide me with a minimum working 
example of how to arrange the buffers in `pyarrow.UnionArray.from_buffers` 
method. Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8624) [Website] Install page should mention arrow-dataset packages

2020-04-29 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson updated ARROW-8624:
---
Description: 
I've seen a few reports like [https://github.com/apache/arrow/issues/7055], 
where the user reports that they've installed the arrow system packages, we can 
see that they exist, but {{pkg-config}} reports that it doesn't have them. I 
think this is because {{-larrow_dataset}} isn't found. As the output on that 
issue shows, while arrow core headers and libraries are there, arrow_dataset is 
not.

-Searching through the packaging scripts (such as 
[https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in]),
 while there is some metadata about a dataset package, I see that 
ARROW_DATASET=ON is not set anywhere, so I don't think we're building it.- So 
apparently we are building it, but we aren't documenting how to get it.

  was:
I've seen a few reports like https://github.com/apache/arrow/issues/7055, where 
the user reports that they've installed the arrow system packages, we can see 
that they exist, but {{pkg-config}} reports that it doesn't have them. I think 
this is because {{-larrow_dataset}} isn't found. As the output on that issue 
shows, while arrow core headers and libraries are there, arrow_dataset is not.

~~Searching through the packaging scripts (such as 
https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in),
 while there is some metadata about a dataset package, I see that 
ARROW_DATASET=ON is not set anywhere, so I don't think we're building it.~~ So 
apparently we are building it, but we aren't documenting how to get it. 


> [Website] Install page should mention arrow-dataset packages
> 
>
> Key: ARROW-8624
> URL: https://issues.apache.org/jira/browse/ARROW-8624
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Affects Versions: 0.17.0
>Reporter: Neal Richardson
>Priority: Critical
>
> I've seen a few reports like [https://github.com/apache/arrow/issues/7055], 
> where the user reports that they've installed the arrow system packages, we 
> can see that they exist, but {{pkg-config}} reports that it doesn't have 
> them. I think this is because {{-larrow_dataset}} isn't found. As the output 
> on that issue shows, while arrow core headers and libraries are there, 
> arrow_dataset is not.
> -Searching through the packaging scripts (such as 
> [https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in]),
>  while there is some metadata about a dataset package, I see that 
> ARROW_DATASET=ON is not set anywhere, so I don't think we're building it.- So 
> apparently we are building it, but we aren't documenting how to get it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (ARROW-8624) [Packaging] Linux system packages aren't building with ARROW_DATASET=ON

2020-04-29 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson reopened ARROW-8624:


> [Packaging] Linux system packages aren't building with ARROW_DATASET=ON
> ---
>
> Key: ARROW-8624
> URL: https://issues.apache.org/jira/browse/ARROW-8624
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Affects Versions: 0.17.0
>Reporter: Neal Richardson
>Priority: Critical
>
> I've seen a few reports like https://github.com/apache/arrow/issues/7055, 
> where the user reports that they've installed the arrow system packages, we 
> can see that they exist, but {{pkg-config}} reports that it doesn't have 
> them. I think this is because {{-larrow_dataset}} isn't found. As the output 
> on that issue shows, while arrow core headers and libraries are there, 
> arrow_dataset is not.
> Searching through the packaging scripts (such as 
> https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in),
>  while there is some metadata about a dataset package, I see that 
> ARROW_DATASET=ON is not set anywhere, so I don't think we're building it. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8624) [Website] Install page should mention arrow-dataset packages

2020-04-29 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson updated ARROW-8624:
---
Summary: [Website] Install page should mention arrow-dataset packages  
(was: [Packaging] Linux system packages aren't building with ARROW_DATASET=ON)

> [Website] Install page should mention arrow-dataset packages
> 
>
> Key: ARROW-8624
> URL: https://issues.apache.org/jira/browse/ARROW-8624
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Affects Versions: 0.17.0
>Reporter: Neal Richardson
>Priority: Critical
>
> I've seen a few reports like https://github.com/apache/arrow/issues/7055, 
> where the user reports that they've installed the arrow system packages, we 
> can see that they exist, but {{pkg-config}} reports that it doesn't have 
> them. I think this is because {{-larrow_dataset}} isn't found. As the output 
> on that issue shows, while arrow core headers and libraries are there, 
> arrow_dataset is not.
> Searching through the packaging scripts (such as 
> https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in),
>  while there is some metadata about a dataset package, I see that 
> ARROW_DATASET=ON is not set anywhere, so I don't think we're building it. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8624) [Website] Install page should mention arrow-dataset packages

2020-04-29 Thread Neal Richardson (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neal Richardson updated ARROW-8624:
---
Description: 
I've seen a few reports like https://github.com/apache/arrow/issues/7055, where 
the user reports that they've installed the arrow system packages, we can see 
that they exist, but {{pkg-config}} reports that it doesn't have them. I think 
this is because {{-larrow_dataset}} isn't found. As the output on that issue 
shows, while arrow core headers and libraries are there, arrow_dataset is not.

~~Searching through the packaging scripts (such as 
https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in),
 while there is some metadata about a dataset package, I see that 
ARROW_DATASET=ON is not set anywhere, so I don't think we're building it.~~ So 
apparently we are building it, but we aren't documenting how to get it. 

  was:
I've seen a few reports like https://github.com/apache/arrow/issues/7055, where 
the user reports that they've installed the arrow system packages, we can see 
that they exist, but {{pkg-config}} reports that it doesn't have them. I think 
this is because {{-larrow_dataset}} isn't found. As the output on that issue 
shows, while arrow core headers and libraries are there, arrow_dataset is not.

Searching through the packaging scripts (such as 
https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in),
 while there is some metadata about a dataset package, I see that 
ARROW_DATASET=ON is not set anywhere, so I don't think we're building it. 


> [Website] Install page should mention arrow-dataset packages
> 
>
> Key: ARROW-8624
> URL: https://issues.apache.org/jira/browse/ARROW-8624
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Affects Versions: 0.17.0
>Reporter: Neal Richardson
>Priority: Critical
>
> I've seen a few reports like https://github.com/apache/arrow/issues/7055, 
> where the user reports that they've installed the arrow system packages, we 
> can see that they exist, but {{pkg-config}} reports that it doesn't have 
> them. I think this is because {{-larrow_dataset}} isn't found. As the output 
> on that issue shows, while arrow core headers and libraries are there, 
> arrow_dataset is not.
> ~~Searching through the packaging scripts (such as 
> https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in),
>  while there is some metadata about a dataset package, I see that 
> ARROW_DATASET=ON is not set anywhere, so I don't think we're building it.~~ 
> So apparently we are building it, but we aren't documenting how to get it. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (ARROW-8624) [Packaging] Linux system packages aren't building with ARROW_DATASET=ON

2020-04-29 Thread Wes McKinney (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney closed ARROW-8624.
---
Resolution: Not A Problem

ARROW_DATASET is enabled by ARROW_PYTHON=ON. The dataset libraries are packaged 
separately from the main Arrow libraries, so the user needs to {{yum install 
arrow-dataset-devel}}

> [Packaging] Linux system packages aren't building with ARROW_DATASET=ON
> ---
>
> Key: ARROW-8624
> URL: https://issues.apache.org/jira/browse/ARROW-8624
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Affects Versions: 0.17.0
>Reporter: Neal Richardson
>Priority: Critical
>
> I've seen a few reports like https://github.com/apache/arrow/issues/7055, 
> where the user reports that they've installed the arrow system packages, we 
> can see that they exist, but {{pkg-config}} reports that it doesn't have 
> them. I think this is because {{-larrow_dataset}} isn't found. As the output 
> on that issue shows, while arrow core headers and libraries are there, 
> arrow_dataset is not.
> Searching through the packaging scripts (such as 
> https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in),
>  while there is some metadata about a dataset package, I see that 
> ARROW_DATASET=ON is not set anywhere, so I don't think we're building it. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8624) [Packaging] Linux system packages aren't building with ARROW_DATASET=ON

2020-04-29 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-8624:
--

 Summary: [Packaging] Linux system packages aren't building with 
ARROW_DATASET=ON
 Key: ARROW-8624
 URL: https://issues.apache.org/jira/browse/ARROW-8624
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Packaging
Affects Versions: 0.17.0
Reporter: Neal Richardson


I've seen a few reports like https://github.com/apache/arrow/issues/7055, where 
the user reports that they've installed the arrow system packages, we can see 
that they exist, but {{pkg-config}} reports that it doesn't have them. I think 
this is because {{-larrow_dataset}} isn't found. As the output on that issue 
shows, while arrow core headers and libraries are there, arrow_dataset is not.

Searching through the packaging scripts (such as 
https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in),
 while there is some metadata about a dataset package, I see that 
ARROW_DATASET=ON is not set anywhere, so I don't think we're building it. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8623) [C++][Gandiva] Reduce use of Boost, remove Boost headers from header files

2020-04-29 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8623:
---

 Summary: [C++][Gandiva] Reduce use of Boost, remove Boost headers 
from header files
 Key: ARROW-8623
 URL: https://issues.apache.org/jira/browse/ARROW-8623
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, C++ - Gandiva
Reporter: Wes McKinney
 Fix For: 1.0.0


Boost is currently a transitive dependency of many of Gandiva's public header 
files. I suggest the following:

* Do not include Boost transitively in any installed header file
* Reduce usages of Boost altogether

On the latter point, most usages of Boost can be trimmed by having a 
{{hash_combine}} function inside the Arrow codebase. See results of grepping 
the codebase

https://gist.github.com/wesm/190006d91628e6bf7c04deb596a52cff

It seems that Boost cannot be easily eliminated altogether at the present 
moment because of a use of Boost.Multiprecision ({{int256_t}}). At some point 
someone may want to implement sufficient 256-bit integer functions so that we 
don't have to depend on Boost.Multiprecision



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8619) [C++] Use distinct Type::type values for interval types

2020-04-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-8619:
--
Labels: pull-request-available  (was: )

> [C++] Use distinct Type::type values for interval types
> ---
>
> Key: ARROW-8619
> URL: https://issues.apache.org/jira/browse/ARROW-8619
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a breaking API change, but {{MonthIntervalType}} and 
> {{DayTimeIntervalType}} are different data types (and have different value 
> sizes, which is not true of timestamps) and thus should be distinguished in 
> the same way that DATE32 / DATE64 are distinguished, or TIME32 / TIME64 are 
> distinguished
> This may nominally simplify function / kernel dispatch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (ARROW-8622) [Rust] Parquet crate does not compile on aarch64

2020-04-29 Thread Paddy Horan (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paddy Horan reassigned ARROW-8622:
--

Assignee: R. Tyler Croy

> [Rust] Parquet crate does not compile on aarch64
> 
>
> Key: ARROW-8622
> URL: https://issues.apache.org/jira/browse/ARROW-8622
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Rust
>Reporter: Paddy Horan
>Assignee: R. Tyler Croy
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8622) [Rust] Parquet crate does not compile on aarch64

2020-04-29 Thread Paddy Horan (Jira)
Paddy Horan created ARROW-8622:
--

 Summary: [Rust] Parquet crate does not compile on aarch64
 Key: ARROW-8622
 URL: https://issues.apache.org/jira/browse/ARROW-8622
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Rust
Reporter: Paddy Horan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8621) [Go] Add Module support by creating tags

2020-04-29 Thread Kyle Brandt (Jira)
Kyle Brandt created ARROW-8621:
--

 Summary: [Go] Add Module support by creating tags
 Key: ARROW-8621
 URL: https://issues.apache.org/jira/browse/ARROW-8621
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Go
Reporter: Kyle Brandt


Arrow has a go.mod, but the go modules system expects a certain git tag for Go 
modules to work.

Based on 
[https://github.com/golang/go/wiki/Modules#faqs--multi-module-repositories] I 
believe the tag would be 
{code}
go/arrow/v0.17.0
{code}


Currently:


{code}
$ go get github.com/apache/arrow/go/arrow@v0.17.0 
go get github.com/apache/arrow/go/arrow@v0.17.0: 
github.com/apache/arrow/go/arrow@v0.17.0: invalid version: unknown revision 
go/arrow/v0.17.0
{code}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (ARROW-8620) arrow header compiler error using nvcc

2020-04-29 Thread Wes McKinney (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney closed ARROW-8620.
---
Resolution: Duplicate

Dup of ARROW-8608

> arrow header compiler error using nvcc
> --
>
> Key: ARROW-8620
> URL: https://issues.apache.org/jira/browse/ARROW-8620
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.17.0
> Environment: ubuntu 18.04
> nvcc 9.1
>Reporter: Committer
>Priority: Minor
>
> I am using arrow api form the document
> https://arrow.apache.org/docs/developers/cpp/building.html#system-setup
> it works find
> I combine my CUDA code with arrow header the error appears
> nvcc main.cu -o -larrow
> Both my arrow code and CUDA code can compile correctly,put then together the 
> error appears 
>  
> {code:java}
> /usr/local/include/arrow/vendored/variant.hpp:2381:109: error: expansion 
> pattern 'Dummy' contains no argument packs
>  template
>
> /usr/local/include/arrow/vendored/variant.hpp:2381:278: error: template 
> argument 3 is invalid
>  template
>   
>   
> 
> /usr/local/include/arrow/vendored/variant.hpp:2381:316: error: expansion 
> pattern 'Dummy' contains no argument packs
>  template
>   
>   
>   
> 
> /usr/local/include/arrow/vendored/variant.hpp:2381:491: error: template 
> argument 2 is invalid
>  template
>   
>   
>   
>   
>   
>
> /usr/local/include/arrow/vendored/variant.hpp:2381:493: error: template 
> argument 1 is invalid
>  template
>   
>   
>   
>   
>   
>  
> /usr/local/include/arrow/vendored/variant.hpp:2381:493: error: template 
> argument 2 is invalid
> /usr/local/include/arrow/vendored/variant.hpp:2381:507: error: template 
> argument 1 is invalid
>  template  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8620) arrow header compiler error using nvcc

2020-04-29 Thread Ming (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming updated ARROW-8620:

Description: 
I am using arrow api form the document
https://arrow.apache.org/docs/developers/cpp/building.html#system-setup
it works find
I combine my CUDA code with arrow header the error appears
nvcc main.cu -o -larrow
Both my arrow code and CUDA code can compile correctly,put then together the 
error appears 
 

{code:java}
/usr/local/include/arrow/vendored/variant.hpp:2381:109: error: expansion 
pattern 'Dummy' contains no argument packs
 template  arrow header compiler error using nvcc
> --
>
> Key: ARROW-8620
> URL: https://issues.apache.org/jira/browse/ARROW-8620
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.17.0
> Environment: ubuntu 18.04
> nvcc 9.1
>Reporter: Ming
>Priority: Minor
>
> I am using arrow api form the document
> https://arrow.apache.org/docs/developers/cpp/building.html#system-setup
> it works find
> I combine my CUDA code with arrow header the error appears
> nvcc main.cu -o -larrow
> Both my arrow code and CUDA code can compile correctly,put then together the 
> error appears 
>  
> {code:java}
> /usr/local/include/arrow/vendored/variant.hpp:2381:109: error: expansion 
> pattern 'Dummy' contains no argument packs
>  template
>
> /usr/local/include/arrow/vendored/variant.hpp:2381:278: error: template 
> argument 3 is invalid
>  template
>   
>   
> 
> /usr/local/include/arrow/vendored/variant.hpp:2381:316: error: expansion 
> pattern 'Dummy' contains no argument packs
>  template
>   
>   
>   
> 
> /usr/local/include/arrow/vendored/variant.hpp:2381:491: error: template 
> argument 2 is invalid
>  template
>   
>   
>   
>   
>   
>
> /usr/local/include/arrow/vendored/variant.hpp:2381:493: error: template 
> argument 1 is invalid
>  template
>   
>   
>   
>   
>   
>  
> /usr/local/include/arrow/vendored/variant.hpp:2381:493: error: template 
> argument 2 is invalid
> /usr/local/include/arrow/vendored/variant.hpp:2381:507: error: template 
> argument 1 is invalid
>  template  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8620) arrow header compiler error using nvcc

2020-04-29 Thread Ming (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming updated ARROW-8620:

Description: 
nvcc main.cu -o -larrow
my code can compile correctly
but after adding arrow header (arrow/api.h) in my code the error appears
{code:java}
/usr/local/include/arrow/vendored/variant.hpp:2381:109: error: expansion 
pattern 'Dummy' contains no argument packs
 template  arrow header compiler error using nvcc
> --
>
> Key: ARROW-8620
> URL: https://issues.apache.org/jira/browse/ARROW-8620
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.17.0
> Environment: ubuntu 18.04
> nvcc 9.1
>Reporter: Ming
>Priority: Minor
>
> nvcc main.cu -o -larrow
> my code can compile correctly
> but after adding arrow header (arrow/api.h) in my code the error appears
> {code:java}
> /usr/local/include/arrow/vendored/variant.hpp:2381:109: error: expansion 
> pattern 'Dummy' contains no argument packs
>  template
>
> /usr/local/include/arrow/vendored/variant.hpp:2381:278: error: template 
> argument 3 is invalid
>  template
>   
>   
> 
> /usr/local/include/arrow/vendored/variant.hpp:2381:316: error: expansion 
> pattern 'Dummy' contains no argument packs
>  template
>   
>   
>   
> 
> /usr/local/include/arrow/vendored/variant.hpp:2381:491: error: template 
> argument 2 is invalid
>  template
>   
>   
>   
>   
>   
>
> /usr/local/include/arrow/vendored/variant.hpp:2381:493: error: template 
> argument 1 is invalid
>  template
>   
>   
>   
>   
>   
>  
> /usr/local/include/arrow/vendored/variant.hpp:2381:493: error: template 
> argument 2 is invalid
> /usr/local/include/arrow/vendored/variant.hpp:2381:507: error: template 
> argument 1 is invalid
>  template  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-8620) arrow header compiler error using nvcc

2020-04-29 Thread Ming (Jira)
Ming created ARROW-8620:
---

 Summary: arrow header compiler error using nvcc
 Key: ARROW-8620
 URL: https://issues.apache.org/jira/browse/ARROW-8620
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.17.0
 Environment: ubuntu 18.04
nvcc 9.1
Reporter: Ming


{code:java}
/usr/local/include/arrow/vendored/variant.hpp:2381:109: error: expansion 
pattern 'Dummy' contains no argument packs
 template 

[jira] [Resolved] (ARROW-8597) [Rust] arrow crate lint and readability improvements

2020-04-29 Thread Neville Dipale (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neville Dipale resolved ARROW-8597.
---
Fix Version/s: 1.0.0
   Resolution: Fixed

Issue resolved by pull request 7042
[https://github.com/apache/arrow/pull/7042]

> [Rust] arrow crate lint and readability improvements
> 
>
> Key: ARROW-8597
> URL: https://issues.apache.org/jira/browse/ARROW-8597
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust
>Reporter: Drazen Urch
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Found a bunch of pedantic issues while reading through Rust arrow crate code, 
> and than found a few more while running clippy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (ARROW-8609) [C++] ORC JNI bridge crashed on null arrow buffer

2020-04-29 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-8609.

Fix Version/s: 1.0.0
   Resolution: Fixed

Issue resolved by pull request 7048
[https://github.com/apache/arrow/pull/7048]

> [C++] ORC JNI bridge crashed on null arrow buffer
> -
>
> Key: ARROW-8609
> URL: https://issues.apache.org/jira/browse/ARROW-8609
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Yuan Zhou
>Assignee: Yuan Zhou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> https://github.com/apache/arrow/blob/master/cpp/src/jni/orc/jni_wrapper.cpp#L278-L281
> We should do a check on arrow buffer if it's null, and passing right value to 
> the constructor. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-8609) [C++] ORC JNI bridge crashed on null arrow buffer

2020-04-29 Thread Krisztian Szucs (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs updated ARROW-8609:
---
Summary: [C++] ORC JNI bridge crashed on null arrow buffer  (was: [C++]orc 
JNI bridge crashed on null arrow buffer)

> [C++] ORC JNI bridge crashed on null arrow buffer
> -
>
> Key: ARROW-8609
> URL: https://issues.apache.org/jira/browse/ARROW-8609
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Yuan Zhou
>Assignee: Yuan Zhou
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> https://github.com/apache/arrow/blob/master/cpp/src/jni/orc/jni_wrapper.cpp#L278-L281
> We should do a check on arrow buffer if it's null, and passing right value to 
> the constructor. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)