[jira] [Resolved] (ARROW-3517) [C++] MinGW 32bit build causes g++ segv

2018-11-04 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3517.

   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2899
[https://github.com/apache/arrow/pull/2899]

> [C++] MinGW 32bit build causes g++ segv
> ---
>
> Key: ARROW-3517
> URL: https://issues.apache.org/jira/browse/ARROW-3517
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.11.0
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I'm trying to build MSYS2 packages for 32bit and 64bit: 
> [https://github.com/Alexpux/MINGW-packages/pull/4519]
>  
> 64bit build works well.
>  
> But 32bit build causes g++ segv: 
> [https://dev.azure.com/msys2/mingw/_build/results?buildId=280&view=logs]
>  
> {code}
> 2018-10-11T00:51:09.6355581Z In file included from 
> D:/a/1/s/mingw-w64-arrow/src/apache-arrow-0.11.0/cpp/src/arrow/io/memory.cc:29:0:
> 2018-10-11T00:51:09.6357210Z 
> D:/a/1/s/mingw-w64-arrow/src/apache-arrow-0.11.0/cpp/src/arrow/util/memory.h: 
> In substitution of 'template 
> std::future arrow::internal::ThreadPool::Submit(Function&&, Args&& 
> ...) [with Function = ; Args = ; Result = ]':
> 2018-10-11T00:51:09.6381001Z 
> D:/a/1/s/mingw-w64-arrow/src/apache-arrow-0.11.0/cpp/src/arrow/util/memory.h:63:72:
>required from here
> 2018-10-11T00:51:09.7514314Z 
> D:/a/1/s/mingw-w64-arrow/src/apache-arrow-0.11.0/cpp/src/arrow/util/memory.h:63:72:
>  internal compiler error: Segmentation fault
> 2018-10-11T00:51:09.7536742Zleft + i 
> * chunk_size, chunk_size));
> 2018-10-11T00:51:09.7537294Z  
>^
> 2018-10-11T00:51:09.7537623Z 
> 2018-10-11T00:51:09.7537778Z This application has requested the Runtime to 
> terminate it in an unusual way.
> 2018-10-11T00:51:09.7537913Z Please contact the application's support team 
> for more information.
> 2018-10-11T00:51:09.7538417Z 
> 2018-10-11T00:51:09.7538550Z 
> D:/a/1/s/mingw-w64-arrow/src/apache-arrow-0.11.0/cpp/src/arrow/util/memory.h:63:72:
>  internal compiler error: Aborted
> 2018-10-11T00:51:09.7538640Z 
> 2018-10-11T00:51:09.7538764Z This application has requested the Runtime to 
> terminate it in an unusual way.
> 2018-10-11T00:51:09.7539047Z Please contact the application's support team 
> for more information.
> 2018-10-11T00:51:09.7539167Z g++.exe: internal compiler error: Aborted 
> (program cc1plus)
> 2018-10-11T00:51:09.7539287Z Please submit a full bug report,
> 2018-10-11T00:51:09.7539385Z with preprocessed source if appropriate.
> 2018-10-11T00:51:09.7539504Z See  for 
> instructions.
> 2018-10-11T00:51:09.7539615Z make[2]: *** 
> [src/arrow/CMakeFiles/arrow_objlib.dir/build.make:414: 
> src/arrow/CMakeFiles/arrow_objlib.dir/io/memory.cc.obj] Error 4
> 2018-10-11T00:51:09.7539718Z make[2]: *** Waiting for unfinished jobs
> 2018-10-11T00:51:10.7253905Z make[1]: *** [CMakeFiles/Makefile2:357: 
> src/arrow/CMakeFiles/arrow_objlib.dir/all] Error 2
> 2018-10-11T00:51:10.7284933Z make: *** [Makefile:141: all] Error 2
> {code}
> It'll be a g++ bug but what should we do? Should we stop to use the code with 
> MinGW 32bit build?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3610) [C++] Add interface to turn stl_allocator into arrow::MemoryPool

2018-11-03 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3610.

Resolution: Fixed

Issue resolved by pull request 2837
[https://github.com/apache/arrow/pull/2837]

> [C++] Add interface to turn stl_allocator into arrow::MemoryPool
> 
>
> Key: ARROW-3610
> URL: https://issues.apache.org/jira/browse/ARROW-3610
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> We already support constructing an {{stl_allocator}} from a {{MemoryPool}}, 
> we should also support the reverse conversion. As the STL allocator does not 
> provide a resize, this will always reallocate, even if there is support for 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3694) [Java] Avoid superfluous string creation when logging level is disabled

2018-11-03 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned ARROW-3694:
--

Assignee: Zhenyuan Zhao

> [Java] Avoid superfluous string creation when logging level is disabled 
> 
>
> Key: ARROW-3694
> URL: https://issues.apache.org/jira/browse/ARROW-3694
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java
>Affects Versions: 0.11.1
>Reporter: Zhenyuan Zhao
>Assignee: Zhenyuan Zhao
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There are a few places where strings were unnecessarily created for logging 
> purpose.
> [https://github.com/apache/arrow/blob/ed70f051bb0636d994f285c38503b992d08efa00/java/vector/src/main/java/org/apache/arrow/vector/ipc/message/ArrowRecordBatch.java#L75]
> For the above scenario in ArrowRecordBatch, roughly 2/3 of the total CPU was 
> spent in string.format()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3694) [Java] Avoid superfluous string creation when logging level is disabled

2018-11-03 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3694.

Resolution: Fixed

Issue resolved by pull request 2894
[https://github.com/apache/arrow/pull/2894]

> [Java] Avoid superfluous string creation when logging level is disabled 
> 
>
> Key: ARROW-3694
> URL: https://issues.apache.org/jira/browse/ARROW-3694
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java
>Affects Versions: 0.11.1
>Reporter: Zhenyuan Zhao
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There are a few places where strings were unnecessarily created for logging 
> purpose.
> [https://github.com/apache/arrow/blob/ed70f051bb0636d994f285c38503b992d08efa00/java/vector/src/main/java/org/apache/arrow/vector/ipc/message/ArrowRecordBatch.java#L75]
> For the above scenario in ArrowRecordBatch, roughly 2/3 of the total CPU was 
> spent in string.format()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3637) [Go] Implement Stringer for arrays

2018-11-01 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671449#comment-16671449
 ] 

Uwe L. Korn commented on ARROW-3637:


[~sbinet] Feel free to close these issues by yourself.

> [Go] Implement Stringer for arrays
> --
>
> Key: ARROW-3637
> URL: https://issues.apache.org/jira/browse/ARROW-3637
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Go
>Reporter: James Walker
>Assignee: Sebastien Binet
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The example in the documentation states:
>  
> {code:java}
> // This example shows how one can slice an array.
> // The initial (float64) array is:
> // [1, 2, 3, (null), 4, 5]
> //
> // and the sub-slice is:
> // [3, (null), 4]
> {code}
> However, the initial array is actually `[1, 2, 3, -1, 4, 5]` and the 
> sub-slice is actually `[3, -1, 4]`.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3670) [C++] Use FindBacktrace to find execinfo.h support

2018-11-01 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3670:
--

 Summary: [C++] Use FindBacktrace to find execinfo.h support
 Key: ARROW-3670
 URL: https://issues.apache.org/jira/browse/ARROW-3670
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.11.1
Reporter: Uwe L. Korn
 Fix For: 0.12.0


See https://github.com/apache/arrow/issues/2818 and 
https://github.com/apache/arrow/issues/2080#issuecomment-434723969



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3663) pyarrow install via pip3 fails with error no module named Cython

2018-11-01 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671321#comment-16671321
 ] 

Uwe L. Korn commented on ARROW-3663:


Calling {{pip3 install -U pip}} will not update what is behind {{pip3}} as this 
is provided by your OS. I recommend that you switch to using virtualenv or 
conda environments for installing. This makes you independent of the {{pip}} 
version of your distribution.

> pyarrow install via pip3 fails with error no module named Cython
> 
>
> Key: ARROW-3663
> URL: https://issues.apache.org/jira/browse/ARROW-3663
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Rajshekhar K
>Assignee: Uwe L. Korn
>Priority: Trivial
>
> Hi Team,
>  
> The issue is reproducible :
> # pip3 install pyarrow 
> Fails installation with no module name Cython. Seems it's not mentioned in 
> the requirements or something.
>  
> {code:java}
> Downloading pyarrow-0.10.0.tar.gz (2.1MB): 2.1MB downloaded
> Running setup.py (path:/tmp/pip_build_root/pyarrow/setup.py) egg_info for 
> package pyarrow
> Traceback (most recent call last):
> File "", line 17, in 
> File "/tmp/pip_build_root/pyarrow/setup.py", line 29, in 
> from Cython.Distutils import build_ext as _build_ext
> ImportError: No module named 'Cython'
> Complete output from command python setup.py egg_info:
> Traceback (most recent call last):
> File "", line 17, in 
> File "/tmp/pip_build_root/pyarrow/setup.py", line 29, in 
> from Cython.Distutils import build_ext as _build_ext
> ImportError: No module named 'Cython'
> 
> Cleaning up...
> {code}
>  
>  
> Tested on Environment: ubuntu14.04
> Pip version:
> {noformat}
> pip 1.5.4 from /usr/lib/python3/dist-packages (python 3.4){noformat}
>  
> Thanks,
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3663) pyarrow install via pip3 fails with error no module named Cython

2018-10-31 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3663.

Resolution: Fixed

> pyarrow install via pip3 fails with error no module named Cython
> 
>
> Key: ARROW-3663
> URL: https://issues.apache.org/jira/browse/ARROW-3663
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Rajshekhar K
>Assignee: Uwe L. Korn
>Priority: Trivial
>
> Hi Team,
>  
> The issue is reproducible :
> # pip3 install pyarrow 
> Fails installation with no module name Cython. Seems it's not mentioned in 
> the requirements or something.
>  
> {code:java}
> Downloading pyarrow-0.10.0.tar.gz (2.1MB): 2.1MB downloaded
> Running setup.py (path:/tmp/pip_build_root/pyarrow/setup.py) egg_info for 
> package pyarrow
> Traceback (most recent call last):
> File "", line 17, in 
> File "/tmp/pip_build_root/pyarrow/setup.py", line 29, in 
> from Cython.Distutils import build_ext as _build_ext
> ImportError: No module named 'Cython'
> Complete output from command python setup.py egg_info:
> Traceback (most recent call last):
> File "", line 17, in 
> File "/tmp/pip_build_root/pyarrow/setup.py", line 29, in 
> from Cython.Distutils import build_ext as _build_ext
> ImportError: No module named 'Cython'
> 
> Cleaning up...
> {code}
>  
>  
> Tested on Environment: ubuntu14.04
> Pip version:
> {noformat}
> pip 1.5.4 from /usr/lib/python3/dist-packages (python 3.4){noformat}
>  
> Thanks,
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ARROW-3663) pyarrow install via pip3 fails with error no module named Cython

2018-10-31 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670126#comment-16670126
 ] 

Uwe L. Korn edited comment on ARROW-3663 at 10/31/18 1:49 PM:
--

Your pip version is too old. You need at least 8.x but your log says 1.5.4. 
Once you have upgrade {{pip}} it should work.


was (Author: xhochy):
Your pip version is too old. You need at least 8.x but your log sys 1.5.4.

> pyarrow install via pip3 fails with error no module named Cython
> 
>
> Key: ARROW-3663
> URL: https://issues.apache.org/jira/browse/ARROW-3663
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Rajshekhar K
>Assignee: Uwe L. Korn
>Priority: Trivial
>
> Hi Team,
>  
> The issue is reproducible :
> # pip3 install pyarrow 
> Fails installation with no module name Cython. Seems it's not mentioned in 
> the requirements or something.
>  
> {code:java}
> Downloading pyarrow-0.10.0.tar.gz (2.1MB): 2.1MB downloaded
> Running setup.py (path:/tmp/pip_build_root/pyarrow/setup.py) egg_info for 
> package pyarrow
> Traceback (most recent call last):
> File "", line 17, in 
> File "/tmp/pip_build_root/pyarrow/setup.py", line 29, in 
> from Cython.Distutils import build_ext as _build_ext
> ImportError: No module named 'Cython'
> Complete output from command python setup.py egg_info:
> Traceback (most recent call last):
> File "", line 17, in 
> File "/tmp/pip_build_root/pyarrow/setup.py", line 29, in 
> from Cython.Distutils import build_ext as _build_ext
> ImportError: No module named 'Cython'
> 
> Cleaning up...
> {code}
>  
>  
> Tested on Environment: ubuntu14.04
> Pip version:
> {noformat}
> pip 1.5.4 from /usr/lib/python3/dist-packages (python 3.4){noformat}
>  
> Thanks,
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3663) pyarrow install via pip3 fails with error no module named Cython

2018-10-31 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned ARROW-3663:
--

Assignee: Uwe L. Korn

> pyarrow install via pip3 fails with error no module named Cython
> 
>
> Key: ARROW-3663
> URL: https://issues.apache.org/jira/browse/ARROW-3663
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Rajshekhar K
>Assignee: Uwe L. Korn
>Priority: Trivial
>
> Hi Team,
>  
> The issue is reproducible :
> # pip3 install pyarrow 
> Fails installation with no module name Cython. Seems it's not mentioned in 
> the requirements or something.
>  
> {code:java}
> Downloading pyarrow-0.10.0.tar.gz (2.1MB): 2.1MB downloaded
> Running setup.py (path:/tmp/pip_build_root/pyarrow/setup.py) egg_info for 
> package pyarrow
> Traceback (most recent call last):
> File "", line 17, in 
> File "/tmp/pip_build_root/pyarrow/setup.py", line 29, in 
> from Cython.Distutils import build_ext as _build_ext
> ImportError: No module named 'Cython'
> Complete output from command python setup.py egg_info:
> Traceback (most recent call last):
> File "", line 17, in 
> File "/tmp/pip_build_root/pyarrow/setup.py", line 29, in 
> from Cython.Distutils import build_ext as _build_ext
> ImportError: No module named 'Cython'
> 
> Cleaning up...
> {code}
>  
>  
> Tested on Environment: ubuntu14.04
> Pip version:
> {noformat}
> pip 1.5.4 from /usr/lib/python3/dist-packages (python 3.4){noformat}
>  
> Thanks,
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3663) pyarrow install via pip3 fails with error no module named Cython

2018-10-31 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670126#comment-16670126
 ] 

Uwe L. Korn commented on ARROW-3663:


Your pip version is too old. You need at least 8.x but your log sys 1.5.4.

> pyarrow install via pip3 fails with error no module named Cython
> 
>
> Key: ARROW-3663
> URL: https://issues.apache.org/jira/browse/ARROW-3663
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Rajshekhar K
>Priority: Trivial
>
> Hi Team,
>  
> The issue is reproducible :
> # pip3 install pyarrow 
> Fails installation with no module name Cython. Seems it's not mentioned in 
> the requirements or something.
>  
> {code:java}
> Downloading pyarrow-0.10.0.tar.gz (2.1MB): 2.1MB downloaded
> Running setup.py (path:/tmp/pip_build_root/pyarrow/setup.py) egg_info for 
> package pyarrow
> Traceback (most recent call last):
> File "", line 17, in 
> File "/tmp/pip_build_root/pyarrow/setup.py", line 29, in 
> from Cython.Distutils import build_ext as _build_ext
> ImportError: No module named 'Cython'
> Complete output from command python setup.py egg_info:
> Traceback (most recent call last):
> File "", line 17, in 
> File "/tmp/pip_build_root/pyarrow/setup.py", line 29, in 
> from Cython.Distutils import build_ext as _build_ext
> ImportError: No module named 'Cython'
> 
> Cleaning up...
> {code}
>  
>  
> Tested on Environment: ubuntu14.04
> Pip version:
> {noformat}
> pip 1.5.4 from /usr/lib/python3/dist-packages (python 3.4){noformat}
>  
> Thanks,
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3663) pyarrow install via pip3 fails with error no module named Cython

2018-10-31 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn updated ARROW-3663:
---
Priority: Trivial  (was: Blocker)

> pyarrow install via pip3 fails with error no module named Cython
> 
>
> Key: ARROW-3663
> URL: https://issues.apache.org/jira/browse/ARROW-3663
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.11.1
>Reporter: Rajshekhar K
>Priority: Trivial
>
> Hi Team,
>  
> The issue is reproducible :
> # pip3 install pyarrow 
> Fails installation with no module name Cython. Seems it's not mentioned in 
> the requirements or something.
>  
> {code:java}
> Downloading pyarrow-0.10.0.tar.gz (2.1MB): 2.1MB downloaded
> Running setup.py (path:/tmp/pip_build_root/pyarrow/setup.py) egg_info for 
> package pyarrow
> Traceback (most recent call last):
> File "", line 17, in 
> File "/tmp/pip_build_root/pyarrow/setup.py", line 29, in 
> from Cython.Distutils import build_ext as _build_ext
> ImportError: No module named 'Cython'
> Complete output from command python setup.py egg_info:
> Traceback (most recent call last):
> File "", line 17, in 
> File "/tmp/pip_build_root/pyarrow/setup.py", line 29, in 
> from Cython.Distutils import build_ext as _build_ext
> ImportError: No module named 'Cython'
> 
> Cleaning up...
> {code}
>  
>  
> Tested on Environment: ubuntu14.04
> Pip version:
> {noformat}
> pip 1.5.4 from /usr/lib/python3/dist-packages (python 3.4){noformat}
>  
> Thanks,
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3658) [Rust] validation of offsets buffer is incorrect for `List`

2018-10-31 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3658.

   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2877
[https://github.com/apache/arrow/pull/2877]

> [Rust] validation of offsets buffer is incorrect for `List`
> --
>
> Key: ARROW-3658
> URL: https://issues.apache.org/jira/browse/ARROW-3658
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Rust
>Reporter: Paddy Horan
>Assignee: Paddy Horan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3661) [Gandiva][GLib] Improve constant name

2018-10-31 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3661.

   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2881
[https://github.com/apache/arrow/pull/2881]

> [Gandiva][GLib] Improve constant name
> -
>
> Key: ARROW-3661
> URL: https://issues.apache.org/jira/browse/ARROW-3661
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Gandiva, GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3108) [C++] arrow::PrettyPrint for Table instances

2018-10-28 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned ARROW-3108:
--

Assignee: Uwe L. Korn

> [C++] arrow::PrettyPrint for Table instances
> 
>
> Key: ARROW-3108
> URL: https://issues.apache.org/jira/browse/ARROW-3108
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Affects Versions: 0.10.0
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: beginner
> Fix For: 0.12.0
>
>
> Extend the {{arrow::PrettyPrint}} functionality to also support 
> {{arrow::Table}} instances in addition to {{RecordBatch}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3642) [C++] Add arrowConfig.cmake generation

2018-10-28 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3642:
--

 Summary: [C++] Add arrowConfig.cmake generation
 Key: ARROW-3642
 URL: https://issues.apache.org/jira/browse/ARROW-3642
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Uwe L. Korn
Assignee: Uwe L. Korn


This allows simple usage of Arrow in C++ packages using {{find_package(arrow)}} 
with no additional {{FindArrow.cmake}} in {{cmake_modules}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3641) [C++/Python] remove public keyword from Cython api functions

2018-10-28 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3641:
--

 Summary: [C++/Python] remove public keyword from Cython api 
functions
 Key: ARROW-3641
 URL: https://issues.apache.org/jira/browse/ARROW-3641
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Python
Reporter: Uwe L. Korn
 Fix For: 0.12.0


Based on a conversation with Stefan Behnel, we should be able to change the 
{{cdef public api}} statements in pyarrow/public-api.pxi to simply {{cdef api}}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3638) [C++][Python] Move reading from Feather as Table feature to C++ from Python

2018-10-28 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3638.

   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2853
[https://github.com/apache/arrow/pull/2853]

> [C++][Python] Move reading from Feather as Table feature to C++ from Python
> ---
>
> Key: ARROW-3638
> URL: https://issues.apache.org/jira/browse/ARROW-3638
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Python
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> It's for using the feature from GLib.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3583) [Python/Java] Create RecordBatch from VectorSchemaRoot

2018-10-27 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3583.

Resolution: Fixed

Issue resolved by pull request 2809
[https://github.com/apache/arrow/pull/2809]

> [Python/Java] Create RecordBatch from VectorSchemaRoot
> --
>
> Key: ARROW-3583
> URL: https://issues.apache.org/jira/browse/ARROW-3583
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Java, Python
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Besides the naming differences, {{pyarrow.RecordBatch}} is content-wise the 
> same as a {{org.apache.arrow.vector.VectorSchemaRoot}}. This adds a 
> conversion function to create a {{pyarrow.RecordBatch}} referencing these 
> arrays.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3632) [Packaging] Update deb names in dev/tasks/tasks.yml in dev/release/00-prepare.sh

2018-10-27 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3632.

   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2849
[https://github.com/apache/arrow/pull/2849]

> [Packaging] Update deb names in dev/tasks/tasks.yml in 
> dev/release/00-prepare.sh
> 
>
> Key: ARROW-3632
> URL: https://issues.apache.org/jira/browse/ARROW-3632
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3633) [Packaging] Update deb names in dev/tasks/tasks.yml for 0.12.0

2018-10-27 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3633.

   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2850
[https://github.com/apache/arrow/pull/2850]

> [Packaging] Update deb names in dev/tasks/tasks.yml for 0.12.0
> --
>
> Key: ARROW-3633
> URL: https://issues.apache.org/jira/browse/ARROW-3633
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3610) [C++] Add interface to turn stl_allocator into arrow::MemoryPool

2018-10-27 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned ARROW-3610:
--

Assignee: Uwe L. Korn

> [C++] Add interface to turn stl_allocator into arrow::MemoryPool
> 
>
> Key: ARROW-3610
> URL: https://issues.apache.org/jira/browse/ARROW-3610
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We already support constructing an {{stl_allocator}} from a {{MemoryPool}}, 
> we should also support the reverse conversion. As the STL allocator does not 
> provide a resize, this will always reallocate, even if there is support for 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3610) [C++] Add interface to turn stl_allocator into arrow::MemoryPool

2018-10-27 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn updated ARROW-3610:
---
Fix Version/s: 0.12.0

> [C++] Add interface to turn stl_allocator into arrow::MemoryPool
> 
>
> Key: ARROW-3610
> URL: https://issues.apache.org/jira/browse/ARROW-3610
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We already support constructing an {{stl_allocator}} from a {{MemoryPool}}, 
> we should also support the reverse conversion. As the STL allocator does not 
> provide a resize, this will always reallocate, even if there is support for 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3610) [C++] Add interface to turn stl_allocator into arrow::MemoryPool

2018-10-24 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3610:
--

 Summary: [C++] Add interface to turn stl_allocator into 
arrow::MemoryPool
 Key: ARROW-3610
 URL: https://issues.apache.org/jira/browse/ARROW-3610
 Project: Apache Arrow
  Issue Type: New Feature
  Components: C++
Reporter: Uwe L. Korn


We already support constructing an {{stl_allocator}} from a {{MemoryPool}}, we 
should also support the reverse conversion. As the STL allocator does not 
provide a resize, this will always reallocate, even if there is support for it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3585) Update the documentation about Schema & Metadata usage

2018-10-22 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16658758#comment-16658758
 ] 

Uwe L. Korn commented on ARROW-3585:


[~danielil] Assigned to you and also gave you permission to self-assign JIRAs 
in the future.

> Update the documentation about Schema & Metadata usage
> --
>
> Key: ARROW-3585
> URL: https://issues.apache.org/jira/browse/ARROW-3585
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Documentation
>Reporter: Daniel Haviv
>Assignee: Daniel Haviv
>Priority: Trivial
>  Labels: beginner, documentation, easyfix, newbie
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Reusing the Schema object from a Parquet file written with Spark with Pandas 
> fails due to Schema mismatch.
> The culprit is in the metadata part of the schema which each component fills 
> according to it's implementation. More details can be found here: 
> [https://github.com/apache/arrow/issues/2805]
> The documentation should point that out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3585) Update the documentation about Schema & Metadata usage

2018-10-22 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned ARROW-3585:
--

Assignee: Daniel Haviv

> Update the documentation about Schema & Metadata usage
> --
>
> Key: ARROW-3585
> URL: https://issues.apache.org/jira/browse/ARROW-3585
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Documentation
>Reporter: Daniel Haviv
>Assignee: Daniel Haviv
>Priority: Trivial
>  Labels: beginner, documentation, easyfix, newbie
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Reusing the Schema object from a Parquet file written with Spark with Pandas 
> fails due to Schema mismatch.
> The culprit is in the metadata part of the schema which each component fills 
> according to it's implementation. More details can be found here: 
> [https://github.com/apache/arrow/issues/2805]
> The documentation should point that out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3573) [Rust] with_bitset does not set valid bits correctly

2018-10-21 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3573.

   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2803
[https://github.com/apache/arrow/pull/2803]

> [Rust] with_bitset does not set valid bits correctly
> 
>
> Key: ARROW-3573
> URL: https://issues.apache.org/jira/browse/ARROW-3573
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Rust
>Reporter: Paddy Horan
>Assignee: Paddy Horan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The boundary check is off a little, 
> {color:#33}`MutableBuffer::new(64).with_bitset(64, false);` will fail.  
> This issue only happens if the arguments to `new` and `with_bitset` are the 
> same and a multiple of 64.
> {color}
> {color:#33}`write_bytes` is currently writing 1 instead of 255 to set all 
> the bits when `val` is `true`{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3580) [Gandiva][C++] Build error with g++ 8.2.0

2018-10-21 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3580.

   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2806
[https://github.com/apache/arrow/pull/2806]

> [Gandiva][C++] Build error with g++ 8.2.0
> -
>
> Key: ARROW-3580
> URL: https://issues.apache.org/jira/browse/ARROW-3580
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Gandiva
>Affects Versions: 0.11.0
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Error message1:
> {noformat}
> In file included from 
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/expr_decomposer.cc:27:
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/function_holder_registry.h:46:27:
>  error: 'function' in namespace 'std' does not name a template type
>using maker_type = std::function FunctionHolderPtr*)>;
>^~~~
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/function_holder_registry.h:46:22:
>  note: 'std::function' is defined in header ''; did you forget to 
> '#include '?
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/function_holder_registry.h:30:1:
> +#include 
>  
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/function_holder_registry.h:46:22:
>using maker_type = std::function FunctionHolderPtr*)>;
>   ^~~
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/function_holder_registry.h:47:52:
>  error: 'maker_type' was not declared in this scope
>using map_type = std::unordered_map;
> ^~
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/function_holder_registry.h:47:52:
>  note: suggested alternative: 'decltype'
>using map_type = std::unordered_map;
> ^~
> decltype
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/function_holder_registry.h:47:62:
>  error: template argument 2 is invalid
>using map_type = std::unordered_map;
>   ^
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/function_holder_registry.h:47:62:
>  error: template argument 5 is invalid
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/function_holder_registry.h:60:10:
>  error: 'map_type' does not name a type; did you mean 'iswctype'?
>static map_type& makers() {
>   ^~~~
>   iswctype
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/function_holder_registry.h: In 
> static member function 'static gandiva::Status 
> gandiva::FunctionHolderRegistry::Make(const string&, const 
> gandiva::FunctionNode&, gandiva::FunctionHolderPtr*)':
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/function_holder_registry.h:51:18:
>  error: 'makers' was not declared in this scope
>  auto found = makers().find(name);
>   ^~
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/function_holder_registry.h:51:18:
>  note: suggested alternative: 'Make'
>  auto found = makers().find(name);
>   ^~
>   Make
> {noformat}
> Error message2:
> {noformat}
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/tree_expr_builder.cc: In static 
> member function 'static gandiva::NodePtr 
> gandiva::TreeExprBuilder::MakeNull(gandiva::DataTypePtr)':
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/tree_expr_builder.cc:78:70: 
> error: 'float_t' was not declared in this scope
>return std::make_shared(data_type, 
> LiteralHolder((float_t)0), true);
>   ^~~
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/tree_expr_builder.cc:78:70: 
> note: suggested alternative: 'float'
>return std::make_shared(data_type, 
> LiteralHolder((float_t)0), true);
>   ^~~
>   float
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/tree_expr_builder.cc:80:70: 
> error: 'double_t' was not declared in this scope
>return std::make_shared(data_type, 
> LiteralHolder((double_t)0), true);
>   ^~~~
> /home/kou/work/cpp/arrow.kou/cpp/src/gandiva/tree_expr_builder.cc:80:70: 
> note: suggested alternative: 'double'
>return std::make_shared(data_type, 
> LiteralHolder((double_t)0), true);
>   ^~~~
>   double
> {noformat}




[jira] [Resolved] (ARROW-3539) [CI/Packaging] Update scripts to build against vendored jemalloc

2018-10-21 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3539.

Resolution: Fixed

Issue resolved by pull request 2779
[https://github.com/apache/arrow/pull/2779]

> [CI/Packaging] Update scripts to build against vendored jemalloc
> 
>
> Key: ARROW-3539
> URL: https://issues.apache.org/jira/browse/ARROW-3539
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3582) [CI] Gandiva C++ build is always triggered

2018-10-21 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned ARROW-3582:
--

Assignee: Sebastien Binet

> [CI] Gandiva C++ build is always triggered
> --
>
> Key: ARROW-3582
> URL: https://issues.apache.org/jira/browse/ARROW-3582
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Developer Tools, Gandiva
>Reporter: Sebastien Binet
>Assignee: Sebastien Binet
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The `JDK: openjdk8 Compiler: gcc C++` build is always triggered, even when 
> _e.g._ only Go files are modified:
> - https://travis-ci.org/sbinet-gonum/arrow/jobs/444128507



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3582) [CI] Gandiva C++ build is always triggered

2018-10-21 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3582.

   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2808
[https://github.com/apache/arrow/pull/2808]

> [CI] Gandiva C++ build is always triggered
> --
>
> Key: ARROW-3582
> URL: https://issues.apache.org/jira/browse/ARROW-3582
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Developer Tools, Gandiva
>Reporter: Sebastien Binet
>Assignee: Sebastien Binet
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The `JDK: openjdk8 Compiler: gcc C++` build is always triggered, even when 
> _e.g._ only Go files are modified:
> - https://travis-ci.org/sbinet-gonum/arrow/jobs/444128507



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2535) [Python] Provide pre-commit hooks that check flake8

2018-10-21 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16658233#comment-16658233
 ] 

Uwe L. Korn commented on ARROW-2535:


I will do {{clang-format}} in a separate PR, limiting the scope of this to 
{{flake8}}

> [Python] Provide pre-commit hooks that check flake8
> ---
>
> Key: ARROW-2535
> URL: https://issues.apache.org/jira/browse/ARROW-2535
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Python
>Reporter: Uwe L. Korn
>Priority: Major
> Fix For: 0.12.0
>
>
> We should provide pre-commit hooks that users can install (optionally) that 
> check e.g. flake8 and clang-format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2535) [Python] Provide pre-commit hooks that check flake8

2018-10-21 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn updated ARROW-2535:
---
Fix Version/s: (was: 0.13.0)
   0.12.0

> [Python] Provide pre-commit hooks that check flake8
> ---
>
> Key: ARROW-2535
> URL: https://issues.apache.org/jira/browse/ARROW-2535
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Python
>Reporter: Uwe L. Korn
>Priority: Major
> Fix For: 0.12.0
>
>
> We should provide pre-commit hooks that users can install (optionally) that 
> check e.g. flake8 and clang-format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2535) [Python] Provide pre-commit hooks that check flake8

2018-10-21 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn updated ARROW-2535:
---
Component/s: (was: C++)

> [Python] Provide pre-commit hooks that check flake8
> ---
>
> Key: ARROW-2535
> URL: https://issues.apache.org/jira/browse/ARROW-2535
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Python
>Reporter: Uwe L. Korn
>Priority: Major
> Fix For: 0.12.0
>
>
> We should provide pre-commit hooks that users can install (optionally) that 
> check e.g. flake8 and clang-format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2535) [Python] Provide pre-commit hooks that check flake8

2018-10-21 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn updated ARROW-2535:
---
Summary: [Python] Provide pre-commit hooks that check flake8  (was: 
[C++/Python] Provide pre-commit hooks that check flake8)

> [Python] Provide pre-commit hooks that check flake8
> ---
>
> Key: ARROW-2535
> URL: https://issues.apache.org/jira/browse/ARROW-2535
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Python
>Reporter: Uwe L. Korn
>Priority: Major
> Fix For: 0.12.0
>
>
> We should provide pre-commit hooks that users can install (optionally) that 
> check e.g. flake8 and clang-format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-2535) [Python] Provide pre-commit hooks that check flake8

2018-10-21 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned ARROW-2535:
--

Assignee: Uwe L. Korn

> [Python] Provide pre-commit hooks that check flake8
> ---
>
> Key: ARROW-2535
> URL: https://issues.apache.org/jira/browse/ARROW-2535
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Python
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
> Fix For: 0.12.0
>
>
> We should provide pre-commit hooks that users can install (optionally) that 
> check e.g. flake8 and clang-format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2535) [C++/Python] Provide pre-commit hooks that check flake8

2018-10-21 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn updated ARROW-2535:
---
Summary: [C++/Python] Provide pre-commit hooks that check flake8  (was: 
[C++/Python] Provide pre-commit hooks that check flake8 et al.)

> [C++/Python] Provide pre-commit hooks that check flake8
> ---
>
> Key: ARROW-2535
> URL: https://issues.apache.org/jira/browse/ARROW-2535
> Project: Apache Arrow
>  Issue Type: Task
>  Components: C++, Python
>Reporter: Uwe L. Korn
>Priority: Major
> Fix For: 0.13.0
>
>
> We should provide pre-commit hooks that users can install (optionally) that 
> check e.g. flake8 and clang-format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3583) [Python/Java] Create RecordBatch from VectorSchemaRoot

2018-10-21 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3583:
--

 Summary: [Python/Java] Create RecordBatch from VectorSchemaRoot
 Key: ARROW-3583
 URL: https://issues.apache.org/jira/browse/ARROW-3583
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Java, Python
Reporter: Uwe L. Korn
Assignee: Uwe L. Korn
 Fix For: 0.12.0


Besides the naming differences, {{pyarrow.RecordBatch}} is content-wise the 
same as a {{org.apache.arrow.vector.VectorSchemaRoot}}. This adds a conversion 
function to create a {{pyarrow.RecordBatch}} referencing these arrays.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3131) [Go] add test for Go-1.11

2018-10-21 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned ARROW-3131:
--

Assignee: Sebastien Binet

> [Go] add test for Go-1.11
> -
>
> Key: ARROW-3131
> URL: https://issues.apache.org/jira/browse/ARROW-3131
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Go
>Reporter: Sebastien Binet
>Assignee: Sebastien Binet
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Go-1.11 has been released.
> we should start to test this new stable release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3131) [Go] add test for Go-1.11

2018-10-21 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3131.

   Resolution: Fixed
Fix Version/s: 0.11.1

Issue resolved by pull request 2487
[https://github.com/apache/arrow/pull/2487]

> [Go] add test for Go-1.11
> -
>
> Key: ARROW-3131
> URL: https://issues.apache.org/jira/browse/ARROW-3131
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Go
>Reporter: Sebastien Binet
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.1
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Go-1.11 has been released.
> we should start to test this new stable release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3131) [Go] add test for Go-1.11

2018-10-21 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn updated ARROW-3131:
---
Fix Version/s: (was: 0.11.1)
   0.12.0

> [Go] add test for Go-1.11
> -
>
> Key: ARROW-3131
> URL: https://issues.apache.org/jira/browse/ARROW-3131
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Go
>Reporter: Sebastien Binet
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Go-1.11 has been released.
> we should start to test this new stable release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3054) [Packaging] Deploy nightlies built using crossbow to the twosigma conda channel

2018-10-20 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657873#comment-16657873
 ] 

Uwe L. Korn commented on ARROW-3054:


We should use a better named channel, e.g {{arrow-nightlies}} (excluding the 
apache part). Just {{apache}} sounds like something a user could use in 
production and is official.

> [Packaging] Deploy nightlies built using crossbow to the twosigma conda 
> channel
> ---
>
> Key: ARROW-3054
> URL: https://issues.apache.org/jira/browse/ARROW-3054
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Packaging
>Affects Versions: 0.10.0
>Reporter: Phillip Cloud
>Assignee: Krisztian Szucs
>Priority: Major
> Fix For: 0.12.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3565) [Python] Pin tensorflow to 1.11.0 in manylinux1 container

2018-10-19 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3565:
--

 Summary: [Python] Pin tensorflow to 1.11.0 in manylinux1 container
 Key: ARROW-3565
 URL: https://issues.apache.org/jira/browse/ARROW-3565
 Project: Apache Arrow
  Issue Type: Task
  Components: Python
Affects Versions: 0.11.0
Reporter: Uwe L. Korn
Assignee: Uwe L. Korn
 Fix For: 0.11.1


Just enough to get {{pyarrow}} in a releasable state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3554) [C++] Reverse traits for C++

2018-10-18 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655651#comment-16655651
 ] 

Uwe L. Korn commented on ARROW-3554:


We have that in https://github.com/apache/arrow/blob/master/cpp/src/arrow/stl.h

> [C++] Reverse traits for C++
> 
>
> Key: ARROW-3554
> URL: https://issues.apache.org/jira/browse/ARROW-3554
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wolf Vollprecht
>Priority: Minor
>
> This might be more of a question that I would have asked on a chat, so sorry 
> if inappropriate here as an issue.
>  
> I am trying to get the Arrow type from a native C++ type. 
> I would like to use something like
>  
> `arrow_type::type -> UInt8Type` or `arrow_type() -> 
> shared_ptr`
>  
> Is that implemented somewhere?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3546) [Python] Provide testing setup to verify wheel binaries work in one or more common Linux distributions

2018-10-17 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654132#comment-16654132
 ] 

Uwe L. Korn commented on ARROW-3546:


Alpine is not supported by manylinux1 as it's using musl libc instead of glibc.

> [Python] Provide testing setup to verify wheel binaries work in one or more 
> common Linux distributions
> --
>
> Key: ARROW-3546
> URL: https://issues.apache.org/jira/browse/ARROW-3546
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> To help catch issues like ARROW-3514: install a candidate wheel in a fresh 
> environment, run Arrow test suite with the installed package



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3535) [Python] pip install tensorflow install too new numpy in manylinux1 build

2018-10-16 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn updated ARROW-3535:
---
Fix Version/s: 0.12.0

> [Python] pip install tensorflow install too new numpy in manylinux1 build
> -
>
> Key: ARROW-3535
> URL: https://issues.apache.org/jira/browse/ARROW-3535
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Uwe L. Korn
>Priority: Blocker
> Fix For: 0.12.0
>
>
> This blocks us from doing a release again. We definitely need to get this 
> split apart before we do another release.
> [~pcmoritz] [~robertnishihara]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3535) [Python] pip install tensorflow install too new numpy in manylinux1 build

2018-10-16 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16651745#comment-16651745
 ] 

Uwe L. Korn commented on ARROW-3535:


Currently this installs 1.15.2 thus making this also our minimal NumPy version.

> [Python] pip install tensorflow install too new numpy in manylinux1 build
> -
>
> Key: ARROW-3535
> URL: https://issues.apache.org/jira/browse/ARROW-3535
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Uwe L. Korn
>Priority: Blocker
>
> This blocks us from doing a release again. We definitely need to get this 
> split apart before we do another release.
> [~pcmoritz] [~robertnishihara]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3535) [Python] pip install tensorflow install too new numpy in manylinux1 build

2018-10-16 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3535:
--

 Summary: [Python] pip install tensorflow install too new numpy in 
manylinux1 build
 Key: ARROW-3535
 URL: https://issues.apache.org/jira/browse/ARROW-3535
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Uwe L. Korn


This blocks us from doing a release again. We definitely need to get this split 
apart before we do another release.

[~pcmoritz] [~robertnishihara]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3534) [Python] Update zlib library in manylinux1 image

2018-10-16 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3534:
--

 Summary: [Python] Update zlib library in manylinux1 image
 Key: ARROW-3534
 URL: https://issues.apache.org/jira/browse/ARROW-3534
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Uwe L. Korn
Assignee: Uwe L. Korn


Update to the latest release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3533) [Python/Documentation] Use sphinx_rtd_theme instead of Bootstrap

2018-10-16 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3533:
--

 Summary: [Python/Documentation] Use sphinx_rtd_theme instead of 
Bootstrap
 Key: ARROW-3533
 URL: https://issues.apache.org/jira/browse/ARROW-3533
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Documentation, Python
Reporter: Uwe L. Korn


I have got some feedback that the Arrow Python API documentation is a bit 
confusing as the ToC/Menu is only really visible on the front page. People get 
confused by the top header. As we are already diverging from the main homepage 
as this was migrated to a newer bootstrap version anyway, I suggest to change 
the documentation theme.

As a best practice, I would switch back to the sphinx_rtd_theme as this 
provides a UX people are used to and happy with. We can customize it if needed 
later as e.g. dask did: https://github.com/dask/dask-sphinx-theme



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3530) [Java/Python] Add conversion for pyarrow.Schema from org.apache…pojo.Schema

2018-10-16 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3530:
--

 Summary: [Java/Python] Add conversion for pyarrow.Schema from 
org.apache…pojo.Schema
 Key: ARROW-3530
 URL: https://issues.apache.org/jira/browse/ARROW-3530
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Java, Python
Reporter: Uwe L. Korn
Assignee: Uwe L. Korn






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3514) [Python] zlib deflate exception when writing Parquet file

2018-10-16 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16651348#comment-16651348
 ] 

Uwe L. Korn commented on ARROW-3514:


{{auditwheel}} vendors automatically all libs that should be shipped in the 
wheel and are not part of a system as defined by the manylinux1 specification. 
We should definitely build a newer version of zlib but still should bundle it 
in the wheel.

> [Python] zlib deflate exception when writing Parquet file
> -
>
> Key: ARROW-3514
> URL: https://issues.apache.org/jira/browse/ARROW-3514
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Python
>Affects Versions: 0.11.0
> Environment: Amazon Linux, CentOS 7, Ubuntu 16.04, zlib 1.2.7/1.2.8, 
> CPython 3.6.
>Reporter: Adam Machanic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The below Python code throws an exception in 0.11.0, but not in 0.10.0.
> I was able to reproduce the issue in Amazon Linux, CentOS 7, and Ubuntu 
> 16.04, but not in Windows 7.
> The Amazon and CentOS machines are both running zlib 1.2.7, and the Ubuntu 
> machine is using 1.2.8.
> Tested with CPython 3.6 in all cases.
> {code:python}
> import io
> import pyarrow
> from pyarrow import parquet
> tbl = pyarrow.Table.from_arrays([pyarrow.array(['abc', 'def'])], ['some_col'])
> f = io.BytesIO()
> parquet.write_table(tbl, f, compression='gzip')
> {code}
> Following is the exception:
> {code}
> Traceback (most recent call last):
>   File "test_pyarrow.py", line 8, in 
> parquet.write_table(tbl, f, compression='gzip')
>   File "/home/adam/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py", 
> line 1125, in write_table
> writer.write_table(table, row_group_size=row_group_size)
>   File "/home/adam/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py", 
> line 376, in write_table
> self.writer.write_table(table, row_group_size=row_group_size)
>   File "pyarrow/_parquet.pyx", line 934, in 
> pyarrow._parquet.ParquetWriter.write_table
>   File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
> pyarrow.lib.ArrowIOError: Arrow error: IOError: zlib deflate failed, output 
> buffer too small
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3513) [Packaging] Push nightly built development containers to dockerhub

2018-10-15 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16649821#comment-16649821
 ] 

Uwe L. Korn commented on ARROW-3513:


You can request one via INFA

> [Packaging] Push nightly built development containers to dockerhub
> --
>
> Key: ARROW-3513
> URL: https://issues.apache.org/jira/browse/ARROW-3513
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Krisztian Szucs
>Priority: Major
>
> In order to do that We need a dockerhub account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3483) [CI] Python 3.6 build failure on Travis-CI

2018-10-10 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16645190#comment-16645190
 ] 

Uwe L. Korn commented on ARROW-3483:


This output looks suspicious:

{code}
Adding pyarrow 0.11.1.dev49+gcd6e094 to easy-install.pth file
Installing plasma_store script to 
/home/travis/build/apache/arrow/pyarrow-test-2.7/bin

Installed /home/travis/build/apache/arrow/python
Processing dependencies for pyarrow==0.11.1.dev49+gcd6e094
Searching for futures==3.2.0
Best match: futures 3.2.0
Adding futures 3.2.0 to easy-install.pth file

Using 
/home/travis/build/apache/arrow/pyarrow-test-2.7/lib/python2.7/site-packages
Searching for six==1.11.0
Best match: six 1.11.0
Adding six 1.11.0 to easy-install.pth file

Using 
/home/travis/build/apache/arrow/pyarrow-test-2.7/lib/python2.7/site-packages
Searching for numpy==1.15.2
Best match: numpy 1.15.2
Adding numpy 1.15.2 to easy-install.pth file
{code}

> [CI] Python 3.6 build failure on Travis-CI
> --
>
> Key: ARROW-3483
> URL: https://issues.apache.org/jira/browse/ARROW-3483
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Continuous Integration
>Reporter: Antoine Pitrou
>Priority: Major
>
> This seems to have appeared recently:
> https://travis-ci.org/apache/arrow/jobs/439696079#L4242
> {code}
> -- Looking for python3.6m
> -- Found Python lib 
> /home/travis/build/apache/arrow/pyarrow-test-3.6/lib/libpython3.6m.a
> CMake Error at cmake_modules/FindNumPy.cmake:62 (message):
>   NumPy import failure:
>   Traceback (most recent call last):
> File 
> "/home/travis/build/apache/arrow/pyarrow-test-3.6/lib/python3.6/site-packages/numpy/core/__init__.py",
>  line 16, in 
>   from . import multiarray
>   ImportError: libpython3.6m.so.1.0: cannot open shared object file: No such
>   file or directory
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3353) [Packaging] Build python 3.7 wheels

2018-10-10 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3353.

   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2740
[https://github.com/apache/arrow/pull/2740]

> [Packaging] Build python 3.7 wheels
> ---
>
> Key: ARROW-3353
> URL: https://issues.apache.org/jira/browse/ARROW-3353
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Krisztian Szucs
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Follow-up of https://github.com/apache/arrow/pull/2462/files



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3482) [C++] Build with JEMALLOC by default

2018-10-10 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3482:
--

 Summary: [C++] Build with JEMALLOC by default
 Key: ARROW-3482
 URL: https://issues.apache.org/jira/browse/ARROW-3482
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Uwe L. Korn
Assignee: Uwe L. Korn
 Fix For: 0.12.0


We already build conda packages and wheels with {{jemalloc}} and we have not 
had any user complaints about that since a long time. So this is then finally 
stable and should be used by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3353) [Packaging] Build python 3.7 wheels

2018-10-10 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned ARROW-3353:
--

Assignee: Uwe L. Korn

> [Packaging] Build python 3.7 wheels
> ---
>
> Key: ARROW-3353
> URL: https://issues.apache.org/jira/browse/ARROW-3353
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Krisztian Szucs
>Assignee: Uwe L. Korn
>Priority: Major
>
> Follow-up of https://github.com/apache/arrow/pull/2462/files



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3353) [Packaging] Build python 3.7 wheels

2018-10-10 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644859#comment-16644859
 ] 

Uwe L. Korn commented on ARROW-3353:


I triggered some crossbow jobs for 0.11, let's see if they succeed

> [Packaging] Build python 3.7 wheels
> ---
>
> Key: ARROW-3353
> URL: https://issues.apache.org/jira/browse/ARROW-3353
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Krisztian Szucs
>Assignee: Uwe L. Korn
>Priority: Major
>
> Follow-up of https://github.com/apache/arrow/pull/2462/files



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-2808) [Python] Add unit tests for ProxyMemoryPool, enable new default MemoryPool to be constructed

2018-10-10 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-2808.

Resolution: Fixed

Issue resolved by pull request 2725
[https://github.com/apache/arrow/pull/2725]

> [Python] Add unit tests for ProxyMemoryPool, enable new default MemoryPool to 
> be constructed
> 
>
> Key: ARROW-2808
> URL: https://issues.apache.org/jira/browse/ARROW-2808
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Antoine Pitrou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I could not find unit tests for ProxyMemoryPool



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3475) C++ Int64Builder.Finish(NumericArray)

2018-10-09 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643604#comment-16643604
 ] 

Uwe L. Korn commented on ARROW-3475:


Yes, definitely!

> C++ Int64Builder.Finish(NumericArray)
> 
>
> Key: ARROW-3475
> URL: https://issues.apache.org/jira/browse/ARROW-3475
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wolf Vollprecht
>Priority: Minor
>
> I was intuitively thinking that the following code would work:
> {{Status s;}}
> {{Int64Builder builder;}}
> {{s = builder.Append(1);}}
> {{s = builder.Append(2);}}
> {{std::shared_ptr> array;}}
> {{builder.Finish(&array);}}
> However, it does not seem to work, as the finish operation is not overloaded 
> in the Int64 (or the numeric builder).
> Would it make sense to add this interface?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3438) [Packaging] Escaped bulletpoints in changelog

2018-10-05 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn updated ARROW-3438:
---
Fix Version/s: (was: 0.11.0)
   0.12.0

> [Packaging] Escaped bulletpoints in changelog
> -
>
> Key: ARROW-3438
> URL: https://issues.apache.org/jira/browse/ARROW-3438
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See 
> https://github.com/apache/arrow/blob/7940ffe559810fec82cb2fbb0b13f5809cb5fe85/CHANGELOG.md



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3438) [Packaging] Escaped bulletpoints in changelog

2018-10-05 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3438.

   Resolution: Fixed
Fix Version/s: 0.11.0

Issue resolved by pull request 2706
[https://github.com/apache/arrow/pull/2706]

> [Packaging] Escaped bulletpoints in changelog
> -
>
> Key: ARROW-3438
> URL: https://issues.apache.org/jira/browse/ARROW-3438
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See 
> https://github.com/apache/arrow/blob/7940ffe559810fec82cb2fbb0b13f5809cb5fe85/CHANGELOG.md



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3444) Table.nbytes attribute

2018-10-05 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639374#comment-16639374
 ] 

Uwe L. Korn commented on ARROW-3444:


Have a look at 
https://github.com/xhochy/fletcher/blob/master/fletcher/base.py#L414-L423 where 
I have implemented that for Arrow columns in Pandas ExtensionArrays.

> Table.nbytes attribute
> --
>
> Key: ARROW-3444
> URL: https://issues.apache.org/jira/browse/ARROW-3444
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Dave Hirschfeld
>Priority: Minor
>
> As it says in the title, I think this would be a very handy attribute to have 
> available in Python. You can get it by converting to pandas and using 
> `DataFrame.nbytes` but this is wasteful of both time and memory so it would 
> be good to have this information on the `pyarrow.Table` object itself.
> This could be implemented using the 
> [__sizeof__|https://docs.python.org/3/library/sys.html#sys.getsizeof] protocol



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3443) [Java] Flight reports memory leaks in TestBasicOperation

2018-10-04 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn updated ARROW-3443:
---
Fix Version/s: 0.11.0

> [Java] Flight reports memory leaks in TestBasicOperation
> 
>
> Key: ARROW-3443
> URL: https://issues.apache.org/jira/browse/ARROW-3443
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: FlightRPC, Java
>Reporter: Uwe L. Korn
>Priority: Major
> Fix For: 0.11.0
>
>
> While running the release verification scripts on Ubuntu 16.04, I get the 
> following error in one of the flight tests:
> {code}
> [INFO] Running org.apache.arrow.flight.TestBasicOperation
> 63 6F 6F 6C 20 74 68 69 6E 67
> get
> put
> hello
> world
> 63 6F 6F 6C 20 74 68 69 6E 67
> [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.131 
> s - in org.apache.arrow.flight.TestBasicOperation
> [INFO] Running org.apache.arrow.flight.example.TestExampleServer
> Starting server.
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.234 
> s <<< FAILURE! - in org.apache.arrow.flight.example.TestExampleServer
> [ERROR] putStream(org.apache.arrow.flight.example.TestExampleServer)  Time 
> elapsed: 0.222 s  <<< ERROR!
> java.lang.IllegalStateException:
> Memory was leaked by query. Memory leaked: (66)
> Allocator(flight-server) 0/66/134/9223372036854775807 (res/actual/peak/limit)
> at 
> org.apache.arrow.flight.example.TestExampleServer.after(TestExampleServer.java:66)
> [INFO] Running org.apache.arrow.flight.perf.TestPerf
> Transferred 1 records totaling 32 bytes at 87,592919 mb/s. 
> 2870244,784388 record/s. 700,971181 batch/s.
> Transferred 1 records totaling 32 bytes at 121,977665 mb/s. 
> 3996964,136267 record/s. 976,138581 batch/s.
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 59.966 s <<< FAILURE! - in org.apache.arrow.flight.perf.TestPerf
> [ERROR] throughput(org.apache.arrow.flight.perf.TestPerf)  Time elapsed: 
> 59.964 s  <<< ERROR!
> java.lang.IllegalStateException:
> Memory was leaked by query. Memory leaked: (133120)
> Allocator(perf-server) 0/133120/267264/9223372036854775807 
> (res/actual/peak/limit)
> at org.apache.arrow.flight.perf.TestPerf.throughput(TestPerf.java:112)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3443) [Java] Flight reports memory leaks in TestBasicOperation

2018-10-04 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3443:
--

 Summary: [Java] Flight reports memory leaks in TestBasicOperation
 Key: ARROW-3443
 URL: https://issues.apache.org/jira/browse/ARROW-3443
 Project: Apache Arrow
  Issue Type: Improvement
  Components: FlightRPC, Java
Reporter: Uwe L. Korn


While running the release verification scripts on Ubuntu 16.04, I get the 
following error in one of the flight tests:

{code}
[INFO] Running org.apache.arrow.flight.TestBasicOperation
63 6F 6F 6C 20 74 68 69 6E 67
get
put
hello
world
63 6F 6F 6C 20 74 68 69 6E 67
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.131 s 
- in org.apache.arrow.flight.TestBasicOperation
[INFO] Running org.apache.arrow.flight.example.TestExampleServer
Starting server.
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.234 s 
<<< FAILURE! - in org.apache.arrow.flight.example.TestExampleServer
[ERROR] putStream(org.apache.arrow.flight.example.TestExampleServer)  Time 
elapsed: 0.222 s  <<< ERROR!
java.lang.IllegalStateException:
Memory was leaked by query. Memory leaked: (66)
Allocator(flight-server) 0/66/134/9223372036854775807 (res/actual/peak/limit)

at 
org.apache.arrow.flight.example.TestExampleServer.after(TestExampleServer.java:66)

[INFO] Running org.apache.arrow.flight.perf.TestPerf
Transferred 1 records totaling 32 bytes at 87,592919 mb/s. 
2870244,784388 record/s. 700,971181 batch/s.
Transferred 1 records totaling 32 bytes at 121,977665 mb/s. 
3996964,136267 record/s. 976,138581 batch/s.
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 59.966 
s <<< FAILURE! - in org.apache.arrow.flight.perf.TestPerf
[ERROR] throughput(org.apache.arrow.flight.perf.TestPerf)  Time elapsed: 59.964 
s  <<< ERROR!
java.lang.IllegalStateException:
Memory was leaked by query. Memory leaked: (133120)
Allocator(perf-server) 0/133120/267264/9223372036854775807 
(res/actual/peak/limit)

at org.apache.arrow.flight.perf.TestPerf.throughput(TestPerf.java:112)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3443) [Java] Flight reports memory leaks in TestBasicOperation

2018-10-04 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638836#comment-16638836
 ] 

Uwe L. Korn commented on ARROW-3443:


[~jnadeau] Is there any information I could provide to make debugging this 
simpler?

> [Java] Flight reports memory leaks in TestBasicOperation
> 
>
> Key: ARROW-3443
> URL: https://issues.apache.org/jira/browse/ARROW-3443
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: FlightRPC, Java
>Reporter: Uwe L. Korn
>Priority: Major
>
> While running the release verification scripts on Ubuntu 16.04, I get the 
> following error in one of the flight tests:
> {code}
> [INFO] Running org.apache.arrow.flight.TestBasicOperation
> 63 6F 6F 6C 20 74 68 69 6E 67
> get
> put
> hello
> world
> 63 6F 6F 6C 20 74 68 69 6E 67
> [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.131 
> s - in org.apache.arrow.flight.TestBasicOperation
> [INFO] Running org.apache.arrow.flight.example.TestExampleServer
> Starting server.
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.234 
> s <<< FAILURE! - in org.apache.arrow.flight.example.TestExampleServer
> [ERROR] putStream(org.apache.arrow.flight.example.TestExampleServer)  Time 
> elapsed: 0.222 s  <<< ERROR!
> java.lang.IllegalStateException:
> Memory was leaked by query. Memory leaked: (66)
> Allocator(flight-server) 0/66/134/9223372036854775807 (res/actual/peak/limit)
> at 
> org.apache.arrow.flight.example.TestExampleServer.after(TestExampleServer.java:66)
> [INFO] Running org.apache.arrow.flight.perf.TestPerf
> Transferred 1 records totaling 32 bytes at 87,592919 mb/s. 
> 2870244,784388 record/s. 700,971181 batch/s.
> Transferred 1 records totaling 32 bytes at 121,977665 mb/s. 
> 3996964,136267 record/s. 976,138581 batch/s.
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 59.966 s <<< FAILURE! - in org.apache.arrow.flight.perf.TestPerf
> [ERROR] throughput(org.apache.arrow.flight.perf.TestPerf)  Time elapsed: 
> 59.964 s  <<< ERROR!
> java.lang.IllegalStateException:
> Memory was leaked by query. Memory leaked: (133120)
> Allocator(perf-server) 0/133120/267264/9223372036854775807 
> (res/actual/peak/limit)
> at org.apache.arrow.flight.perf.TestPerf.throughput(TestPerf.java:112)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3395) [C++/Python] Add docker container for linting

2018-10-02 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3395.

   Resolution: Fixed
Fix Version/s: 0.11.0

Issue resolved by pull request 2680
[https://github.com/apache/arrow/pull/2680]

> [C++/Python] Add docker container for linting
> -
>
> Key: ARROW-3395
> URL: https://issues.apache.org/jira/browse/ARROW-3395
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Python
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Add a docker container that runs clang-format and flake8 checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3395) [C++/Python] Add docker container for linting

2018-10-01 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3395:
--

 Summary: [C++/Python] Add docker container for linting
 Key: ARROW-3395
 URL: https://issues.apache.org/jira/browse/ARROW-3395
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Python
Reporter: Uwe L. Korn
Assignee: Uwe L. Korn


Add a docker container that runs clang-format and flake8 checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3392) [Python] Support filters in disjunctive normal form in ParquetDataset

2018-10-01 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3392:
--

 Summary: [Python] Support filters in disjunctive normal form in 
ParquetDataset
 Key: ARROW-3392
 URL: https://issues.apache.org/jira/browse/ARROW-3392
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Uwe L. Korn
Assignee: Uwe L. Korn
 Fix For: 0.12.0


This allows us to represent any boolean predicate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3391) [Python] Support \0 characters in binary Parquet predicate values

2018-10-01 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3391:
--

 Summary: [Python] Support \0 characters in binary Parquet 
predicate values
 Key: ARROW-3391
 URL: https://issues.apache.org/jira/browse/ARROW-3391
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Uwe L. Korn
 Fix For: 0.13.0


As we convert the predicate values of a Parquet filter in some intermediate 
steps to C-style strings, we currently disallow the use of binary and string 
predicate values that contain {{\0}} bytes as they would otherwise result in 
wrong results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3388) [Python] boolean Partition keys in ParquetDataset are reconstructed as string

2018-10-01 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3388:
--

 Summary: [Python] boolean Partition keys in ParquetDataset are 
reconstructed as string
 Key: ARROW-3388
 URL: https://issues.apache.org/jira/browse/ARROW-3388
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Uwe L. Korn
 Fix For: 0.12.0


Saving a {{ParquetDataset}} using a boolean column as a partitioning column 
will store {{True/False}} as the values in the path. On reload these columns 
will then be string columns with the values {{'True'}} and {{'False'}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3363) [C++/Python] Add helper functions to detect scalar Python types

2018-09-29 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3363:
--

 Summary: [C++/Python] Add helper functions to detect scalar Python 
types
 Key: ARROW-3363
 URL: https://issues.apache.org/jira/browse/ARROW-3363
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Uwe L. Korn
Assignee: Uwe L. Korn
 Fix For: 0.11.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3339) [R] Support for character vectors

2018-09-29 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned ARROW-3339:
--

Assignee: Romain François

> [R] Support for character vectors
> -
>
> Key: ARROW-3339
> URL: https://issues.apache.org/jira/browse/ARROW-3339
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: R
>Reporter: Romain François
>Assignee: Romain François
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3339) [R] Support for character vectors

2018-09-29 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3339.

   Resolution: Fixed
Fix Version/s: 0.11.0

Issue resolved by pull request 2654
[https://github.com/apache/arrow/pull/2654]

> [R] Support for character vectors
> -
>
> Key: ARROW-3339
> URL: https://issues.apache.org/jira/browse/ARROW-3339
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: R
>Reporter: Romain François
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3327) [Python] manylinux container confusing

2018-09-27 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3327.

   Resolution: Fixed
Fix Version/s: 0.11.0

Issue resolved by pull request 2642
[https://github.com/apache/arrow/pull/2642]

> [Python] manylinux container confusing
> --
>
> Key: ARROW-3327
> URL: https://issues.apache.org/jira/browse/ARROW-3327
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.10.0
>Reporter: Antoine Pitrou
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The layout of the Docker container for manylinux builds is a bit confusing 
> and error-prone. The Arrow source code is present both in {{/arrow}} and 
> {{/io/arrow}}, and it's easy to get the two diverging if you're not careful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3335) [Python] Add ccache to manylinux1 container

2018-09-26 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3335:
--

 Summary: [Python] Add ccache to manylinux1 container
 Key: ARROW-3335
 URL: https://issues.apache.org/jira/browse/ARROW-3335
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Packaging, Python
Reporter: Uwe L. Korn
Assignee: Uwe L. Korn
 Fix For: 0.11.0


Should make the recompilation steps a lot faster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3334) [Python] Update conda packages to new numpy requirement

2018-09-26 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3334:
--

 Summary: [Python] Update conda packages to new numpy requirement
 Key: ARROW-3334
 URL: https://issues.apache.org/jira/browse/ARROW-3334
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Packaging, Python
Reporter: Uwe L. Korn
Assignee: Uwe L. Korn
 Fix For: 0.11.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3141) [Python] Tensorflow support in pyarrow wheels pins numpy>=1.14

2018-09-26 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3141.

Resolution: Fixed

Issue resolved by pull request 2634
[https://github.com/apache/arrow/pull/2634]

> [Python] Tensorflow support in pyarrow wheels pins numpy>=1.14
> --
>
> Key: ARROW-3141
> URL: https://issues.apache.org/jira/browse/ARROW-3141
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Packaging, Python
>Affects Versions: 0.10.0
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This was introduced by https://github.com/apache/arrow/pull/2104/files
> Two options:
> * Don't build with tensorflow support by default
> * Increase our minimal support NumPy version to 1.14 overall



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3327) [Python] manylinux container confusing

2018-09-26 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned ARROW-3327:
--

Assignee: Uwe L. Korn

> [Python] manylinux container confusing
> --
>
> Key: ARROW-3327
> URL: https://issues.apache.org/jira/browse/ARROW-3327
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.10.0
>Reporter: Antoine Pitrou
>Assignee: Uwe L. Korn
>Priority: Major
>
> The layout of the Docker container for manylinux builds is a bit confusing 
> and error-prone. The Arrow source code is present both in {{/arrow}} and 
> {{/io/arrow}}, and it's easy to get the two diverging if you're not careful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3327) [Python] manylinux container confusing

2018-09-26 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628599#comment-16628599
 ] 

Uwe L. Korn commented on ARROW-3327:


There is no need anymore for this after the Parquet merge. I'll do some cleanups

> [Python] manylinux container confusing
> --
>
> Key: ARROW-3327
> URL: https://issues.apache.org/jira/browse/ARROW-3327
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.10.0
>Reporter: Antoine Pitrou
>Priority: Major
>
> The layout of the Docker container for manylinux builds is a bit confusing 
> and error-prone. The Arrow source code is present both in {{/arrow}} and 
> {{/io/arrow}}, and it's easy to get the two diverging if you're not careful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3141) [Python] Tensorflow support in pyarrow wheels pins numpy>=1.14

2018-09-25 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned ARROW-3141:
--

Assignee: Uwe L. Korn

> [Python] Tensorflow support in pyarrow wheels pins numpy>=1.14
> --
>
> Key: ARROW-3141
> URL: https://issues.apache.org/jira/browse/ARROW-3141
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Packaging, Python
>Affects Versions: 0.10.0
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
> Fix For: 0.11.0
>
>
> This was introduced by https://github.com/apache/arrow/pull/2104/files
> Two options:
> * Don't build with tensorflow support by default
> * Increase our minimal support NumPy version to 1.14 overall



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3141) [Python] Tensorflow support in pyarrow wheels pins numpy>=1.14

2018-09-25 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16626985#comment-16626985
 ] 

Uwe L. Korn commented on ARROW-3141:


Then I'll up the numpy requirement to 1.14 and make a follow-up JIRA to 
separate it into its own package for 0.12.

> [Python] Tensorflow support in pyarrow wheels pins numpy>=1.14
> --
>
> Key: ARROW-3141
> URL: https://issues.apache.org/jira/browse/ARROW-3141
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Packaging, Python
>Affects Versions: 0.10.0
>Reporter: Uwe L. Korn
>Priority: Major
> Fix For: 0.11.0
>
>
> This was introduced by https://github.com/apache/arrow/pull/2104/files
> Two options:
> * Don't build with tensorflow support by default
> * Increase our minimal support NumPy version to 1.14 overall



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-1796) [Python] RowGroup filtering on file level

2018-09-24 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn updated ARROW-1796:
---
Labels: parquet  (was: )

> [Python] RowGroup filtering on file level
> -
>
> Key: ARROW-1796
> URL: https://issues.apache.org/jira/browse/ARROW-1796
> Project: Apache Arrow
>  Issue Type: Improvement
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: parquet
> Fix For: 0.12.0
>
>
> We can build upon the API defined in {{fastparquet}} for defining RowGroup 
> filters: 
> https://github.com/dask/fastparquet/blob/master/fastparquet/api.py#L296-L300 
> and translate them into the C++ enums we will define in 
> https://issues.apache.org/jira/browse/PARQUET-1158 . This should enable us to 
> provide the user with a simple predicate pushdown API that we can extend in 
> the background from RowGroup to Page level later on.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3141) [Python] Tensorflow support in pyarrow wheels pins numpy>=1.14

2018-09-24 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16625706#comment-16625706
 ] 

Uwe L. Korn commented on ARROW-3141:


Actually, we will then only have this feature as part of the wheel but conda 
packages will not provide it. I'm a bit hesitant to just fix this by increasing 
the minimal numpy version. The best way forward is definitely to make it a 
standalone Python package {{pyarrow.tensorflow}}.

> [Python] Tensorflow support in pyarrow wheels pins numpy>=1.14
> --
>
> Key: ARROW-3141
> URL: https://issues.apache.org/jira/browse/ARROW-3141
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Packaging, Python
>Affects Versions: 0.10.0
>Reporter: Uwe L. Korn
>Priority: Major
> Fix For: 0.11.0
>
>
> This was introduced by https://github.com/apache/arrow/pull/2104/files
> Two options:
> * Don't build with tensorflow support by default
> * Increase our minimal support NumPy version to 1.14 overall



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3141) [Python] Tensorflow support in pyarrow wheels pins numpy>=1.14

2018-09-24 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16625703#comment-16625703
 ] 

Uwe L. Korn commented on ARROW-3141:


Shorttermin fix would be to raise the minimal requirement to 1.14 and see what 
we get as a userfeedback.

> [Python] Tensorflow support in pyarrow wheels pins numpy>=1.14
> --
>
> Key: ARROW-3141
> URL: https://issues.apache.org/jira/browse/ARROW-3141
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Packaging, Python
>Affects Versions: 0.10.0
>Reporter: Uwe L. Korn
>Priority: Major
> Fix For: 0.11.0
>
>
> This was introduced by https://github.com/apache/arrow/pull/2104/files
> Two options:
> * Don't build with tensorflow support by default
> * Increase our minimal support NumPy version to 1.14 overall



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3076) [Website] Add Google Analytics tags to generated API documentation

2018-09-22 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624777#comment-16624777
 ] 

Uwe L. Korn commented on ARROW-3076:


No, we probably have to read a bit on that. The solution to my private blog was 
to remove any external trackers.

> [Website] Add Google Analytics tags to generated API documentation
> --
>
> Key: ARROW-3076
> URL: https://issues.apache.org/jira/browse/ARROW-3076
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Website
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.11.0
>
>
> It would be helpful to see which parts of the documentation are seeing traffic



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3086) [Glib] GISCAN fails due to conda-shipped openblas

2018-09-22 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624775#comment-16624775
 ] 

Uwe L. Korn commented on ARROW-3086:


No

> [Glib] GISCAN fails due to conda-shipped openblas
> -
>
> Key: ARROW-3086
> URL: https://issues.apache.org/jira/browse/ARROW-3086
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: GLib
>Affects Versions: 0.10.0
>Reporter: Uwe L. Korn
>Assignee: Kouhei Sutou
>Priority: Major
> Fix For: 0.12.0
>
>
> With the changes in [https://github.com/apache/arrow/pull/2374], the 
> libraries provided by conda are now in the library path when running the 
> GISCAN step. This sadly leads to the poisoning of the search path with the 
> conda provided openblas which is incompatible with the system provided 
> libLAPACK.dylib
> {code:java}
> dyld: Library not loaded: 
> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib
> Referenced from: 
> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/vecLib
> Reason: Incompatible library version: vecLib requires version 1.0.0 or later, 
> but libLAPACK.dylib provides version 0.0.0{code}
> While mentioned that it explicitly loads 
> {{/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib}},
>  it seems that {{liblapack.so}} from the conda installation gets picked up 
> first. This only provides the library symbols with version 0.0.0 and thus is 
> incompatible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3086) [Glib] GISCAN fails due to conda-shipped openblas

2018-09-22 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn updated ARROW-3086:
---
Fix Version/s: (was: 0.11.0)
   0.12.0

> [Glib] GISCAN fails due to conda-shipped openblas
> -
>
> Key: ARROW-3086
> URL: https://issues.apache.org/jira/browse/ARROW-3086
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: GLib
>Affects Versions: 0.10.0
>Reporter: Uwe L. Korn
>Assignee: Kouhei Sutou
>Priority: Major
> Fix For: 0.12.0
>
>
> With the changes in [https://github.com/apache/arrow/pull/2374], the 
> libraries provided by conda are now in the library path when running the 
> GISCAN step. This sadly leads to the poisoning of the search path with the 
> conda provided openblas which is incompatible with the system provided 
> libLAPACK.dylib
> {code:java}
> dyld: Library not loaded: 
> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib
> Referenced from: 
> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/vecLib
> Reason: Incompatible library version: vecLib requires version 1.0.0 or later, 
> but libLAPACK.dylib provides version 0.0.0{code}
> While mentioned that it explicitly loads 
> {{/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib}},
>  it seems that {{liblapack.so}} from the conda installation gets picked up 
> first. This only provides the library symbols with version 0.0.0 and thus is 
> incompatible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3301) [Website] Update Jekyll and Bootstrap 4

2018-09-22 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn updated ARROW-3301:
---
Summary: [Website] Update Jekyll and Bootstrap 4  (was: [Website] Update 
Jekyll and Bootstrap)

> [Website] Update Jekyll and Bootstrap 4
> ---
>
> Key: ARROW-3301
> URL: https://issues.apache.org/jira/browse/ARROW-3301
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Website
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>
> Update to Bootstrap version 4



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3301) [Website] Update Jekyll and Bootstrap

2018-09-22 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-3301:
--

 Summary: [Website] Update Jekyll and Bootstrap
 Key: ARROW-3301
 URL: https://issues.apache.org/jira/browse/ARROW-3301
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Website
Reporter: Uwe L. Korn
Assignee: Uwe L. Korn


Update to Bootstrap version 4



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3143) [C++] CopyBitmap into existing memory

2018-09-22 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3143.

Resolution: Fixed

Issue resolved by pull request 2526
[https://github.com/apache/arrow/pull/2526]

> [C++] CopyBitmap into existing memory
> -
>
> Key: ARROW-3143
> URL: https://issues.apache.org/jira/browse/ARROW-3143
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> {{CopyBitmap}} currently always allocates a new Buffer for its result. We 
> also want to support the case where we insert into existing memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3280) [Python] Difficulty running tests after conda install

2018-09-22 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624699#comment-16624699
 ] 

Uwe L. Korn commented on ARROW-3280:


[~mrocklin] Please delete your {{/home/mrocklin/workspace/arrow/python/.eggs}} 
folder. It contains an old {{setuptools_scm}} version that probably messes 
things up here; and then install {{setuptools_scm}} using {{pip/conda}} into 
the environment.

> [Python] Difficulty running tests after conda install
> -
>
> Key: ARROW-3280
> URL: https://issues.apache.org/jira/browse/ARROW-3280
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.10.0
> Environment: conda create -n test-arrow pytest ipython pandas nomkl 
> pyarrow -c conda-forge
> Ubuntu 16.04
>Reporter: Matthew Rocklin
>Priority: Minor
>  Labels: python
>
> I install PyArrow from conda-forge, and then try running tests (or import 
> generally)
> {code:java}
> conda create -n test-arrow pytest ipython pandas nomkl pyarrow -c conda-forge 
> {code}
> {code:java}
> mrocklin@carbon:~/workspace/arrow/python$ py.test 
> pyarrow/tests/test_parquet.py 
> Traceback (most recent call last):
> File 
> "/home/mrocklin/Software/anaconda/lib/python3.6/site-packages/_pytest/config.py",
>  line 328, in _getconftestmodules
> return self._path2confmods[path]
> KeyError: 
> local('/home/mrocklin/workspace/arrow/python/pyarrow/tests/test_parquet.py')During
>  handling of the above exception, another exception occurred:
> Traceback (most recent call last):
> File 
> "/home/mrocklin/Software/anaconda/lib/python3.6/site-packages/_pytest/config.py",
>  line 328, in _getconftestmodules
> return self._path2confmods[path]
> KeyError: local('/home/mrocklin/workspace/arrow/python/pyarrow/tests')During 
> handling of the above exception, another exception occurred:
> Traceback (most recent call last):
> File 
> "/home/mrocklin/Software/anaconda/lib/python3.6/site-packages/_pytest/config.py",
>  line 359, in _importconftest
> return self._conftestpath2mod[conftestpath]
> KeyError: 
> local('/home/mrocklin/workspace/arrow/python/pyarrow/tests/conftest.py')During
>  handling of the above exception, another exception occurred:
> Traceback (most recent call last):
> File 
> "/home/mrocklin/Software/anaconda/lib/python3.6/site-packages/_pytest/config.py",
>  line 365, in _importconftest
> mod = conftestpath.pyimport()
> File 
> "/home/mrocklin/Software/anaconda/lib/python3.6/site-packages/py/_path/local.py",
>  line 668, in pyimport
> __import__(modname)
> File "/home/mrocklin/workspace/arrow/python/pyarrow/__init__.py", line 54, in 
> 
> from pyarrow.lib import cpu_count, set_cpu_count
> ModuleNotFoundError: No module named 'pyarrow.lib'
> ERROR: could not load 
> /home/mrocklin/workspace/arrow/python/pyarrow/tests/conftest.py{code}
> Probably this is something wrong with my environment, but I thought I'd 
> report it as a usability bug



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1993) [Python] Add function for determining implied Arrow schema from pandas.DataFrame

2018-09-22 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624698#comment-16624698
 ] 

Uwe L. Korn commented on ARROW-1993:


We need to delay to 0.12 / 0.13. This needs a lot more work to avoid costly 
operations.

> [Python] Add function for determining implied Arrow schema from 
> pandas.DataFrame
> 
>
> Key: ARROW-1993
> URL: https://issues.apache.org/jira/browse/ARROW-1993
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: beginner, pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently the only option is to use {{Table/Array.from_pandas}} which does 
> significant unnecessary work and allocates memory. If only the schema is of 
> interest, then we could do less work and not allocate memory.
> We should provide the user a function {{pyarrow.Schema.from_pandas}} which 
> takes a DataFrame as an input and returns the respective Arrow schema. The 
> functionality for determing the schema is already available in the Python 
> code, it is at moment just very tightly bound to the conversion 
> infrastructure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3300) [Release] Update .deb package names in preparation

2018-09-22 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3300.

   Resolution: Fixed
Fix Version/s: 0.11.0

Issue resolved by pull request 2606
[https://github.com/apache/arrow/pull/2606]

> [Release] Update .deb package names in preparation
> --
>
> Key: ARROW-3300
> URL: https://issues.apache.org/jira/browse/ARROW-3300
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need to use libarrow${SO_VERSION} package name for .deb.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3271) [Python] Manylinux1 builds timing out in Travis CI

2018-09-20 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16621974#comment-16621974
 ] 

Uwe L. Korn commented on ARROW-3271:


We could limit the manylinux1 builds to e.g. a single Python version to improve 
the build times. They are quite important as they are the lower bound in the 
compiler versions that we support.

> [Python] Manylinux1 builds timing out in Travis CI
> --
>
> Key: ARROW-3271
> URL: https://issues.apache.org/jira/browse/ARROW-3271
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Continuous Integration
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.11.0
>
>
> Not sure why this is happening -- I think the docker pull has been a lot 
> slower of late



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3243) [C++] Upgrade jemalloc to version 5

2018-09-20 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16621971#comment-16621971
 ] 

Uwe L. Korn commented on ARROW-3243:


The patch we have is solely relevant for jemalloc-4, it was already in the 
released jemalloc-5 branch. Sadly jemalloc 5 had some changes that made it 
unusable in the {{manylinux1}} setting. It could be that these are resolved, 
then we could switch to a newer version. You can simply try this by changing 
the installation script. Otherwise we probably have to wait until we have 
changed our wheel to be based on {{manylinux2010}}.

> [C++] Upgrade jemalloc to version 5
> ---
>
> Key: ARROW-3243
> URL: https://issues.apache.org/jira/browse/ARROW-3243
> Project: Apache Arrow
>  Issue Type: Improvement
>Reporter: Philipp Moritz
>Priority: Major
>
> Is it possible/feasible to upgrade jemalloc to version 5 and assume that 
> version? I'm asking because I've been working towards replacing dlmalloc in 
> plasma with jemalloc, which makes some of the code much nicer and removes 
> some of the issues we had with dlmalloc, but it requires jemalloc APIs that 
> are only available starting from jemalloc version 5, in particular, I'm using 
> the extent_hooks_t capability.
> For now I can submit a patch that uses a different version of jemalloc in 
> plasma and then we can figure out how to deal with it (maybe there is a way 
> to make it work with older versions). What are your thoughts?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3141) [Python] Tensorflow support in pyarrow wheels pins numpy>=1.14

2018-09-20 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16621952#comment-16621952
 ] 

Uwe L. Korn commented on ARROW-3141:


I'm not really sure how long people stay on old NumPy versions. I guess we can 
increase the minimal version. Still, we should be very careful about the NumPy 
version in our builds and should not let it update automatically.

> [Python] Tensorflow support in pyarrow wheels pins numpy>=1.14
> --
>
> Key: ARROW-3141
> URL: https://issues.apache.org/jira/browse/ARROW-3141
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Packaging, Python
>Affects Versions: 0.10.0
>Reporter: Uwe L. Korn
>Priority: Major
> Fix For: 0.11.0
>
>
> This was introduced by https://github.com/apache/arrow/pull/2104/files
> Two options:
> * Don't build with tensorflow support by default
> * Increase our minimal support NumPy version to 1.14 overall



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3069) [Release] Stop using SHA1 checksums per ASF policy

2018-09-20 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3069.

Resolution: Fixed

Issue resolved by pull request 2584
[https://github.com/apache/arrow/pull/2584]

> [Release] Stop using SHA1 checksums per ASF policy
> --
>
> Key: ARROW-3069
> URL: https://issues.apache.org/jira/browse/ARROW-3069
> Project: Apache Arrow
>  Issue Type: Improvement
>Reporter: Wes McKinney
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3262) [Python] Implement __getitem__ with integers on pyarrow.Column

2018-09-20 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3262.

Resolution: Fixed

Issue resolved by pull request 2585
[https://github.com/apache/arrow/pull/2585]

> [Python] Implement __getitem__ with integers on pyarrow.Column
> --
>
> Key: ARROW-3262
> URL: https://issues.apache.org/jira/browse/ARROW-3262
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 0.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This would improve interactive usability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3267) [Python] Create empty table from schema

2018-09-20 Thread Uwe L. Korn (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16621664#comment-16621664
 ] 

Uwe L. Korn commented on ARROW-3267:


[~Paul.Rogers] We already have the necessary builder infrastructure, this 
function is mainly to have something to pass around when there is no data. Also 
the {{Table}} instance is not meant to be modified, i.e. it will stay empty all 
along the pipeline.

> [Python] Create empty table from schema
> ---
>
> Key: ARROW-3267
> URL: https://issues.apache.org/jira/browse/ARROW-3267
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
> Fix For: 0.11.0
>
>
> When one knows the expected schema for its input data but has no input data 
> for a data pipeline, it is necessary to construct an empty table as a 
> sentinel value to pass through.
> This is a small but often useful convenience function.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


<    1   2   3   4   5   6   7   8   9   10   >