[jira] [Commented] (ARROW-2251) [GLib] Destroying GArrowBuffer while GArrowTensor that uses the buffer causes a crash

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385596#comment-16385596
 ] 

ASF GitHub Bot commented on ARROW-2251:
---

wesm commented on issue #1691: ARROW-2251: [GLib] Keep GArrowBuffer alive while 
GArrowTensor for the buffer is live
URL: https://github.com/apache/arrow/pull/1691#issuecomment-370307596
 
 
   I see, so there's a "weak" reference to memory that is held by another 
buffer object. We have to deal with some issues like this in Python to handle 
dependency chains where the `shared_ptr` is not aware of memory 
relationships expressed at the Python level


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [GLib] Destroying GArrowBuffer while GArrowTensor that uses the buffer causes 
> a crash
> -
>
> Key: ARROW-2251
> URL: https://issues.apache.org/jira/browse/ARROW-2251
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: GLib
>Affects Versions: 0.8.0
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2251) [GLib] Destroying GArrowBuffer while GArrowTensor that uses the buffer causes a crash

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385598#comment-16385598
 ] 

ASF GitHub Bot commented on ARROW-2251:
---

wesm closed pull request #1691: ARROW-2251: [GLib] Keep GArrowBuffer alive 
while GArrowTensor for the buffer is live
URL: https://github.com/apache/arrow/pull/1691
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/c_glib/arrow-glib/input-stream.cpp 
b/c_glib/arrow-glib/input-stream.cpp
index 94422241b..f602e5f7e 100644
--- a/c_glib/arrow-glib/input-stream.cpp
+++ b/c_glib/arrow-glib/input-stream.cpp
@@ -282,7 +282,7 @@ 
garrow_seekable_input_stream_read_tensor(GArrowSeekableInputStream *input_stream
arrow_random_access_file.get(),
&arrow_tensor);
   if (garrow_error_check(error, status, 
"[seekable-input-stream][read-tensor]")) {
-return garrow_tensor_new_raw(&arrow_tensor);
+return garrow_tensor_new_raw(&arrow_tensor, nullptr);
   } else {
 return NULL;
   }
diff --git a/c_glib/arrow-glib/tensor.cpp b/c_glib/arrow-glib/tensor.cpp
index 3325f8511..359831f67 100644
--- a/c_glib/arrow-glib/tensor.cpp
+++ b/c_glib/arrow-glib/tensor.cpp
@@ -40,11 +40,13 @@ G_BEGIN_DECLS
 
 typedef struct GArrowTensorPrivate_ {
   std::shared_ptr tensor;
+  GArrowBuffer *buffer;
 } GArrowTensorPrivate;
 
 enum {
   PROP_0,
-  PROP_TENSOR
+  PROP_TENSOR,
+  PROP_BUFFER
 };
 
 G_DEFINE_TYPE_WITH_PRIVATE(GArrowTensor, garrow_tensor, G_TYPE_OBJECT)
@@ -52,6 +54,19 @@ G_DEFINE_TYPE_WITH_PRIVATE(GArrowTensor, garrow_tensor, 
G_TYPE_OBJECT)
 #define GARROW_TENSOR_GET_PRIVATE(obj)   \
   (G_TYPE_INSTANCE_GET_PRIVATE((obj), GARROW_TYPE_TENSOR, GArrowTensorPrivate))
 
+static void
+garrow_tensor_dispose(GObject *object)
+{
+  auto priv = GARROW_TENSOR_GET_PRIVATE(object);
+
+  if (priv->buffer) {
+g_object_unref(priv->buffer);
+priv->buffer = nullptr;
+  }
+
+  G_OBJECT_CLASS(garrow_tensor_parent_class)->dispose(object);
+}
+
 static void
 garrow_tensor_finalize(GObject *object)
 {
@@ -64,9 +79,9 @@ garrow_tensor_finalize(GObject *object)
 
 static void
 garrow_tensor_set_property(GObject *object,
-  guint prop_id,
-  const GValue *value,
-  GParamSpec *pspec)
+   guint prop_id,
+   const GValue *value,
+   GParamSpec *pspec)
 {
   auto priv = GARROW_TENSOR_GET_PRIVATE(object);
 
@@ -75,6 +90,9 @@ garrow_tensor_set_property(GObject *object,
 priv->tensor =
   *static_cast 
*>(g_value_get_pointer(value));
 break;
+  case PROP_BUFFER:
+priv->buffer = GARROW_BUFFER(g_value_dup_object(value));
+break;
   default:
 G_OBJECT_WARN_INVALID_PROPERTY_ID(object, prop_id, pspec);
 break;
@@ -83,11 +101,16 @@ garrow_tensor_set_property(GObject *object,
 
 static void
 garrow_tensor_get_property(GObject *object,
-  guint prop_id,
-  GValue *value,
-  GParamSpec *pspec)
+   guint prop_id,
+   GValue *value,
+   GParamSpec *pspec)
 {
+  auto priv = GARROW_TENSOR_GET_PRIVATE(object);
+
   switch (prop_id) {
+  case PROP_BUFFER:
+g_value_set_object(value, priv->buffer);
+break;
   default:
 G_OBJECT_WARN_INVALID_PROPERTY_ID(object, prop_id, pspec);
 break;
@@ -106,6 +129,7 @@ garrow_tensor_class_init(GArrowTensorClass *klass)
 
   auto gobject_class = G_OBJECT_CLASS(klass);
 
+  gobject_class->dispose  = garrow_tensor_dispose;
   gobject_class->finalize = garrow_tensor_finalize;
   gobject_class->set_property = garrow_tensor_set_property;
   gobject_class->get_property = garrow_tensor_get_property;
@@ -116,6 +140,14 @@ garrow_tensor_class_init(GArrowTensorClass *klass)
   static_cast(G_PARAM_WRITABLE |

G_PARAM_CONSTRUCT_ONLY));
   g_object_class_install_property(gobject_class, PROP_TENSOR, spec);
+
+  spec = g_param_spec_object("buffer",
+ "Buffer",
+ "The data",
+ GARROW_TYPE_BUFFER,
+ static_cast(G_PARAM_READWRITE |
+  G_PARAM_CONSTRUCT_ONLY));
+  g_object_class_install_property(gobject_class, PROP_BUFFER, spec);
 }
 
 /**
@@ -166,7 +198,7 @@ garrow_tensor_new(GArrowDataType *data_type,
 arrow_shape,
   

[jira] [Resolved] (ARROW-2251) [GLib] Destroying GArrowBuffer while GArrowTensor that uses the buffer causes a crash

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-2251.
-
Resolution: Fixed

Issue resolved by pull request 1691
[https://github.com/apache/arrow/pull/1691]

> [GLib] Destroying GArrowBuffer while GArrowTensor that uses the buffer causes 
> a crash
> -
>
> Key: ARROW-2251
> URL: https://issues.apache.org/jira/browse/ARROW-2251
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: GLib
>Affects Versions: 0.8.0
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2251) [GLib] Destroying GArrowBuffer while GArrowTensor that uses the buffer causes a crash

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385595#comment-16385595
 ] 

ASF GitHub Bot commented on ARROW-2251:
---

kou commented on issue #1691: ARROW-2251: [GLib] Keep GArrowBuffer alive while 
GArrowTensor for the buffer is live
URL: https://github.com/apache/arrow/pull/1691#issuecomment-370307240
 
 
   Partially right. `shared_ptr` keeps `shared_ptr` alive but 
memory in the `shard_ptr` may be freed when the `shared_ptr` 
just refers external memory. It's caused by creating `shard_ptr` by 
`Arrow::Buffer.new("...data...")` in Ruby. (It creates `GArrowBuffer` in C.) 
The `"...data..."` should be alive while the `Arrow::Buffer` is alive. 
`shared_ptr` is only alive without this change. Both 
`shared_ptr` and `GArrowBuffer` are alive with this change. The 
`GArrowBuffer` should keep the data alive.
   
   I'll send one more pull request to improve memory management in 
`GArrowBuffer`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [GLib] Destroying GArrowBuffer while GArrowTensor that uses the buffer causes 
> a crash
> -
>
> Key: ARROW-2251
> URL: https://issues.apache.org/jira/browse/ARROW-2251
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: GLib
>Affects Versions: 0.8.0
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2254) [Python] Local in-place dev versions picking up JS tags

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2254:

Fix Version/s: (was: 0.9.0)
   0.10.0

> [Python] Local in-place dev versions picking up JS tags
> ---
>
> Key: ARROW-2254
> URL: https://issues.apache.org/jira/browse/ARROW-2254
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.10.0
>
>
> I thought we had fixed this bug, but it's back:
> {code}
> $ ipython
> Python 3.5.2 | packaged by conda-forge | (default, Jul 26 2016, 01:32:08) 
> Type 'copyright', 'credits' or 'license' for more information
> IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.
> In [1]: pa.__version__
> Out[1]: '0.3.1.dev52+g8b1c8118'
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2252) [Python] Create buffer from address, size and base

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385592#comment-16385592
 ] 

ASF GitHub Bot commented on ARROW-2252:
---

wesm closed pull request #1693: ARROW-2252: [Python] Create buffer from 
address, size and base
URL: https://github.com/apache/arrow/pull/1693
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/python/pyarrow/__init__.py b/python/pyarrow/__init__.py
index 15a37ca10..8cb4b3b9b 100644
--- a/python/pyarrow/__init__.py
+++ b/python/pyarrow/__init__.py
@@ -72,8 +72,8 @@
 from pyarrow.lib import TimestampType
 
 # Buffers, allocation
-from pyarrow.lib import (Buffer, ResizableBuffer, compress, decompress,
- allocate_buffer, frombuffer)
+from pyarrow.lib import (Buffer, ForeignBuffer, ResizableBuffer, compress,
+ decompress, allocate_buffer, frombuffer)
 
 from pyarrow.lib import (MemoryPool, total_allocated_bytes,
  set_memory_pool, default_memory_pool,
diff --git a/python/pyarrow/io.pxi b/python/pyarrow/io.pxi
index 325c5827f..5c8411be4 100644
--- a/python/pyarrow/io.pxi
+++ b/python/pyarrow/io.pxi
@@ -720,6 +720,18 @@ cdef class Buffer:
 return self.size
 
 
+cdef class ForeignBuffer(Buffer):
+
+def __init__(self, addr, size, base):
+cdef:
+intptr_t c_addr = addr
+int64_t c_size = size
+self.base = base
+cdef shared_ptr[CBuffer] buffer = make_shared[CBuffer](
+c_addr, c_size)
+self.init( buffer)
+
+
 cdef class ResizableBuffer(Buffer):
 
 cdef void init_rz(self, const shared_ptr[CResizableBuffer]& buffer):
diff --git a/python/pyarrow/lib.pxd b/python/pyarrow/lib.pxd
index e4d574f18..c37bc2beb 100644
--- a/python/pyarrow/lib.pxd
+++ b/python/pyarrow/lib.pxd
@@ -324,6 +324,11 @@ cdef class Buffer:
 cdef int _check_nullptr(self) except -1
 
 
+cdef class ForeignBuffer(Buffer):
+cdef:
+object base
+
+
 cdef class ResizableBuffer(Buffer):
 
 cdef void init_rz(self, const shared_ptr[CResizableBuffer]& buffer)
diff --git a/python/pyarrow/tests/test_io.py b/python/pyarrow/tests/test_io.py
index d269ad0e7..17aca4333 100644
--- a/python/pyarrow/tests/test_io.py
+++ b/python/pyarrow/tests/test_io.py
@@ -24,6 +24,7 @@
 import weakref
 
 import numpy as np
+import numpy.testing as npt
 
 import pandas as pd
 
@@ -253,6 +254,14 @@ def test_buffer_equals():
 assert buf2.equals(buf5)
 
 
+def test_foreign_buffer():
+n = np.array([1, 2])
+addr = n.__array_interface__["data"][0]
+size = n.nbytes
+fb = pa.ForeignBuffer(addr, size, n)
+npt.assert_array_equal(np.asarray(fb), n.view(dtype=np.int8))
+
+
 def test_allocate_buffer():
 buf = pa.allocate_buffer(100)
 assert buf.size == 100


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Python] Create buffer from address, size and base
> --
>
> Key: ARROW-2252
> URL: https://issues.apache.org/jira/browse/ARROW-2252
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Python
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Given a memory address and a size, we should be able to construct an Arrow 
> buffer from this. The additional base object will be used to hold a reference 
> to the underlying, original buffer so that it does not go out of scope before 
> the Arrow buffer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-2252) [Python] Create buffer from address, size and base

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-2252.
-
Resolution: Fixed

Issue resolved by pull request 1693
[https://github.com/apache/arrow/pull/1693]

> [Python] Create buffer from address, size and base
> --
>
> Key: ARROW-2252
> URL: https://issues.apache.org/jira/browse/ARROW-2252
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Python
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Given a memory address and a size, we should be able to construct an Arrow 
> buffer from this. The additional base object will be used to hold a reference 
> to the underlying, original buffer so that it does not go out of scope before 
> the Arrow buffer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2258) [C++] Appveyor builds failing on master

2018-03-04 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-2258:
---

 Summary: [C++] Appveyor builds failing on master
 Key: ARROW-2258
 URL: https://issues.apache.org/jira/browse/ARROW-2258
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Wes McKinney
 Fix For: 0.9.0


See 
https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/build/1.0.5563



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2257) [C++] Add high-level option to toggle CXX11 ABI

2018-03-04 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-2257:
---

 Summary: [C++] Add high-level option to toggle CXX11 ABI
 Key: ARROW-2257
 URL: https://issues.apache.org/jira/browse/ARROW-2257
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Wes McKinney
 Fix For: 0.9.0


Using gcc-4.8-based toolchain libraries from conda-forge I ran into the 
following failure when building on Ubuntu 16.04 with clang-5.0

{code}
[48/48] Linking CXX executable debug/python-test
FAILED: debug/python-test 
: && /usr/bin/ccache /usr/bin/clang++-5.0  -ggdb -O0  -Weverything 
-Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-deprecated -Wno-weak-vtables 
-Wno-padded -Wno-comma -Wno-unused-parameter -Wno-unused-template -Wno-undef 
-Wno-shadow -Wno-switch-enum -Wno-exit-time-destructors 
-Wno-global-constructors -Wno-weak-template-vtables 
-Wno-undefined-reinterpret-cast -Wno-implicit-fallthrough 
-Wno-unreachable-code-return -Wno-float-equal -Wno-missing-prototypes 
-Wno-old-style-cast -Wno-covered-switch-default -Wno-cast-align 
-Wno-vla-extension -Wno-shift-sign-overflow -Wno-used-but-marked-unused 
-Wno-missing-variable-declarations -Wno-gnu-zero-variadic-macro-arguments 
-Wconversion -Wno-sign-conversion -Wno-disabled-macro-expansion 
-Wno-gnu-folding-constant -Wno-reserved-id-macro -Wno-range-loop-analysis 
-Wno-double-promotion -Wno-undefined-func-template 
-Wno-zero-as-null-pointer-constant -Wno-unknown-warning-option -Werror 
-std=c++11 -msse3 -maltivec -Werror -D_GLIBCXX_USE_CXX11_ABI=0 
-Qunused-arguments  -fsanitize=address -DADDRESS_SANITIZER 
-fsanitize-coverage=trace-pc-guard -g  -rdynamic 
src/arrow/python/CMakeFiles/python-test.dir/python-test.cc.o  -o 
debug/python-test  
-Wl,-rpath,/home/wesm/code/arrow/cpp/build/debug:/home/wesm/miniconda/envs/arrow-dev/lib:/home/wesm/cpp-toolchain/lib
 debug/libarrow_python_test_main.a debug/libarrow_python.a 
debug/libarrow.so.0.0.0 
/home/wesm/miniconda/envs/arrow-dev/lib/libpython3.6m.so 
/home/wesm/cpp-toolchain/lib/libgtest.a -lpthread -ldl 
orc_ep-install/lib/liborc.a /home/wesm/cpp-toolchain/lib/libprotobuf.a 
/home/wesm/cpp-toolchain/lib/libzstd.a /home/wesm/cpp-toolchain/lib/libz.a 
/home/wesm/cpp-toolchain/lib/libsnappy.a /home/wesm/cpp-toolchain/lib/liblz4.a 
/home/wesm/cpp-toolchain/lib/libbrotlidec-static.a 
/home/wesm/cpp-toolchain/lib/libbrotlienc-static.a 
/home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a -lpthread 
-Wl,-rpath-link,/home/wesm/cpp-toolchain/lib && :
debug/libarrow.so.0.0.0: undefined reference to 
`orc::ParseError::ParseError(std::string const&)'
debug/libarrow.so.0.0.0: undefined reference to 
`google::protobuf::io::CodedOutputStream::WriteStringWithSizeToArray(std::__cxx11::basic_string, std::allocator > const&, unsigned char*)'
debug/libarrow.so.0.0.0: undefined reference to 
`google::protobuf::internal::WireFormatLite::WriteStringMaybeAliased(int, 
std::__cxx11::basic_string, std::allocator > 
const&, google::protobuf::io::CodedOutputStream*)'
debug/libarrow.so.0.0.0: undefined reference to 
`google::protobuf::internal::fixed_address_empty_string[abi:cxx11]'
debug/libarrow.so.0.0.0: undefined reference to 
`google::protobuf::internal::WireFormatLite::ReadBytes(google::protobuf::io::CodedInputStream*,
 std::__cxx11::basic_string, std::allocator 
>*)'
debug/libarrow.so.0.0.0: undefined reference to 
`google::protobuf::Message::GetTypeName[abi:cxx11]() const'
debug/libarrow.so.0.0.0: undefined reference to 
`google::protobuf::Message::InitializationErrorString[abi:cxx11]() const'
debug/libarrow.so.0.0.0: undefined reference to 
`google::protobuf::MessageLite::SerializeToString(std::__cxx11::basic_string, std::allocator >*) const'
debug/libarrow.so.0.0.0: undefined reference to 
`google::protobuf::internal::WireFormatLite::WriteString(int, 
std::__cxx11::basic_string, std::allocator > 
const&, google::protobuf::io::CodedOutputStream*)'
debug/libarrow.so.0.0.0: undefined reference to 
`google::protobuf::MessageFactory::InternalRegisterGeneratedFile(char const*, 
void (*)(std::__cxx11::basic_string, 
std::allocator > const&))'
debug/libarrow.so.0.0.0: undefined reference to 
`google::protobuf::internal::WireFormatLite::WriteBytesMaybeAliased(int, 
std::__cxx11::basic_string, std::allocator > 
const&, google::protobuf::io::CodedOutputStream*)'
debug/libarrow.so.0.0.0: undefined reference to 
`google::protobuf::internal::AssignDescriptors(std::__cxx11::basic_string, std::allocator > const&, 
google::protobuf::internal::MigrationSchema const*, google::protobuf::Message 
const* const*, unsigned int const*, google::protobuf::MessageFactory*, 
google::protobuf::Metadata*, google::protobuf::EnumDescriptor const**, 
google::protobuf::ServiceDescriptor const**)'
debug/libarrow.so.0.0.0: undefined reference to 
`google::protobuf::MessageLite::ParseFromString(std::__cxx11::basic_string, std::allocator > const&)'
debug/libar

[jira] [Updated] (ARROW-2256) [C++] Fuzzer builds fail out of the box on Ubuntu 16.04 using LLVM apt repos

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2256:

Description: 
I did a clean upgrade to 16.04 on one of my machine and ran into the problem 
described here:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=866087

I think this can be resolved temporarily by symlinking the static library, but 
we should document the problem so other devs know what to do when it happens

  was:
I did a clean upgrade to 16.04 on one of my machine and ran into the problem 
described here:



> [C++] Fuzzer builds fail out of the box on Ubuntu 16.04 using LLVM apt repos
> 
>
> Key: ARROW-2256
> URL: https://issues.apache.org/jira/browse/ARROW-2256
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.9.0
>
>
> I did a clean upgrade to 16.04 on one of my machine and ran into the problem 
> described here:
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=866087
> I think this can be resolved temporarily by symlinking the static library, 
> but we should document the problem so other devs know what to do when it 
> happens



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2256) [C++] Fuzzer builds fail out of the box on Ubuntu 16.04 using LLVM apt repos

2018-03-04 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-2256:
---

 Summary: [C++] Fuzzer builds fail out of the box on Ubuntu 16.04 
using LLVM apt repos
 Key: ARROW-2256
 URL: https://issues.apache.org/jira/browse/ARROW-2256
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Wes McKinney
 Fix For: 0.9.0


I did a clean upgrade to 16.04 on one of my machine and ran into the problem 
described here:




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2167) [C++] Building Orc extensions fails with the default BUILD_WARNING_LEVEL=Production

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2167:

Issue Type: Bug  (was: Improvement)

> [C++] Building Orc extensions fails with the default 
> BUILD_WARNING_LEVEL=Production
> ---
>
> Key: ARROW-2167
> URL: https://issues.apache.org/jira/browse/ARROW-2167
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.8.0
>Reporter: Phillip Cloud
>Priority: Major
> Fix For: 0.9.0
>
>
> Building orc_ep fails because there are a bunch of upstream warnings like not 
> providing {{override}} on virtual destructor subclasses, and using {{0}} as 
> the {{nullptr}} constant and the default {{BUILD_WARNING_LEVEL}} is 
> {{Production}} which includes {{-Wall}} (all warnings as errors).
> I see that there are different possible options for {{BUILD_WARNING_LEVEL}} 
> so it's possible for developers to deal with this issue.
> It seems easier to let EPs build with whatever the default warning level is 
> for the project rather than force our defaults on those projects.
> Generally speaking, are we using our own CXX_FLAGS for EPs other than Orc?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-906) [C++] Serialize Field metadata to IPC metadata

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-906:
---
Fix Version/s: (was: 0.9.0)
   0.10.0

> [C++] Serialize Field metadata to IPC metadata
> --
>
> Key: ARROW-906
> URL: https://issues.apache.org/jira/browse/ARROW-906
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.10.0
>
>
> Follow up work to ARROW-898



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2150) [Python] array equality defaults to identity

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2150:

Issue Type: Bug  (was: Improvement)

> [Python] array equality defaults to identity
> 
>
> Key: ARROW-2150
> URL: https://issues.apache.org/jira/browse/ARROW-2150
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.8.0
>Reporter: Antoine Pitrou
>Priority: Minor
> Fix For: 0.9.0
>
>
> I'm not sure this is deliberate, but it doesn't look very desirable to me:
> {code}
> >>> pa.array([1,2,3], type=pa.int32()) == pa.array([1,2,3], type=pa.int32())
> False
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-1954) [Python] Add metadata accessor to pyarrow.Field

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-1954:

Fix Version/s: (was: 0.9.0)
   0.10.0

> [Python] Add metadata accessor to pyarrow.Field
> ---
>
> Key: ARROW-1954
> URL: https://issues.apache.org/jira/browse/ARROW-1954
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.10.0
>
>
> Depends on ARROW-906 for this data to survive IPC roundtrip



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2255) Serialize schema- and field-level custom metadata in integration test JSON format

2018-03-04 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-2255:
---

 Summary: Serialize schema- and field-level custom metadata in 
integration test JSON format
 Key: ARROW-2255
 URL: https://issues.apache.org/jira/browse/ARROW-2255
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++, Java - Vectors
Reporter: Wes McKinney


I don't believe we are doing this at present. We should validate that each 
implementation properly handles the incoming metadata from other Arrow emitters



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1982) [Python] Return parquet statistics min/max as values instead of strings

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385404#comment-16385404
 ] 

ASF GitHub Bot commented on ARROW-1982:
---

wesm opened a new pull request #1698: ARROW-1982: [Python] Coerce Parquet 
statistics as bytes to more useful Python scalar types
URL: https://github.com/apache/arrow/pull/1698
 
 
   I also changed the BYTE_ARRAY, FIXED_LEN_BYTE_ARRAY to return bytes since 
decoding from binary to UTF8 unicode didn't seem correct to me as the default 
behavior


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Python] Return parquet statistics min/max as values instead of strings
> ---
>
> Key: ARROW-1982
> URL: https://issues.apache.org/jira/browse/ARROW-1982
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Jim Crist
>Assignee: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Currently `min` and `max` column statistics are returned as formatted strings 
> of the _physical type_. This makes using them in python a bit tricky, as the 
> strings need to be parsed as the proper _logical type_. Observe:
> {code}
> In [20]: import pandas as pd
> In [21]: df = pd.DataFrame({'a': [1, 2, 3],
> ...:'b': ['a', 'b', 'c'],
> ...:'c': [pd.Timestamp('1991-01-01')]*3})
> ...:
> In [22]: df.to_parquet('temp.parquet', engine='pyarrow')
> In [23]: from pyarrow import parquet as pq
> In [24]: f = pq.ParquetFile('temp.parquet')
> In [25]: rg = f.metadata.row_group(0)
> In [26]: rg.column(0).statistics.min  # string instead of integer
> Out[26]: '1'
> In [27]: rg.column(1).statistics.min  # weird space added after value due to 
> formatter
> Out[27]: 'a '
> In [28]: rg.column(2).statistics.min  # formatted as physical type (int) 
> instead of logical (datetime)
> Out[28]: '66268800'
> {code}
> Since the type information is known, it should be possible to convert these 
> to arrow values instead of strings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-1982) [Python] Return parquet statistics min/max as values instead of strings

2018-03-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-1982:
--
Labels: pull-request-available  (was: )

> [Python] Return parquet statistics min/max as values instead of strings
> ---
>
> Key: ARROW-1982
> URL: https://issues.apache.org/jira/browse/ARROW-1982
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Jim Crist
>Assignee: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Currently `min` and `max` column statistics are returned as formatted strings 
> of the _physical type_. This makes using them in python a bit tricky, as the 
> strings need to be parsed as the proper _logical type_. Observe:
> {code}
> In [20]: import pandas as pd
> In [21]: df = pd.DataFrame({'a': [1, 2, 3],
> ...:'b': ['a', 'b', 'c'],
> ...:'c': [pd.Timestamp('1991-01-01')]*3})
> ...:
> In [22]: df.to_parquet('temp.parquet', engine='pyarrow')
> In [23]: from pyarrow import parquet as pq
> In [24]: f = pq.ParquetFile('temp.parquet')
> In [25]: rg = f.metadata.row_group(0)
> In [26]: rg.column(0).statistics.min  # string instead of integer
> Out[26]: '1'
> In [27]: rg.column(1).statistics.min  # weird space added after value due to 
> formatter
> Out[27]: 'a '
> In [28]: rg.column(2).statistics.min  # formatted as physical type (int) 
> instead of logical (datetime)
> Out[28]: '66268800'
> {code}
> Since the type information is known, it should be possible to convert these 
> to arrow values instead of strings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2195) [Plasma] Segfault when retrieving RecordBatch from plasma store

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2195:

Issue Type: Bug  (was: Improvement)

> [Plasma] Segfault when retrieving RecordBatch from plasma store
> ---
>
> Key: ARROW-2195
> URL: https://issues.apache.org/jira/browse/ARROW-2195
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Philipp Moritz
>Priority: Major
> Fix For: 0.9.0
>
>
> It can be reproduced with the following script:
> {code:python}
> import pyarrow as pa
> import pyarrow.plasma as plasma
> def retrieve1():
> client = plasma.connect('test', "", 0)
> key = "keynumber1keynumber1"
> pid = plasma.ObjectID(bytearray(key,'UTF-8'))
> [buff] = client .get_buffers([pid])
> batch = pa.RecordBatchStreamReader(buff).read_next_batch()
> print(batch)
> print(batch.schema)
> print(batch[0])
> return batch
> client = plasma.connect('test', "", 0)
> test1 = [1, 12, 23, 3, 21, 34]
> test1 = pa.array(test1, pa.int32())
> batch = pa.RecordBatch.from_arrays([test1], ['FIELD1'])
> key = "keynumber1keynumber1"
> pid = plasma.ObjectID(bytearray(key,'UTF-8'))
> sink = pa.MockOutputStream()
> stream_writer = pa.RecordBatchStreamWriter(sink, batch.schema)
> stream_writer.write_batch(batch)
> stream_writer.close()
> bff = client.create(pid, sink.size())
> stream = pa.FixedSizeBufferWriter(bff)
> writer = pa.RecordBatchStreamWriter(stream, batch.schema)
> writer.write_batch(batch)
> client.seal(pid)
> batch = retrieve1()
> print(batch)
> print(batch.schema)
> print(batch[0])
> {code}
>  
> Preliminary backtrace:
>  
> {code}
> CESS (code=1, address=0x38158)
>     frame #0: 0x00010e6457fc 
> lib.so`__pyx_pw_7pyarrow_3lib_10Int32Value_1as_py(_object*, _object*) + 28
> lib.so`__pyx_pw_7pyarrow_3lib_10Int32Value_1as_py:
> ->  0x10e6457fc <+28>: movslq (%rdx,%rcx,4), %rdi
>     0x10e645800 <+32>: callq  0x10e698170               ; symbol stub for: 
> PyInt_FromLong
>     0x10e645805 <+37>: testq  %rax, %rax
>     0x10e645808 <+40>: je     0x10e64580c               ; <+44>
> (lldb) bt
>  * thread #1: tid = 0xf1378e, 0x00010e6457fc 
> lib.so`__pyx_pw_7pyarrow_3lib_10Int32Value_1as_py(_object*, _object*) + 28, 
> queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, 
> address=0x38158)
>   * frame #0: 0x00010e6457fc 
> lib.so`__pyx_pw_7pyarrow_3lib_10Int32Value_1as_py(_object*, _object*) + 28
>     frame #1: 0x00010e5ccd35 lib.so`__Pyx_PyObject_CallNoArg(_object*) + 
> 133
>     frame #2: 0x00010e613b25 
> lib.so`__pyx_pw_7pyarrow_3lib_10ArrayValue_3__repr__(_object*) + 933
>     frame #3: 0x00010c2f83bc libpython2.7.dylib`PyObject_Repr + 60
>     frame #4: 0x00010c35f651 libpython2.7.dylib`PyEval_EvalFrameEx + 22305
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2254) [Python] Local in-place dev versions picking up JS tags

2018-03-04 Thread Uwe L. Korn (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385383#comment-16385383
 ] 

Uwe L. Korn commented on ARROW-2254:


I think, I have found the missing option, will make a PR later.

> [Python] Local in-place dev versions picking up JS tags
> ---
>
> Key: ARROW-2254
> URL: https://issues.apache.org/jira/browse/ARROW-2254
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.9.0
>
>
> I thought we had fixed this bug, but it's back:
> {code}
> $ ipython
> Python 3.5.2 | packaged by conda-forge | (default, Jul 26 2016, 01:32:08) 
> Type 'copyright', 'credits' or 'license' for more information
> IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.
> In [1]: pa.__version__
> Out[1]: '0.3.1.dev52+g8b1c8118'
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-1929) [C++] Move various Arrow testing utility code from Parquet to Arrow codebase

2018-03-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-1929:
--
Labels: pull-request-available  (was: )

> [C++] Move various Arrow testing utility code from Parquet to Arrow codebase
> 
>
> Key: ARROW-1929
> URL: https://issues.apache.org/jira/browse/ARROW-1929
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> see https://github.com/apache/parquet-cpp/pull/426 and comments within



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1929) [C++] Move various Arrow testing utility code from Parquet to Arrow codebase

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385382#comment-16385382
 ] 

ASF GitHub Bot commented on ARROW-1929:
---

wesm opened a new pull request #1697: ARROW-1929: [C++] Copy over testing 
utility code from PARQUET-1092
URL: https://github.com/apache/arrow/pull/1697
 
 
   This code was introduced in parquet-cpp in 
https://github.com/apache/parquet-cpp/pull/426


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++] Move various Arrow testing utility code from Parquet to Arrow codebase
> 
>
> Key: ARROW-1929
> URL: https://issues.apache.org/jira/browse/ARROW-1929
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> see https://github.com/apache/parquet-cpp/pull/426 and comments within



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2238) [C++] Detect clcache in cmake configuration

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385378#comment-16385378
 ] 

ASF GitHub Bot commented on ARROW-2238:
---

MaxRis commented on issue #1684: ARROW-2238: [C++] Detect and use clcache in 
cmake configuration
URL: https://github.com/apache/arrow/pull/1684#issuecomment-370263126
 
 
   @pitrou I will check if that resolves Appveyor build failure and let you 
know, thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++] Detect clcache in cmake configuration
> ---
>
> Key: ARROW-2238
> URL: https://issues.apache.org/jira/browse/ARROW-2238
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Antoine Pitrou
>Assignee: Antoine Pitrou
>Priority: Minor
>  Labels: pull-request-available
>
> By default Windows builds should use clcache if installed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2238) [C++] Detect clcache in cmake configuration

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385375#comment-16385375
 ] 

ASF GitHub Bot commented on ARROW-2238:
---

pitrou commented on issue #1684: ARROW-2238: [C++] Detect and use clcache in 
cmake configuration
URL: https://github.com/apache/arrow/pull/1684#issuecomment-370262825
 
 
   @MaxRis that sounds ok to me.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++] Detect clcache in cmake configuration
> ---
>
> Key: ARROW-2238
> URL: https://issues.apache.org/jira/browse/ARROW-2238
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Antoine Pitrou
>Assignee: Antoine Pitrou
>Priority: Minor
>  Labels: pull-request-available
>
> By default Windows builds should use clcache if installed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2199) [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is never less than 1 and propagate density throughout the vector tree

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385368#comment-16385368
 ] 

ASF GitHub Bot commented on ARROW-2199:
---

siddharthteotia commented on issue #1646: ARROW-2199: [JAVA] Control the memory 
allocated for inner vectors in containers.
URL: https://github.com/apache/arrow/pull/1646#issuecomment-370262328
 
 
   Addressed review comments.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is 
> never less than 1 and propagate density throughout the vector tree
> ---
>
> Key: ARROW-2199
> URL: https://issues.apache.org/jira/browse/ARROW-2199
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java - Vectors
>Reporter: Siddharth Teotia
>Assignee: Siddharth Teotia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-2246) [Python] Use namespaced boost in manylinux1 package

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-2246.
-
Resolution: Fixed

Done in 
https://github.com/apache/arrow/commit/8b1c8118b017a941f0102709d72df7e5a9783aa4

> [Python] Use namespaced boost in manylinux1 package
> ---
>
> Key: ARROW-2246
> URL: https://issues.apache.org/jira/browse/ARROW-2246
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Blocker
> Fix For: 0.9.0
>
>
> Boost provides the functionality to generate a namespaced copy of all its 
> implementations. This means that you can have a private copy of Boost in your 
> library that will not come into conflict with other Boost installations in 
> your setting. While for e.g. conda-forge a good ecosystem exists that 
> provides the unique Boost version, in the setting of the manylinux1 wheels we 
> have no control over which other Boost version exists.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2254) [Python] Local in-place dev versions picking up JS tags

2018-03-04 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385361#comment-16385361
 ] 

Wes McKinney commented on ARROW-2254:
-

We can also not support development with {{build_ext --inplace}} but if it is 
not too difficult, it would be nice

> [Python] Local in-place dev versions picking up JS tags
> ---
>
> Key: ARROW-2254
> URL: https://issues.apache.org/jira/browse/ARROW-2254
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.9.0
>
>
> I thought we had fixed this bug, but it's back:
> {code}
> $ ipython
> Python 3.5.2 | packaged by conda-forge | (default, Jul 26 2016, 01:32:08) 
> Type 'copyright', 'credits' or 'license' for more information
> IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.
> In [1]: pa.__version__
> Out[1]: '0.3.1.dev52+g8b1c8118'
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2199) [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is never less than 1 and propagate density throughout the vector tree

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385363#comment-16385363
 ] 

ASF GitHub Bot commented on ARROW-2199:
---

siddharthteotia commented on a change in pull request #1646: ARROW-2199: [JAVA] 
Control the memory allocated for inner vectors in containers.
URL: https://github.com/apache/arrow/pull/1646#discussion_r172063380
 
 

 ##
 File path: 
java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java
 ##
 @@ -174,14 +174,16 @@ public void setInitialCapacity(int valueCount) {
* @param valueCount desired number of elements in the vector
* @param density average number of bytes per variable width element
*/
+  @Override
   public void setInitialCapacity(int valueCount, double density) {
-final long size = (long) (valueCount * density);
-if (size < 1) {
-  throw new IllegalArgumentException("With the provided density and value 
count, potential capacity of the data buffer is 0");
-}
+long size = (long) (valueCount * density);
 if (size > MAX_ALLOCATION_SIZE) {
   throw new OversizedAllocationException("Requested amount of memory is 
more than max allowed");
 }
+
+if(size == 0) {
+  size = 1;
+}
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is 
> never less than 1 and propagate density throughout the vector tree
> ---
>
> Key: ARROW-2199
> URL: https://issues.apache.org/jira/browse/ARROW-2199
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java - Vectors
>Reporter: Siddharth Teotia
>Assignee: Siddharth Teotia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2254) [Python] Local in-place dev versions picking up JS tags

2018-03-04 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385359#comment-16385359
 ] 

Wes McKinney commented on ARROW-2254:
-

It's the same for me -- this is for in-place builds rather than installs, so we 
need to put the setuptools_scm version resolution code someplace where it can 
be used in pyarrow/__init__.py

> [Python] Local in-place dev versions picking up JS tags
> ---
>
> Key: ARROW-2254
> URL: https://issues.apache.org/jira/browse/ARROW-2254
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.9.0
>
>
> I thought we had fixed this bug, but it's back:
> {code}
> $ ipython
> Python 3.5.2 | packaged by conda-forge | (default, Jul 26 2016, 01:32:08) 
> Type 'copyright', 'credits' or 'license' for more information
> IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.
> In [1]: pa.__version__
> Out[1]: '0.3.1.dev52+g8b1c8118'
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2199) [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is never less than 1 and propagate density throughout the vector tree

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385356#comment-16385356
 ] 

ASF GitHub Bot commented on ARROW-2199:
---

siddharthteotia commented on a change in pull request #1646: ARROW-2199: [JAVA] 
Control the memory allocated for inner vectors in containers.
URL: https://github.com/apache/arrow/pull/1646#discussion_r172062968
 
 

 ##
 File path: 
java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java
 ##
 @@ -174,14 +174,16 @@ public void setInitialCapacity(int valueCount) {
* @param valueCount desired number of elements in the vector
* @param density average number of bytes per variable width element
*/
+  @Override
   public void setInitialCapacity(int valueCount, double density) {
-final long size = (long) (valueCount * density);
-if (size < 1) {
-  throw new IllegalArgumentException("With the provided density and value 
count, potential capacity of the data buffer is 0");
-}
+long size = (long) (valueCount * density);
 if (size > MAX_ALLOCATION_SIZE) {
   throw new OversizedAllocationException("Requested amount of memory is 
more than max allowed");
 }
+
+if(size == 0) {
 
 Review comment:
   Yes we cannot have an initial capacity of 0 because then our safe* functions 
run into an infinite loop where they try to realloc and have the target buffer 
size as next power of 2 -- BaseAllocator.nextPowerOfTwo returns 0 for 0 and 
thus safe functions keep calling realloc.
   
   This happens if the initial capacity was 0.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is 
> never less than 1 and propagate density throughout the vector tree
> ---
>
> Key: ARROW-2199
> URL: https://issues.apache.org/jira/browse/ARROW-2199
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java - Vectors
>Reporter: Siddharth Teotia
>Assignee: Siddharth Teotia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2199) [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is never less than 1 and propagate density throughout the vector tree

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385355#comment-16385355
 ] 

ASF GitHub Bot commented on ARROW-2199:
---

siddharthteotia commented on a change in pull request #1646: ARROW-2199: [JAVA] 
Control the memory allocated for inner vectors in containers.
URL: https://github.com/apache/arrow/pull/1646#discussion_r172063231
 
 

 ##
 File path: java/vector/src/main/codegen/templates/UnionVector.java
 ##
 @@ -282,6 +282,7 @@ private void reallocTypeBuffer() {
 
 long newAllocationSize = baseSize * 2L;
 newAllocationSize = BaseAllocator.nextPowerOfTwo(newAllocationSize);
+newAllocationSize = Math.max(newAllocationSize, 1);
 
 Review comment:
   Now that setInitialCapacity is safeguarded to not allow an initial capacity 
less than 1, we may not hit this case but I think it is better to do the check 
in realloc as well -- else we will run in infinite loop.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is 
> never less than 1 and propagate density throughout the vector tree
> ---
>
> Key: ARROW-2199
> URL: https://issues.apache.org/jira/browse/ARROW-2199
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java - Vectors
>Reporter: Siddharth Teotia
>Assignee: Siddharth Teotia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2199) [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is never less than 1 and propagate density throughout the vector tree

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385352#comment-16385352
 ] 

ASF GitHub Bot commented on ARROW-2199:
---

siddharthteotia commented on a change in pull request #1646: ARROW-2199: [JAVA] 
Control the memory allocated for inner vectors in containers.
URL: https://github.com/apache/arrow/pull/1646#discussion_r172063146
 
 

 ##
 File path: 
java/vector/src/test/java/org/apache/arrow/vector/TestListVector.java
 ##
 @@ -810,15 +810,6 @@ public void testSetInitialCapacity() {
   vector.allocateNew();
   assertEquals(512, vector.getValueCapacity());
   assertEquals(8, vector.getDataVector().getValueCapacity());
-
-  boolean error = false;
-  try {
-vector.setInitialCapacity(5, 0.1);
 
 Review comment:
   No earlier we were throwing IllegalStateException but that shouldn't be 
done. Now we take a max and set it to 1.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is 
> never less than 1 and propagate density throughout the vector tree
> ---
>
> Key: ARROW-2199
> URL: https://issues.apache.org/jira/browse/ARROW-2199
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java - Vectors
>Reporter: Siddharth Teotia
>Assignee: Siddharth Teotia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1491) [C++] Add casting implementations from strings to numbers or boolean

2018-03-04 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385353#comment-16385353
 ] 

Wes McKinney commented on ARROW-1491:
-

[~cpcloud] this would be nice to have, but relative to the bug backlog for 
0.9.0 we could also defer this to the next release

> [C++] Add casting implementations from strings to numbers or boolean
> 
>
> Key: ARROW-1491
> URL: https://issues.apache.org/jira/browse/ARROW-1491
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Licht Takeuchi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2199) [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is never less than 1 and propagate density throughout the vector tree

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385354#comment-16385354
 ] 

ASF GitHub Bot commented on ARROW-2199:
---

siddharthteotia commented on a change in pull request #1646: ARROW-2199: [JAVA] 
Control the memory allocated for inner vectors in containers.
URL: https://github.com/apache/arrow/pull/1646#discussion_r172063162
 
 

 ##
 File path: 
java/vector/src/test/java/org/apache/arrow/vector/TestValueVector.java
 ##
 @@ -1933,15 +1933,6 @@ public void testSetInitialCapacity() {
   vector.allocateNew();
   assertEquals(4096, vector.getValueCapacity());
   assertEquals(64, vector.getDataBuffer().capacity());
-
-  boolean error = false;
-  try {
-vector.setInitialCapacity(5, 0.1);
 
 Review comment:
   We take a max and set it to 1 if needed. Exception handling is not needed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is 
> never less than 1 and propagate density throughout the vector tree
> ---
>
> Key: ARROW-2199
> URL: https://issues.apache.org/jira/browse/ARROW-2199
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java - Vectors
>Reporter: Siddharth Teotia
>Assignee: Siddharth Teotia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2199) [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is never less than 1 and propagate density throughout the vector tree

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385349#comment-16385349
 ] 

ASF GitHub Bot commented on ARROW-2199:
---

siddharthteotia commented on a change in pull request #1646: ARROW-2199: [JAVA] 
Control the memory allocated for inner vectors in containers.
URL: https://github.com/apache/arrow/pull/1646#discussion_r172063122
 
 

 ##
 File path: 
java/vector/src/main/java/org/apache/arrow/vector/complex/BaseRepeatedValueVector.java
 ##
 @@ -166,13 +168,23 @@ public void setInitialCapacity(int numRecords) {
*This helps in tightly controlling the memory we provision
*for inner data vector.
*/
+  @Override
   public void setInitialCapacity(int numRecords, double density) {
+if ((numRecords * density) >= Integer.MAX_VALUE) {
+  throw new OversizedAllocationException("Requested amount of memory is 
more than max allowed");
+}
 offsetAllocationSizeInBytes = (numRecords + 1) * OFFSET_WIDTH;
-final int innerValueCapacity = (int)(numRecords * density);
-if (innerValueCapacity < 1) {
-  throw new IllegalArgumentException("With the provided density and value 
count, potential value capacity for the data vector is 0");
+int innerValueCapacity = (int)(numRecords * density);
+
+if(innerValueCapacity == 0) {
+  innerValueCapacity = 1;
+}
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is 
> never less than 1 and propagate density throughout the vector tree
> ---
>
> Key: ARROW-2199
> URL: https://issues.apache.org/jira/browse/ARROW-2199
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java - Vectors
>Reporter: Siddharth Teotia
>Assignee: Siddharth Teotia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2244) [C++] Slicing NullArray should not cause the null count on the internal data to be unknown

2018-03-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-2244:
--
Labels: pull-request-available  (was: )

> [C++] Slicing NullArray should not cause the null count on the internal data 
> to be unknown
> --
>
> Key: ARROW-2244
> URL: https://issues.apache.org/jira/browse/ARROW-2244
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> see https://github.com/apache/arrow/blob/master/cpp/src/arrow/array.cc#L101



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2254) [Python] Local in-place dev versions picking up JS tags

2018-03-04 Thread Uwe L. Korn (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385350#comment-16385350
 ] 

Uwe L. Korn commented on ARROW-2254:


For reference, I get the following:
{code:java}
In [3]: import setuptools_scm.git
...: describe = setuptools_scm.git.DEFAULT_DESCRIBE + " --match 
'apache-arrow-[0-9]*'"
...: command = describe.replace("--match *.*", "")
...:

In [4]: command
Out[4]: "git describe --dirty --tags --long --match 'apache-arrow-[0-9]*'"

In [5]: !git describe --dirty --tags --long --match 'apache-arrow-[0-9]*'
apache-arrow-0.8.0-214-g4ff04cf-dirty{code}

> [Python] Local in-place dev versions picking up JS tags
> ---
>
> Key: ARROW-2254
> URL: https://issues.apache.org/jira/browse/ARROW-2254
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.9.0
>
>
> I thought we had fixed this bug, but it's back:
> {code}
> $ ipython
> Python 3.5.2 | packaged by conda-forge | (default, Jul 26 2016, 01:32:08) 
> Type 'copyright', 'credits' or 'license' for more information
> IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.
> In [1]: pa.__version__
> Out[1]: '0.3.1.dev52+g8b1c8118'
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2244) [C++] Slicing NullArray should not cause the null count on the internal data to be unknown

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385351#comment-16385351
 ] 

ASF GitHub Bot commented on ARROW-2244:
---

wesm opened a new pull request #1696: ARROW-2244: [C++] Add unit test to 
explicitly check that NullArray internal data set correctly in Slice operations
URL: https://github.com/apache/arrow/pull/1696
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++] Slicing NullArray should not cause the null count on the internal data 
> to be unknown
> --
>
> Key: ARROW-2244
> URL: https://issues.apache.org/jira/browse/ARROW-2244
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> see https://github.com/apache/arrow/blob/master/cpp/src/arrow/array.cc#L101



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2254) [Python] Local in-place dev versions picking up JS tags

2018-03-04 Thread Uwe L. Korn (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385348#comment-16385348
 ] 

Uwe L. Korn commented on ARROW-2254:


Please post the output from executing the value of {{command}}
{code:java}
import setuptools_scm.git
describe = setuptools_scm.git.DEFAULT_DESCRIBE + " --match 
'apache-arrow-[0-9]*'"
command = describe.replace("--match *.*", ""){code}

> [Python] Local in-place dev versions picking up JS tags
> ---
>
> Key: ARROW-2254
> URL: https://issues.apache.org/jira/browse/ARROW-2254
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.9.0
>
>
> I thought we had fixed this bug, but it's back:
> {code}
> $ ipython
> Python 3.5.2 | packaged by conda-forge | (default, Jul 26 2016, 01:32:08) 
> Type 'copyright', 'credits' or 'license' for more information
> IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.
> In [1]: pa.__version__
> Out[1]: '0.3.1.dev52+g8b1c8118'
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2199) [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is never less than 1 and propagate density throughout the vector tree

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385347#comment-16385347
 ] 

ASF GitHub Bot commented on ARROW-2199:
---

siddharthteotia commented on a change in pull request #1646: ARROW-2199: [JAVA] 
Control the memory allocated for inner vectors in containers.
URL: https://github.com/apache/arrow/pull/1646#discussion_r172062999
 
 

 ##
 File path: 
java/vector/src/main/java/org/apache/arrow/vector/DensityAwareVector.java
 ##
 @@ -0,0 +1,32 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.vector;
+
+/**
+ * Vector that support density aware initial capacity settings.
+ */
+public interface DensityAwareVector {
+  /**
+   * Set value with density
+   * @param valueCount
+   * @param density
+   */
+  void setInitialCapacity(int valueCount, double density);
+
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is 
> never less than 1 and propagate density throughout the vector tree
> ---
>
> Key: ARROW-2199
> URL: https://issues.apache.org/jira/browse/ARROW-2199
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java - Vectors
>Reporter: Siddharth Teotia
>Assignee: Siddharth Teotia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2199) [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is never less than 1 and propagate density throughout the vector tree

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385345#comment-16385345
 ] 

ASF GitHub Bot commented on ARROW-2199:
---

siddharthteotia commented on a change in pull request #1646: ARROW-2199: [JAVA] 
Control the memory allocated for inner vectors in containers.
URL: https://github.com/apache/arrow/pull/1646#discussion_r172062983
 
 

 ##
 File path: 
java/vector/src/main/java/org/apache/arrow/vector/DensityAwareVector.java
 ##
 @@ -0,0 +1,32 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.vector;
+
+/**
+ * Vector that support density aware initial capacity settings.
+ */
+public interface DensityAwareVector {
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is 
> never less than 1 and propagate density throughout the vector tree
> ---
>
> Key: ARROW-2199
> URL: https://issues.apache.org/jira/browse/ARROW-2199
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java - Vectors
>Reporter: Siddharth Teotia
>Assignee: Siddharth Teotia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2199) [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is never less than 1 and propagate density throughout the vector tree

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385344#comment-16385344
 ] 

ASF GitHub Bot commented on ARROW-2199:
---

siddharthteotia commented on a change in pull request #1646: ARROW-2199: [JAVA] 
Control the memory allocated for inner vectors in containers.
URL: https://github.com/apache/arrow/pull/1646#discussion_r172062968
 
 

 ##
 File path: 
java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java
 ##
 @@ -174,14 +174,16 @@ public void setInitialCapacity(int valueCount) {
* @param valueCount desired number of elements in the vector
* @param density average number of bytes per variable width element
*/
+  @Override
   public void setInitialCapacity(int valueCount, double density) {
-final long size = (long) (valueCount * density);
-if (size < 1) {
-  throw new IllegalArgumentException("With the provided density and value 
count, potential capacity of the data buffer is 0");
-}
+long size = (long) (valueCount * density);
 if (size > MAX_ALLOCATION_SIZE) {
   throw new OversizedAllocationException("Requested amount of memory is 
more than max allowed");
 }
+
+if(size == 0) {
 
 Review comment:
   Yes we cannot have an initial capacity of 0 because then our safe* functions 
runs into an infinite loop where they try to realloc and have the target buffer 
size as next power of 2 -- BaseAllocator.nextPowerOfTwo returns 0 for 0 and 
thus safe functions keep calling realloc.
   
   This happens if the initial capacity was 0.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is 
> never less than 1 and propagate density throughout the vector tree
> ---
>
> Key: ARROW-2199
> URL: https://issues.apache.org/jira/browse/ARROW-2199
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java - Vectors
>Reporter: Siddharth Teotia
>Assignee: Siddharth Teotia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2199) [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is never less than 1 and propagate density throughout the vector tree

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385343#comment-16385343
 ] 

ASF GitHub Bot commented on ARROW-2199:
---

siddharthteotia commented on a change in pull request #1646: ARROW-2199: [JAVA] 
Control the memory allocated for inner vectors in containers.
URL: https://github.com/apache/arrow/pull/1646#discussion_r172062884
 
 

 ##
 File path: 
java/vector/src/main/java/org/apache/arrow/vector/DensityAwareVector.java
 ##
 @@ -0,0 +1,32 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.vector;
+
+/**
+ * Vector that support density aware initial capacity settings.
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [JAVA] Follow up fixes for ARROW-2019. Ensure density driven capacity is 
> never less than 1 and propagate density throughout the vector tree
> ---
>
> Key: ARROW-2199
> URL: https://issues.apache.org/jira/browse/ARROW-2199
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Java - Vectors
>Reporter: Siddharth Teotia
>Assignee: Siddharth Teotia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2254) [Python] Local in-place dev versions picking up JS tags

2018-03-04 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385337#comment-16385337
 ] 

Wes McKinney commented on ARROW-2254:
-

Upgraded to 2.14.2 and seems still present

> [Python] Local in-place dev versions picking up JS tags
> ---
>
> Key: ARROW-2254
> URL: https://issues.apache.org/jira/browse/ARROW-2254
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.9.0
>
>
> I thought we had fixed this bug, but it's back:
> {code}
> $ ipython
> Python 3.5.2 | packaged by conda-forge | (default, Jul 26 2016, 01:32:08) 
> Type 'copyright', 'credits' or 'license' for more information
> IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.
> In [1]: pa.__version__
> Out[1]: '0.3.1.dev52+g8b1c8118'
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2254) [Python] Local in-place dev versions picking up JS tags

2018-03-04 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385335#comment-16385335
 ] 

Wes McKinney commented on ARROW-2254:
-

Ubuntu 14.04, git 2.12.2. If upgrading git solves the issue, we can close this

> [Python] Local in-place dev versions picking up JS tags
> ---
>
> Key: ARROW-2254
> URL: https://issues.apache.org/jira/browse/ARROW-2254
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.9.0
>
>
> I thought we had fixed this bug, but it's back:
> {code}
> $ ipython
> Python 3.5.2 | packaged by conda-forge | (default, Jul 26 2016, 01:32:08) 
> Type 'copyright', 'credits' or 'license' for more information
> IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.
> In [1]: pa.__version__
> Out[1]: '0.3.1.dev52+g8b1c8118'
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2254) [Python] Local in-place dev versions picking up JS tags

2018-03-04 Thread Uwe L. Korn (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385334#comment-16385334
 ] 

Uwe L. Korn commented on ARROW-2254:


On which OS is this and which version of git are you using?

> [Python] Local in-place dev versions picking up JS tags
> ---
>
> Key: ARROW-2254
> URL: https://issues.apache.org/jira/browse/ARROW-2254
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.9.0
>
>
> I thought we had fixed this bug, but it's back:
> {code}
> $ ipython
> Python 3.5.2 | packaged by conda-forge | (default, Jul 26 2016, 01:32:08) 
> Type 'copyright', 'credits' or 'license' for more information
> IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.
> In [1]: pa.__version__
> Out[1]: '0.3.1.dev52+g8b1c8118'
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-2228) [Python] Unsigned int type for arrow Table not supported

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-2228.
-
   Resolution: Duplicate
Fix Version/s: 0.9.0

Can confirmed this is fixed in master, will be part of 0.9.0 release

> [Python] Unsigned int type for arrow Table not supported
> 
>
> Key: ARROW-2228
> URL: https://issues.apache.org/jira/browse/ARROW-2228
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
> Environment: Ubuntu 16.04
> python3.6.3
>Reporter: Marcello
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.9.0
>
>
> Running this python one-liner
>  
> {code:java}
> // code pa.Table.from_pandas(pd.DataFrame({'foo': 
> [np.array([1000], dtype=np.uint64)]}))
> {code}
> I get
> {code:java}
> // code 
> ---
> ArrowInvalid  Traceback (most recent call last)
>  in ()
> > 1 pa.Table.from_pandas(pd.DataFrame({'foo': 
> [np.array([1000], dtype=np.uint64)]}))
> ~/.virtualenvs/log-archive/lib/python3.6/site-packages/pyarrow/table.pxi in 
> pyarrow.lib.Table.from_pandas 
> (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:44927)()
> ~/.virtualenvs/log-archive/lib/python3.6/site-packages/pyarrow/pandas_compat.py
>  in dataframe_to_arrays(df, schema, preserve_index, nthreads)
>     348 arrays = [convert_column(c, t)
>     349   for c, t in zip(columns_to_convert,
> --> 350   convert_types)]
>     351 else:
>     352 from concurrent import futures
> ~/.virtualenvs/log-archive/lib/python3.6/site-packages/pyarrow/pandas_compat.py
>  in (.0)
>     347 if nthreads == 1:
>     348 arrays = [convert_column(c, t)
> --> 349   for c, t in zip(columns_to_convert,
>     350   convert_types)]
>     351 else:
> ~/.virtualenvs/log-archive/lib/python3.6/site-packages/pyarrow/pandas_compat.py
>  in convert_column(col, ty)
>     343
>     344 def convert_column(col, ty):
> --> 345 return pa.array(col, from_pandas=True, type=ty)
>     346
>     347 if nthreads == 1:
> ~/.virtualenvs/log-archive/lib/python3.6/site-packages/pyarrow/array.pxi in 
> pyarrow.lib.array (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:29224)()
> ~/.virtualenvs/log-archive/lib/python3.6/site-packages/pyarrow/array.pxi in 
> pyarrow.lib._ndarray_to_array 
> (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:28465)()
> ~/.virtualenvs/log-archive/lib/python3.6/site-packages/pyarrow/error.pxi in 
> pyarrow.lib.check_status 
> (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:8270)()
> ArrowInvalid: trying to convert NumPy type int64 but got uint64
> {code}
>  
> the problem possibly relies on the fact that from_pandas doesn't handle the 
> conversion from python object to unsigned integer.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2254) [Python] Local in-place dev versions picking up JS tags

2018-03-04 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-2254:
---

 Summary: [Python] Local in-place dev versions picking up JS tags
 Key: ARROW-2254
 URL: https://issues.apache.org/jira/browse/ARROW-2254
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Wes McKinney
 Fix For: 0.9.0


I thought we had fixed this bug, but it's back:

{code}
$ ipython
Python 3.5.2 | packaged by conda-forge | (default, Jul 26 2016, 01:32:08) 
Type 'copyright', 'credits' or 'license' for more information
IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: pa.__version__
Out[1]: '0.3.1.dev52+g8b1c8118'
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-2228) [Python] Unsigned int type for arrow Table not supported

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-2228:
---

Assignee: Wes McKinney

> [Python] Unsigned int type for arrow Table not supported
> 
>
> Key: ARROW-2228
> URL: https://issues.apache.org/jira/browse/ARROW-2228
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
> Environment: Ubuntu 16.04
> python3.6.3
>Reporter: Marcello
>Assignee: Wes McKinney
>Priority: Major
>
> Running this python one-liner
>  
> {code:java}
> // code pa.Table.from_pandas(pd.DataFrame({'foo': 
> [np.array([1000], dtype=np.uint64)]}))
> {code}
> I get
> {code:java}
> // code 
> ---
> ArrowInvalid  Traceback (most recent call last)
>  in ()
> > 1 pa.Table.from_pandas(pd.DataFrame({'foo': 
> [np.array([1000], dtype=np.uint64)]}))
> ~/.virtualenvs/log-archive/lib/python3.6/site-packages/pyarrow/table.pxi in 
> pyarrow.lib.Table.from_pandas 
> (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:44927)()
> ~/.virtualenvs/log-archive/lib/python3.6/site-packages/pyarrow/pandas_compat.py
>  in dataframe_to_arrays(df, schema, preserve_index, nthreads)
>     348 arrays = [convert_column(c, t)
>     349   for c, t in zip(columns_to_convert,
> --> 350   convert_types)]
>     351 else:
>     352 from concurrent import futures
> ~/.virtualenvs/log-archive/lib/python3.6/site-packages/pyarrow/pandas_compat.py
>  in (.0)
>     347 if nthreads == 1:
>     348 arrays = [convert_column(c, t)
> --> 349   for c, t in zip(columns_to_convert,
>     350   convert_types)]
>     351 else:
> ~/.virtualenvs/log-archive/lib/python3.6/site-packages/pyarrow/pandas_compat.py
>  in convert_column(col, ty)
>     343
>     344 def convert_column(col, ty):
> --> 345 return pa.array(col, from_pandas=True, type=ty)
>     346
>     347 if nthreads == 1:
> ~/.virtualenvs/log-archive/lib/python3.6/site-packages/pyarrow/array.pxi in 
> pyarrow.lib.array (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:29224)()
> ~/.virtualenvs/log-archive/lib/python3.6/site-packages/pyarrow/array.pxi in 
> pyarrow.lib._ndarray_to_array 
> (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:28465)()
> ~/.virtualenvs/log-archive/lib/python3.6/site-packages/pyarrow/error.pxi in 
> pyarrow.lib.check_status 
> (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:8270)()
> ArrowInvalid: trying to convert NumPy type int64 but got uint64
> {code}
>  
> the problem possibly relies on the fact that from_pandas doesn't handle the 
> conversion from python object to unsigned integer.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-2245) [Python] Revert static linkage of parquet-cpp in manylinux1 wheel

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-2245.
-
Resolution: Fixed

Issue resolved by pull request 1692
[https://github.com/apache/arrow/pull/1692]

> [Python] Revert static linkage of parquet-cpp in manylinux1 wheel
> -
>
> Key: ARROW-2245
> URL: https://issues.apache.org/jira/browse/ARROW-2245
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Although we are not in a theoretical way the authoritative source of 
> parquet-cpp with the pyarrow manylinux1 wheel, in practical way we actually 
> are this and statically linking parquet-cpp can introduce some problems that 
> dynamically linking it does not (e.g. duplicate unloading of the library if 
> you include it in a Python wheel and in the process that creates the Python 
> interpreter). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2245) [Python] Revert static linkage of parquet-cpp in manylinux1 wheel

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385315#comment-16385315
 ] 

ASF GitHub Bot commented on ARROW-2245:
---

wesm closed pull request #1692: ARROW-2245: ARROW-2246: [Python] Revert static 
linkage of parquet-cpp in manylinux1 wheel
URL: https://github.com/apache/arrow/pull/1692
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/cpp/cmake_modules/ThirdpartyToolchain.cmake 
b/cpp/cmake_modules/ThirdpartyToolchain.cmake
index 4103af41b..c330e2ae3 100644
--- a/cpp/cmake_modules/ThirdpartyToolchain.cmake
+++ b/cpp/cmake_modules/ThirdpartyToolchain.cmake
@@ -501,10 +501,11 @@ if (ARROW_JEMALLOC)
   set(JEMALLOC_SHARED_LIB 
"${JEMALLOC_PREFIX}/lib/libjemalloc${CMAKE_SHARED_LIBRARY_SUFFIX}")
   set(JEMALLOC_STATIC_LIB 
"${JEMALLOC_PREFIX}/lib/libjemalloc_pic${CMAKE_STATIC_LIBRARY_SUFFIX}")
   set(JEMALLOC_VENDORED 1)
+  # We need to disable TLS or otherwise C++ exceptions won't work anymore.
   ExternalProject_Add(jemalloc_ep
 URL 
${CMAKE_CURRENT_SOURCE_DIR}/thirdparty/jemalloc/${JEMALLOC_VERSION}.tar.gz
 PATCH_COMMAND touch doc/jemalloc.3 doc/jemalloc.html
-CONFIGURE_COMMAND ./autogen.sh "--prefix=${JEMALLOC_PREFIX}" 
"--with-jemalloc-prefix=je_arrow_" "--with-private-namespace=je_arrow_private_"
+CONFIGURE_COMMAND ./autogen.sh "--prefix=${JEMALLOC_PREFIX}" 
"--with-jemalloc-prefix=je_arrow_" "--with-private-namespace=je_arrow_private_" 
"--disable-tls"
 ${EP_LOG_OPTIONS}
 BUILD_IN_SOURCE 1
 BUILD_COMMAND ${MAKE}
diff --git a/cpp/src/plasma/CMakeLists.txt b/cpp/src/plasma/CMakeLists.txt
index 3448d009c..bc00f9806 100644
--- a/cpp/src/plasma/CMakeLists.txt
+++ b/cpp/src/plasma/CMakeLists.txt
@@ -124,6 +124,16 @@ endif()
 add_executable(plasma_store store.cc)
 target_link_libraries(plasma_store plasma_static ${PLASMA_LINK_LIBS})
 
+if (ARROW_RPATH_ORIGIN)
+  if (APPLE)
+set(_lib_install_rpath "@loader_path")
+  else()
+set(_lib_install_rpath "\$ORIGIN")
+  endif()
+  set_target_properties(plasma_store PROPERTIES
+  INSTALL_RPATH ${_lib_install_rpath})
+endif()
+
 # Headers: top level
 install(FILES
   common.h
diff --git a/python/CMakeLists.txt b/python/CMakeLists.txt
index e9de08ba1..72294d494 100644
--- a/python/CMakeLists.txt
+++ b/python/CMakeLists.txt
@@ -66,7 +66,7 @@ if("${CMAKE_SOURCE_DIR}" STREQUAL 
"${CMAKE_CURRENT_SOURCE_DIR}")
 ON)
   option(PYARROW_BOOST_USE_SHARED
 "Rely on boost shared libraries on linking static parquet"
-OFF)
+ON)
   option(PYARROW_BUILD_PLASMA
 "Build the PyArrow Plasma integration"
 OFF)
@@ -235,6 +235,24 @@ function(bundle_arrow_implib library_path)
   COPYONLY)
 endfunction(bundle_arrow_implib)
 
+function(bundle_boost_lib library_path)
+  get_filename_component(LIBRARY_NAME ${${library_path}} NAME)
+  get_filename_component(LIBRARY_NAME_WE ${${library_path}} NAME_WE)
+  configure_file(${${library_path}}
+  ${BUILD_OUTPUT_ROOT_DIRECTORY}/${LIBRARY_NAME}
+  COPYONLY)
+  set(Boost_SO_VERSION 
"${Boost_MAJOR_VERSION}.${Boost_MINOR_VERSION}.${Boost_SUBMINOR_VERSION}")
+  if (APPLE)
+configure_file(${${library_path}}
+
${BUILD_OUTPUT_ROOT_DIRECTORY}/${LIBRARY_NAME_WE}.${Boost_SO_VERSION}${CMAKE_SHARED_LIBRARY_SUFFIX}
+COPYONLY)
+  else()
+configure_file(${${library_path}}
+
${BUILD_OUTPUT_ROOT_DIRECTORY}/${LIBRARY_NAME_WE}${CMAKE_SHARED_LIBRARY_SUFFIX}.${Boost_SO_VERSION}
+COPYONLY)
+  endif()
+endfunction()
+
 # Always bundle includes
 file(COPY ${ARROW_INCLUDE_DIR}/arrow DESTINATION 
${BUILD_OUTPUT_ROOT_DIRECTORY}/include)
 
@@ -247,6 +265,15 @@ if (PYARROW_BUNDLE_ARROW_CPP)
 ABI_VERSION ${ARROW_ABI_VERSION}
 SO_VERSION ${ARROW_SO_VERSION})
 
+  # boost
+  if (PYARROW_BOOST_USE_SHARED)
+set(Boost_USE_STATIC_LIBS OFF)
+find_package(Boost COMPONENTS system filesystem regex REQUIRED)
+bundle_boost_lib(Boost_REGEX_LIBRARY)
+bundle_boost_lib(Boost_FILESYSTEM_LIBRARY)
+bundle_boost_lib(Boost_SYSTEM_LIBRARY)
+  endif()
+
   if (MSVC)
 bundle_arrow_implib(ARROW_SHARED_IMP_LIB)
 bundle_arrow_implib(ARROW_PYTHON_SHARED_IMP_LIB)
diff --git a/python/manylinux1/Dockerfile-x86_64 
b/python/manylinux1/Dockerfile-x86_64
index 62a089329..d48bd0d2c 100644
--- a/python/manylinux1/Dockerfile-x86_64
+++ b/python/manylinux1/Dockerfile-x86_64
@@ -14,13 +14,13 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-FROM quay.io/xhochy/arrow_manylinux1_x86_64_base:ARROW-2212
+FROM quay.io/xhochy/arrow_manylinux1_x86_64_base:ARROW-2245
 
 ADD arrow /arrow
 WORKDIR /arrow/cpp
 RUN mkdir bui

[jira] [Commented] (ARROW-2253) [Python] Support __eq__ on scalar values

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385294#comment-16385294
 ] 

ASF GitHub Bot commented on ARROW-2253:
---

xhochy opened a new pull request #1695: ARROW-2253: [Python] Support __eq__ on 
scalar values
URL: https://github.com/apache/arrow/pull/1695
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Python] Support __eq__ on scalar values
> 
>
> Key: ARROW-2253
> URL: https://issues.apache.org/jira/browse/ARROW-2253
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Python
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Support a generic {{__eq__}} method the {{ArrayValue}} class. We might want 
> to specialise it in the future in C++ to avoid some copies but as a first 
> attempt delegate the comparison to the Python types.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2253) [Python] Support __eq__ on scalar values

2018-03-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-2253:
--
Labels: pull-request-available  (was: )

> [Python] Support __eq__ on scalar values
> 
>
> Key: ARROW-2253
> URL: https://issues.apache.org/jira/browse/ARROW-2253
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Python
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Support a generic {{__eq__}} method the {{ArrayValue}} class. We might want 
> to specialise it in the future in C++ to avoid some copies but as a first 
> attempt delegate the comparison to the Python types.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2253) [Python] Support __eq__ on scalar values

2018-03-04 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-2253:
--

 Summary: [Python] Support __eq__ on scalar values
 Key: ARROW-2253
 URL: https://issues.apache.org/jira/browse/ARROW-2253
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Python
Reporter: Uwe L. Korn
Assignee: Uwe L. Korn
 Fix For: 0.9.0


Support a generic {{__eq__}} method the {{ArrayValue}} class. We might want to 
specialise it in the future in C++ to avoid some copies but as a first attempt 
delegate the comparison to the Python types.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2241) [Python] Simple script for running all current ASV benchmarks at a commit or tag

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2241:

Fix Version/s: (was: 0.9.0)
   0.10.0

> [Python] Simple script for running all current ASV benchmarks at a commit or 
> tag
> 
>
> Key: ARROW-2241
> URL: https://issues.apache.org/jira/browse/ARROW-2241
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.10.0
>
>
> The objective of this is to be able to get a graph for performance at each 
> release tag for the currently-defined benchmarks (including benchmarks that 
> did not exist in older tags)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2248) [Python] Nightly or on-demand HDFS test builds

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2248:

Fix Version/s: (was: 0.9.0)
   0.10.0

> [Python] Nightly or on-demand HDFS test builds
> --
>
> Key: ARROW-2248
> URL: https://issues.apache.org/jira/browse/ARROW-2248
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.10.0
>
>
> We continue to acquire more functionality related to HDFS and Parquet. 
> Testing this, including tests that involve interoperability with other 
> systems, like Spark, will require some work outside of our normal CI 
> infrastructure.
> I suggest we start with testing the C++/Python HDFS integration, which will 
> help with validating patches like ARROW-1643 
> https://github.com/apache/arrow/pull/1668



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-2227) [Python] Table.from_pandas does not create chunked_arrays.

2018-03-04 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-2227:
---

Assignee: Wes McKinney

> [Python] Table.from_pandas does not create chunked_arrays.
> --
>
> Key: ARROW-2227
> URL: https://issues.apache.org/jira/browse/ARROW-2227
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.8.0
>Reporter: Chris Ellison
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.9.0
>
>
> When creating a large enough array, pyarrow raises an exception:
> {code:java}
> import numpy as np
> import pandas as pd
> import pyarrow as pa
> x = list('1' * 2**31)
> y = pd.DataFrame({'x': x})
> t = pa.Table.from_pandas(y)
> # ArrowInvalid: BinaryArrow cannot contain more than 2147483646 bytes, have 
> 2147483647{code}
> The array should be chunked for the user. As is, data frames with >2 GiB in 
> binary data will struggle to get into arrow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1919) Plasma hanging if object id is not 20 bytes

2018-03-04 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385279#comment-16385279
 ] 

Wes McKinney commented on ARROW-1919:
-

Arrow 0.9.0 should be released sometime this month

> Plasma hanging if object id is not 20 bytes
> ---
>
> Key: ARROW-1919
> URL: https://issues.apache.org/jira/browse/ARROW-1919
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Philipp Moritz
>Assignee: Philipp Moritz
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> This happens if plasma's capability to put an object with a user defined 
> object id is used if the object id is not 20 bytes long. Plasma will hang 
> upon get in that case, we should give an error instead.
> See https://github.com/ray-project/ray/issues/1315



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-2252) [Python] Create buffer from address, size and base

2018-03-04 Thread Uwe L. Korn (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned ARROW-2252:
--

Assignee: Uwe L. Korn

> [Python] Create buffer from address, size and base
> --
>
> Key: ARROW-2252
> URL: https://issues.apache.org/jira/browse/ARROW-2252
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Python
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Given a memory address and a size, we should be able to construct an Arrow 
> buffer from this. The additional base object will be used to hold a reference 
> to the underlying, original buffer so that it does not go out of scope before 
> the Arrow buffer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2252) [Python] Create buffer from address, size and base

2018-03-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-2252:
--
Labels: pull-request-available  (was: )

> [Python] Create buffer from address, size and base
> --
>
> Key: ARROW-2252
> URL: https://issues.apache.org/jira/browse/ARROW-2252
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Python
>Reporter: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Given a memory address and a size, we should be able to construct an Arrow 
> buffer from this. The additional base object will be used to hold a reference 
> to the underlying, original buffer so that it does not go out of scope before 
> the Arrow buffer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2252) [Python] Create buffer from address, size and base

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385205#comment-16385205
 ] 

ASF GitHub Bot commented on ARROW-2252:
---

xhochy opened a new pull request #1693: ARROW-2252: [Python] Create buffer from 
address, size and base
URL: https://github.com/apache/arrow/pull/1693
 
 
   Usage with Arrow Java vectors:
   
   ```
   import jpype
   import numpy as np
   import pyarrow as pa
   import sys
   
   # Start JVM with Arrow and all of its dependencies.
   jpype.startJVM(getDefaultJVMPath(), 
"-Djava.class.path=arrow-tools-0.9.0-SNAPSHOT-jar-with-dependencies.jar")
   
   # Create vector
   ra = jpype.JPackage("org").apache.arrow.memory.RootAllocator(sys.maxsize)
   uint1 = jpype.JPackage("org").apache.arrow.vector.UInt1Vector("int", ra)
   uint1.allocateNew(128)
   for i in range(128):
   uint1.setSafe(i, i)
   uint1.setValueCount(128)
   
   # Access it in Python
   addr = uint1.getDataBuffer().unwrap().memoryAddress()
   size = uint1.getDataBuffer().unwrap().capacity()
   fb = pa.ForeignBuffer(addr, size, n)
   np.asarray(fb)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Python] Create buffer from address, size and base
> --
>
> Key: ARROW-2252
> URL: https://issues.apache.org/jira/browse/ARROW-2252
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Python
>Reporter: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Given a memory address and a size, we should be able to construct an Arrow 
> buffer from this. The additional base object will be used to hold a reference 
> to the underlying, original buffer so that it does not go out of scope before 
> the Arrow buffer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2238) [C++] Detect clcache in cmake configuration

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385197#comment-16385197
 ] 

ASF GitHub Bot commented on ARROW-2238:
---

MaxRis commented on issue #1684: ARROW-2238: [C++] Detect and use clcache in 
cmake configuration
URL: https://github.com/apache/arrow/pull/1684#issuecomment-370238637
 
 
   @pitrou it should be fine to set `set(CMAKE_CXX_COMPILER ${CLCACHE_FOUND})` 
only if Generator defined as `Ninja` or `NMake Makefiles`. This possibly also 
will resolve current Appveyor failure. What do you think?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++] Detect clcache in cmake configuration
> ---
>
> Key: ARROW-2238
> URL: https://issues.apache.org/jira/browse/ARROW-2238
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Antoine Pitrou
>Assignee: Antoine Pitrou
>Priority: Minor
>  Labels: pull-request-available
>
> By default Windows builds should use clcache if installed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2252) [Python] Create buffer from address, size and base

2018-03-04 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-2252:
--

 Summary: [Python] Create buffer from address, size and base
 Key: ARROW-2252
 URL: https://issues.apache.org/jira/browse/ARROW-2252
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Python
Reporter: Uwe L. Korn
 Fix For: 0.9.0


Given a memory address and a size, we should be able to construct an Arrow 
buffer from this. The additional base object will be used to hold a reference 
to the underlying, original buffer so that it does not go out of scope before 
the Arrow buffer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2245) [Python] Revert static linkage of parquet-cpp in manylinux1 wheel

2018-03-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-2245:
--
Labels: pull-request-available  (was: )

> [Python] Revert static linkage of parquet-cpp in manylinux1 wheel
> -
>
> Key: ARROW-2245
> URL: https://issues.apache.org/jira/browse/ARROW-2245
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Although we are not in a theoretical way the authoritative source of 
> parquet-cpp with the pyarrow manylinux1 wheel, in practical way we actually 
> are this and statically linking parquet-cpp can introduce some problems that 
> dynamically linking it does not (e.g. duplicate unloading of the library if 
> you include it in a Python wheel and in the process that creates the Python 
> interpreter). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2245) [Python] Revert static linkage of parquet-cpp in manylinux1 wheel

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385170#comment-16385170
 ] 

ASF GitHub Bot commented on ARROW-2245:
---

xhochy opened a new pull request #1692: ARROW-2245: ARROW-2246: [Python] Revert 
static linkage of parquet-cpp in manylinux1 wheel
URL: https://github.com/apache/arrow/pull/1692
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Python] Revert static linkage of parquet-cpp in manylinux1 wheel
> -
>
> Key: ARROW-2245
> URL: https://issues.apache.org/jira/browse/ARROW-2245
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Although we are not in a theoretical way the authoritative source of 
> parquet-cpp with the pyarrow manylinux1 wheel, in practical way we actually 
> are this and statically linking parquet-cpp can introduce some problems that 
> dynamically linking it does not (e.g. duplicate unloading of the library if 
> you include it in a Python wheel and in the process that creates the Python 
> interpreter). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2238) [C++] Detect clcache in cmake configuration

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385169#comment-16385169
 ] 

ASF GitHub Bot commented on ARROW-2238:
---

pitrou commented on issue #1684: ARROW-2238: [C++] Detect and use clcache in 
cmake configuration
URL: https://github.com/apache/arrow/pull/1684#issuecomment-370234208
 
 
   clcache works best with Ninja or NMake (*). My suggestion here would be to 
recommend Ninja + clcache for best build performance. The other concern, 
though, is to avoid breaking existing builds for those who prefer other 
generators (e.g. Visual Studio).
   
   (*) See the following links:
   - https://github.com/frerich/clcache/issues/273#issuecomment-354623452
   - 
https://github.com/frerich/clcache/wiki/Integration#integration-for-visual-studio
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++] Detect clcache in cmake configuration
> ---
>
> Key: ARROW-2238
> URL: https://issues.apache.org/jira/browse/ARROW-2238
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Antoine Pitrou
>Assignee: Antoine Pitrou
>Priority: Minor
>  Labels: pull-request-available
>
> By default Windows builds should use clcache if installed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2238) [C++] Detect clcache in cmake configuration

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385116#comment-16385116
 ] 

ASF GitHub Bot commented on ARROW-2238:
---

MaxRis commented on issue #1684: ARROW-2238: [C++] Detect and use clcache in 
cmake configuration
URL: https://github.com/apache/arrow/pull/1684#issuecomment-370228503
 
 
   update: it seems that current solution `set(CMAKE_CXX_COMPILER 
${CLCACHE_FOUND})` works only with `NMake Makefiles` generator, but clcache 
doesn't get called if `Visual Studio 14 2015 Win64` or similar is used.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++] Detect clcache in cmake configuration
> ---
>
> Key: ARROW-2238
> URL: https://issues.apache.org/jira/browse/ARROW-2238
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Antoine Pitrou
>Assignee: Antoine Pitrou
>Priority: Minor
>  Labels: pull-request-available
>
> By default Windows builds should use clcache if installed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2251) [GLib] Destroying GArrowBuffer while GArrowTensor that uses the buffer causes a crash

2018-03-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-2251:
--
Labels: pull-request-available  (was: )

> [GLib] Destroying GArrowBuffer while GArrowTensor that uses the buffer causes 
> a crash
> -
>
> Key: ARROW-2251
> URL: https://issues.apache.org/jira/browse/ARROW-2251
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: GLib
>Affects Versions: 0.8.0
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2251) [GLib] Destroying GArrowBuffer while GArrowTensor that uses the buffer causes a crash

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385107#comment-16385107
 ] 

ASF GitHub Bot commented on ARROW-2251:
---

kou opened a new pull request #1691: ARROW-2251: [GLib] Keep GArrowBuffer alive 
while GArrowTensor for the buffer is live
URL: https://github.com/apache/arrow/pull/1691
 
 
   It prevents GC-related crash.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [GLib] Destroying GArrowBuffer while GArrowTensor that uses the buffer causes 
> a crash
> -
>
> Key: ARROW-2251
> URL: https://issues.apache.org/jira/browse/ARROW-2251
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: GLib
>Affects Versions: 0.8.0
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2251) [GLib] Destroying GArrowBuffer while GArrowTensor that uses the buffer causes a crash

2018-03-04 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-2251:
---

 Summary: [GLib] Destroying GArrowBuffer while GArrowTensor that 
uses the buffer causes a crash
 Key: ARROW-2251
 URL: https://issues.apache.org/jira/browse/ARROW-2251
 Project: Apache Arrow
  Issue Type: Bug
  Components: GLib
Affects Versions: 0.8.0
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou
 Fix For: 0.9.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)