[jira] [Created] (ARROW-6031) [Java] Support iterating a vector by ArrowBufPointer

2019-07-24 Thread Liya Fan (JIRA)
Liya Fan created ARROW-6031:
---

 Summary: [Java] Support iterating a vector by ArrowBufPointer
 Key: ARROW-6031
 URL: https://issues.apache.org/jira/browse/ARROW-6031
 Project: Apache Arrow
  Issue Type: New Feature
Reporter: Liya Fan
Assignee: Liya Fan


Provide the functionality to traverse a vector (fixed-width vector & 
variable-width vector) by an iterator. This is convenient for scenarios when 
accessing vector elements in sequence.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-6030) [Java] Efficiently compute hash code for ArrowBufPointer

2019-07-24 Thread Liya Fan (JIRA)
Liya Fan created ARROW-6030:
---

 Summary: [Java] Efficiently compute hash code for ArrowBufPointer
 Key: ARROW-6030
 URL: https://issues.apache.org/jira/browse/ARROW-6030
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Java
Reporter: Liya Fan
Assignee: Liya Fan


As ArrowBufHasher is introduced, we can compute the hash code of a continuous 
region within an ArrowBuf. 

We optimize the process to make it efficient to avoid recomputation. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-6029) [R] could not build

2019-07-24 Thread kohleth (JIRA)
kohleth created ARROW-6029:
--

 Summary: [R] could not build
 Key: ARROW-6029
 URL: https://issues.apache.org/jira/browse/ARROW-6029
 Project: Apache Arrow
  Issue Type: Bug
  Components: R
Affects Versions: 0.14.0
 Environment: > sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS
Reporter: kohleth


hi there,

when trying to build the R wrapper using 
{code:java}
remotes::install_github("apache/arrow", subdir = "r"){code}
I hit the following error:
Found pkg-config cflags and libs!
PKG_CFLAGS=-DNDEBUG -DARROW_R_WITH_ARROW
PKG_LIBS=-larrow -lparquet** libsg++ -std=gnu++11 -I"/usr/share/R/include" 
-DNDEBUG -DNDEBUG -DARROW_R_WITH_ARROW -I"/usr/lib/R/site-library/Rcpp/include" 
 -fvisibility=hidden -fpic  -g -O2 
-fdebug-prefix-map=/build/r-base-VjHo9C/r-base-3.6.0=. -fstack-protector-strong 
-Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c 
array.cpp -o array.o
g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG -DNDEBUG 
-DARROW_R_WITH_ARROW -I"/usr/lib/R/site-library/Rcpp/include"  
-fvisibility=hidden -fpic  -g -O2 
-fdebug-prefix-map=/build/r-base-VjHo9C/r-base-3.6.0=. -fstack-protector-strong 
-Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c 
array__to_vector.cpp -o array__to_vector.oarray__to_vector.cpp: In function 
'Rcpp::List Table__to_dataframe(const std::shared_ptr&, bool)':
array__to_vector.cpp:819:65: error: 'using element_type = class arrow::Column 
\{aka class arrow::Column}' has no member named 'chunks'
 converters[i] = arrow::r::Converter::Make(table->column(i)->chunks());
 
^~array__to_vector.cpp:820:23: error: 'using element_type = class 
arrow::Table \{aka class arrow::Table}' has no member named 'field'
 names[i] = table->field(i)->name();
   ^/usr/lib/R/etc/Makeconf:176: recipe for target 
'array__to_vector.o' failedmake: *** [array__to_vector.o] Error 1ERROR: 
compilation failed for package 'arrow'* removing 
'/home/kchia/R/x86_64-pc-linux-gnu-library/3.6/arrow'Error: Failed to install 
'arrow' from GitHub:
  (converted from warning) installation of package 
'/tmp/RtmpfYJZFa/file33fc6aee0ae6/arrow_0.14.0.9000.tar.gz' had non-zero exit 
status



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [Discuss] Do a 0.15.0 release before 1.0.0?

2019-07-24 Thread Bryan Cutler
+1 on a 0.15.0 release. At the minimum, if we could detect the stream and
provide a clear error message for Python and Java I think that would help
the transition. If we are also able to implement readers/writers that can
fallback to 4-byte prefix, then that would be nice to have.

On Wed, Jul 24, 2019 at 1:27 PM Jacques Nadeau  wrote:

> I'm ok with the change and 0.15 release to better manage it.
>
>
> > I've always understood the metadata to be a few dozen/hundred KB, a
> > small percentage of the total message size. I could be underestimating
> > the ratios though -- is it common to have tables w/ 1000+ columns? I've
> > seen a few reports like that in cuDF, but I'm curious to hear
> > Jacques'/Dremio's experience too.
> >
>
> Metadata size has been an issue at different points for us. We do
> definitely see datasets with 1000+ columns. It is also compounded by the
> fact that as we add more columns, we typically decrease row count so that
> the individual batches are still easily pipelined--which further increases
> the relative ratio between data and metadata.
>


Re: [Discuss] Do a 0.15.0 release before 1.0.0?

2019-07-24 Thread Jacques Nadeau
I'm ok with the change and 0.15 release to better manage it.


> I've always understood the metadata to be a few dozen/hundred KB, a
> small percentage of the total message size. I could be underestimating
> the ratios though -- is it common to have tables w/ 1000+ columns? I've
> seen a few reports like that in cuDF, but I'm curious to hear
> Jacques'/Dremio's experience too.
>

Metadata size has been an issue at different points for us. We do
definitely see datasets with 1000+ columns. It is also compounded by the
fact that as we add more columns, we typically decrease row count so that
the individual batches are still easily pipelined--which further increases
the relative ratio between data and metadata.


Re: [Discuss] Do a 0.15.0 release before 1.0.0?

2019-07-24 Thread Paul Taylor

I'm not sure I understand this suggestion:
1.  Wouldn't this cause old readers to miss the last 4 bytes of the buffer
(and provide meaningless bytes at the beginning).
2.  The current proposal on the other thread is to have the pattern be
<0x>


Sorry I didn't mean to say an int64_t length, just that now we'd be 
reserving 8 bytes in the "metadata length" position where today we 
reserve 4.


I'm not sure about every language, but at least in Python/JS an external 
forwards-compatible solution would involve slicing the message buffer up 
front like this:


def adjust_message_buffer(message_bytes):
  buf = pa.py_buffer(message_bytes)
  if first_four_bytes_are_max_int32(message_bytes):
    return buf.slice(4)
  return buf



On 7/23/19 7:31 PM, Micah Kornfield wrote:

Could we detect the 4-byte length, incur a penalty copying the memory to
an aligned buffer, then continue consuming the stream?

I think that is the plan (or at least would be my plan) if we go ahead with
the change




(It's probably
fine if we only write the 8-byte length, since consumers on older
versions of Arrow could slice from the 4th byte before passing a buffer
to the reader).

I'm not sure I understand this suggestion:
1.  Wouldn't this cause old readers to miss the last 4 bytes of the buffer
(and provide meaningless bytes at the beginning).
2.  The current proposal on the other thread is to have the pattern be
<0x>

Thanks,
Micah

On Tue, Jul 23, 2019 at 11:43 AM Paul Taylor 
wrote:


+1 for a 0.15.0 before 1.0 if we go ahead with this.

I'm curious to hear other's thoughts about compatibility. I think we
should avoid breaking backwards compatibility if possible. It's common
for apps/libs to be pinned on specific Arrow versions, and I worry it'd
cause a lot of work for downstream devs to audit their tool suite for
full Arrow binary compatibility (and/or require their customers to do
the same).

Could we detect the 4-byte length, incur a penalty copying the memory to
an aligned buffer, then continue consuming the stream? (It's probably
fine if we only write the 8-byte length, since consumers on older
versions of Arrow could slice from the 4th byte before passing a buffer
to the reader).

I've always understood the metadata to be a few dozen/hundred KB, a
small percentage of the total message size. I could be underestimating
the ratios though -- is it common to have tables w/ 1000+ columns? I've
seen a few reports like that in cuDF, but I'm curious to hear
Jacques'/Dremio's experience too.

If copying is feasible, it doesn't seem so bad a trade-off to maintain
backwards-compatibility. As libraries and consumers upgrade their Arrow
dependencies, the 4-byte length will be less and less common, and
they'll be less likely to pay the cost.



On 7/23/19 2:22 AM, Uwe L. Korn wrote:

It is also a good way to test the change in public. We don't want to

adjust something like this anymore in a 1.0.0 release. Already doing this
in 0.15.0 and then maybe doing adjustments due to issues that appear "in
the wild" is psychologically the easier way. There is a lot of thinking of
users bound with the magic 1.0, thus I would plan to minimize what is
changed between 1.0 and pre-1.0. This also should save us maintainers some
time as I would expect different behaviour in bug reports between 1.0 and
pre-1.0 issues.

Uwe

On Tue, Jul 23, 2019, at 7:52 AM, Micah Kornfield wrote:

I think the main reason to do a release before 1.0.0 is if we want to

make

the change that would give a good error message for forward

incompatibility

(I think this could be done as 0.14.2 since it would just be clarifying

an

error message).  Otherwise, I think including it in 1.0.0 would be fine
(its still not clear to me if there is consensus to fix the issue).

Thanks,
Micah


On Monday, July 22, 2019, Wes McKinney  wrote:


I'd be satisfied with fixing the Flatbuffer alignment issue either in
a 0.15.0 or 1.0.0. In the interest of expediency, though, making a
0.15.0 with this change sooner rather than later might be prudent.

On Mon, Jul 22, 2019 at 12:35 PM Antoine Pitrou 
wrote:

Hello,

Recently we've discussed breaking the IPC format to fix a

long-standing

alignment issue.  See this discussion:


https://lists.apache.org/thread.html/8cea56f2069710ac128ff9129c744f0ef96a3e33a4d79d7e820019af@%3Cdev.arrow.apache.org%3E

Should we first do a 0.15.0 in order to get those format fixes right?
Once that is fine and settled we can move to the 1.0.0 release?

Regards

Antoine.







Building on Arrow CUDA

2019-07-24 Thread Paul Taylor
I'm looking at options to replace the custom Arrow logic in cuDF with 
Arrow library calls. What's the recommended way to declare a dependency 
on pyarrow/arrowcpp with CUDA support?


I see in the docs it says to build from source, but that's only an 
option for an (advanced) end-user. And building/vendoring 
libarrow_cuda.so isn't a great option for a non-Arrow library, because 
someone who does source build Arrow-with-cuda will conflict with the 
version we ship.


Right now we're considering statically linking libarrow_cuda into 
libcudf.so and vendoring Arrow's cuda cython alongside ours, but this 
increases compile times/library size.


Is there a package management solution (like pip/conda install 
pyarrow[cuda]) that I'm missing? If not, should there be?


Best,

Paul



[jira] [Created] (ARROW-6028) Failed to compile on windows platform using arrow

2019-07-24 Thread Haowei Yu (JIRA)
Haowei Yu created ARROW-6028:


 Summary: Failed to compile on windows platform using arrow
 Key: ARROW-6028
 URL: https://issues.apache.org/jira/browse/ARROW-6028
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Python
Affects Versions: 0.14.0
Reporter: Haowei Yu


I am writing a python extension and trying to compile c++ code and link against 
arrow library on windows platform. (Using visual studio 2017) and compilation 
failed. 

{code:text}
building 'snowflake.connector.arrow_iterator' extension
C:\Program Files (x86)\Microsoft Visual 
Studio\2017\Community\VC\Tools\MSVC\14.16.27023\bin\HostX86\x64\cl.exe /c 
/nologo /Ox /W3 /GL /DNDEBUG /MD -Icpp/ArrowIterator/ 
-Ic:\Users\Haowei\py36env\lib\site-packages\pyarrow\include 
-IC:\Users\Haowei\AppData\Local\Programs\Python\Python36\include 
-IC:\Users\Haowei\AppData\Local\Programs\Python\Python36\include "-IC:\Program 
Files (x86)\Microsoft Visual 
Studio\2017\Community\VC\Tools\MSVC\14.16.27023\ATLMFC\include" "-IC:\Program 
Files (x86)\Microsoft Visual 
Studio\2017\Community\VC\Tools\MSVC\14.16.27023\include" "-IC:\Program Files 
(x86)\Windows Kits\NETFXSDK\4.6.1\include\um" "-IC:\Program Files (x86)\Windows 
Kits\10\include\10.0.17763.0\ucrt" "-IC:\Program Files (x86)\Windows 
Kits\10\include\10.0.17763.0\shared" "-IC:\Program Files (x86)\Windows 
Kits\10\include\10.0.17763.0\um" "-IC:\Program Files (x86)\Windows 
Kits\10\include\10.0.17763.0\winrt" "-IC:\Program Files (x86)\Windows 
Kits\10\include\10.0.17763.0\cppwinrt" /EHsc /Tpbuild\cython\arrow_iterator.cpp 
/Fobuild\temp.win-amd64-3.6\Release\build\cython\arrow_iterator.obj -std=c++11
cl : Command line warning D9002 : ignoring unknown option '-std=c++11'
arrow_iterator.cpp
c:\Users\Haowei\py36env\lib\site-packages\pyarrow\include\arrow/type.h(852): 
error C2528: '__timezone': pointer to reference is illegal
c:\Users\Haowei\py36env\lib\site-packages\pyarrow\include\arrow/type.h(859): 
error C2269: cannot create a pointer or reference to a qualified function type 
(requires pointer-to-member)
c:\Users\Haowei\py36env\lib\site-packages\pyarrow\include\arrow/type.h(853): 
error C2664: 
'std::basic_string,std::allocator>::basic_string(const
 std::basic_string,std::allocator> &)': 
cannot convert argument 1 from 'const std::string *' to 
'std::initializer_list<_Elem>'
with
[
_Elem=char
]
c:\Users\Haowei\py36env\lib\site-packages\pyarrow\include\arrow/type.h(852): 
note: No constructor could take the source type, or constructor overload 
resolution was ambiguous
c:\Users\Haowei\py36env\lib\site-packages\pyarrow\include\arrow/type.h(859): 
error C2440: 'return': cannot convert from 'std::string' to 'const std::string 
*(__cdecl *)(void)'
c:\Users\Haowei\py36env\lib\site-packages\pyarrow\include\arrow/type.h(859): 
note: No user-defined-conversion operator available that can perform this 
conversion, or the operator cannot be called
c:\Users\Haowei\py36env\lib\site-packages\pyarrow\include\arrow/type.h(1126): 
error C2528: '__timezone': pointer to reference is illegal
error: command 'C:\\Program Files (x86)\\Microsoft Visual 
Studio\\2017\\Community\\VC\\Tools\\MSVC\\14.16.27023\\bin\\HostX86\\x64\\cl.exe'
 failed with exit status 2
{code}

I googled a little bit and found similar issue in feather repo. 
https://github.com/wesm/feather/issues/111

So I did something similar to their fix:
Adding following code to the type.h header file (according to 
https://github.com/wesm/feather/pull/146/files)
{code:c++}
#if _MSC_VER >= 1900
  #undef timezone
#endif
{code}

Not sure if this is the right way to fix it. If yes, I can submit a PR.






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-6027) CMake Build w/boost_ep fails on Windows - "%1 is not a valid Win32 application"

2019-07-24 Thread Jonathan McDevitt (JIRA)
Jonathan McDevitt created ARROW-6027:


 Summary: CMake Build w/boost_ep fails on Windows - "%1 is not a 
valid Win32 application"
 Key: ARROW-6027
 URL: https://issues.apache.org/jira/browse/ARROW-6027
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Jonathan McDevitt
 Attachments: _release64CMakeBuildLogs.txt, _release64CMakeLogs.txt

Hi all,

I seem to be running into an issue when building Apache Arrow for Windows. It 
fails to build boost; in the CMake output it says

 
{code:java}
CMake Error at 
D:/Staging/arrow/cpp/release64/boost_ep-prefix/src/boost_ep-stamp/boost_ep-configure-Release.cmake:49
 (message):
Command failed: %1 is not a valid Win32 application

'./bootstrap.sh' 
'--prefix=D:/Staging/arrow/cpp/release64/boost_ep-prefix/src/boost_ep' 
'--with-libraries=filesystem,regex,system'
{code}
I've been trying to address this issue, and am currently investigating using a 
pre-build Boost library as a workaround, but the expectation is that this 
should work out of the box. I have attached logs demonstrating this behaviour. 
The initial step of running CMake for Windows 64 is fine, but the actual build 
step is what fails, and the boost_ep-configure-*.log files are empty so there 
is nothing there to give an idea of what's going on.

 
h2. **Expected Behaviour

When building Apache Arrow 0.14.x, build should work out of the box when VS 
2015 build tools are present and the environment is configured with vcvarsall 
for the appropriate architecture.
h2. Observed Behaviour

Build fails with error: 
{code:java}
Command failed: %1 is not a valid Win32 application

'./bootstrap.sh' 
'--prefix=D:/Staging/arrow/cpp/release64/boost_ep-prefix/src/boost_ep' 
'--with-libraries=filesystem,regex,system'{code}
h2. Steps to Reproduce
 # Sync to Maintenance 0.14.x with 'git clone -b maint-0.14.x 
[https://github.com/apache/arrow.git']
 # Following the instructions at 
[https://github.com/apache/arrow/blob/master/docs/source/developers/cpp.rst]:
 ## Create a 'build' directory from which to run CMake and generate the 
appropriate build files.
 ## Run "%VS140COMNTOOLS%..\..\VC\vcvarsall.bat" amd64
 ## From within the build directory, run "cmake .. -G "Visual Studio 14 2015 
Win64" -DARROW_BUILD_TESTS=ON"
 ### Alternatively, if running Ninja, run "cmake .. -GNinja 
-DCMAKE_C_COMPILER="cl.exe" -DCMAKE_CXX_COMPILER="cl.exe" 
-DARROW_BUILD_TESTS=ON"
 ## Observe error.

Thanks,
~Jon



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: Arrow sync call July 24 at 12:00 US/Eastern, 16:00 UTC

2019-07-24 Thread Micah Kornfield
Want to try https://meet.google.com/myj-ospb-dxw

On Wed, Jul 24, 2019 at 9:12 AM Antoine Pitrou  wrote:

>
> Apparently we're all having the same problem...
>
>
> Le 24/07/2019 à 18:06, Micah Kornfield a écrit :
> > Is this happening?  I can't seem to join?
> >
> > On Tue, Jul 23, 2019 at 7:26 PM Neal Richardson <
> neal.p.richard...@gmail.com>
> > wrote:
> >
> >> Hi everyone,
> >> Reminder that the biweekly Arrow call is tomorrow (well, already today
> for
> >> some of you) at https://meet.google.com/vtm-teks-phx. All are welcome
> to
> >> join. Notes will be sent out to the mailing list afterwards.
> >>
> >> Neal
> >>
> >
>


Re: Arrow sync call July 24 at 12:00 US/Eastern, 16:00 UTC

2019-07-24 Thread Antoine Pitrou


Apparently we're all having the same problem...


Le 24/07/2019 à 18:06, Micah Kornfield a écrit :
> Is this happening?  I can't seem to join?
> 
> On Tue, Jul 23, 2019 at 7:26 PM Neal Richardson 
> wrote:
> 
>> Hi everyone,
>> Reminder that the biweekly Arrow call is tomorrow (well, already today for
>> some of you) at https://meet.google.com/vtm-teks-phx. All are welcome to
>> join. Notes will be sent out to the mailing list afterwards.
>>
>> Neal
>>
> 


Re: Arrow sync call July 24 at 12:00 US/Eastern, 16:00 UTC

2019-07-24 Thread Micah Kornfield
Is this happening?  I can't seem to join?

On Tue, Jul 23, 2019 at 7:26 PM Neal Richardson 
wrote:

> Hi everyone,
> Reminder that the biweekly Arrow call is tomorrow (well, already today for
> some of you) at https://meet.google.com/vtm-teks-phx. All are welcome to
> join. Notes will be sent out to the mailing list afterwards.
>
> Neal
>


[jira] [Created] (ARROW-6026) [Doc] Add CONTRIBUTING.md

2019-07-24 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-6026:
-

 Summary: [Doc] Add CONTRIBUTING.md
 Key: ARROW-6026
 URL: https://issues.apache.org/jira/browse/ARROW-6026
 Project: Apache Arrow
  Issue Type: Task
  Components: Documentation
Reporter: Antoine Pitrou
 Fix For: 1.0.0


A CONTRIBUTING.md file at the top-level of a repository is automatically picked 
up by Github and displayed when people open an issue or PR for the first time.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-6025) [Gandiva][Test] Error handling for missing timezone in castTIMESTAMP_utf8 tests

2019-07-24 Thread Krisztian Szucs (JIRA)
Krisztian Szucs created ARROW-6025:
--

 Summary: [Gandiva][Test] Error handling for missing timezone in 
castTIMESTAMP_utf8 tests
 Key: ARROW-6025
 URL: https://issues.apache.org/jira/browse/ARROW-6025
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++ - Gandiva
Reporter: Krisztian Szucs


I've recently enabled gandiva in the conda c++ ursabot builders. The container 
doesn't contain the required timezones do the tests are failing:

{code}
../src/gandiva/precompiled/time_test.cc:103: Failure
Expected equality of these values:
  castTIMESTAMP_utf8(context_ptr, "2000-09-23 9:45:30.920 Canada/Pacific", 37)
Which is: 0
  969727530920
../src/gandiva/precompiled/time_test.cc:105: Failure
Expected equality of these values:
  castTIMESTAMP_utf8(context_ptr, "2012-02-28 23:30:59 Asia/Kolkata", 32)
Which is: 0
  1330452059000
../src/gandiva/precompiled/time_test.cc:107: Failure
Expected equality of these values:
  castTIMESTAMP_utf8(context_ptr, "1923-10-07 03:03:03 America/New_York", 36)
Which is: 0
  -1459094217000
{code}

See build: https://ci.ursalabs.org/#/builders/66/builds/3046/steps/8/logs/stdio



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-6024) [Java] Provide more hash algorithms

2019-07-24 Thread Liya Fan (JIRA)
Liya Fan created ARROW-6024:
---

 Summary: [Java] Provide more hash algorithms 
 Key: ARROW-6024
 URL: https://issues.apache.org/jira/browse/ARROW-6024
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Java
Reporter: Liya Fan
Assignee: Liya Fan


Provide more hash algorithms to choose for different scenarios.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-6023) [C++][Gandiva] Add functions in Gandiva

2019-07-24 Thread Prudhvi Porandla (JIRA)
Prudhvi Porandla created ARROW-6023:
---

 Summary: [C++][Gandiva] Add functions in Gandiva
 Key: ARROW-6023
 URL: https://issues.apache.org/jira/browse/ARROW-6023
 Project: Apache Arrow
  Issue Type: Task
  Components: C++ - Gandiva
Reporter: Prudhvi Porandla
Assignee: Prudhvi Porandla
 Fix For: 1.0.0


support following functions in Gandiva -
 # int32 castINT(int64) : cast int64 to int32
 # float4 castFLOAT4(float8) : cast float8 to float4
 # int64 truncate(int64, int32 scale) : if scale is negative, make last -scale 
digits zero
 # timestamp add(date, int32 days) : add days to date(in milliseconds) and 
return   timestamp 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-6022) [Java] Support equals API in ValueVector to compare two vectors equal

2019-07-24 Thread Ji Liu (JIRA)
Ji Liu created ARROW-6022:
-

 Summary: [Java] Support equals API in ValueVector to compare two 
vectors equal
 Key: ARROW-6022
 URL: https://issues.apache.org/jira/browse/ARROW-6022
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Java
Reporter: Ji Liu
Assignee: Ji Liu


In some case, this feature is useful.

In ARROW-1184, {{Dictionary#equals}} not work due to the lack of this API.

Moreover, we already implemented {{equals(int index, ValueVector target, int 
targetIndex)}}, so this new added API could reuse it.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-6021) [Java] Extract copyFrom and copyFromSafe to ValueVector

2019-07-24 Thread Liya Fan (JIRA)
Liya Fan created ARROW-6021:
---

 Summary: [Java] Extract copyFrom and copyFromSafe to ValueVector
 Key: ARROW-6021
 URL: https://issues.apache.org/jira/browse/ARROW-6021
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Java
Reporter: Liya Fan
Assignee: Liya Fan


Currently we have copyFrom and copyFromSafe methods in fixed-width and 
variable-width vectors. Extracting them to the common super interface will make 
it much more convenient to use them, and avoid unnecessary if-else statements.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)