[jira] [Created] (ARROW-16704) tableFromIPC should handle AsyncRecordBatchReader inputs

2022-05-31 Thread Paul Taylor (Jira)
Paul Taylor created ARROW-16704:
---

 Summary: tableFromIPC should handle AsyncRecordBatchReader inputs
 Key: ARROW-16704
 URL: https://issues.apache.org/jira/browse/ARROW-16704
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Affects Versions: 8.0.0
Reporter: Paul Taylor


To match the prior `Table.from()` method, `tableFromIPC()` method should handle 
the case where the input is an async RecordBatchReader.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (ARROW-12570) [JS] Fix issues that blocked the v4.0.0 release

2021-04-27 Thread Paul Taylor (Jira)
Paul Taylor created ARROW-12570:
---

 Summary: [JS] Fix issues that blocked the v4.0.0 release
 Key: ARROW-12570
 URL: https://issues.apache.org/jira/browse/ARROW-12570
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Reporter: Paul Taylor
Assignee: Paul Taylor


A few issues had to be fixed manually for the v4.0.0 release:

* ts-jest throwing a type error running the tests on the TS source
* lerna.json really does need those version numbers
* npm has introduced rate limits since v3.0.0
* support npm 2FA one-time-passwords for publish



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-12305) [JS] Benchmark test data generate.py assumes python 2

2021-04-08 Thread Paul Taylor (Jira)
Paul Taylor created ARROW-12305:
---

 Summary: [JS] Benchmark test data generate.py assumes python 2
 Key: ARROW-12305
 URL: https://issues.apache.org/jira/browse/ARROW-12305
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Reporter: Paul Taylor
Assignee: Paul Taylor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10255) [JS] Reorganize imports and exports to be more friendly to ESM tree-shaking

2020-10-09 Thread Paul Taylor (Jira)
Paul Taylor created ARROW-10255:
---

 Summary: [JS] Reorganize imports and exports to be more friendly 
to ESM tree-shaking
 Key: ARROW-10255
 URL: https://issues.apache.org/jira/browse/ARROW-10255
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Affects Versions: 0.17.1
Reporter: Paul Taylor
Assignee: Paul Taylor


Presently most of our public classes can't be easily 
[tree-shaken|https://webpack.js.org/guides/tree-shaking/] by library consumers. 
This is a problem for libraries that only need to use parts of Arrow.

For example, the vis.gl projects have an integration test that imports three of 
our simpler classes and tests the resulting bundle size:

{code:javascript}
import {Schema, Field, Float32} from 'apache-arrow';

// | Bundle Size| Compressed 
// | 202KB (207112) KB  | 45KB (46618) KB
{code}

We can help solve this with the following changes:
* Add "sideEffects": false to our ESM package.json
* Reorganize our imports to only include what's needed
* Eliminate or move some static/member methods to standalone exported functions
* Wrap the utf8 util's node Buffer detection in eval so Webpack doesn't compile 
in its own Buffer shim
* Removing flatbuffers namespaces from generated TS because these defeat 
Webpack's tree-shaking ability

Candidate functions for removal/moving to standalone functions:
* Schema.new, Schema.from, Schema.prototype.compareTo
* Field.prototype.compareTo
* Type.prototype.compareTo
* Table.new, Table.from
* Column.new
* Vector.new, Vector.from
* RecordBatchReader.from

After applying a few of the above changes to the Schema and flatbuffers files, 
I was able to reduce the vis.gl's import size 90%:
{code:javascript}
// Bundle Size  | Compressed
// 24KB (24942) KB  | 6KB (6154) KB
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9659) [C++] RecordBatchStreamReader throws on CUDA device buffers

2020-08-05 Thread Paul Taylor (Jira)
Paul Taylor created ARROW-9659:
--

 Summary: [C++] RecordBatchStreamReader throws on CUDA device 
buffers
 Key: ARROW-9659
 URL: https://issues.apache.org/jira/browse/ARROW-9659
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 1.0.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 1.0.1


Prior to 1.0.0, the RecordBatchStreamReader was capable of reading source 
CudaBuffers wrapped in a CudaBufferReader. In 1.0.0, the Array validation 
routines call into Buffer::data(), which throws an error if the source isn't in 
host memory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-9041) [C++] overloaded virtual function "arrow::io::Writable::Write" is only partially overridden in class

2020-06-05 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17126947#comment-17126947
 ] 

Paul Taylor commented on ARROW-9041:


These are resolved in [PR 5677|https://github.com/apache/arrow/pull/5677]. Now 
that the [new variant.hpp header|https://github.com/apache/arrow/pull/7053] is 
in 0.17.1, we should be able to upgrade.

> [C++] overloaded virtual function "arrow::io::Writable::Write" is only 
> partially overridden in class 
> -
>
> Key: ARROW-9041
> URL: https://issues.apache.org/jira/browse/ARROW-9041
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.15.0
>Reporter: Karthikeyan Natarajan
>Priority: Major
>  Labels: easyfix
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Following warnings appear 
> cpp/build/arrow/install/include/arrow/io/file.h(189): warning: overloaded 
> virtual function "arrow::io::Writable::Write" is only partially overridden in 
> class "arrow::io::MemoryMappedFile"
> cpp/build/arrow/install/include/arrow/io/memory.h(98): warning: overloaded 
> virtual function "arrow::io::Writable::Write" is only partially overridden in 
> class "arrow::io::MockOutputStream"
> cpp/build/arrow/install/include/arrow/io/memory.h(116): warning: overloaded 
> virtual function "arrow::io::Writable::Write" is only partially overridden in 
> class "arrow::io::FixedSizeBufferWriter"
> Suggestion solution is to use `using Writable::Write` in protected/private.
> [https://isocpp.org/wiki/faq/strange-inheritance#hiding-rule]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8394) Typescript compiler errors for arrow d.ts files, when using es2015-esm package

2020-05-29 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119867#comment-17119867
 ] 

Paul Taylor commented on ARROW-8394:


Thanks [~pprice], I'll look into this. I had to do a bunch of weird things to 
trick the 3.5 compiler into propagating the types, so I'm hoping I can back 
some of those out to get it working in 3.9 and simplify the typedefs along the 
way.

> Typescript compiler errors for arrow d.ts files, when using es2015-esm package
> --
>
> Key: ARROW-8394
> URL: https://issues.apache.org/jira/browse/ARROW-8394
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: 0.16.0
>Reporter: Shyamal Shukla
>Priority: Blocker
>
> Attempting to use apache-arrow within a web application, but typescript 
> compiler throws the following errors in some of arrow's .d.ts files
> import \{ Table } from "../node_modules/@apache-arrow/es2015-esm/Arrow";
> export class SomeClass {
> .
> .
> constructor() {
> const t = Table.from('');
> }
> *node_modules/@apache-arrow/es2015-esm/column.d.ts:14:22* - error TS2417: 
> Class static side 'typeof Column' incorrectly extends base class static side 
> 'typeof Chunked'. Types of property 'new' are incompatible.
> *node_modules/@apache-arrow/es2015-esm/ipc/reader.d.ts:238:5* - error TS2717: 
> Subsequent property declarations must have the same type. Property 'schema' 
> must be of type 'Schema', but here has type 'Schema'.
> 238 schema: Schema;
> *node_modules/@apache-arrow/es2015-esm/recordbatch.d.ts:17:18* - error 
> TS2430: Interface 'RecordBatch' incorrectly extends interface 'StructVector'. 
> The types of 'slice(...).clone' are incompatible between these types.
> the tsconfig.json file looks like
> {
>  "compilerOptions": {
>  "target":"ES6",
>  "outDir": "dist",
>  "baseUrl": "src/"
>  },
>  "exclude": ["dist"],
>  "include": ["src/*.ts"]
> }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-8053) [JS] Improve performance of filtering

2020-04-07 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17077728#comment-17077728
 ] 

Paul Taylor commented on ARROW-8053:


[~hulettbh] did the predicates stuff. It could certainly be more optimized if 
it JIT'd into a flat JS function.

An apples-to-apples comparison would be to filter the rows individually:

{code:javascript}
function filterStruct(struct, predicate) {
  let keys = [], i = -1, j = -1;
  for (let row of table) if (predicate(row, ++i)) keys[++j] = i;
  return DictionaryVector.from(struct, new Int32(), keys)
}

function predicate(policy) {
  return policy.proto === 6
  && ((policy.startPort > 0 && policy.endPort < 200) || policy.startPort 
=== 49152)
  && policy.isActive === true;
}

const count = filterStruct(policiesTable, pred).length
{code}

I generally agree with [~lmeyerov] though, don't do inline scans and reductions 
if you care about performance. Use WASM/web workers to distrubute across CPU 
cores, or (better yet) WebGL TransformFeedback on the GPU (both work in node 
and browsers, neither require non-JS dependencies). 
 Arrow excels at both of these strategies.

> [JS] Improve performance of filtering
> -
>
> Key: ARROW-8053
> URL: https://issues.apache.org/jira/browse/ARROW-8053
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Reporter: Will Strimling
>Priority: Major
>
> A series of observable notebooks have shown quite convincingly that arrow 
> doesn't compete with other libraries or JavaScript when it comes to filtering 
> performance. Has there been any discussion or roadmaps established for 
> improving it?
> Most convincing Observables:
>  * 
> [https://observablehq.com/@duaneatat/apache-arrow-filtering-vs-array-filter]
>  * 
> [https://observablehq.com/@robertleeplummerjr/array-filtering-apache-arrow-vs-gpu-js-textures-vs-array-fil]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-7513) [JS] Arrow Tutorial: Common data types

2020-01-10 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013120#comment-17013120
 ] 

Paul Taylor commented on ARROW-7513:


[~lmeyerov] Int64Vector and Uint64Vector.from methods either require you pass 
the JS BigInt types, or a second "is64bit" boolean argument: 
https://github.com/apache/arrow/blob/master/js/src/vector/int.ts#L63-L64. All 
the IntVectors share the same from implementation IIRC because of a limitation 
in the typescript compiler that may not exist anymore.

> [JS] Arrow Tutorial: Common data types
> --
>
> Key: ARROW-7513
> URL: https://issues.apache.org/jira/browse/ARROW-7513
> Project: Apache Arrow
>  Issue Type: Task
>  Components: JavaScript
>Reporter: Leo Meyerovich
>Assignee: Leo Meyerovich
>Priority: Minor
>
> The JS client lacks basic introductory material around creating the common 
> basic data types such as turning JS arrays into ints, dicts, etc. There is no 
> equivalent of Python's [https://arrow.apache.org/docs/python/data.html] . 
> This has made use for myself difficult, and I bet for others.
>  
> As with prev tutorials, I started sketching on 
> [https://observablehq.com/@lmeyerov/rich-data-types-in-apache-arrow-js-efficient-data-tables-wit]
>   . When we're happy can make sense to export as an html or something to the 
> repo, or just link from the main readme.
> I believe the target topics worth covering are:
>  * Common user data types: Ints, Dicts, Struct, Time
>  * Common column types: Data, Vector, Column
>  * Going from individual & arrays & buffers of JS values to Arrow-wrapped 
> forms, and basic inspection of the result
> Not worth going into here is Tables vs. RecordBatches, which is the other 
> tutorial.
>  
> 1. Ideas of what to add/edit/remove?
> 2. And anyone up for helping with discussion of Data vs. Vector, and ingest 
> of Time & Struct?
> 3. ... Should we be encouraging Struct or Map? I saw some PRs changing stuff 
> here.
>  
> cc [~wesm] [~bhulette] [~paul.e.taylor]
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-6886) [C++] arrow::io header nvcc compiler warnings

2019-10-16 Thread Paul Taylor (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-6886:
---
Fix Version/s: 0.15.1

> [C++] arrow::io header nvcc compiler warnings
> -
>
> Key: ARROW-6886
> URL: https://issues.apache.org/jira/browse/ARROW-6886
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.15.0
>Reporter: Paul Taylor
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 0.15.1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Seeing the following compiler warnings statically linking the arrow::io 
> headers with nvcc:
> {noformat}
> arrow/install/include/arrow/io/file.h(189): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::MemoryMappedFile"
> arrow/install/include/arrow/io/memory.h(98): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::MockOutputStream"
> arrow/install/include/arrow/io/memory.h(116): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::FixedSizeBufferWriter"
> arrow/install/include/arrow/io/file.h(189): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::MemoryMappedFile"
> arrow/install/include/arrow/io/memory.h(98): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::MockOutputStream"
> arrow/install/include/arrow/io/memory.h(116): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::FixedSizeBufferWriter"
> arrow/install/include/arrow/io/file.h(189): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::MemoryMappedFile"
> arrow/install/include/arrow/io/memory.h(98): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::MockOutputStream"
> arrow/install/include/arrow/io/memory.h(116): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::FixedSizeBufferWriter"
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6886) [C++] arrow::io header nvcc compiler warnings

2019-10-16 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953183#comment-16953183
 ] 

Paul Taylor commented on ARROW-6886:


[~apitrou] Yeah this warning is benign, but the team is moving towards a 
zero-tolerance policy for compilation warnings. I'm looking into a fix now, 
hopefully can have a PR ready in time for 0.15.1.

These headers are included by some of our cuda files, so here's the full 
command as generated by cmake:

{noformat}
/usr/local/cuda-10.0/bin/nvcc 
 -DARROW_METADATA_V4
 -Dcudf_EXPORTS
 -Igoogletest/install/include
 -Iinclude
 -I../include
 -I../src
 -I../thirdparty/cub
 -I../thirdparty/jitify
 -I../thirdparty/libcudacxx/include
 -Iarrow/install/include
 -Iarrow/build/flatbuffers_ep-prefix/src/flatbuffers_ep-install/include
 -I/home/ptaylor/dev/rapids/compose/etc/conda/envs/rapids/include  
 -gencode arch=compute_75,code=sm_75
 -gencode arch=compute_75,code=compute_75
 --expt-extended-lambda
 --expt-relaxed-constexpr
 -Werror cross-execution-space-call
 -Xcompiler
 -Wall,-Werror
 --define-macro HT_LEGACY_ALLOCATOR
 -O3
 -DNDEBUG
 -Xcompiler=-fPIC  
 -DJITIFY_USE_CACHE
 -DCUDF_VERSION=0.11.0
 -std=c++14
 -x cu
 -x cuda
 -c /home/ptaylor/dev/rapids/cudf/cpp/src/io/avro/avro_reader_impl.cu
 -o CMakeFiles/cudf.dir/src/io/avro/avro_reader_impl.cu.o && 
/usr/local/cuda-10.0/bin/nvcc 
 -DARROW_METADATA_V4
 -Dcudf_EXPORTS
 -Igoogletest/install/include
 -Iinclude
 -I../include
 -I../src
 -I../thirdparty/cub
 -I../thirdparty/jitify
 -I../thirdparty/libcudacxx/include
 -Iarrow/install/include
 -Iarrow/build/flatbuffers_ep-prefix/src/flatbuffers_ep-install/include
 -I/home/ptaylor/dev/rapids/compose/etc/conda/envs/rapids/include  
 -gencode arch=compute_75,code=sm_75
 -gencode arch=compute_75,code=compute_75
 --expt-extended-lambda
 --expt-relaxed-constexpr
 -Werror cross-execution-space-call
 -Xcompiler
 -Wall,-Werror
 --define-macro HT_LEGACY_ALLOCATOR
 -O3
 -DNDEBUG
 -Xcompiler=-fPIC  
 -DJITIFY_USE_CACHE
 -DCUDF_VERSION=0.11.0
 -std=c++14
 -x cu
 -x cuda
 -M /home/ptaylor/dev/rapids/cudf/cpp/src/io/avro/avro_reader_impl.cu
 -MT CMakeFiles/cudf.dir/src/io/avro/avro_reader_impl.cu.o
 -o $DEP_FILE
{noformat}


> [C++] arrow::io header nvcc compiler warnings
> -
>
> Key: ARROW-6886
> URL: https://issues.apache.org/jira/browse/ARROW-6886
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.15.0
>Reporter: Paul Taylor
>Priority: Trivial
>
> Seeing the following compiler warnings statically linking the arrow::io 
> headers with nvcc:
> {noformat}
> arrow/install/include/arrow/io/file.h(189): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::MemoryMappedFile"
> arrow/install/include/arrow/io/memory.h(98): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::MockOutputStream"
> arrow/install/include/arrow/io/memory.h(116): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::FixedSizeBufferWriter"
> arrow/install/include/arrow/io/file.h(189): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::MemoryMappedFile"
> arrow/install/include/arrow/io/memory.h(98): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::MockOutputStream"
> arrow/install/include/arrow/io/memory.h(116): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::FixedSizeBufferWriter"
> arrow/install/include/arrow/io/file.h(189): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::MemoryMappedFile"
> arrow/install/include/arrow/io/memory.h(98): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::MockOutputStream"
> arrow/install/include/arrow/io/memory.h(116): warning: overloaded virtual 
> function "arrow::io::Writable::Write" is only partially overridden in class 
> "arrow::io::FixedSizeBufferWriter"
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6886) [C++] arrow::io header nvcc compiler warnings

2019-10-14 Thread Paul Taylor (Jira)
Paul Taylor created ARROW-6886:
--

 Summary: [C++] arrow::io header nvcc compiler warnings
 Key: ARROW-6886
 URL: https://issues.apache.org/jira/browse/ARROW-6886
 Project: Apache Arrow
  Issue Type: New Feature
  Components: C++
Affects Versions: 0.15.0
Reporter: Paul Taylor


Seeing the following compiler warnings statically linking the arrow::io headers 
with nvcc:

{noformat}
arrow/install/include/arrow/io/file.h(189): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::MemoryMappedFile"

arrow/install/include/arrow/io/memory.h(98): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::MockOutputStream"

arrow/install/include/arrow/io/memory.h(116): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::FixedSizeBufferWriter"

arrow/install/include/arrow/io/file.h(189): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::MemoryMappedFile"

arrow/install/include/arrow/io/memory.h(98): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::MockOutputStream"

arrow/install/include/arrow/io/memory.h(116): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::FixedSizeBufferWriter"

arrow/install/include/arrow/io/file.h(189): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::MemoryMappedFile"

arrow/install/include/arrow/io/memory.h(98): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::MockOutputStream"

arrow/install/include/arrow/io/memory.h(116): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::FixedSizeBufferWriter"
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6759) [JS] Run less comprehensive every-commit build, relegate multi-target builds perhaps to nightlies

2019-10-02 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943014#comment-16943014
 ] 

Paul Taylor commented on ARROW-6759:


Yeah no sweat, we can change the `ci/travis_script_js.sh` build and test 
commands to only test the UMD builds. Historically these have the most issues 
since they're minified, so if they pass everything should pass:

{code:bash}
npm run build -- -m umd -t es5 -t es2015 -t esnext
npm test -- -m umd -t es5 -t es2015 -t esnext
{code}


> [JS] Run less comprehensive every-commit build, relegate multi-target builds 
> perhaps to nightlies
> -
>
> Key: ARROW-6759
> URL: https://issues.apache.org/jira/browse/ARROW-6759
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 1.0.0
>
>
> The JavaScript CI build is taking 25-30 minutes nowadays. This could be 
> abbreviated by testing fewer deployment targets. We obviously still need to 
> test all the deployment targets but we could do that nightly instead of on 
> every commit



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (ARROW-6575) [JS] decimal toString does not support negative values

2019-09-24 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934909#comment-16934909
 ] 

Paul Taylor edited comment on ARROW-6575 at 9/25/19 2:24 AM:
-

[~zad] Yeah I couldn't figure out how to propagate the sign bit through the 
decimal conversion. I'd be happy to review a PR if you know the right way to do 
it.


was (Author: paul.e.taylor):
Yeah, I couldn't figure out how to propagate the sign bit through the decimal 
conversion. I'd be happy to review a PR if you know the right way to do it.

> [JS] decimal toString does not support negative values
> --
>
> Key: ARROW-6575
> URL: https://issues.apache.org/jira/browse/ARROW-6575
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: 0.14.1
>Reporter: Andong Zhan
>Priority: Critical
>
> The main description is here: [https://github.com/apache/arrow/issues/5397]
> Also, I have a simple test case (slightly changed generate-test-data.js and 
> generated-data-validators):
> {code:java}
> export const decimal = (length = 2, nullCount = length * 0.2 | 0, scale = 0, 
> precision = 38) => vectorGenerator.visit(new Decimal(scale, precision), 
> length, nullCount);
> function fillDecimal(length: number) {
> // const BPE = Uint32Array.BYTES_PER_ELEMENT; // 4
> const array = new Uint32Array(length);
> // const max = (2 ** (8 * BPE)) - 1;
> // for (let i = -1; ++i < length; array[i] = rand() * max * (rand() > 0.5 
> ? -1 : 1));
> array[0] = 0;
> array[1] = 1286889712;
> array[2] = 2218195178;
> array[3] = 4282345521;
> array[4] = 0;
> array[5] = 16004768;
> array[6] = 3587851993;
> array[7] = 126217744;
> return array;
> }
> {code}
> and the expected value should be
> {code:java}
> expect(vector.get(0).toString()).toBe('-1');
> expect(vector.get(1).toString()).toBe('1');
> {code}
> However, the actual first value is 339282366920938463463374607431768211456 
> which is wrong! The second value is correct by the way.
> I believe the bug is in the function called 
> function decimalToString>(a: T) because it cannot 
> return a negative value at all.
> [arrow/js/src/util/bn.ts|https://github.com/apache/arrow/blob/d54425de19b7dbb2764a40355d76d1c785cf64ec/js/src/util/bn.ts#L99]
> Line 99 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6641) Remove Deprecated WriteableFile warning

2019-09-20 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934910#comment-16934910
 ] 

Paul Taylor commented on ARROW-6641:


I think this was addressed by 
[e41ad0d2|https://github.com/apache/arrow/commit/e41ad0d2ccaf96812d902b161d8a0b2b372f1b72]
 which should make it into the 0.15 release.

> Remove Deprecated WriteableFile warning
> ---
>
> Key: ARROW-6641
> URL: https://issues.apache.org/jira/browse/ARROW-6641
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.14.0, 0.14.1
>Reporter: Karthikeyan Natarajan
>Priority: Major
>  Labels: newbie
>
> Current version is 0.14.1. As per comment, deprecated `WriteableFile` should 
> be removed. 
>  
> {code:java}
> // TODO(kszucs): remove this after 0.13
> #ifndef _MSC_VER
> using WriteableFile ARROW_DEPRECATED("Use WritableFile") = WritableFile;
> using ReadableFileInterface ARROW_DEPRECATED("Use RandomAccessFile") = 
> RandomAccessFile;
> #else
> // MSVC does not like using ARROW_DEPRECATED with using declarations
> using WriteableFile = WritableFile;
> using ReadableFileInterface = RandomAccessFile;
> #endif
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (ARROW-6370) [JS] Table.from adds 0 on int columns

2019-09-20 Thread Paul Taylor (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor closed ARROW-6370.
--
Resolution: Not A Bug

> [JS] Table.from adds 0 on int columns
> -
>
> Key: ARROW-6370
> URL: https://issues.apache.org/jira/browse/ARROW-6370
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: 0.14.1
>Reporter: Sascha Hofmann
>Priority: Major
>
> I am generating an arrow table in pyarrow and send it via gRPC like this:
> {code:java}
> sink = pa.BufferOutputStream()
> writer = pa.RecordBatchStreamWriter(sink, batch.schema)
> writer.write_batch(batch)
> writer.close()
> yield ds.Response(
> status=200,
> loading=False,
> response=[sink.getvalue().to_pybytes()]   
> )
> {code}
> On the javascript end, I parse it like that:
> {code:java}
>  Table.from(response.getResponseList()[0])
> {code}
> That works but when I look at the actual table, int columns have a 0 for 
> every other row. String columns seem to be parsed just fine. 
> The Python byte array created from to_pybytes() has the same length as 
> received in javascript. I am also able to recreate the original table for the 
> byte array in Python. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6575) [JS] decimal toString does not support negative values

2019-09-20 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934909#comment-16934909
 ] 

Paul Taylor commented on ARROW-6575:


Yeah, I couldn't figure out how to propagate the sign bit through the decimal 
conversion. I'd be happy to review a PR if you know the right way to do it.

> [JS] decimal toString does not support negative values
> --
>
> Key: ARROW-6575
> URL: https://issues.apache.org/jira/browse/ARROW-6575
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: 0.14.1
>Reporter: Andong Zhan
>Priority: Critical
>
> The main description is here: [https://github.com/apache/arrow/issues/5397]
> Also, I have a simple test case (slightly changed generate-test-data.js and 
> generated-data-validators):
> {code:java}
> export const decimal = (length = 2, nullCount = length * 0.2 | 0, scale = 0, 
> precision = 38) => vectorGenerator.visit(new Decimal(scale, precision), 
> length, nullCount);
> function fillDecimal(length: number) {
> // const BPE = Uint32Array.BYTES_PER_ELEMENT; // 4
> const array = new Uint32Array(length);
> // const max = (2 ** (8 * BPE)) - 1;
> // for (let i = -1; ++i < length; array[i] = rand() * max * (rand() > 0.5 
> ? -1 : 1));
> array[0] = 0;
> array[1] = 1286889712;
> array[2] = 2218195178;
> array[3] = 4282345521;
> array[4] = 0;
> array[5] = 16004768;
> array[6] = 3587851993;
> array[7] = 126217744;
> return array;
> }
> {code}
> and the expected value should be
> {code:java}
> expect(vector.get(0).toString()).toBe('-1');
> expect(vector.get(1).toString()).toBe('1');
> {code}
> However, the actual first value is 339282366920938463463374607431768211456 
> which is wrong! The second value is correct by the way.
> I believe the bug is in the function called 
> function decimalToString>(a: T) because it cannot 
> return a negative value at all.
> [arrow/js/src/util/bn.ts|https://github.com/apache/arrow/blob/d54425de19b7dbb2764a40355d76d1c785cf64ec/js/src/util/bn.ts#L99]
> Line 99 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (ARROW-6574) [JS] TypeError with utf8 and JSONVectorLoader.readData

2019-09-20 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934903#comment-16934903
 ] 

Paul Taylor commented on ARROW-6574:


[~akre54] This is the JSON IPC format which is only suitable for integration 
tests between the different Arrow implementations.

You can use the Vector 
[Builders|https://github.com/apache/arrow/blob/b2785d38a110c8fd8a3d7c957cd78d8911607a5e/js/src/builder.ts#L54]
 to encode to arbitrary JS objects into Arrow Vectors and Tables.

The raw Builder APIs allow you to control every aspect of the chunking and 
flushing behavior, but as a consequence are relatively low-level. There are 
higher-level APIs for transforming values from iterables, async iterables, node 
streams, or DOM streams. You can see examples of usage [in the tests 
here|https://github.com/apache/arrow/blob/b2785d38a110c8fd8a3d7c957cd78d8911607a5e/js/test/unit/builders/builder-tests.ts#L261],
 or see [this example|https://github.com/trxcllnt/csv-to-arrow-js] converting a 
CSV row stream to Arrow.

Lastly if your values are already in memory, you can call `Vector.from()` with 
an Arrow type and an iterable (or async-iterable) of JS values, and it'll use 
the Builders to return a Vector of the specified type:


{code:javascript}

// create from a list of numbers or a Float32Array (zero-copy) -- all values 
will be valid
const f32 = Float32Vector.from([1.1, 2.5, 3.7]);

// or a different style, handy if inferring the types at runtime
// values in the `nullValues` array will be treated as NULL, and written in the 
validity bitmap
const f32 = Vector.from({
  nullValues: [-1, NaN],
  type: new Arrow.Float32(),
  values: [1.1, -1, 2.5, 3.7, NaN],
});
// ^ result: [1.1, null, 2.5, 3.7, null]

// or with values from an AsyncIterator
const f32 = await Vector.from({
  type: new Arrow.Float32(),
  values: (async function*() { yield* [1.1, 2.5, 3.7]; }())
});

{code}



> [JS] TypeError with utf8 and JSONVectorLoader.readData
> --
>
> Key: ARROW-6574
> URL: https://issues.apache.org/jira/browse/ARROW-6574
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: 0.14.1
> Environment: node v10.16.0, OSX 10.14.5
>Reporter: Adam M Krebs
>Priority: Major
>
> Minimal repro:
>  
> {code:javascript}
> const fields = [
>   {
> name: 'first_name',
> type: {name: 'utf8'},
> nullable: false,
> children: [],
>   },
> ];
> Table.from({
>   schema: {fields},
>   batches: [{
> count: 1,
> columns: [{
>   name: 'first_name',
>   count: 1,
>   VALIDITY: [],
>   DATA: ['Fred']
> }]
>   }]
> });{code}
>  
> Output:
> {code:java}
> /[snip]/node_modules/apache-arrow/visitor/vectorloader.js:92
> readData(type, { offset } = this.nextBufferRange()) {
>  ^TypeError: Cannot destructure property `offset` of 
> 'undefined' or 'null'.
> at JSONVectorLoader.readData 
> (/[snip]/node_modules/apache-arrow/visitor/vectorloader.js:92:38)
> at JSONVectorLoader.visitUtf8 
> (/[snip]/node_modules/apache-arrow/visitor/vectorloader.js:46:188)
> at JSONVectorLoader.visit 
> (/[snip]/node_modules/apache-arrow/visitor.js:28:48)
> at JSONVectorLoader.visit 
> (/[snip]/node_modules/apache-arrow/visitor/vectorloader.js:40:22)
> at nodes.map (/[snip]/node_modules/apache-arrow/visitor.js:25:44)
> at Array.map ()
> at JSONVectorLoader.visitMany 
> (/[snip]/node_modules/apache-arrow/visitor.js:25:22)
> at RecordBatchJSONReaderImpl._loadVectors 
> (/[snip]/node_modules/apache-arrow/ipc/reader.js:523:107)
> at RecordBatchJSONReaderImpl._loadRecordBatch 
> (/[snip]/node_modules/apache-arrow/ipc/reader.js:209:79)
> at RecordBatchJSONReaderImpl.next 
> (/[snip]/node_modules/apache-arrow/ipc/reader.js:280:42){code}
>  
>  
> Looks like the `nextBufferRange` call is returning `undefined`, due to an 
> out-of-bounds `buffersIndex`.
>  
> Happy to provide more info if needed. Seems to only affect utf8 types and 
> nothing else.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (ARROW-6370) [JS] Table.from adds 0 on int columns

2019-09-12 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928851#comment-16928851
 ] 

Paul Taylor edited comment on ARROW-6370 at 9/12/19 7:53 PM:
-

[~saschahofmann]

bq. From my understanding of Arrow it should be a platform-independent format, 
meaning that if I am sending an arrow table from Python to JS it should turn 
out the same, right?

Yes, and that's what's happening here. But you're sending 8-byte integers to a 
platform which has historically only supported 4-byte integers, which is why 
you see each 8-byte integer as a pair of 4-byte integers.

I recommend reading [this post|https://v8.dev/features/bigint] on BigInts in 
the v8 blog.

BigInts (and their related typed arrays) are relatively new additions to JS, 
and aren't supported in all engines yet.

We have done our best to support getting and setting BigInt values when running 
in VM that does support them, but for now we still have to support platforms 
without BigInt. That's why the values Array for Int64Vector is a stride-2 
Int32Array instead of a BigInt64Array.


was (Author: paul.e.taylor):
[~saschahofmann]

bq. From my understanding of Arrow it should be a platform-independent format, 
meaning that if I am sending an arrow table from Python to JS it should turn 
out the same, right?

Yes, and that's what's happening here. But you're sending 8-byte integers to a 
platform which has historically only supported 4-byte integers, which is why 
you see each 8-byte integer as a pair of 4-byte integers.

I recommend reading [this post|https://v8.dev/features/bigint] on BigInts in 
the v8 blog.

BigInts (and their related typed arrays) are relatively new additions to JS, 
and aren't supported in all engines yet.

We have done our best to support getting and setting BigInt values when running 
in VM that do support them, but for now we still have to support platforms 
without BigInt. That's why the values Array for Int64Vector is a stride-2 
Int32Array instead of a BigInt64Array.

> [JS] Table.from adds 0 on int columns
> -
>
> Key: ARROW-6370
> URL: https://issues.apache.org/jira/browse/ARROW-6370
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: 0.14.1
>Reporter: Sascha Hofmann
>Priority: Major
>
> I am generating an arrow table in pyarrow and send it via gRPC like this:
> {code:java}
> sink = pa.BufferOutputStream()
> writer = pa.RecordBatchStreamWriter(sink, batch.schema)
> writer.write_batch(batch)
> writer.close()
> yield ds.Response(
> status=200,
> loading=False,
> response=[sink.getvalue().to_pybytes()]   
> )
> {code}
> On the javascript end, I parse it like that:
> {code:java}
>  Table.from(response.getResponseList()[0])
> {code}
> That works but when I look at the actual table, int columns have a 0 for 
> every other row. String columns seem to be parsed just fine. 
> The Python byte array created from to_pybytes() has the same length as 
> received in javascript. I am also able to recreate the original table for the 
> byte array in Python. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (ARROW-6370) [JS] Table.from adds 0 on int columns

2019-09-12 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928851#comment-16928851
 ] 

Paul Taylor edited comment on ARROW-6370 at 9/12/19 7:52 PM:
-

[~saschahofmann]

bq. From my understanding of Arrow it should be a platform-independent format, 
meaning that if I am sending an arrow table from Python to JS it should turn 
out the same, right?

Yes, and that's what's happening here. But you're sending 8-byte integers to a 
platform which has historically only supported 4-byte integers, which is why 
you see each 8-byte integer as a pair of 4-byte integers.

I recommend reading [this post|https://v8.dev/features/bigint] on BigInts in 
the v8 blog.

BigInts (and their related typed arrays) are relatively new additions to JS, 
and aren't supported in all engines yet.

We have done our best to support getting and setting BigInt values when running 
in VM that do support them, but for now we still have to support platforms 
without BigInt. That's why the values Array for Int64Vector is a stride-2 
Int32Array instead of a BigInt64Array.


was (Author: paul.e.taylor):
[~saschahofmann]

bq. From my understanding of Arrow it should be a platform-independent format, 
meaning that if I am sending an arrow table from Python to JS it should turn 
out the same, right?

Yes, and that's what's happening here. But you're sending 8-byte integers to a 
platform which has historically only supported 4-byte integers, which is why 
you see each 8-byte integer as a pair of 4-byte integers.

I recommend reading [this post|https://v8.dev/features/bigint] on BigInts in 
the v8 blog.

BigInts (and their related typed arrays) are relatively new additions to JS, 
and aren't supported in all engines yet.

We have done our best to support geting and setting BigInt values when running 
in VM that supports them, but for now we still have to support platforms 
without BigInt. That's why the values Array for Int64Vector is a stride-2 
Int32Array instead of a BigInt64Array.

> [JS] Table.from adds 0 on int columns
> -
>
> Key: ARROW-6370
> URL: https://issues.apache.org/jira/browse/ARROW-6370
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: 0.14.1
>Reporter: Sascha Hofmann
>Priority: Major
>
> I am generating an arrow table in pyarrow and send it via gRPC like this:
> {code:java}
> sink = pa.BufferOutputStream()
> writer = pa.RecordBatchStreamWriter(sink, batch.schema)
> writer.write_batch(batch)
> writer.close()
> yield ds.Response(
> status=200,
> loading=False,
> response=[sink.getvalue().to_pybytes()]   
> )
> {code}
> On the javascript end, I parse it like that:
> {code:java}
>  Table.from(response.getResponseList()[0])
> {code}
> That works but when I look at the actual table, int columns have a 0 for 
> every other row. String columns seem to be parsed just fine. 
> The Python byte array created from to_pybytes() has the same length as 
> received in javascript. I am also able to recreate the original table for the 
> byte array in Python. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (ARROW-6370) [JS] Table.from adds 0 on int columns

2019-09-12 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928851#comment-16928851
 ] 

Paul Taylor commented on ARROW-6370:


[~saschahofmann]

bq. From my understanding of Arrow it should be a platform-independent format, 
meaning that if I am sending an arrow table from Python to JS it should turn 
out the same, right?

Yes, and that's what's happening here. But you're sending 8-byte integers to a 
platform which has historically only supported 4-byte integers, which is why 
you see each 8-byte integer as a pair of 4-byte integers.

I recommend reading [this post|https://v8.dev/features/bigint] on BigInts in 
the v8 blog.

BigInts (and their related typed arrays) are relatively new additions to JS, 
and aren't supported in all engines yet.

We have done our best to support geting and setting BigInt values when running 
in VM that supports them, but for now we still have to support platforms 
without BigInt. That's why the values Array for Int64Vector is a stride-2 
Int32Array instead of a BigInt64Array.

> [JS] Table.from adds 0 on int columns
> -
>
> Key: ARROW-6370
> URL: https://issues.apache.org/jira/browse/ARROW-6370
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: 0.14.1
>Reporter: Sascha Hofmann
>Priority: Major
>
> I am generating an arrow table in pyarrow and send it via gRPC like this:
> {code:java}
> sink = pa.BufferOutputStream()
> writer = pa.RecordBatchStreamWriter(sink, batch.schema)
> writer.write_batch(batch)
> writer.close()
> yield ds.Response(
> status=200,
> loading=False,
> response=[sink.getvalue().to_pybytes()]   
> )
> {code}
> On the javascript end, I parse it like that:
> {code:java}
>  Table.from(response.getResponseList()[0])
> {code}
> That works but when I look at the actual table, int columns have a 0 for 
> every other row. String columns seem to be parsed just fine. 
> The Python byte array created from to_pybytes() has the same length as 
> received in javascript. I am also able to recreate the original table for the 
> byte array in Python. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (ARROW-6370) [JS] Table.from adds 0 on int columns

2019-09-09 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925928#comment-16925928
 ] 

Paul Taylor commented on ARROW-6370:


[~saschahofmann] I closed this because this is working as intended.

64-bit little-endian numbers are represented as pairs of lo, hi twos-complement 
32-bit integers. If your values are less than 32-bits, the high bits will be 
zero. We're not inserting zeros, the zeros are part of the data Python is 
sending to JavaScript.

The Int64Vector and Uint64Vector support implicitly casting either to a normal 
JS 64-bit float as (with 53-bits of precision) if you can afford to lose 
precision, or to JS's new 
[BigInt|https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt]
 type if you need the full 64-bits of precision and are on a platform that 
supports BigInt (v8 and the newest versions of FF).

{code:javascript}
const { Int64, Vector } = require('apache-arrow');
let i64s = Vector.from({ type: new Int64(), values: [123n, 456n, 789n] });
for (let x of i64s) {
  console.log(x); // will be an Int32Array of two numbers: lo, hi
  console.log(0 + x); // casts to a 53-bit integer, i.e. regular JS float64
  console.log(0n + x); // casts to a BigInt, i.e. JS's new 64-bit integer
}
{code}


> [JS] Table.from adds 0 on int columns
> -
>
> Key: ARROW-6370
> URL: https://issues.apache.org/jira/browse/ARROW-6370
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: 0.14.1
>Reporter: Sascha Hofmann
>Priority: Major
>
> I am generating an arrow table in pyarrow and send it via gRPC like this:
> {code:java}
> sink = pa.BufferOutputStream()
> writer = pa.RecordBatchStreamWriter(sink, batch.schema)
> writer.write_batch(batch)
> writer.close()
> yield ds.Response(
> status=200,
> loading=False,
> response=[sink.getvalue().to_pybytes()]   
> )
> {code}
> On the javascript end, I parse it like that:
> {code:java}
>  Table.from(response.getResponseList()[0])
> {code}
> That works but when I look at the actual table, int columns have a 0 for 
> every other row. String columns seem to be parsed just fine. 
> The Python byte array created from to_pybytes() has the same length as 
> received in javascript. I am also able to recreate the original table for the 
> byte array in Python. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (ARROW-2786) [JS] Read Parquet files in JavaScript

2019-09-04 Thread Paul Taylor (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922744#comment-16922744
 ] 

Paul Taylor commented on ARROW-2786:


There are a few JS Parquet implementations, with 
[parquetjs|https://www.npmjs.com/package/parquetjs] being the more mature one 
from what I recall.

A while back I put together [this 
demo|https://github.com/trxcllnt/arrow-to-parquet-js] converting Arrow -> 
Parquet in pure JS. The major drawback is the ParquetJS writer is row-oriented, 
so performance will be an issue.

I opened [this issue|https://github.com/ironSource/parquetjs/issues/84] to get 
some clarification, but haven't heard back yet.

> [JS] Read Parquet files in JavaScript
> -
>
> Key: ARROW-2786
> URL: https://issues.apache.org/jira/browse/ARROW-2786
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: JavaScript
>Reporter: Wes McKinney
>Priority: Major
>  Labels: parquet
>
> See question in https://github.com/apache/arrow/issues/2209



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Closed] (ARROW-6370) [JS] Table.from adds 0 on int columns

2019-09-04 Thread Paul Taylor (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor closed ARROW-6370.
--
Resolution: Not A Bug

> [JS] Table.from adds 0 on int columns
> -
>
> Key: ARROW-6370
> URL: https://issues.apache.org/jira/browse/ARROW-6370
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: 0.14.1
>Reporter: Sascha Hofmann
>Priority: Major
>
> I am generating an arrow table in pyarrow and send it via gRPC like this:
> {code:java}
> sink = pa.BufferOutputStream()
> writer = pa.RecordBatchStreamWriter(sink, batch.schema)
> writer.write_batch(batch)
> writer.close()
> yield ds.Response(
> status=200,
> loading=False,
> response=[sink.getvalue().to_pybytes()]   
> )
> {code}
> On the javascript end, I parse it like that:
> {code:java}
>  Table.from(response.getResponseList()[0])
> {code}
> That works but when I look at the actual table, int columns have a 0 for 
> every other row. String columns seem to be parsed just fine. 
> The Python byte array created from to_pybytes() has the same length as 
> received in javascript. I am also able to recreate the original table for the 
> byte array in Python. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (ARROW-5741) [JS] Make numeric vector from functions consistent with TypedArray.from

2019-08-14 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-5741.

   Resolution: Fixed
Fix Version/s: (was: 1.0.0)
   0.15.0

Issue resolved by pull request 4746
[https://github.com/apache/arrow/pull/4746]

> [JS] Make numeric vector from functions consistent with TypedArray.from
> ---
>
> Key: ARROW-5741
> URL: https://issues.apache.org/jira/browse/ARROW-5741
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Brian Hulette
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Described in 
> https://lists.apache.org/thread.html/b648a781cba7f10d5a6072ff2e7dab6c03e2d1f12e359d9261891486@%3Cdev.arrow.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-6053) [Python] RecordBatchStreamReader::Open2 cdef type signature doesn't match C++

2019-07-26 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-6053:
--

 Summary: [Python] RecordBatchStreamReader::Open2 cdef type 
signature doesn't match C++
 Key: ARROW-6053
 URL: https://issues.apache.org/jira/browse/ARROW-6053
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Python
Affects Versions: 0.14.1
Reporter: Paul Taylor
Assignee: Paul Taylor


The Cython method signature for RecordBatchStreamReader::Open2 doesn't match 
the C++ type signature and causes a compiler type error trying to call Open2 
from Cython.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (ARROW-5762) [Integration][JS] Integration Tests for Map Type

2019-07-17 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor reassigned ARROW-5762:
--

Assignee: Paul Taylor

> [Integration][JS] Integration Tests for Map Type
> 
>
> Key: ARROW-5762
> URL: https://issues.apache.org/jira/browse/ARROW-5762
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Integration, JavaScript
>Reporter: Bryan Cutler
>Assignee: Paul Taylor
>Priority: Major
> Fix For: 1.0.0
>
>
> ARROW-1279 enabled integration tests for MapType between Java and C++, but 
> JavaScript had to be disabled for the map case due to an error.  Once this is 
> fixed, {{generate_map_case}} could be moved under {{generate_nested_case}} 
> with the other nested types.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ARROW-5762) [Integration][JS] Integration Tests for Map Type

2019-07-17 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-5762:
---
Fix Version/s: 1.0.0

> [Integration][JS] Integration Tests for Map Type
> 
>
> Key: ARROW-5762
> URL: https://issues.apache.org/jira/browse/ARROW-5762
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Integration, JavaScript
>Reporter: Bryan Cutler
>Priority: Major
> Fix For: 1.0.0
>
>
> ARROW-1279 enabled integration tests for MapType between Java and C++, but 
> JavaScript had to be disabled for the map case due to an error.  Once this is 
> fixed, {{generate_map_case}} could be moved under {{generate_nested_case}} 
> with the other nested types.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ARROW-5762) [Integration][JS] Integration Tests for Map Type

2019-07-17 Thread Paul Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887544#comment-16887544
 ] 

Paul Taylor commented on ARROW-5762:


After reviewing the C++, the JS version of the Map type is not the same (it's 
essentially a Struct except instead of accessing child fields by field index, 
they're accessed by name). We should absolutely update the JS Map 
implementation before the 1.0 release.


> [Integration][JS] Integration Tests for Map Type
> 
>
> Key: ARROW-5762
> URL: https://issues.apache.org/jira/browse/ARROW-5762
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Integration, JavaScript
>Reporter: Bryan Cutler
>Priority: Major
>
> ARROW-1279 enabled integration tests for MapType between Java and C++, but 
> JavaScript had to be disabled for the map case due to an error.  Once this is 
> fixed, {{generate_map_case}} could be moved under {{generate_nested_case}} 
> with the other nested types.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (ARROW-5763) [JS] enable integration tests for MapVector

2019-07-17 Thread Paul Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887479#comment-16887479
 ] 

Paul Taylor commented on ARROW-5763:


After reviewing the C++, the JS version of the Map type is not the same (it's 
essentially a Struct except instead of accessing child fields by field index, 
they're accessed by name). We should absolutely update the JS Map 
implementation before the 1.0 release.

> [JS] enable integration tests for MapVector
> ---
>
> Key: ARROW-5763
> URL: https://issues.apache.org/jira/browse/ARROW-5763
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: JavaScript
>Reporter: Benjamin Kietzman
>Priority: Minor
>
> As of 0.14, C++ and Java support Map arrays those implementations pass 
> integration tests. JS has a MapVector and some unit tests for it, but it 
> should be tested against other implementations as well



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (ARROW-5532) [JS] Field Metadata Not Read

2019-06-21 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-5532.

Resolution: Fixed

Issue resolved by pull request 4476
https://github.com/apache/arrow/pull/4476

> [JS] Field Metadata Not Read
> 
>
> Key: ARROW-5532
> URL: https://issues.apache.org/jira/browse/ARROW-5532
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: 0.13.0
> Environment: Mac OSX 10.14, Chrome 74
>Reporter: Trey Hakanson
>Assignee: Paul Taylor
>Priority: Major
>  Labels: Javas
> Fix For: 0.14.0
>
>
> Field metadata is not read when using {{@apache-arrow/ts@0.13.0}}. Example 
> below also uses {{pyarrow==0.13.0}}
> Steps to reproduce:
> Adding metadata:
> {code:title=toarrow.py|borderStyle=solid}
> import pyarrow as pa
> import pandas as pd
> source = "sample.csv"
> output = "sample.arrow"
> df = pd.read_csv(source)
> table = pa.Table.from_pandas(df)
> schema = pa.schema([
>  column.field.add_metadata({"foo": "bar"}))
>  for column
>  in table.columns
> ])
> writer = pa.RecordBatchFileWriter(output, schema)
> writer.write(table)
> writer.close()
> {code}
> Reading field metadata using {{pyarrow}}:
> {code:title=readarrow.py|borderStyle=solid}
> source = "sample.arrow"
> field = "foo"
> reader = pa.RecordBatchFileReader(source)
> reader.schema.field_by_name(field).metadata # Correctly shows `{"foo": "bar"}`
> {code}
> Reading field metadata using {{@apache-arrow/ts}}:
> {code:title=toarrow.ts|borderStyle=solid}
> import { Table, Field, Type } from "@apache-arrow/ts";
> const url = "https://example.com/sample.arrow";;
> const buf = await fetch(url).then(res => res.arrayBuffer());
> const table = Table.from([new Uint8Array(buf)]);
> for (let field of table.schema.fields) {
>  field.metadata; // Incorrectly shows an empty map
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5537) [JS] Support delta dictionaries in RecordBatchWriter and DictionaryBuilder

2019-06-09 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-5537:
--

 Summary: [JS] Support delta dictionaries in RecordBatchWriter and 
DictionaryBuilder
 Key: ARROW-5537
 URL: https://issues.apache.org/jira/browse/ARROW-5537
 Project: Apache Arrow
  Issue Type: New Feature
Affects Versions: 0.13.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.14.0


The new JS DictionaryBuilder and RecordBatchWriter and should support building 
and writing delta dictionary batches to enable creating DictionaryVectors while 
streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-5239) Add support for interval types in javascript

2019-05-22 Thread Paul Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-5239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16846108#comment-16846108
 ] 

Paul Taylor commented on ARROW-5239:


We have the Interval year_month and day_time types in JS, but I'm not sure if 
this issue is about a new kind of Interval DataType. [~emkornfi...@gmail.com], 
any thoughts?

> Add support for interval types in javascript
> 
>
> Key: ARROW-5239
> URL: https://issues.apache.org/jira/browse/ARROW-5239
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: JavaScript
>Reporter: Micah Kornfield
>Priority: Major
>
> Update integration_test.py to include interval tests for JSTest once this is 
> done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5396) [JS] Ensure reader and writer support files and streams with no RecordBatches

2019-05-22 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-5396:
--

 Summary: [JS] Ensure reader and writer support files and streams 
with no RecordBatches
 Key: ARROW-5396
 URL: https://issues.apache.org/jira/browse/ARROW-5396
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: 0.13.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.14.0


Re: https://issues.apache.org/jira/browse/ARROW-2119 and 
[https://github.com/apache/arrow/pull/3871], the JS reader and writer should 
support files and streams with a Schema but no RecordBatches.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-5100) [JS] Writer swaps byte order if buffers share the same underlying ArrayBuffer

2019-04-12 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-5100.

Resolution: Fixed

Issue resolved by pull request 4102
[https://github.com/apache/arrow/pull/4102]

> [JS] Writer swaps byte order if buffers share the same underlying ArrayBuffer
> -
>
> Key: ARROW-5100
> URL: https://issues.apache.org/jira/browse/ARROW-5100
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: 0.13.0
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.14.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We collapse contiguous Uint8Arrays that share the same underlying ArrayBuffer 
> and have overlapping byte ranges. This was done to maintain true zero-copy 
> behavior when using certain node core streams that use a buffer pool 
> internally, and could write chunks of the same logical Arrow Message at 
> out-of-order byte offsets in the pool.
> Unfortunately this can also lead to a bug where, in rare cases, buffers are 
> swapped while writing Arrow Messages too. We could have a flag to indicate 
> whether we think collapsing out-of-order same-buffer chunks is safe, but I'm 
> not sure if we can always know that, so I'd prefer to take it out and incur 
> the copy cost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5115) [JS] Implement the Vector Builders

2019-04-03 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-5115:
--

 Summary: [JS] Implement the Vector Builders
 Key: ARROW-5115
 URL: https://issues.apache.org/jira/browse/ARROW-5115
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: 0.13.0
Reporter: Paul Taylor
Assignee: Paul Taylor


We should implement the streaming Vector Builders in JS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5100) [JS] Writer swaps byte order if buffers share the same underlying ArrayBuffer

2019-04-02 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-5100:
--

 Summary: [JS] Writer swaps byte order if buffers share the same 
underlying ArrayBuffer
 Key: ARROW-5100
 URL: https://issues.apache.org/jira/browse/ARROW-5100
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: 0.13.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.14.0


We collapse contiguous Uint8Arrays that share the same underlying ArrayBuffer 
and have overlapping byte ranges. This was done to maintain true zero-copy 
behavior when using certain node core streams that use a buffer pool 
internally, and could write chunks of the same logical Arrow Message at 
out-of-order byte offsets in the pool.

Unfortunately this can also lead to a bug where, in rare cases, buffers are 
swapped while writing Arrow Messages too. We could have a flag to indicate 
whether we think collapsing out-of-order same-buffer chunks is safe, but I'm 
not sure if we can always know that, so I'd prefer to take it out and incur the 
copy cost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4976) [JS] RecordBatchReader should reset its Node/DOM streams

2019-03-20 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4976:
--

 Summary: [JS] RecordBatchReader should reset its Node/DOM streams
 Key: ARROW-4976
 URL: https://issues.apache.org/jira/browse/ARROW-4976
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


RecordBatchReaders should reset their internal platform streams on reset so 
they can be piped to separate output streams when reset.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-4780) [JS] Package sourcemap files, update default package JS version

2019-03-05 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-4780:
---
Affects Version/s: JS-0.4.0

> [JS] Package sourcemap files, update default package JS version
> ---
>
> Key: ARROW-4780
> URL: https://issues.apache.org/jira/browse/ARROW-4780
> Project: Apache Arrow
>  Issue Type: Improvement
>Affects Versions: JS-0.4.0
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Minor
>
> The build should split the sourcemaps out to speed up client builds, and 
> include a "module" entry in the package.json for @pika/web, and the main 
> package should ship the latest ESNext JS versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4780) [JS] Package sourcemap files, update default package JS version

2019-03-05 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4780:
--

 Summary: [JS] Package sourcemap files, update default package JS 
version
 Key: ARROW-4780
 URL: https://issues.apache.org/jira/browse/ARROW-4780
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Paul Taylor
Assignee: Paul Taylor


The build should split the sourcemaps out to speed up client builds, and 
include a "module" entry in the package.json for @pika/web, and the main 
package should ship the latest ESNext JS versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4781) [JS] Ensure empty data initializes empty typed arrays

2019-03-05 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4781:
--

 Summary: [JS] Ensure empty data initializes empty typed arrays
 Key: ARROW-4781
 URL: https://issues.apache.org/jira/browse/ARROW-4781
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


Empty ArrayData instances should initialize with the appropriate 0-length 
buffers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-4780) [JS] Package sourcemap files, update default package JS version

2019-03-05 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-4780:
---
Component/s: JavaScript

> [JS] Package sourcemap files, update default package JS version
> ---
>
> Key: ARROW-4780
> URL: https://issues.apache.org/jira/browse/ARROW-4780
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Affects Versions: JS-0.4.0
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Minor
> Fix For: JS-0.4.1
>
>
> The build should split the sourcemaps out to speed up client builds, and 
> include a "module" entry in the package.json for @pika/web, and the main 
> package should ship the latest ESNext JS versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-4780) [JS] Package sourcemap files, update default package JS version

2019-03-05 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-4780:
---
Fix Version/s: JS-0.4.1

> [JS] Package sourcemap files, update default package JS version
> ---
>
> Key: ARROW-4780
> URL: https://issues.apache.org/jira/browse/ARROW-4780
> Project: Apache Arrow
>  Issue Type: Improvement
>Affects Versions: JS-0.4.0
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Minor
> Fix For: JS-0.4.1
>
>
> The build should split the sourcemaps out to speed up client builds, and 
> include a "module" entry in the package.json for @pika/web, and the main 
> package should ship the latest ESNext JS versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3667) [JS] Incorrectly reads record batches with an all null column

2019-03-02 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-3667.

   Resolution: Fixed
 Assignee: Paul Taylor
Fix Version/s: JS-0.4.1

Issue resolved by pull request 3787
[https://github.com/apache/arrow/pull/3787]

> [JS] Incorrectly reads record batches with an all null column
> -
>
> Key: ARROW-3667
> URL: https://issues.apache.org/jira/browse/ARROW-3667
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: JS-0.3.1
>Reporter: Brian Hulette
>Assignee: Paul Taylor
>Priority: Major
> Fix For: JS-0.5.0, JS-0.4.1
>
>
> The JS library seems to incorrectly read any columns that come after an 
> all-null column in IPC buffers produced by pyarrow.
> Here's a python script that generates two arrow buffers, one with an all-null 
> column followed by a utf-8 column, and a second with those two reversed
> {code:python}
> import pyarrow as pa
> import pandas as pd
> def serialize_to_arrow(df, fd, compress=True):
>   batch = pa.RecordBatch.from_pandas(df)
>   writer = pa.RecordBatchFileWriter(fd, batch.schema)
>   writer.write_batch(batch)
>   writer.close()
> if __name__ == "__main__":
> df = pd.DataFrame(data={'nulls': [None, None, None], 'not nulls': ['abc', 
> 'def', 'ghi']}, columns=['nulls', 'not nulls'])
> with open('bad.arrow', 'wb') as fd:
> serialize_to_arrow(df, fd)
> df = pd.DataFrame(df, columns=['not nulls', 'nulls'])
> with open('good.arrow', 'wb') as fd:
> serialize_to_arrow(df, fd)
> {code}
> JS incorrectly interprets the [null, not null] case:
> {code:javascript}
> > var arrow = require('apache-arrow')
> undefined
> > var fs = require('fs')
> undefined
> > arrow.Table.from(fs.readFileSync('good.arrow')).getColumn('not 
> > nulls').get(0)
> 'abc'
> > arrow.Table.from(fs.readFileSync('bad.arrow')).getColumn('not nulls').get(0)
> '\u\u\u\u\u0003\u\u\u\u0006\u\u\u\t\u\u\u'
> {code}
> Presumably this is because pyarrow is omitting some (or all) of the buffers 
> associated with the all-null column, but the JS IPC reader is still looking 
> for them, causing the buffer count to get out of sync.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-4728) [JS] Failing test Table#assign with a zero-length Null column round-trips through serialization

2019-03-02 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-4728.

   Resolution: Fixed
Fix Version/s: JS-0.4.1

Issue resolved by pull request 3789
[https://github.com/apache/arrow/pull/3789]

> [JS] Failing test Table#assign with a zero-length Null column round-trips 
> through serialization
> ---
>
> Key: ARROW-4728
> URL: https://issues.apache.org/jira/browse/ARROW-4728
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: 0.12.1
>Reporter: Francois Saint-Jacques
>Assignee: Paul Taylor
>Priority: Major
>  Labels: ci-failure, pull-request-available, travis-ci
> Fix For: 0.13.0, JS-0.4.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> See https://travis-ci.org/apache/arrow/jobs/500383002#L1022
> {code:javascript}
>   ● Table#serialize() › Table#assign with an empty table round-trips through 
> serialization
> expect(received).toBe(expected) // Object.is equality
> Expected: 86
> Received: 41
>   91 | const source = table1.assign(Table.empty());
>   92 | expect(source.numCols).toBe(table1.numCols);
> > 93 | expect(source.length).toBe(table1.length);
>  |   ^
>   94 | const result = Table.from(source.serialize());
>   95 | expect(result).toEqualTable(source);
>   96 | 
> expect(result.schema.metadata.get('foo')).toEqual('bar');
>   at Object.test (test/unit/table/serialize-tests.ts:93:35)
>   ● Table#serialize() › Table#assign with a zero-length Null column 
> round-trips through serialization
> expect(received).toBe(expected) // Object.is equality
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (ARROW-3667) [JS] Incorrectly reads record batches with an all null column

2019-03-02 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-3667:
---
Comment: was deleted

(was: Issue resolved by pull request 3787
[https://github.com/apache/arrow/pull/3787])

> [JS] Incorrectly reads record batches with an all null column
> -
>
> Key: ARROW-3667
> URL: https://issues.apache.org/jira/browse/ARROW-3667
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: JS-0.3.1
>Reporter: Brian Hulette
>Assignee: Paul Taylor
>Priority: Major
> Fix For: JS-0.5.0, JS-0.4.1
>
>
> The JS library seems to incorrectly read any columns that come after an 
> all-null column in IPC buffers produced by pyarrow.
> Here's a python script that generates two arrow buffers, one with an all-null 
> column followed by a utf-8 column, and a second with those two reversed
> {code:python}
> import pyarrow as pa
> import pandas as pd
> def serialize_to_arrow(df, fd, compress=True):
>   batch = pa.RecordBatch.from_pandas(df)
>   writer = pa.RecordBatchFileWriter(fd, batch.schema)
>   writer.write_batch(batch)
>   writer.close()
> if __name__ == "__main__":
> df = pd.DataFrame(data={'nulls': [None, None, None], 'not nulls': ['abc', 
> 'def', 'ghi']}, columns=['nulls', 'not nulls'])
> with open('bad.arrow', 'wb') as fd:
> serialize_to_arrow(df, fd)
> df = pd.DataFrame(df, columns=['not nulls', 'nulls'])
> with open('good.arrow', 'wb') as fd:
> serialize_to_arrow(df, fd)
> {code}
> JS incorrectly interprets the [null, not null] case:
> {code:javascript}
> > var arrow = require('apache-arrow')
> undefined
> > var fs = require('fs')
> undefined
> > arrow.Table.from(fs.readFileSync('good.arrow')).getColumn('not 
> > nulls').get(0)
> 'abc'
> > arrow.Table.from(fs.readFileSync('bad.arrow')).getColumn('not nulls').get(0)
> '\u\u\u\u\u0003\u\u\u\u0006\u\u\u\t\u\u\u'
> {code}
> Presumably this is because pyarrow is omitting some (or all) of the buffers 
> associated with the all-null column, but the JS IPC reader is still looking 
> for them, causing the buffer count to get out of sync.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3667) [JS] Incorrectly reads record batches with an all null column

2019-03-02 Thread Paul Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782516#comment-16782516
 ] 

Paul Taylor commented on ARROW-3667:


Issue resolved by pull request 3787
[https://github.com/apache/arrow/pull/3787]

> [JS] Incorrectly reads record batches with an all null column
> -
>
> Key: ARROW-3667
> URL: https://issues.apache.org/jira/browse/ARROW-3667
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: JS-0.3.1
>Reporter: Brian Hulette
>Priority: Major
> Fix For: JS-0.5.0
>
>
> The JS library seems to incorrectly read any columns that come after an 
> all-null column in IPC buffers produced by pyarrow.
> Here's a python script that generates two arrow buffers, one with an all-null 
> column followed by a utf-8 column, and a second with those two reversed
> {code:python}
> import pyarrow as pa
> import pandas as pd
> def serialize_to_arrow(df, fd, compress=True):
>   batch = pa.RecordBatch.from_pandas(df)
>   writer = pa.RecordBatchFileWriter(fd, batch.schema)
>   writer.write_batch(batch)
>   writer.close()
> if __name__ == "__main__":
> df = pd.DataFrame(data={'nulls': [None, None, None], 'not nulls': ['abc', 
> 'def', 'ghi']}, columns=['nulls', 'not nulls'])
> with open('bad.arrow', 'wb') as fd:
> serialize_to_arrow(df, fd)
> df = pd.DataFrame(df, columns=['not nulls', 'nulls'])
> with open('good.arrow', 'wb') as fd:
> serialize_to_arrow(df, fd)
> {code}
> JS incorrectly interprets the [null, not null] case:
> {code:javascript}
> > var arrow = require('apache-arrow')
> undefined
> > var fs = require('fs')
> undefined
> > arrow.Table.from(fs.readFileSync('good.arrow')).getColumn('not 
> > nulls').get(0)
> 'abc'
> > arrow.Table.from(fs.readFileSync('bad.arrow')).getColumn('not nulls').get(0)
> '\u\u\u\u\u0003\u\u\u\u0006\u\u\u\t\u\u\u'
> {code}
> Presumably this is because pyarrow is omitting some (or all) of the buffers 
> associated with the all-null column, but the JS IPC reader is still looking 
> for them, causing the buffer count to get out of sync.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-4738) [JS] NullVector should include a null data buffer

2019-03-02 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-4738.

Resolution: Fixed

Issue resolved by pull request 3787
[https://github.com/apache/arrow/pull/3787]

> [JS] NullVector should include a null data buffer
> -
>
> Key: ARROW-4738
> URL: https://issues.apache.org/jira/browse/ARROW-4738
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: JS-0.4.0
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
>  Labels: pull-request-available
> Fix For: JS-0.4.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Arrow C++ and pyarrow expect NullVectors to include a null data buffer, so 
> ArrowJS should write one into the buffer layout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-4738) [JS] NullVector should include a null data buffer

2019-03-01 Thread Paul Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782221#comment-16782221
 ] 

Paul Taylor commented on ARROW-4738:


[~bhulette] a PR is up here: https://github.com/apache/arrow/pull/3787

> [JS] NullVector should include a null data buffer
> -
>
> Key: ARROW-4738
> URL: https://issues.apache.org/jira/browse/ARROW-4738
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: JS-0.4.0
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
> Fix For: JS-0.4.1
>
>
> Arrow C++ and pyarrow expect NullVectors to include a null data buffer, so 
> ArrowJS should write one into the buffer layout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-4738) [JS] NullVector should include a null data buffer

2019-03-01 Thread Paul Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782218#comment-16782218
 ] 

Paul Taylor commented on ARROW-4738:


[~bhulette] yeah, I think so

> [JS] NullVector should include a null data buffer
> -
>
> Key: ARROW-4738
> URL: https://issues.apache.org/jira/browse/ARROW-4738
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: JS-0.4.0
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
> Fix For: JS-0.4.1
>
>
> Arrow C++ and pyarrow expect NullVectors to include a null data buffer, so 
> ArrowJS should write one into the buffer layout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-4728) [JS] Failing test Table#assign with a zero-length Null column round-trips through serialization

2019-03-01 Thread Paul Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782203#comment-16782203
 ] 

Paul Taylor commented on ARROW-4728:


Thanks [~fsaintjacques], I submitted https://github.com/apache/arrow/pull/3789 
with a fix

> [JS] Failing test Table#assign with a zero-length Null column round-trips 
> through serialization
> ---
>
> Key: ARROW-4728
> URL: https://issues.apache.org/jira/browse/ARROW-4728
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: 0.12.1
>Reporter: Francois Saint-Jacques
>Assignee: Paul Taylor
>Priority: Major
>  Labels: ci-failure, travis-ci
> Fix For: 0.13.0
>
>
> See https://travis-ci.org/apache/arrow/jobs/500414242#L1002
> {code:javascript}
>   ● Table#serialize() › Table#assign with an empty table round-trips through 
> serialization
> expect(received).toBe(expected) // Object.is equality
> Expected: 86
> Received: 41
>   91 | const source = table1.assign(Table.empty());
>   92 | expect(source.numCols).toBe(table1.numCols);
> > 93 | expect(source.length).toBe(table1.length);
>  |   ^
>   94 | const result = Table.from(source.serialize());
>   95 | expect(result).toEqualTable(source);
>   96 | 
> expect(result.schema.metadata.get('foo')).toEqual('bar');
>   at Object.test (test/unit/table/serialize-tests.ts:93:35)
>   ● Table#serialize() › Table#assign with a zero-length Null column 
> round-trips through serialization
> expect(received).toBe(expected) // Object.is equality
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4738) [JS] NullVector should include a null data buffer

2019-03-01 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4738:
--

 Summary: [JS] NullVector should include a null data buffer
 Key: ARROW-4738
 URL: https://issues.apache.org/jira/browse/ARROW-4738
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


Arrow C++ and pyarrow expect NullVectors to include a null data buffer, so 
ArrowJS should write one into the buffer layout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-4728) [JS] Failing test Table#assign with a zero-length Null column round-trips through serialization

2019-03-01 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor reassigned ARROW-4728:
--

Assignee: Paul Taylor

> [JS] Failing test Table#assign with a zero-length Null column round-trips 
> through serialization
> ---
>
> Key: ARROW-4728
> URL: https://issues.apache.org/jira/browse/ARROW-4728
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Affects Versions: 0.12.1
>Reporter: Francois Saint-Jacques
>Assignee: Paul Taylor
>Priority: Major
>  Labels: ci-failure, travis-ci
> Fix For: 0.13.0
>
>
> See https://travis-ci.org/apache/arrow/jobs/500414242#L1002
> {code:javascript}
>   ● Table#serialize() › Table#assign with an empty table round-trips through 
> serialization
> expect(received).toBe(expected) // Object.is equality
> Expected: 86
> Received: 41
>   91 | const source = table1.assign(Table.empty());
>   92 | expect(source.numCols).toBe(table1.numCols);
> > 93 | expect(source.length).toBe(table1.length);
>  |   ^
>   94 | const result = Table.from(source.serialize());
>   95 | expect(result).toEqualTable(source);
>   96 | 
> expect(result.schema.metadata.get('foo')).toEqual('bar');
>   at Object.test (test/unit/table/serialize-tests.ts:93:35)
>   ● Table#serialize() › Table#assign with a zero-length Null column 
> round-trips through serialization
> expect(received).toBe(expected) // Object.is equality
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4682) [JS] Writer should be able to write empty tables

2019-02-26 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4682:
--

 Summary: [JS] Writer should be able to write empty tables
 Key: ARROW-4682
 URL: https://issues.apache.org/jira/browse/ARROW-4682
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


The writer should be able to write empty tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4674) [JS] Update arrow2csv to new Row API

2019-02-25 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4674:
--

 Summary: [JS] Update arrow2csv to new Row API
 Key: ARROW-4674
 URL: https://issues.apache.org/jira/browse/ARROW-4674
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


The {{arrow2csv}} utility uses {{row.length}} to measure cells, but now that 
we've made Rows use Symbols for their internal properties, it should enumerate 
the values with the iterator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-4524) [JS] Improve Row proxy generation performance

2019-02-25 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-4524.

   Resolution: Fixed
Fix Version/s: JS-0.4.1

> [JS] Improve Row proxy generation performance
> -
>
> Key: ARROW-4524
> URL: https://issues.apache.org/jira/browse/ARROW-4524
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
> Fix For: JS-0.4.1
>
>
> See 
> https://github.com/vega/vega-loader-arrow/commit/19c88e130aaeeae9d0166360db467121e5724352#r32253784



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4652) [JS] RecordBatchReader throughNode should respect autoDestroy

2019-02-21 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4652:
--

 Summary: [JS] RecordBatchReader throughNode should respect 
autoDestroy
 Key: ARROW-4652
 URL: https://issues.apache.org/jira/browse/ARROW-4652
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


The Reader transform stream closes after reading one set of tables even when 
autoDestroy is false. Instead it should reset/reopen the reader, like 
{{RecordBatchReader.readAll()}} does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4580) [JS] Accept Iterables in IntVector/FloatVector from() signatures

2019-02-14 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4580:
--

 Summary: [JS] Accept Iterables in IntVector/FloatVector from() 
signatures
 Key: ARROW-4580
 URL: https://issues.apache.org/jira/browse/ARROW-4580
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


Right now {{IntVector.from()}} and {{FloatVector.from()}} expect the data is 
already in typed-array form. But if we know the desired Vector type before hand 
(e.g. if {{Int32Vector.from()}} is called), we can accept any JS iterable of 
the values.

In order to do this, we should ensure {{Float16Vector.from()}} properly clamps 
incoming f32/f64 values to u16s, in case the source is a vanilla 64-bit JS 
float.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4579) [JS] Add more interop with BigInt/BigInt64Array/BigUint64Array

2019-02-14 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4579:
--

 Summary: [JS] Add more interop with 
BigInt/BigInt64Array/BigUint64Array
 Key: ARROW-4579
 URL: https://issues.apache.org/jira/browse/ARROW-4579
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


We should use or return the new native [BigInt 
types|https://developers.google.com/web/updates/2018/05/bigint] whenever it's 
available.

* Use the native {{BigInt}} to convert/stringify i64s/u64s
* Support the {{BigInt}} type in element comparator and {{indexOf()}}
* Add zero-copy {{toBigInt64Array()}} and {{toBigUint64Array()}} methods to 
{{Int64Vector}} and {{Uint64Vector}}, respectively




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4578) [JS] Float16Vector toArray should be zero-copy

2019-02-14 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4578:
--

 Summary: [JS] Float16Vector toArray should be zero-copy
 Key: ARROW-4578
 URL: https://issues.apache.org/jira/browse/ARROW-4578
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


The {{Float16Vector#toArray()}} implementation currently transforms each half 
float into a single float, and returns a Float32Array. All the other 
{{toArray()}} implementations are zero-copy, and this deviation would break 
anyone expecting to give two-byte half floats to native APIs like WebGL. We 
should instead include {{Float16Vector#toFloat32Array()}} and 
{{Float16Vector#toFloat64Array()}} convenience methods that do rely on copying.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-4555) [JS] Add high-level Table and Column creation methods

2019-02-12 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-4555:
---
Affects Version/s: (was: 0.4.0)
   JS-0.4.0

> [JS] Add high-level Table and Column creation methods
> -
>
> Key: ARROW-4555
> URL: https://issues.apache.org/jira/browse/ARROW-4555
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: JavaScript
>Affects Versions: JS-0.4.0
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
> Fix For: 0.4.1
>
>
> It'd be great to have a few high-level functions that implicitly create the 
> Schema, RecordBatches, etc. from a Table and a list of Columns. For example:
> {code:actionscript}
> const table = Table.new(
>   Column.new('foo', ...),
>   Column.new('bar', ...)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-4554) [JS] Implement logic for combining Vectors with different lengths/chunksizes

2019-02-12 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-4554:
---
Fix Version/s: (was: 0.4.1)
   JS-0.5.0

> [JS] Implement logic for combining Vectors with different lengths/chunksizes
> 
>
> Key: ARROW-4554
> URL: https://issues.apache.org/jira/browse/ARROW-4554
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: JavaScript
>Affects Versions: JS-0.4.0
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
> Fix For: JS-0.5.0
>
>
> We should add logic to combine and possibly slice/re-chunk and uniformly 
> partition chunks into separate RecordBatches. This will make it easier to 
> create Tables or RecordBatches from Vectors of different lengths. This is 
> also necessary for {{Table#assign()}}. PR incoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-4553) [JS] Implement Schema/Field/DataType comparators

2019-02-12 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-4553:
---
Affects Version/s: (was: 0.4.0)
   JS-0.4.0

> [JS] Implement Schema/Field/DataType comparators
> 
>
> Key: ARROW-4553
> URL: https://issues.apache.org/jira/browse/ARROW-4553
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: JavaScript
>Affects Versions: JS-0.4.0
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
> Fix For: 0.4.1
>
>
> Some basic type comparison logic is necessary for {{Table#assign()}}. PR 
> incoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-4555) [JS] Add high-level Table and Column creation methods

2019-02-12 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-4555:
---
Fix Version/s: (was: 0.4.1)
   JS-0.5.0

> [JS] Add high-level Table and Column creation methods
> -
>
> Key: ARROW-4555
> URL: https://issues.apache.org/jira/browse/ARROW-4555
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: JavaScript
>Affects Versions: JS-0.4.0
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
> Fix For: JS-0.5.0
>
>
> It'd be great to have a few high-level functions that implicitly create the 
> Schema, RecordBatches, etc. from a Table and a list of Columns. For example:
> {code:actionscript}
> const table = Table.new(
>   Column.new('foo', ...),
>   Column.new('bar', ...)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-4554) [JS] Implement logic for combining Vectors with different lengths/chunksizes

2019-02-12 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-4554:
---
Affects Version/s: (was: 0.4.0)
   JS-0.4.0

> [JS] Implement logic for combining Vectors with different lengths/chunksizes
> 
>
> Key: ARROW-4554
> URL: https://issues.apache.org/jira/browse/ARROW-4554
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: JavaScript
>Affects Versions: JS-0.4.0
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
> Fix For: 0.4.1
>
>
> We should add logic to combine and possibly slice/re-chunk and uniformly 
> partition chunks into separate RecordBatches. This will make it easier to 
> create Tables or RecordBatches from Vectors of different lengths. This is 
> also necessary for {{Table#assign()}}. PR incoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-4553) [JS] Implement Schema/Field/DataType comparators

2019-02-12 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-4553:
---
Fix Version/s: (was: 0.4.1)
   JS-0.5.0

> [JS] Implement Schema/Field/DataType comparators
> 
>
> Key: ARROW-4553
> URL: https://issues.apache.org/jira/browse/ARROW-4553
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: JavaScript
>Affects Versions: JS-0.4.0
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
> Fix For: JS-0.5.0
>
>
> Some basic type comparison logic is necessary for {{Table#assign()}}. PR 
> incoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-4552) [JS] Table and Schema assign implementations

2019-02-12 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-4552:
---
Fix Version/s: JS-0.5.0

> [JS] Table and Schema assign implementations
> 
>
> Key: ARROW-4552
> URL: https://issues.apache.org/jira/browse/ARROW-4552
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: JavaScript
>Affects Versions: JS-0.4.0
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
> Fix For: JS-0.5.0
>
>
> It'd be really handy to have a basic {{assign}} methods on the Table and 
> Schema. I've extracted and cleaned up some internal helper methods I have 
> that does this. PR incoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4557) [JS] Add Table/Schema/RecordBatch `selectAt(...indices)` method

2019-02-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4557:
--

 Summary: [JS] Add Table/Schema/RecordBatch `selectAt(...indices)` 
method
 Key: ARROW-4557
 URL: https://issues.apache.org/jira/browse/ARROW-4557
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.5.0


Presently Table, Schema, and RecordBatch have basic {{select(...colNames)}} 
implementations. Having an easy {{selectAt(...colIndices)}} impl would be a 
nice complement, especially when there are duplicate column names.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-4552) [JS] Table and Schema assign implementations

2019-02-12 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-4552:
---
Affects Version/s: (was: 0.4.0)
   JS-0.4.0

> [JS] Table and Schema assign implementations
> 
>
> Key: ARROW-4552
> URL: https://issues.apache.org/jira/browse/ARROW-4552
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: JavaScript
>Affects Versions: JS-0.4.0
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
>
> It'd be really handy to have a basic {{assign}} methods on the Table and 
> Schema. I've extracted and cleaned up some internal helper methods I have 
> that does this. PR incoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4555) [JS] Add high-level Table and Column creation methods

2019-02-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4555:
--

 Summary: [JS] Add high-level Table and Column creation methods
 Key: ARROW-4555
 URL: https://issues.apache.org/jira/browse/ARROW-4555
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: 0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.4.1


It'd be great to have a few high-level functions that implicitly create the 
Schema, RecordBatches, etc. from a Table and a list of Columns. For example:
{code:actionscript}
const table = Table.new(
  Column.new('foo', ...),
  Column.new('bar', ...)
);
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4554) [JS] Implement logic for combining Vectors with different lengths/chunksizes

2019-02-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4554:
--

 Summary: [JS] Implement logic for combining Vectors with different 
lengths/chunksizes
 Key: ARROW-4554
 URL: https://issues.apache.org/jira/browse/ARROW-4554
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: 0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.4.1


We should add logic to combine and possibly slice/re-chunk and uniformly 
partition chunks into separate RecordBatches. This will make it easier to 
create Tables or RecordBatches from Vectors of different lengths. This is also 
necessary for {{Table#assign()}}. PR incoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4553) [JS] Implement Schema/Field/DataType comparators

2019-02-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4553:
--

 Summary: [JS] Implement Schema/Field/DataType comparators
 Key: ARROW-4553
 URL: https://issues.apache.org/jira/browse/ARROW-4553
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: 0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.4.1


Some basic type comparison logic is necessary for {{Table#assign()}}. PR 
incoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4552) [JS] Table and Schema assign implementations

2019-02-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4552:
--

 Summary: [JS] Table and Schema assign implementations
 Key: ARROW-4552
 URL: https://issues.apache.org/jira/browse/ARROW-4552
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: 0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor


It'd be really handy to have a basic {{assign}} methods on the Table and 
Schema. I've extracted and cleaned up some internal helper methods I have that 
does this. PR incoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-2116) [JS] Implement IPC writer

2019-02-07 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor reassigned ARROW-2116:
--

Assignee: Paul Taylor

> [JS] Implement IPC writer
> -
>
> Key: ARROW-2116
> URL: https://issues.apache.org/jira/browse/ARROW-2116
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Brian Hulette
>Assignee: Paul Taylor
>Priority: Major
>  Labels: pull-request-available
> Fix For: JS-0.4.0
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4477) [JS] Bn shouldn't override constructor of the resulting typed array

2019-02-04 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4477:
--

 Summary: [JS] Bn shouldn't override constructor of the resulting 
typed array
 Key: ARROW-4477
 URL: https://issues.apache.org/jira/browse/ARROW-4477
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: 0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.4.0


There's an undefined constructor property definition in the {{Object.assign()}} 
call for the BigNum mixins that's overriding the constructor of the returned 
TypedArrays. I think this was left over from the first iteration where I used 
{{Object.create()}}. These should be removed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4442) [JS] Overly broad type annotation for Chunked typeId leading to type mismatches in generated typing

2019-01-31 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4442:
--

 Summary: [JS] Overly broad type annotation for Chunked typeId 
leading to type mismatches in generated typing
 Key: ARROW-4442
 URL: https://issues.apache.org/jira/browse/ARROW-4442
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Affects Versions: 0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor


Typescript is generating an overly broad type for the `typeId` property of the 
ChunkedVector class, leading to a type mismatch and failure to infer Column 
is Vector:


{code:actionscript}

let col: Vector;
col = new Chunked(new Utf8());
  ^
/*
Argument of type 'Chunked' is not assignable to parameter of type 
'Vector'.
  Type 'Chunked' is not assignable to type 'Vector'.
Types of property 'typeId' are incompatible.
  Type 'Type' is not assignable to type 'Type.Utf8'.
*/
{code}

The fix is to add an explicit return annotation to the Chunked typeId getter.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4396) Update Typedoc to support TypeScript 3.2

2019-01-27 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4396:
--

 Summary: Update Typedoc to support TypeScript 3.2
 Key: ARROW-4396
 URL: https://issues.apache.org/jira/browse/ARROW-4396
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Reporter: Paul Taylor
Assignee: Paul Taylor


Update TypeDoc now that it supports TypeScript 3.2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4395) ts-node throws type error running `bin/arrow2csv.js`

2019-01-27 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4395:
--

 Summary: ts-node throws type error running `bin/arrow2csv.js`
 Key: ARROW-4395
 URL: https://issues.apache.org/jira/browse/ARROW-4395
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: 0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.4.0


ts-node is being too strict, throws this (inaccurate) error JIT'ing the TS 
source:

{code:none}
$ cat test/data/cpp/stream/simple.arrow | ./bin/arrow2csv.js 

/home/ptaylor/dev/arrow/js/node_modules/ts-node/src/index.ts:228
return new TSError(diagnosticText, diagnosticCodes)
   ^
TSError: ⨯ Unable to compile TypeScript:
src/vector/map.ts(25,57): error TS2345: Argument of type 'Field[]' is not assignable to parameter of type 'Field[]'.
  Type 'Field' is not assignable to type 
'Field'.
Type 'T[string] | T[number] | T[symbol]' is not assignable to type 'T[keyof 
T]'.
  Type 'T[symbol]' is not assignable to type 'T[keyof T]'.
Type 'DataType' is not assignable to type 'T[keyof T]'.
  Type 'symbol' is not assignable to type 'keyof T'.
Type 'symbol' is not assignable to type 'string | number'.
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (ARROW-1496) [JS] Upload coverage data to codecov.io

2019-01-20 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-1496:
---
Comment: was deleted

(was: Issue resolved by pull request 3290
[https://github.com/apache/arrow/pull/3290|https://github.com/apache/arrow/pull/3290]
)

> [JS] Upload coverage data to codecov.io
> ---
>
> Key: ARROW-1496
> URL: https://issues.apache.org/jira/browse/ARROW-1496
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Wes McKinney
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1496) [JS] Upload coverage data to codecov.io

2019-01-20 Thread Paul Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16747576#comment-16747576
 ] 

Paul Taylor commented on ARROW-1496:


Issue resolved by pull request 3290
[https://github.com/apache/arrow/pull/3290|https://github.com/apache/arrow/pull/3290]


> [JS] Upload coverage data to codecov.io
> ---
>
> Key: ARROW-1496
> URL: https://issues.apache.org/jira/browse/ARROW-1496
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Wes McKinney
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-1496) [JS] Upload coverage data to codecov.io

2019-01-20 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-1496.

Resolution: Fixed

Issue resolved by pull request 3290
[https://github.com/apache/arrow/pull/3290|https://github.com/apache/arrow/pull/3290]


> [JS] Upload coverage data to codecov.io
> ---
>
> Key: ARROW-1496
> URL: https://issues.apache.org/jira/browse/ARROW-1496
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Wes McKinney
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-4283) Should RecordBatchStreamReader/Writer be AsyncIterable?

2019-01-19 Thread Paul Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16747321#comment-16747321
 ] 

Paul Taylor commented on ARROW-4283:


[~pitrou] Thanks for the feedback.

I want to clarify: my Python skills aren't sharp, I'm not familiar with the 
pyarrow API or Python's asyncio/async-iterable primitives, so filter my 
comments through the lens of a beginner.

The little experience I do have is using the RecordBatchStreamReader to read 
from stdin (via {{sys.stdin.buffer}}) and named file descriptors (via 
{{os.fdopen()}}). Since Python's so friendly (and I have no idea how the Python 
IO primitives work), I thought maybe I could pass aiohttp's {{Request.stream}} 
to the RecordBatchStreamReader constructor, and quickly learned that no, I 
can't ;).

In the JS implementation we have two main entry points for reading RecordBatch 
streams:
 # a static 
[{{RecordBatchReader.from(source)}}|https://github.com/apache/arrow/blob/cc1ce6194b905768b1a6d9f0e209270f62dc558a/js/src/ipc/reader.ts#L142],
 which accepts heterogeneous source types and returns a RecordBatchReader for 
the underlying Arrow type (file, stream, or JSON) and conforms to sync/async 
semantics of the source input type
 # methods that create [through/transform 
streams|https://github.com/apache/arrow/blob/cc1ce6194b905768b1a6d9f0e209270f62dc558a/js/bin/file-to-stream.js#L33]
 from the RecordBatchReader and RecordBatchWriter, for use with node's native 
stream primitives

Each link in the streaming pipeline is a sort of transform stream, and a 
significant amount of effort went into supporting all the different 
node/browser IO primitives, so I understand if that's too much to ask at this 
point.

As an alternative, would it be possible to add a method that accepts a Python 
byte stream, and returns a zero-copy AsyncIterable of RecordBatches? Or maybe 
add an an example in the 
[python/ipc|https://arrow.apache.org/docs/python/ipc.html#writing-and-reading-streams]
 docs page of how to do that?

> Should RecordBatchStreamReader/Writer be AsyncIterable?
> ---
>
> Key: ARROW-4283
> URL: https://issues.apache.org/jira/browse/ARROW-4283
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Paul Taylor
>Priority: Minor
> Fix For: 0.13.0
>
>
> Filing this issue after a discussion today with [~xhochy] about how to 
> implement streaming pyarrow http services. I had attempted to use both Flask 
> and [aiohttp|https://aiohttp.readthedocs.io/en/stable/streams.html]'s 
> streaming interfaces because they seemed familiar, but no dice. I have no 
> idea how hard this would be to add -- supporting all the asynciterable 
> primitives in JS was non-trivial.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-4283) Should RecordBatchStreamReader/Writer be AsyncIterable?

2019-01-17 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-4283:
---
Summary: Should RecordBatchStreamReader/Writer be AsyncIterable?  (was: 
Should RecordBatchStreamReader/Writer be AsyncIteraable?)

> Should RecordBatchStreamReader/Writer be AsyncIterable?
> ---
>
> Key: ARROW-4283
> URL: https://issues.apache.org/jira/browse/ARROW-4283
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Paul Taylor
>Priority: Minor
> Fix For: 0.13.0
>
>
> Filing this issue after a discussion today with [~xhochy] about how to 
> implement streaming pyarrow http services. I had attempted to use both Flask 
> and [aiohttp|https://aiohttp.readthedocs.io/en/stable/streams.html]'s 
> streaming interfaces because they seemed familiar, but no dice. I have no 
> idea how hard this would be to add -- supporting all the asynciterable 
> primitives in JS was non-trivial.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4283) Should RecordBatchStreamReader/Writer be AsyncIteraable?

2019-01-17 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4283:
--

 Summary: Should RecordBatchStreamReader/Writer be AsyncIteraable?
 Key: ARROW-4283
 URL: https://issues.apache.org/jira/browse/ARROW-4283
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Paul Taylor


Filing this issue after a discussion today with [~xhochy] about how to 
implement streaming pyarrow http services. I had attempted to use both Flask 
and [aiohttp|https://aiohttp.readthedocs.io/en/stable/streams.html]'s streaming 
interfaces because they seemed familiar, but no dice. I have no idea how hard 
this would be to add -- supporting all the asynciterable primitives in JS was 
non-trivial.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3689) [JS] Upgrade to TS 3.1

2019-01-13 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-3689.

   Resolution: Fixed
 Assignee: Paul Taylor
Fix Version/s: (was: JS-0.5.0)
   JS-0.4.0

Issue resolved by pull request 3290
[https://github.com/apache/arrow/pull/3290|https://github.com/apache/arrow/pull/3290]


> [JS] Upgrade to TS 3.1
> --
>
> Key: ARROW-3689
> URL: https://issues.apache.org/jira/browse/ARROW-3689
> Project: Apache Arrow
>  Issue Type: Task
>  Components: JavaScript
>Reporter: Brian Hulette
>Assignee: Paul Taylor
>Priority: Major
> Fix For: JS-0.4.0
>
>
> Attempted 
> [here|https://github.com/apache/arrow/pull/2611#issuecomment-431318129], but 
> ran into issues.
> Should upgrade typedoc to 0.13 at the same time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2839) [JS] Support whatwg/streams in IPC reader/writer

2019-01-13 Thread Paul Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16741633#comment-16741633
 ] 

Paul Taylor commented on ARROW-2839:


Issue resolved by pull request 3290
[https://github.com/apache/arrow/pull/3290|https://github.com/apache/arrow/pull/3290]


> [JS] Support whatwg/streams in IPC reader/writer
> 
>
> Key: ARROW-2839
> URL: https://issues.apache.org/jira/browse/ARROW-2839
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Affects Versions: JS-0.3.1
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
> Fix For: JS-0.5.0
>
>
> We should make it easy to stream Arrow in the browser via 
> [whatwg/streams|https://github.com/whatwg/streams]. I already have this 
> working at Graphistry, but I had to use some of the IPC internal methods. 
> Creating this issue to track back-porting that work and the few minor 
> refactors to the IPC internals that we'll need to do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-2235) [JS] Add tests for IPC messages split across multiple buffers

2019-01-13 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-2235.

   Resolution: Fixed
Fix Version/s: (was: JS-0.5.0)
   JS-0.4.0

Issue resolved by pull request 3290
[https://github.com/apache/arrow/pull/3290|https://github.com/apache/arrow/pull/3290]


> [JS] Add tests for IPC messages split across multiple buffers
> -
>
> Key: ARROW-2235
> URL: https://issues.apache.org/jira/browse/ARROW-2235
> Project: Apache Arrow
>  Issue Type: Task
>  Components: JavaScript
>Reporter: Brian Hulette
>Priority: Major
> Fix For: JS-0.4.0
>
>
> See https://github.com/apache/arrow/pull/1670
> This is probably easiest to do after the JS IPC writer is finished 
> (ARROW-2116)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-2766) [JS] Add ability to construct a Table from a list of Arrays/TypedArrays

2019-01-13 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-2766.

   Resolution: Fixed
 Assignee: Paul Taylor
Fix Version/s: (was: JS-0.5.0)
   JS-0.4.0

Issue resolved by pull request 3290
[https://github.com/apache/arrow/pull/3290|https://github.com/apache/arrow/pull/3290]


> [JS] Add ability to construct a Table from a list of Arrays/TypedArrays
> ---
>
> Key: ARROW-2766
> URL: https://issues.apache.org/jira/browse/ARROW-2766
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: JavaScript
>Reporter: Brian Hulette
>Assignee: Paul Taylor
>Priority: Major
> Fix For: JS-0.4.0
>
>
> Something like 
> {code:javascript}
> Table.from({'col1': [...], 'col2': [...], 'col3': [...]})
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3561) [JS] Update ts-jest

2019-01-13 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-3561.

   Resolution: Fixed
 Assignee: Paul Taylor
Fix Version/s: JS-0.4.0

Issue resolved by pull request 3290
[https://github.com/apache/arrow/pull/3290|https://github.com/apache/arrow/pull/3290]


> [JS] Update ts-jest
> ---
>
> Key: ARROW-3561
> URL: https://issues.apache.org/jira/browse/ARROW-3561
> Project: Apache Arrow
>  Issue Type: Task
>  Components: JavaScript
>Reporter: Dominik Moritz
>Assignee: Paul Taylor
>Priority: Major
> Fix For: JS-0.4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3337) [JS] IPC writer doesn't serialize the dictionary of nested Vectors

2019-01-13 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-3337.

   Resolution: Fixed
Fix Version/s: (was: JS-0.5.0)
   JS-0.4.0

Issue resolved by pull request 3290
[https://github.com/apache/arrow/pull/3290|https://github.com/apache/arrow/pull/3290]


> [JS] IPC writer doesn't serialize the dictionary of nested Vectors
> --
>
> Key: ARROW-3337
> URL: https://issues.apache.org/jira/browse/ARROW-3337
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Affects Versions: JS-0.3.1
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
> Fix For: JS-0.4.0
>
>
> The JS writer only serializes dictionaries for [top-level 
> children|https://github.com/apache/arrow/blob/ee9b1ba426e2f1f117cde8d8f4ba6fbe3be5674c/js/src/ipc/writer/binary.ts#L40]
>  of a Table. This is wrong, and an oversight on my part. The fix here is to 
> put the actual Dictionary vectors in the `schema.dictionaries` map instead of 
> the dictionary fields, like I understand the C++ does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-2778) [JS] Add Utf8Vector.from

2019-01-13 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-2778.

   Resolution: Fixed
Fix Version/s: JS-0.4.0

Issue resolved by pull request 3290
[https://github.com/apache/arrow/pull/3290|https://github.com/apache/arrow/pull/3290]


> [JS] Add Utf8Vector.from
> 
>
> Key: ARROW-2778
> URL: https://issues.apache.org/jira/browse/ARROW-2778
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Labels: pull-request-available
> Fix For: JS-0.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3560) [JS] Remove @std/esm

2019-01-13 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-3560.

   Resolution: Fixed
 Assignee: Paul Taylor
Fix Version/s: JS-0.4.0

Issue resolved by pull request 3290
[https://github.com/apache/arrow/pull/3290|https://github.com/apache/arrow/pull/3290]


> [JS] Remove @std/esm
> 
>
> Key: ARROW-3560
> URL: https://issues.apache.org/jira/browse/ARROW-3560
> Project: Apache Arrow
>  Issue Type: Task
>  Components: JavaScript
>Reporter: Dominik Moritz
>Assignee: Paul Taylor
>Priority: Minor
> Fix For: JS-0.4.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When I run npm install, I get this warning:
> @std/esm@0.26.0: This package is discontinued. Use https://npmjs.com/esm



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-2839) [JS] Support whatwg/streams in IPC reader/writer

2019-01-13 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-2839.

   Resolution: Fixed
Fix Version/s: (was: JS-0.5.0)
   JS-0.4.0

Issue resolved by pull request 3290
[https://github.com/apache/arrow/pull/3290|https://github.com/apache/arrow/pull/3290]


> [JS] Support whatwg/streams in IPC reader/writer
> 
>
> Key: ARROW-2839
> URL: https://issues.apache.org/jira/browse/ARROW-2839
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Affects Versions: JS-0.3.1
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
> Fix For: JS-0.4.0
>
>
> We should make it easy to stream Arrow in the browser via 
> [whatwg/streams|https://github.com/whatwg/streams]. I already have this 
> working at Graphistry, but I had to use some of the IPC internal methods. 
> Creating this issue to track back-porting that work and the few minor 
> refactors to the IPC internals that we'll need to do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (ARROW-2839) [JS] Support whatwg/streams in IPC reader/writer

2019-01-13 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor updated ARROW-2839:
---
Comment: was deleted

(was: Issue resolved by pull request 3290
[https://github.com/apache/arrow/pull/3290|https://github.com/apache/arrow/pull/3290]
)

> [JS] Support whatwg/streams in IPC reader/writer
> 
>
> Key: ARROW-2839
> URL: https://issues.apache.org/jira/browse/ARROW-2839
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Affects Versions: JS-0.3.1
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
> Fix For: JS-0.5.0
>
>
> We should make it easy to stream Arrow in the browser via 
> [whatwg/streams|https://github.com/whatwg/streams]. I already have this 
> working at Graphistry, but I had to use some of the IPC internal methods. 
> Creating this issue to track back-porting that work and the few minor 
> refactors to the IPC internals that we'll need to do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3336) JS writer doesn't serialize sliced Vectors correctly

2018-09-27 Thread Paul Taylor (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor resolved ARROW-3336.

Resolution: Fixed

Issue resolved by pull request 2638
[https://github.com/apache/arrow/pull/2638]

> JS writer doesn't serialize sliced Vectors correctly
> 
>
> Key: ARROW-3336
> URL: https://issues.apache.org/jira/browse/ARROW-3336
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Affects Versions: JS-0.3.1
>Reporter: Paul Taylor
>Assignee: Paul Taylor
>Priority: Major
>  Labels: pull-request-available
> Fix For: JS-0.4.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> The JS IPC writer is slicing the data and valueOffset buffers by starting 
> from the data's current logical offset. This is incorrect, since the slice 
> function already does this for the data, type, and valueOffset TypedArrays 
> internally. PR incoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3337) JS writer doesn't serialize the dictionary of nested Vectors

2018-09-26 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-3337:
--

 Summary: JS writer doesn't serialize the dictionary of nested 
Vectors
 Key: ARROW-3337
 URL: https://issues.apache.org/jira/browse/ARROW-3337
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Affects Versions: JS-0.3.1
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.0


The JS writer only serializes dictionaries for [top-level 
children|https://github.com/apache/arrow/blob/ee9b1ba426e2f1f117cde8d8f4ba6fbe3be5674c/js/src/ipc/writer/binary.ts#L40]
 of a Table. This is wrong, and an oversight on my part. The fix here is to put 
the actual Dictionary vectors in the `schema.dictionaries` map instead of the 
dictionary fields, like I understand the C++ does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3336) JS writer doesn't serialize sliced Vectors correctly

2018-09-26 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-3336:
--

 Summary: JS writer doesn't serialize sliced Vectors correctly
 Key: ARROW-3336
 URL: https://issues.apache.org/jira/browse/ARROW-3336
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Affects Versions: JS-0.3.1
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.0


The JS IPC writer is slicing the data and valueOffset buffers by starting from 
the data's current logical offset. This is incorrect, since the slice function 
already does this for the data, type, and valueOffset TypedArrays internally. 
PR incoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3256) [JS] File footer and message metadata is inconsistent

2018-09-25 Thread Paul Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628085#comment-16628085
 ] 

Paul Taylor commented on ARROW-3256:


Sorry for the confusion Wes, I got distracted while rewriting that sentence and 
forgot to remove the last half when I came back to it.

Does this change look like a fix? 
https://github.com/apache/arrow/pull/2616/commits/2095e4ebffeb9f51f04d1b9500c958dbbca9bedd#diff-64a9bfd33e2b9cdeaf61082d9fde8a0dR77

> [JS] File footer and message metadata is inconsistent
> -
>
> Key: ARROW-3256
> URL: https://issues.apache.org/jira/browse/ARROW-3256
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Reporter: Wes McKinney
>Priority: Major
> Fix For: JS-0.4.0
>
>
> I added some assertions to the C++ library and found that the body length in 
> the file footer and the IPC message were different
> {code}
> ##
> JS producing, C++ consuming
> ##
> ==
> Testing file 
> /home/travis/build/apache/arrow/integration/data/struct_example.json
> ==
> -- Creating binary inputs
> node --no-warnings /home/travis/build/apache/arrow/js/bin/json-to-arrow.js -a 
> /tmp/tmplbm3vbwz/3d2269c960f148b6b94e5f881c0bf9ca_struct_example.json_to_arrow
>  -j /home/travis/build/apache/arrow/integration/data/struct_example.json
> -- Validating file
> /home/travis/build/apache/arrow/cpp-build/debug/json-integration-test 
> --integration 
> --arrow=/tmp/tmplbm3vbwz/3d2269c960f148b6b94e5f881c0bf9ca_struct_example.json_to_arrow
>  --json=/home/travis/build/apache/arrow/integration/data/struct_example.json 
> --mode=VALIDATE
> Command failed: 
> /home/travis/build/apache/arrow/cpp-build/debug/json-integration-test 
> --integration 
> --arrow=/tmp/tmplbm3vbwz/3d2269c960f148b6b94e5f881c0bf9ca_struct_example.json_to_arrow
>  --json=/home/travis/build/apache/arrow/integration/data/struct_example.json 
> --mode=VALIDATE
> With output:
> --
> /home/travis/build/apache/arrow/cpp/src/arrow/ipc/reader.cc:581 Check failed: 
> (message->body_length()) == (block.body_length)
> {code}
> I'm not sure what's wrong. I'll remove the assertions for now



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3256) [JS] File footer and message metadata is inconsistent

2018-09-25 Thread Paul Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16627755#comment-16627755
 ] 

Paul Taylor commented on ARROW-3256:


Yeah, looking at it now it makes sense why they're different. The JS is setting 
the FileBlock's body_length to the size of the entire serialized IPC message, 
not just the size of the data buffers.

The body_length in the RecordBatch header currently is the total aligned sizes 
of the buffers in the batch, which I copied from here: 
https://github.com/apache/arrow/blob/516750216bfd48489b20988ad181e61823ecbb2f/cpp/src/arrow/ipc/writer.cc#L179

Also looking at where the body_length from a FileBlock is used, I see this: 
https://github.com/apache/arrow/blob/516750216bfd48489b20988ad181e61823ecbb2f/cpp/src/arrow/ipc/writer.cc#L866
 

That looks like the body_length field in the message header is the sum size of 
all the buffers. The size of the IPC message is then metadata_length + 
body_length + padding, and is written to the first 4 bytes of the ipc message. 
Have I misunderstood how the the C++ writer is computing the body_length?

> [JS] File footer and message metadata is inconsistent
> -
>
> Key: ARROW-3256
> URL: https://issues.apache.org/jira/browse/ARROW-3256
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Reporter: Wes McKinney
>Priority: Major
> Fix For: JS-0.4.0
>
>
> I added some assertions to the C++ library and found that the body length in 
> the file footer and the IPC message were different
> {code}
> ##
> JS producing, C++ consuming
> ##
> ==
> Testing file 
> /home/travis/build/apache/arrow/integration/data/struct_example.json
> ==
> -- Creating binary inputs
> node --no-warnings /home/travis/build/apache/arrow/js/bin/json-to-arrow.js -a 
> /tmp/tmplbm3vbwz/3d2269c960f148b6b94e5f881c0bf9ca_struct_example.json_to_arrow
>  -j /home/travis/build/apache/arrow/integration/data/struct_example.json
> -- Validating file
> /home/travis/build/apache/arrow/cpp-build/debug/json-integration-test 
> --integration 
> --arrow=/tmp/tmplbm3vbwz/3d2269c960f148b6b94e5f881c0bf9ca_struct_example.json_to_arrow
>  --json=/home/travis/build/apache/arrow/integration/data/struct_example.json 
> --mode=VALIDATE
> Command failed: 
> /home/travis/build/apache/arrow/cpp-build/debug/json-integration-test 
> --integration 
> --arrow=/tmp/tmplbm3vbwz/3d2269c960f148b6b94e5f881c0bf9ca_struct_example.json_to_arrow
>  --json=/home/travis/build/apache/arrow/integration/data/struct_example.json 
> --mode=VALIDATE
> With output:
> --
> /home/travis/build/apache/arrow/cpp/src/arrow/ipc/reader.cc:581 Check failed: 
> (message->body_length()) == (block.body_length)
> {code}
> I'm not sure what's wrong. I'll remove the assertions for now



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ARROW-3256) [JS] File footer and message metadata is inconsistent

2018-09-24 Thread Paul Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16626629#comment-16626629
 ] 

Paul Taylor edited comment on ARROW-3256 at 9/25/18 12:49 AM:
--

[~wesmckinn] -The current  behavior is metadata length + body length, aligned 
to the next-highest multiple of 8. This includes the 4 bytes used to store the 
metadata length. Do you recall the difference between the expected total size 
and the total size JS is creating? If so I can work backwards from that to 
figure out what to add or subtract.-

Edit: I misunderstood the original bug -- I now understand you mean the 
body_length of the Message that the JS writer creates is different from the 
body_length of the FileBlock it lives in. I thought you meant there was a 
difference between JS and CPP. I can take a look soon.


was (Author: paul.e.taylor):
[~wesmckinn] ~The current  behavior is metadata length + body length, aligned 
to the next-highest multiple of 8. This includes the 4 bytes used to store the 
metadata length. Do you recall the difference between the expected total size 
and the total size JS is creating? If so I can work backwards from that to 
figure out what to add or subtract.~

Edit: I misunderstood the original bug -- I now understand you mean the 
body_length of the Message that the JS writer creates is different from the 
body_length of the FileBlock it lives in. I thought you meant there was a 
difference between JS and CPP. I can take a look soon.

> [JS] File footer and message metadata is inconsistent
> -
>
> Key: ARROW-3256
> URL: https://issues.apache.org/jira/browse/ARROW-3256
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: JavaScript
>Reporter: Wes McKinney
>Priority: Major
> Fix For: JS-0.4.0
>
>
> I added some assertions to the C++ library and found that the body length in 
> the file footer and the IPC message were different
> {code}
> ##
> JS producing, C++ consuming
> ##
> ==
> Testing file 
> /home/travis/build/apache/arrow/integration/data/struct_example.json
> ==
> -- Creating binary inputs
> node --no-warnings /home/travis/build/apache/arrow/js/bin/json-to-arrow.js -a 
> /tmp/tmplbm3vbwz/3d2269c960f148b6b94e5f881c0bf9ca_struct_example.json_to_arrow
>  -j /home/travis/build/apache/arrow/integration/data/struct_example.json
> -- Validating file
> /home/travis/build/apache/arrow/cpp-build/debug/json-integration-test 
> --integration 
> --arrow=/tmp/tmplbm3vbwz/3d2269c960f148b6b94e5f881c0bf9ca_struct_example.json_to_arrow
>  --json=/home/travis/build/apache/arrow/integration/data/struct_example.json 
> --mode=VALIDATE
> Command failed: 
> /home/travis/build/apache/arrow/cpp-build/debug/json-integration-test 
> --integration 
> --arrow=/tmp/tmplbm3vbwz/3d2269c960f148b6b94e5f881c0bf9ca_struct_example.json_to_arrow
>  --json=/home/travis/build/apache/arrow/integration/data/struct_example.json 
> --mode=VALIDATE
> With output:
> --
> /home/travis/build/apache/arrow/cpp/src/arrow/ipc/reader.cc:581 Check failed: 
> (message->body_length()) == (block.body_length)
> {code}
> I'm not sure what's wrong. I'll remove the assertions for now



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >