[jira] [Created] (ARROW-6886) [C++] arrow::io header nvcc compiler warnings

2019-10-14 Thread Paul Taylor (Jira)
Paul Taylor created ARROW-6886:
--

 Summary: [C++] arrow::io header nvcc compiler warnings
 Key: ARROW-6886
 URL: https://issues.apache.org/jira/browse/ARROW-6886
 Project: Apache Arrow
  Issue Type: New Feature
  Components: C++
Affects Versions: 0.15.0
Reporter: Paul Taylor


Seeing the following compiler warnings statically linking the arrow::io headers 
with nvcc:

{noformat}
arrow/install/include/arrow/io/file.h(189): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::MemoryMappedFile"

arrow/install/include/arrow/io/memory.h(98): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::MockOutputStream"

arrow/install/include/arrow/io/memory.h(116): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::FixedSizeBufferWriter"

arrow/install/include/arrow/io/file.h(189): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::MemoryMappedFile"

arrow/install/include/arrow/io/memory.h(98): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::MockOutputStream"

arrow/install/include/arrow/io/memory.h(116): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::FixedSizeBufferWriter"

arrow/install/include/arrow/io/file.h(189): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::MemoryMappedFile"

arrow/install/include/arrow/io/memory.h(98): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::MockOutputStream"

arrow/install/include/arrow/io/memory.h(116): warning: overloaded virtual 
function "arrow::io::Writable::Write" is only partially overridden in class 
"arrow::io::FixedSizeBufferWriter"
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-6053) [Python] RecordBatchStreamReader::Open2 cdef type signature doesn't match C++

2019-07-26 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-6053:
--

 Summary: [Python] RecordBatchStreamReader::Open2 cdef type 
signature doesn't match C++
 Key: ARROW-6053
 URL: https://issues.apache.org/jira/browse/ARROW-6053
 Project: Apache Arrow
  Issue Type: New Feature
  Components: Python
Affects Versions: 0.14.1
Reporter: Paul Taylor
Assignee: Paul Taylor


The Cython method signature for RecordBatchStreamReader::Open2 doesn't match 
the C++ type signature and causes a compiler type error trying to call Open2 
from Cython.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (ARROW-5537) [JS] Support delta dictionaries in RecordBatchWriter and DictionaryBuilder

2019-06-09 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-5537:
--

 Summary: [JS] Support delta dictionaries in RecordBatchWriter and 
DictionaryBuilder
 Key: ARROW-5537
 URL: https://issues.apache.org/jira/browse/ARROW-5537
 Project: Apache Arrow
  Issue Type: New Feature
Affects Versions: 0.13.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.14.0


The new JS DictionaryBuilder and RecordBatchWriter and should support building 
and writing delta dictionary batches to enable creating DictionaryVectors while 
streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5396) [JS] Ensure reader and writer support files and streams with no RecordBatches

2019-05-22 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-5396:
--

 Summary: [JS] Ensure reader and writer support files and streams 
with no RecordBatches
 Key: ARROW-5396
 URL: https://issues.apache.org/jira/browse/ARROW-5396
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: 0.13.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.14.0


Re: https://issues.apache.org/jira/browse/ARROW-2119 and 
[https://github.com/apache/arrow/pull/3871], the JS reader and writer should 
support files and streams with a Schema but no RecordBatches.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5115) [JS] Implement the Vector Builders

2019-04-03 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-5115:
--

 Summary: [JS] Implement the Vector Builders
 Key: ARROW-5115
 URL: https://issues.apache.org/jira/browse/ARROW-5115
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: 0.13.0
Reporter: Paul Taylor
Assignee: Paul Taylor


We should implement the streaming Vector Builders in JS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-5100) [JS] Writer swaps byte order if buffers share the same underlying ArrayBuffer

2019-04-02 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-5100:
--

 Summary: [JS] Writer swaps byte order if buffers share the same 
underlying ArrayBuffer
 Key: ARROW-5100
 URL: https://issues.apache.org/jira/browse/ARROW-5100
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: 0.13.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.14.0


We collapse contiguous Uint8Arrays that share the same underlying ArrayBuffer 
and have overlapping byte ranges. This was done to maintain true zero-copy 
behavior when using certain node core streams that use a buffer pool 
internally, and could write chunks of the same logical Arrow Message at 
out-of-order byte offsets in the pool.

Unfortunately this can also lead to a bug where, in rare cases, buffers are 
swapped while writing Arrow Messages too. We could have a flag to indicate 
whether we think collapsing out-of-order same-buffer chunks is safe, but I'm 
not sure if we can always know that, so I'd prefer to take it out and incur the 
copy cost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4976) [JS] RecordBatchReader should reset its Node/DOM streams

2019-03-20 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4976:
--

 Summary: [JS] RecordBatchReader should reset its Node/DOM streams
 Key: ARROW-4976
 URL: https://issues.apache.org/jira/browse/ARROW-4976
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


RecordBatchReaders should reset their internal platform streams on reset so 
they can be piped to separate output streams when reset.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4781) [JS] Ensure empty data initializes empty typed arrays

2019-03-05 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4781:
--

 Summary: [JS] Ensure empty data initializes empty typed arrays
 Key: ARROW-4781
 URL: https://issues.apache.org/jira/browse/ARROW-4781
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


Empty ArrayData instances should initialize with the appropriate 0-length 
buffers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4780) [JS] Package sourcemap files, update default package JS version

2019-03-05 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4780:
--

 Summary: [JS] Package sourcemap files, update default package JS 
version
 Key: ARROW-4780
 URL: https://issues.apache.org/jira/browse/ARROW-4780
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Paul Taylor
Assignee: Paul Taylor


The build should split the sourcemaps out to speed up client builds, and 
include a "module" entry in the package.json for @pika/web, and the main 
package should ship the latest ESNext JS versions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4738) [JS] NullVector should include a null data buffer

2019-03-01 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4738:
--

 Summary: [JS] NullVector should include a null data buffer
 Key: ARROW-4738
 URL: https://issues.apache.org/jira/browse/ARROW-4738
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


Arrow C++ and pyarrow expect NullVectors to include a null data buffer, so 
ArrowJS should write one into the buffer layout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4682) [JS] Writer should be able to write empty tables

2019-02-26 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4682:
--

 Summary: [JS] Writer should be able to write empty tables
 Key: ARROW-4682
 URL: https://issues.apache.org/jira/browse/ARROW-4682
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


The writer should be able to write empty tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4674) [JS] Update arrow2csv to new Row API

2019-02-25 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4674:
--

 Summary: [JS] Update arrow2csv to new Row API
 Key: ARROW-4674
 URL: https://issues.apache.org/jira/browse/ARROW-4674
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


The {{arrow2csv}} utility uses {{row.length}} to measure cells, but now that 
we've made Rows use Symbols for their internal properties, it should enumerate 
the values with the iterator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4652) [JS] RecordBatchReader throughNode should respect autoDestroy

2019-02-21 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4652:
--

 Summary: [JS] RecordBatchReader throughNode should respect 
autoDestroy
 Key: ARROW-4652
 URL: https://issues.apache.org/jira/browse/ARROW-4652
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


The Reader transform stream closes after reading one set of tables even when 
autoDestroy is false. Instead it should reset/reopen the reader, like 
{{RecordBatchReader.readAll()}} does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4579) [JS] Add more interop with BigInt/BigInt64Array/BigUint64Array

2019-02-14 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4579:
--

 Summary: [JS] Add more interop with 
BigInt/BigInt64Array/BigUint64Array
 Key: ARROW-4579
 URL: https://issues.apache.org/jira/browse/ARROW-4579
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


We should use or return the new native [BigInt 
types|https://developers.google.com/web/updates/2018/05/bigint] whenever it's 
available.

* Use the native {{BigInt}} to convert/stringify i64s/u64s
* Support the {{BigInt}} type in element comparator and {{indexOf()}}
* Add zero-copy {{toBigInt64Array()}} and {{toBigUint64Array()}} methods to 
{{Int64Vector}} and {{Uint64Vector}}, respectively




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4580) [JS] Accept Iterables in IntVector/FloatVector from() signatures

2019-02-14 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4580:
--

 Summary: [JS] Accept Iterables in IntVector/FloatVector from() 
signatures
 Key: ARROW-4580
 URL: https://issues.apache.org/jira/browse/ARROW-4580
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


Right now {{IntVector.from()}} and {{FloatVector.from()}} expect the data is 
already in typed-array form. But if we know the desired Vector type before hand 
(e.g. if {{Int32Vector.from()}} is called), we can accept any JS iterable of 
the values.

In order to do this, we should ensure {{Float16Vector.from()}} properly clamps 
incoming f32/f64 values to u16s, in case the source is a vanilla 64-bit JS 
float.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4578) [JS] Float16Vector toArray should be zero-copy

2019-02-14 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4578:
--

 Summary: [JS] Float16Vector toArray should be zero-copy
 Key: ARROW-4578
 URL: https://issues.apache.org/jira/browse/ARROW-4578
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.1


The {{Float16Vector#toArray()}} implementation currently transforms each half 
float into a single float, and returns a Float32Array. All the other 
{{toArray()}} implementations are zero-copy, and this deviation would break 
anyone expecting to give two-byte half floats to native APIs like WebGL. We 
should instead include {{Float16Vector#toFloat32Array()}} and 
{{Float16Vector#toFloat64Array()}} convenience methods that do rely on copying.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4557) [JS] Add Table/Schema/RecordBatch `selectAt(...indices)` method

2019-02-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4557:
--

 Summary: [JS] Add Table/Schema/RecordBatch `selectAt(...indices)` 
method
 Key: ARROW-4557
 URL: https://issues.apache.org/jira/browse/ARROW-4557
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: JS-0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.5.0


Presently Table, Schema, and RecordBatch have basic {{select(...colNames)}} 
implementations. Having an easy {{selectAt(...colIndices)}} impl would be a 
nice complement, especially when there are duplicate column names.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4555) [JS] Add high-level Table and Column creation methods

2019-02-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4555:
--

 Summary: [JS] Add high-level Table and Column creation methods
 Key: ARROW-4555
 URL: https://issues.apache.org/jira/browse/ARROW-4555
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: 0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.4.1


It'd be great to have a few high-level functions that implicitly create the 
Schema, RecordBatches, etc. from a Table and a list of Columns. For example:
{code:actionscript}
const table = Table.new(
  Column.new('foo', ...),
  Column.new('bar', ...)
);
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4554) [JS] Implement logic for combining Vectors with different lengths/chunksizes

2019-02-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4554:
--

 Summary: [JS] Implement logic for combining Vectors with different 
lengths/chunksizes
 Key: ARROW-4554
 URL: https://issues.apache.org/jira/browse/ARROW-4554
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: 0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.4.1


We should add logic to combine and possibly slice/re-chunk and uniformly 
partition chunks into separate RecordBatches. This will make it easier to 
create Tables or RecordBatches from Vectors of different lengths. This is also 
necessary for {{Table#assign()}}. PR incoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4553) [JS] Implement Schema/Field/DataType comparators

2019-02-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4553:
--

 Summary: [JS] Implement Schema/Field/DataType comparators
 Key: ARROW-4553
 URL: https://issues.apache.org/jira/browse/ARROW-4553
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: 0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.4.1


Some basic type comparison logic is necessary for {{Table#assign()}}. PR 
incoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4552) [JS] Table and Schema assign implementations

2019-02-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4552:
--

 Summary: [JS] Table and Schema assign implementations
 Key: ARROW-4552
 URL: https://issues.apache.org/jira/browse/ARROW-4552
 Project: Apache Arrow
  Issue Type: New Feature
  Components: JavaScript
Affects Versions: 0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor


It'd be really handy to have a basic {{assign}} methods on the Table and 
Schema. I've extracted and cleaned up some internal helper methods I have that 
does this. PR incoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4477) [JS] Bn shouldn't override constructor of the resulting typed array

2019-02-04 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4477:
--

 Summary: [JS] Bn shouldn't override constructor of the resulting 
typed array
 Key: ARROW-4477
 URL: https://issues.apache.org/jira/browse/ARROW-4477
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: 0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.4.0


There's an undefined constructor property definition in the {{Object.assign()}} 
call for the BigNum mixins that's overriding the constructor of the returned 
TypedArrays. I think this was left over from the first iteration where I used 
{{Object.create()}}. These should be removed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4442) [JS] Overly broad type annotation for Chunked typeId leading to type mismatches in generated typing

2019-01-31 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4442:
--

 Summary: [JS] Overly broad type annotation for Chunked typeId 
leading to type mismatches in generated typing
 Key: ARROW-4442
 URL: https://issues.apache.org/jira/browse/ARROW-4442
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Affects Versions: 0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor


Typescript is generating an overly broad type for the `typeId` property of the 
ChunkedVector class, leading to a type mismatch and failure to infer Column 
is Vector:


{code:actionscript}

let col: Vector;
col = new Chunked(new Utf8());
  ^
/*
Argument of type 'Chunked' is not assignable to parameter of type 
'Vector'.
  Type 'Chunked' is not assignable to type 'Vector'.
Types of property 'typeId' are incompatible.
  Type 'Type' is not assignable to type 'Type.Utf8'.
*/
{code}

The fix is to add an explicit return annotation to the Chunked typeId getter.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4396) Update Typedoc to support TypeScript 3.2

2019-01-27 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4396:
--

 Summary: Update Typedoc to support TypeScript 3.2
 Key: ARROW-4396
 URL: https://issues.apache.org/jira/browse/ARROW-4396
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Reporter: Paul Taylor
Assignee: Paul Taylor


Update TypeDoc now that it supports TypeScript 3.2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4395) ts-node throws type error running `bin/arrow2csv.js`

2019-01-27 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4395:
--

 Summary: ts-node throws type error running `bin/arrow2csv.js`
 Key: ARROW-4395
 URL: https://issues.apache.org/jira/browse/ARROW-4395
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: 0.4.0
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.4.0


ts-node is being too strict, throws this (inaccurate) error JIT'ing the TS 
source:

{code:none}
$ cat test/data/cpp/stream/simple.arrow | ./bin/arrow2csv.js 

/home/ptaylor/dev/arrow/js/node_modules/ts-node/src/index.ts:228
return new TSError(diagnosticText, diagnosticCodes)
   ^
TSError: ⨯ Unable to compile TypeScript:
src/vector/map.ts(25,57): error TS2345: Argument of type 'Field[]' is not assignable to parameter of type 'Field[]'.
  Type 'Field' is not assignable to type 
'Field'.
Type 'T[string] | T[number] | T[symbol]' is not assignable to type 'T[keyof 
T]'.
  Type 'T[symbol]' is not assignable to type 'T[keyof T]'.
Type 'DataType' is not assignable to type 'T[keyof T]'.
  Type 'symbol' is not assignable to type 'keyof T'.
Type 'symbol' is not assignable to type 'string | number'.
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-4283) Should RecordBatchStreamReader/Writer be AsyncIteraable?

2019-01-17 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-4283:
--

 Summary: Should RecordBatchStreamReader/Writer be AsyncIteraable?
 Key: ARROW-4283
 URL: https://issues.apache.org/jira/browse/ARROW-4283
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Paul Taylor


Filing this issue after a discussion today with [~xhochy] about how to 
implement streaming pyarrow http services. I had attempted to use both Flask 
and [aiohttp|https://aiohttp.readthedocs.io/en/stable/streams.html]'s streaming 
interfaces because they seemed familiar, but no dice. I have no idea how hard 
this would be to add -- supporting all the asynciterable primitives in JS was 
non-trivial.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3337) JS writer doesn't serialize the dictionary of nested Vectors

2018-09-26 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-3337:
--

 Summary: JS writer doesn't serialize the dictionary of nested 
Vectors
 Key: ARROW-3337
 URL: https://issues.apache.org/jira/browse/ARROW-3337
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Affects Versions: JS-0.3.1
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.0


The JS writer only serializes dictionaries for [top-level 
children|https://github.com/apache/arrow/blob/ee9b1ba426e2f1f117cde8d8f4ba6fbe3be5674c/js/src/ipc/writer/binary.ts#L40]
 of a Table. This is wrong, and an oversight on my part. The fix here is to put 
the actual Dictionary vectors in the `schema.dictionaries` map instead of the 
dictionary fields, like I understand the C++ does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3336) JS writer doesn't serialize sliced Vectors correctly

2018-09-26 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-3336:
--

 Summary: JS writer doesn't serialize sliced Vectors correctly
 Key: ARROW-3336
 URL: https://issues.apache.org/jira/browse/ARROW-3336
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Affects Versions: JS-0.3.1
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.0


The JS IPC writer is slicing the data and valueOffset buffers by starting from 
the data's current logical offset. This is incorrect, since the slice function 
already does this for the data, type, and valueOffset TypedArrays internally. 
PR incoming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3304) JS stream reader should yield all messages

2018-09-23 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-3304:
--

 Summary: JS stream reader should yield all messages
 Key: ARROW-3304
 URL: https://issues.apache.org/jira/browse/ARROW-3304
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Affects Versions: JS-0.3.1
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.0


The JS stream reader should yield all parsed messages from the source stream so 
an external consumer of the iterator can read multiple tables from one combined 
source stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2839) [JS] Support whatwg/streams in IPC reader/writer

2018-07-12 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-2839:
--

 Summary: [JS] Support whatwg/streams in IPC reader/writer
 Key: ARROW-2839
 URL: https://issues.apache.org/jira/browse/ARROW-2839
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Affects Versions: JS-0.3.1
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: 0.4.0


We should make it easy to stream Arrow in the browser via 
[whatwg/streams|https://github.com/whatwg/streams]. I already have this working 
at Graphistry, but I had to use some of the IPC internal methods. Creating this 
issue to track back-porting that work and the few minor refactors to the IPC 
internals that we'll need to do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2828) [JS] Refactor Vector Data classes

2018-07-10 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-2828:
--

 Summary: [JS] Refactor Vector Data classes
 Key: ARROW-2828
 URL: https://issues.apache.org/jira/browse/ARROW-2828
 Project: Apache Arrow
  Issue Type: Task
  Components: JavaScript
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.0


In order to make it easier to build some of the higher-level APIs, we need to 
slim the Vector Data classes down to just one base implementation.

Initial WIP commit here, and work will continue in this branch: 
https://github.com/trxcllnt/arrow/commit/dfad9023583bef4f8d2a50ea25f643e4bccbc805#diff-2512057432c4ebf55c6308cb06b43b08



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2779) [JS] Fix node stream reader/writer compatibility

2018-07-01 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-2779:
--

 Summary: [JS] Fix node stream reader/writer compatibility
 Key: ARROW-2779
 URL: https://issues.apache.org/jira/browse/ARROW-2779
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.0


Emit Buffers not Uint8Arrays, and guard against reading 0-length buffers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2650) [JS] Finish implementing Unions

2018-05-31 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-2650:
--

 Summary: [JS] Finish implementing Unions
 Key: ARROW-2650
 URL: https://issues.apache.org/jira/browse/ARROW-2650
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.0


Finish implementing Unions in JS and add to integration tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2640) JS Writer should serialize schema metadata

2018-05-27 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-2640:
--

 Summary: JS Writer should serialize schema metadata
 Key: ARROW-2640
 URL: https://issues.apache.org/jira/browse/ARROW-2640
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Reporter: Paul Taylor
Assignee: Paul Taylor


JS writer should serialize schema metadata



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2356) [JS] JSON reader fails on FixedSizeBinary data buffer

2018-03-26 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-2356:
--

 Summary: [JS] JSON reader fails on FixedSizeBinary data buffer
 Key: ARROW-2356
 URL: https://issues.apache.org/jira/browse/ARROW-2356
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: JS-0.3.1
Reporter: Paul Taylor
Assignee: Paul Taylor
 Fix For: JS-0.4.0


The JSON reader doesn't ingest the FixedSizeBinary data buffer correctly, and 
we haven't known about it the JS integration test runner is accidentally 
exiting with code 0 on failures.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2226) [JS] DictionaryData should use indices' offset in constructor

2018-02-26 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-2226:
--

 Summary: [JS] DictionaryData should use indices' offset in 
constructor
 Key: ARROW-2226
 URL: https://issues.apache.org/jira/browse/ARROW-2226
 Project: Apache Arrow
  Issue Type: Bug
Affects Versions: JS-0.3.0
Reporter: Paul Taylor
Assignee: Paul Taylor






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2225) [JS] Vector reader should support reading tables split across buffers

2018-02-26 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-2225:
--

 Summary: [JS] Vector reader should support reading tables split 
across buffers
 Key: ARROW-2225
 URL: https://issues.apache.org/jira/browse/ARROW-2225
 Project: Apache Arrow
  Issue Type: Bug
Affects Versions: JS-0.3.0
Reporter: Paul Taylor
Assignee: Paul Taylor






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2214) [JS] proxy DictionaryVector's nullBitmap to its indices' nullBitmap

2018-02-25 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-2214:
--

 Summary: [JS] proxy DictionaryVector's nullBitmap to its indices' 
nullBitmap
 Key: ARROW-2214
 URL: https://issues.apache.org/jira/browse/ARROW-2214
 Project: Apache Arrow
  Issue Type: Bug
Affects Versions: JS-0.3.0
Reporter: Paul Taylor
Assignee: Paul Taylor


We need to add a {{nullBitmap}} getter to {{DictionaryData}} that proxies to 
the indices {{nullBitmap}}, like we do with the {{nullCount}}. This is blocking 
the PR that updates JPMC Perspective to v0.3.0: 
[https://github.com/jpmorganchase/perspective/pull/55#issuecomment-368164271|https://github.com/jpmorganchase/perspective/pull/55#issuecomment-368164271.].
 [~wesmckinn] can we do a patch release v0.3.1 once this PR is merged, since 
it's blocking a 3rd party PR?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2213) [JS] Fix npm-release.sh

2018-02-25 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-2213:
--

 Summary: [JS] Fix npm-release.sh 
 Key: ARROW-2213
 URL: https://issues.apache.org/jira/browse/ARROW-2213
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Reporter: Paul Taylor
Assignee: Paul Taylor


Fix two publishing issues:
 # timeouts caused by npm 2FA settings: 
[https://github.com/lerna/lerna/issues/1137]
 # silent failure publishing the main apache-arrow module due to "dist" key in 
that module's generated package.json



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-1903) [JS] Fix typings consuming apache-arrow module when noImplicitAny is false

2017-12-07 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-1903:
--

 Summary: [JS] Fix typings consuming apache-arrow module when 
noImplicitAny is false
 Key: ARROW-1903
 URL: https://issues.apache.org/jira/browse/ARROW-1903
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Affects Versions: 0.8.0
Reporter: Paul Taylor
Assignee: Paul Taylor


The TypeScript compiler has a few bugs that raise compiler errors when valid 
strict-mode code is compiled with some of the strict mode-settings disabled. 
Since we ship the TS source code in the main `apache-arrow` npm module, 
consumers will encounter the following TypeScript compiler errors under these 
conditions:

{code}
# --strictNullChecks=true, --noImplicitAny=false
vector/numeric.ts(57,17): error TS2322: Type 'number' is not assignable to type 
'never'.
vector/numeric.ts(61,35): error TS2322: Type 'number' is not assignable to type 
'never'.
vector/numeric.ts(63,18): error TS2322: Type '0' is not assignable to type 
'never'.
vector/virtual.ts(98,38): error TS2345: Argument of type 'TypedArray' is not 
assignable to parameter of type 'never'.
{code}

The fixes are minor, and I'll add a step in the unit tests to validate the 
build targets compile with different compilation flags than ours.

Related:
https://github.com/ReactiveX/IxJS/pull/167
https://github.com/Microsoft/TypeScript/issues/20299



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ARROW-1590) Flow TS Table method generics

2017-09-21 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-1590:
--

 Summary: Flow TS Table method generics
 Key: ARROW-1590
 URL: https://issues.apache.org/jira/browse/ARROW-1590
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Reporter: Paul Taylor
Assignee: Paul Taylor


The Table method generics should thread the Vector and value types through from 
the call site.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ARROW-1549) [JS] Integrate auto-generated Arrow test files

2017-09-17 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-1549:
--

 Summary: [JS] Integrate auto-generated Arrow test files
 Key: ARROW-1549
 URL: https://issues.apache.org/jira/browse/ARROW-1549
 Project: Apache Arrow
  Issue Type: Test
  Components: JavaScript
Reporter: Paul Taylor
Assignee: Paul Taylor
Priority: Minor


[~wesmckinn] mentioned the C++ and Java versions have tests that validate the 
respective Reader implementations are correct against Arrow files generated by 
the other implementation. Adding these tests to the JS implementation is 
important, and should be trivially easy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ARROW-1544) [JS] Export Vector type definitions

2017-09-15 Thread Paul Taylor (JIRA)
Paul Taylor created ARROW-1544:
--

 Summary: [JS] Export Vector type definitions
 Key: ARROW-1544
 URL: https://issues.apache.org/jira/browse/ARROW-1544
 Project: Apache Arrow
  Issue Type: Bug
  Components: JavaScript
Reporter: Paul Taylor
Assignee: Paul Taylor
Priority: Minor
 Fix For: 0.7.0


We should export the Vector type definitions on the main Arrow export.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)