Re: [js] publishing new npm package

2018-11-19 Thread Wes McKinney
hi Tommy, The project has to release first. You can have a look at the 0.4.0 milestone here https://issues.apache.org/jira/projects/ARROW/versions/12342901 I understand that Paul has been working on some refactoring of the Array internals, and so a release has not happened pending that. - Wes O

[js] publishing new npm package

2018-11-19 Thread Tommy Guy
Hey team, I noticed the package at https://www.npmjs.com/package/apache-arrow is 8 months old and lists version 0.3.1. The code lists v0.3.0 in packages.json. Can we get a new package published to npm? It would also be good to clarify the versioning story before we do it. Thanks! Tommy Guy

Re: [Go] High memory usage on CSV read into table

2018-11-19 Thread Wes McKinney
That seems buggy then. There is only 4.125 bytes of overhead per string value on average (a 32-bit offset, plus a valid bit) On Mon, Nov 19, 2018 at 5:02 PM Daniel Harper wrote: > > Uncompressed > > $ ls -la concurrent_streams.csv > -rw-r--r-- 1 danielharper 112M Nov 16 19:21 concurrent_streams.cs

Re: [Go] High memory usage on CSV read into table

2018-11-19 Thread Daniel Harper
Uncompressed $ ls -la concurrent_streams.csv -rw-r--r-- 1 danielharper 112M Nov 16 19:21 concurrent_streams.csv $ wc -l concurrent_streams.csv 1007481 concurrent_streams.csv Daniel Harper http://djhworld.github.io On Mon, 19 Nov 2018 at 21:55, Wes McKinney wrote: > I'm curious how the file

Re: [Go] High memory usage on CSV read into table

2018-11-19 Thread Wes McKinney
I'm curious how the file is only 100MB if it's producing ~6GB of strings in memory. Is it compressed? On Mon, Nov 19, 2018 at 4:48 PM Daniel Harper wrote: > > Thanks, > > I've tried the new code and that seems to have shaved about 1GB of memory > off, so the heap is about 8.84GB now, here is the u

Re: [Go] High memory usage on CSV read into table

2018-11-19 Thread Daniel Harper
Thanks, I've tried the new code and that seems to have shaved about 1GB of memory off, so the heap is about 8.84GB now, here is the updated pprof output https://i.imgur.com/itOHqBf.png It looks like the majority of allocations are in the memory.GoAllocator (pprof) top Showing nodes accounting fo

Re: Rust IPC and Integration Testing

2018-11-19 Thread Wes McKinney
Could you explain why having a separate crate would be a good idea? On Mon, Nov 19, 2018 at 2:40 PM Andy Grove wrote: > > A question has been raised on this PR as to whether we should publish a > separate crate for the format/ipc generated code. I think there might be > some merit in this and want

Re: Rust IPC and Integration Testing

2018-11-19 Thread Andy Grove
A question has been raised on this PR as to whether we should publish a separate crate for the format/ipc generated code. I think there might be some merit in this and wanted to raise it here to see what everyone thinks. This would mean having two directories under the rust directory ... one for a

[jira] [Created] (ARROW-3840) [C++] Run fuzzer tests with docker-compose

2018-11-19 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3840: --- Summary: [C++] Run fuzzer tests with docker-compose Key: ARROW-3840 URL: https://issues.apache.org/jira/browse/ARROW-3840 Project: Apache Arrow Issue Type: Imp

[jira] [Created] (ARROW-3839) [Rust] Add ability to infer schema in CSV reader

2018-11-19 Thread Andy Grove (JIRA)
Andy Grove created ARROW-3839: - Summary: [Rust] Add ability to infer schema in CSV reader Key: ARROW-3839 URL: https://issues.apache.org/jira/browse/ARROW-3839 Project: Apache Arrow Issue Type: I

[jira] [Created] (ARROW-3838) Implement CSV Writer

2018-11-19 Thread Andy Grove (JIRA)
Andy Grove created ARROW-3838: - Summary: Implement CSV Writer Key: ARROW-3838 URL: https://issues.apache.org/jira/browse/ARROW-3838 Project: Apache Arrow Issue Type: Improvement Compone

[jira] [Created] (ARROW-3837) [C++] gflags link errors on Windows

2018-11-19 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3837: --- Summary: [C++] gflags link errors on Windows Key: ARROW-3837 URL: https://issues.apache.org/jira/browse/ARROW-3837 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-3836) [C++] Add PREFIX option to ADD_ARROW_BENCHMARK

2018-11-19 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3836: --- Summary: [C++] Add PREFIX option to ADD_ARROW_BENCHMARK Key: ARROW-3836 URL: https://issues.apache.org/jira/browse/ARROW-3836 Project: Apache Arrow Issue Type:

Re: [Go] High memory usage on CSV read into table

2018-11-19 Thread Sebastien Binet
hi Daniel, On Sun, Nov 18, 2018 at 10:17 PM Daniel Harper wrote: > Sorry just realised SVG doesn't work. > > PNG of the pprof can be found here: https://i.imgur.com/BVXv1Jm.png > > > Daniel Harper > http://djhworld.github.io > > > On Sun, 18 Nov 2018 at 21:07, Daniel Harper wrote: > > > Wasn't s