Re: Apache Arrow .NET implementation

2018-10-11 Thread Phillip Cloud
+1. I agree that .NET + Arrow is a good match. Generally speaking, I'm not sure there are many systems programming languages whose communities wouldn't benefit from an Arrow implementation. I do think it's worth discussing what to do about the growing numbers of implementations, but that

Re: parquet-column_scanner-test failure

2018-10-11 Thread Wes McKinney
You can use https://gist.github.com/, too On Thu, Oct 11, 2018 at 11:15 AM Uwe L. Korn wrote: > > Hello Tanveer, > > I cannot see anything with that link besides the landing page of the upload > provider. Maybe open a JIRA with your problems and attach the logs there: >

[jira] [Created] (ARROW-3499) [R] Expose arrow::ipc::Message type

2018-10-11 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3499: --- Summary: [R] Expose arrow::ipc::Message type Key: ARROW-3499 URL: https://issues.apache.org/jira/browse/ARROW-3499 Project: Apache Arrow Issue Type: New

Re: [DRAFT] Apache Arrow board report October 2018

2018-10-11 Thread Wes McKinney
OK, I have updated. If others could comment on the .NET thread, we can start a vote soon there ## Description: Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data,

Re: [DRAFT] Apache Arrow board report October 2018

2018-10-11 Thread Uwe L. Korn
You could also mention that we are about to receive a C# donation. Otherwise this looks good. Uwe On Thu, Oct 11, 2018, at 6:05 PM, Wes McKinney wrote: > ## Description: > > Apache Arrow is a cross-language development platform for in-memory data. It > specifies a standardized

[DRAFT] Apache Arrow board report October 2018

2018-10-11 Thread Wes McKinney
## Description: Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational

Re: [JAVA] Arrow performance measurement

2018-10-11 Thread Li Jin
I have created these as the first step. Animesh, feel free to submit PR for these. I will look into your micro benchmarks soon. 1. [image: Improvement] ARROW-3497[Java] Add user documentation for achieving better performance 2.

[jira] [Created] (ARROW-3498) [R] Make IPC APIs consistent

2018-10-11 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3498: --- Summary: [R] Make IPC APIs consistent Key: ARROW-3498 URL: https://issues.apache.org/jira/browse/ARROW-3498 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-3497) [Java] Add user documentation for achieving better performance

2018-10-11 Thread Li Jin (JIRA)
Li Jin created ARROW-3497: - Summary: [Java] Add user documentation for achieving better performance Key: ARROW-3497 URL: https://issues.apache.org/jira/browse/ARROW-3497 Project: Apache Arrow Issue

[jira] [Created] (ARROW-3496) Add microbenchmark code to Java

2018-10-11 Thread Li Jin (JIRA)
Li Jin created ARROW-3496: - Summary: Add microbenchmark code to Java Key: ARROW-3496 URL: https://issues.apache.org/jira/browse/ARROW-3496 Project: Apache Arrow Issue Type: Task

[jira] [Created] (ARROW-3494) [C++] re2 conda-forge package not working in toolchain

2018-10-11 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3494: --- Summary: [C++] re2 conda-forge package not working in toolchain Key: ARROW-3494 URL: https://issues.apache.org/jira/browse/ARROW-3494 Project: Apache Arrow

[jira] [Created] (ARROW-3493) Document BOUNDS_CHECKING_ENABLED

2018-10-11 Thread Li Jin (JIRA)
Li Jin created ARROW-3493: - Summary: Document BOUNDS_CHECKING_ENABLED Key: ARROW-3493 URL: https://issues.apache.org/jira/browse/ARROW-3493 Project: Apache Arrow Issue Type: Task

Re: parquet-column_scanner-test failure

2018-10-11 Thread Uwe L. Korn
Hello Tanveer, I cannot see anything with that link besides the landing page of the upload provider. Maybe open a JIRA with your problems and attach the logs there: https://issues.apache.org/jira/projects/ARROW/issues Uwe On Thu, Oct 11, 2018, at 3:09 PM, Tanveer Ahmad - EWI wrote: > Hi Uwe,

[jira] [Created] (ARROW-3492) [C++] Build jemalloc in parallel

2018-10-11 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-3492: - Summary: [C++] Build jemalloc in parallel Key: ARROW-3492 URL: https://issues.apache.org/jira/browse/ARROW-3492 Project: Apache Arrow Issue Type:

Re: [JAVA] Arrow performance measurement

2018-10-11 Thread Li Jin
Hi Wes and Animesh, Thanks for the analysis and discussion. I am happy to looking into this. I will create some Jiras soon. Li On Thu, Oct 11, 2018 at 5:49 AM Wes McKinney wrote: > hey Animesh, > > Thank you for doing this analysis. If you'd like to share some of the > analysis more broadly

[jira] [Created] (ARROW-3491) [C++] Experiment with split DWARF

2018-10-11 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-3491: - Summary: [C++] Experiment with split DWARF Key: ARROW-3491 URL: https://issues.apache.org/jira/browse/ARROW-3491 Project: Apache Arrow Issue Type: Wish

[jira] [Created] (ARROW-3490) [R] streaming arrow objects to output streams

2018-10-11 Thread JIRA
Romain François created ARROW-3490: -- Summary: [R] streaming arrow objects to output streams Key: ARROW-3490 URL: https://issues.apache.org/jira/browse/ARROW-3490 Project: Apache Arrow Issue

RE: parquet-column_scanner-test failure

2018-10-11 Thread Tanveer Ahmad - EWI
Hi Uwe, Here its: https://unsee.cc/9f88adf1/ After commenting out all parquet tests, I was able to build it. Regards, -- Tanveer Ahmad PhD Student Computer Engineering Laboratory, Department of Quantum & Computer Engineering EEMCS, TU Delft, The Netherlands

Re: parquet-column_scanner-test failure

2018-10-11 Thread Uwe L. Korn
Hello Tanveer, your attachment did not come through as attachments are not allowed on the mailing list. Can you post it somewhere? Uwe On Thu, Oct 11, 2018, at 12:33 PM, Tanveer Ahmad - EWI wrote: > Hi, > > I enabled following flags and got error in the attachment (parquet- >

[jira] [Created] (ARROW-3489) [Gandiva] Support for in expressions

2018-10-11 Thread Praveen Kumar Desabandu (JIRA)
Praveen Kumar Desabandu created ARROW-3489: -- Summary: [Gandiva] Support for in expressions Key: ARROW-3489 URL: https://issues.apache.org/jira/browse/ARROW-3489 Project: Apache Arrow

[jira] [Created] (ARROW-3488) [Packaging] Separate crossbow task definition files for packaging and tests

2018-10-11 Thread Krisztian Szucs (JIRA)
Krisztian Szucs created ARROW-3488: -- Summary: [Packaging] Separate crossbow task definition files for packaging and tests Key: ARROW-3488 URL: https://issues.apache.org/jira/browse/ARROW-3488

[jira] [Created] (ARROW-3487) simplify NULL_IF_NULL functions that can return errors

2018-10-11 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-3487: - Summary: simplify NULL_IF_NULL functions that can return errors Key: ARROW-3487 URL: https://issues.apache.org/jira/browse/ARROW-3487 Project: Apache Arrow

Re: Gandiva snapshot releases

2018-10-11 Thread Praveen Kumar
Hi All, I spent some time today understanding cross bow and it looks great! To unblock ourselves immediately, we are going to do the ubuntu deploy first, followed by the mac deploy and the fat jar deployment. To confirm our understanding we would be doing the following 1. Create a queue repo

Re: Apache Arrow .NET implementation

2018-10-11 Thread Antoine Pitrou
Ah, sorry. I think it would be good for the project to have a foot in the .NET ecosystem. I'm not able to comment more specifically, but the general idea sounds fine to me. Regards Antoine. Le 11/10/2018 à 12:46, Wes McKinney a écrit : > This is a topic that bears discussion but I don't

Re: Apache Arrow .NET implementation

2018-10-11 Thread Wes McKinney
This is a topic that bears discussion but I don't think it's related to whether we accept this donation of IP into the Apache project. Let's discuss that separately. On Thu, Oct 11, 2018, 12:39 PM Antoine Pitrou wrote: > > Le 11/10/2018 à 12:34, Wes McKinney a écrit : > > hi Antoine -- this is

Re: Apache Arrow .NET implementation

2018-10-11 Thread Antoine Pitrou
Le 11/10/2018 à 12:34, Wes McKinney a écrit : > hi Antoine -- this is C# .NET, not C++. That is what I understood. My question was whether we want to continue gathering all independent implementations (I'm assuming the .NET implementation doesn't rely on the C++ libs?) in a single repo.

Re: Apache Arrow .NET implementation

2018-10-11 Thread Wes McKinney
hi Antoine -- this is C# .NET, not C++. So this is a new implementation to stand alongside the other languages in the repo at present. I would hope that we can eventually add C# integration tests to the integration test matrix. - Wes On Thu, Oct 11, 2018 at 6:12 AM Antoine Pitrou wrote: > > > If

parquet-column_scanner-test failure

2018-10-11 Thread Tanveer Ahmad - EWI
Hi, I enabled following flags and got error in the attachment (parquet-column_scanner-test failure) in making arrow build 11. cmake .. -DCMAKE_BUILD_TYPE=Release -DARROW_PARQUET=ON -DARROW_PLASMA=ON -DARROW_PLASMA_JAVA_CLIENT=ON Any help in this regard? Thanks. Regards, -- Tanveer Ahmad PhD

Re: Apache Arrow .NET implementation

2018-10-11 Thread Antoine Pitrou
If it's an independent implementation, does it make sense to integrate it in the current C++ Arrow repository? AFAIK, the main point is to run integration tests. On the other hand, it doesn't scale very well to integrate all implementations in a single repo. It could still be under the Apache

Getting a fix in for ARROW-3343

2018-10-11 Thread Wes McKinney
hi folks, The failing test in ARROW-3343 has been causing non-deterministic failing CI builds for multiple weeks now, e.g. https://travis-ci.org/apache/arrow/jobs/439490404 I suggest one of two recourses: * Fix underlying cause (seems to be a shutdown race condition of some kind) * Disable the

Re: [JAVA] Arrow performance measurement

2018-10-11 Thread Wes McKinney
hey Animesh, Thank you for doing this analysis. If you'd like to share some of the analysis more broadly e.g. on the Apache Arrow blog or social media, let us know. Seems like there might be a few follow ups here for the Arrow Java community: * Documentation about achieving better performance *

Re: Nightly tests for Arrow

2018-10-11 Thread Krisztián Szűcs
A confluence page sounds good to me. I'll create it. On Wed, Oct 10, 2018 at 8:06 PM Wes McKinney wrote: > How would you all like to manage this project? Maybe we should create > a Confluence wiki page to enumerate the different facets of the effort > and make sure we create JIRA issues to plot

Re: Apache Arrow .NET implementation

2018-10-11 Thread Wes McKinney
Do others have some opinions about this? I think it would be great to bring in folks from the .NET community. Since there is also a Parquet .NET library available (https://github.com/elastacloud/parquet-dotnet) there may be some interesting collaborations. Thanks, Wes On Tue, Oct 9, 2018 at 12:58