Arrow for Redshift Spectrum

2017-09-28 Thread Colin Nichols
Hi all, Would love to get some feedback on a little project I put together. I paired my company's parquet conversion routines (wrapper around pyarrow) with SqlAlchemy's table reflection capabilities to make an "easy mode" redshift --> Redshift spectrum converter. You can find it here:

Re: ArrowFileReader failing to read bytes written to Java output stream

2017-09-28 Thread Andrew Pham (BLOOMBERG/ 731 LEX)
That did the trick, thanks! I guess, to close things off, given an ArrowRecordBatch, how can we dump the contents into instantiated Java classes (essentially deserializing the result into the inputs/list of objects that we fed into the ArrowFileWriter before)? The record batch consists of

Re: [VOTE] Release Apache Arrow 0.7.1 - RC1

2017-09-28 Thread Phillip Cloud
+1 (non-binding) * Verified signatures with release verification script * Ran C++, Python + parquet support unit tests On Thu, Sep 28, 2017 at 1:58 PM Gang Wang wrote: > +1 Looks good to me. > > Gary > > On 2017-09-27 07:01, Wes McKinney wrote: > > Hello

Re: ArrowFileReader failing to read bytes written to Java output stream

2017-09-28 Thread Bryan Cutler
Oh, I think I see the problem. You need to put "writer.close();" in a finally block instead of a catch block in ArrowPayloadIterator. That is when the schema actually gets written to the stream. On Thu, Sep 28, 2017 at 9:27 AM, Andrew Pham (BLOOMBERG/ 731 LEX) < apha...@bloomberg.net> wrote:

Re: [VOTE] Release Apache Arrow 0.7.1 - RC1

2017-09-28 Thread Arun K. Subramaniyan
+1 On Thu, Sep 28, 2017 at 1:58 PM Gang Wang wrote: > +1 Looks good to me. > > Gary > > On 2017-09-27 07:01, Wes McKinney wrote: > > Hello all, > > > > I'd like to propose the 2nd release candidate (rc1) of Apache > > Arrow version 0.7.1. This is a bugfix

Re: [VOTE] Release Apache Arrow 0.7.1 - RC1

2017-09-28 Thread Gang Wang
+1 Looks good to me. Gary On 2017-09-27 07:01, Wes McKinney wrote: > Hello all, > > I'd like to propose the 2nd release candidate (rc1) of Apache > Arrow version 0.7.1. This is a bugfix release from 0.7.0. The only > difference between rc1 and rc0 was fixing an issue

Re: ArrowFileReader failing to read bytes written to Java output stream

2017-09-28 Thread Andrew Pham (BLOOMBERG/ 731 LEX)
Ah, looks like it was stripped for some reason. Check out: https://pastebin.com/4abb2txs https://pastebin.com/53UnimQ6 https://pastebin.com/KwmP7Ens My hunch is that I'm writing to the output stream with an incorrectly determined Arrow Schema, and so the reader can't pick it up. If that's the

Re: [VOTE] Release Apache Arrow 0.7.1 - RC1

2017-09-28 Thread Uwe L. Korn
+1 (binding) using dev/release/verify-release-candidate.sh I * Verified signature, checksum on Linux * Ran C++, Python (+ Parquet support), Plasma build Could not verify c_glib due to "in `': uninitialized constant GI (NameError)" in the tests. But I guess this is due to a not correct setup