Congratulations everyone!
On Mon, 28 May 2018 at 21:42 Li Jin wrote:
> Congrats everyone!
> On Mon, May 28, 2018 at 3:21 PM Jacques Nadeau wrote:
>
> > Woo!
> >
> > On Mon, May 28, 2018 at 4:50 PM, Wes McKinney
> wrote:
> >
> > > Congrats all! The journey continues
> > >
> > > On Mon, May 28,
Hi Everyone,
I've encountered a memory mapping error when attempting to read a parquet
file to a Pandas DataFrame. It seems to be happening intermittently though,
I've so far encountered it once. In my case the pq.read_table code is being
invoked in a Linux docker container. I had a look at the
2018 at 15:37 simba nyatsanga <simnyatsa...@gmail.com> wrote:
> Thanks all for the great feedback!
>
> Thanks Daniel for the sample data sets. I loaded them up and they're quite
> comparable in size to some of the data I'm dealing with. In my case the
> shapes range from 150
rows which can penalize system that
> > expect
> > > to amortize column meta data over more data.
> > >
> > > This test might match your situation, but I would be leery of drawing
> > > overly broad conclusions from this single data point.
> > >
me through. Try uploading them somewhere and link
> to them in the mails. Attachments are always stripped on Apache
> mailing lists.
> Uwe
>
>
> On Wed, Jan 24, 2018, at 1:48 PM, simba nyatsanga wrote:
> > Hi Everyone,
> >
> > I did some benchmarking to compare the disk
Hi Everyone,
I did some benchmarking to compare the disk size performance when writing
Pandas DataFrames to parquet files using Snappy and Brotli compression. I
then compared these numbers with those of my current file storage solution.
In my current (non Arrow+Parquet solution), every column in
ht Arrow memory layout.
>
> - Wes
>
> On Mon, Jan 22, 2018 at 4:50 PM, simba nyatsanga <simnyatsa...@gmail.com>
> wrote:
> > Hi Uwe,
> >
> > Thank you very much for the detailed explanation. I have a much better
> > understanding now.
> >
> > C
Hi Uwe,
Thank you very much for the detailed explanation. I have a much better
understanding now.
Cheers
On Mon, 22 Jan 2018 at 19:37 Uwe L. Korn <uw...@xhochy.com> wrote:
> Hello Simba,
>
> find the answers inline.
>
> On Mon, Jan 22, 2018, at 7:29 AM, simba nyatsanga w
Hi Everyone,
I've got two questions that I'd like help with:
1. Pandas and numpy arrays can handle multiple types in a sequence eg. a
float and a string by using the dtype=object. From what I gather, Arrow
arrays enforce a uniform type depending on the type of the first
encountered element in a
- Wes
>
> On Thu, Jan 18, 2018 at 2:10 PM, simba nyatsanga <simnyatsa...@gmail.com>
> wrote:
> > Hi Wes,
> >
> > Great! Thanks for the pointer. From what I gather this is a fundamental
> and
> > deliberate design decision. Would I be correct in saying the
ter/cpp/src/arrow/python/arrow_to_pandas.cc#L541
>
> - Wes
>
> On Thu, Jan 18, 2018 at 1:26 PM, simba nyatsanga <simnyatsa...@gmail.com>
> wrote:
>
> > Good day everyone,
> >
> > I noticed what looks like type inference happening after persisting a
> >
Good day everyone,
I noticed what looks like type inference happening after persisting a
pandas DataFrame where one of the column values is a list. When I load up
the DataFrame again and do df.to_dict(), the value is no longer a list but
a numpy array. I dug through functions in the
epending on how development is progressing.
>
> - Wes
>
> On Sun, Jan 14, 2018 at 9:19 AM, simba nyatsanga <simnyatsa...@gmail.com>
> wrote:
> > Thanks a lot. I see that there's a PR that's been opened to resolve the
> > encoding issue - https://github.com/apache/arrow/p
Sun, Jan 14, 2018, at 2:42 PM, simba nyatsanga wrote:
> > Amazing, thanks Uwe!
> >
> > I was able to build pyarrow successfully for python 2.7 using your
> > workaround. I appreciate that you've got a possible solution for the too.
> >
> > Besides the PR getting
uot;/Users/simba/Projects/personal/oss/arrow/python/build/
> > temp.macosx-10.9-x86_64-2.7/CMakeFiles/CMakeOutput.log".
> > See also "/Users/simba/Projects/personal/oss/arrow/python/build/
> > temp.macosx-10.9-x86_64-2.7/CMakeFiles/CMakeError.log".error:
> > c
-r--r--1 simba staff 3.0M Jan 11 18:45
libparquet.a
lrwxr-xr-x1 simba staff18B Jan 11 18:45 libparquet.dylib ->
libparquet.1.dylib
Just to clarify also, I'm attempting to build the wheel from within
*arrow/python* folder where the *setup.py* file is.
Thanks again for the
t;
> Are you following development instructions in
>
> http://arrow.apache.org/docs/python/development.html#developing-on-linux-and-macos
> or something else?
>
> - Wes
>
> On Wed, Jan 10, 2018 at 11:20 AM, simba nyatsanga
> <simnyatsa...@gmail.com> wrote:
> > Hi,
Hi,
I've created a python 2.7 virtualenv in my attempt to build the pyarrow
project. But I'm having trouble running one of commands as specified in the
development docs on Github, specifically this command:
cd arrow/python
python setup.py build_ext --build-type=$ARROW_BUILD_TYPE \
18 matches
Mail list logo