Hi all,
I came across a behavior change from 0.17.1 when comparing array scalar
values with python objects. This used to work for 0.17.1 and before, but in
1.0.0 equals always returns false. I saw there was a previous discussion on
Python equality semantics, but not sure if the conclusion is the
>
> Sounds fine to me. I guess one question is what needs to be formalized
> in the Schema.fbs files or elsewhere in the columnar format
> documentation (and we will need to hold an associated vote for that I
> think)
Yes, i think we will need to hold a vote for it. Since this is essentially
a
I see there's a bunch of additional aggregation code in Dremio that
might serve as inspiration (some of which is related to distributed
aggregation, so may not be relevant)
https://github.com/dremio/dremio-oss/tree/master/sabot/kernel/src/main/java/com/dremio/sabot/op/aggregate
Maybe Andy or one
Sounds fine to me. I guess one question is what needs to be formalized
in the Schema.fbs files or elsewhere in the columnar format
documentation (and we will need to hold an associated vote for that I
think)
On Mon, Aug 3, 2020 at 11:30 PM Micah Kornfield wrote:
>
> Given no objections, we'll go
hi Kenta,
Yes, I think it only makes sense to implement this in the context of
the query engine project. Here's a list of assorted thoughts about it:
* I have been mentally planning to follow the Vectorwise-type query
engine architecture that's discussed in [1] [2] and many other
academic
Attendees:
Projjal Chanda
Fred Gan
Andy Grove
Todd Hendricks
Jörn Horstmann
Ben Kietzman
Rok Mihevc
Neal Richardson
Paul Taylor
Andrew Wieteska
Discussion
* 1.0.1
* Andy: Rust packaging issue, need to test on published crate
* Timing: week of August 17
* Bug in dictionary batches in device
Hi folks,
Red Arrow, the Ruby binding of Arrow GLib, implements grouped aggregation
features for RecordBatch and Table. Because these features are written in
Ruby, they are too slow for large size data. We need to make them much
faster.
To improve their calculation speed, they should be
> I will have a closer look and comment most likely next week.
Thank you!
>
> Unfortunately, having code developed in external repositories increases the
> complexity of importing that code back into the Apache project Not sure if
> you’re interested in preemptively following the project’s
I will have a closer look and comment most likely next week.
Unfortunately, having code developed in external repositories increases the
complexity of importing that code back into the Apache project Not sure if
you’re interested in preemptively following the project’s style guide (file
naming,
I also am not sure there is a good case for a new built-in type since it
introduces a good deal of complexity, particularly when there is the
extension type option. We’ve been living with 64-bit nanoseconds in pandas
for a decade, for example (and without the option for lower resolutions!!),
and
Wes & crew,
Congratulations and thank you for the successful 1.0 rollout , it is certainly
making a huge difference for my day job!
Is it a good time now to revive the conversation below? (and
https://github.com/apache/arrow/pull/7548 )
I have also gone ahead and released a prototype the covers
Arrow Build Report for Job nightly-2020-08-05-0
All tasks:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-08-05-0
Failed Tasks:
- conda-linux-gcc-py36-cpu:
URL:
hi liya,
Thanks for your careful review, it is a typo, the order of getBuffers is
wrong.
Fan Liya 于2020年8月5日周三 下午2:14写道:
> Hi Ji,
>
> IMO, for the correct order, the validity buffer should precede the offset
> buffer (e.g. this is the order used by BaseVariableWidthVector &
>
Hi Ji,
IMO, for the correct order, the validity buffer should precede the offset
buffer (e.g. this is the order used by BaseVariableWidthVector &
BaseLargeVariableWidthVector).
In ListVector#getBuffers, the offset buffer precedes the validity buffer,
so I am a little confused why you say the
14 matches
Mail list logo