It probably pays to have at least two issues one for porting the diff, and
the existing one I made.
On Wed, Mar 11, 2020 at 1:45 AM Ji Liu wrote:
> Hi Micah,
> Thanks for your feedback, you have opened an issue for Google's Truth[1]
> and it was assigned to me, I'll try to use it.
>
> Thanks,
>
Hi Evan,
Seems like we are mostly on the same page. Some more notes below.
> For example, encoding nulls in dictionary values helps reduce the need
for both bitmap storage and lookup.
I'm not sure if this is provided was provided as an example as something to
add, but I believe this is already s
Maarten, I don't expect regressions for flat cases (I'm going to try to run
benchmarks comparison tonight).
In terms of the flag, I'm more concerned about some corner case I didn't
think of in testing or a workload that for some reason is better with the
prior code. If either of these arise I woul
Kouhei Sutou created ARROW-8109:
---
Summary: [Packaging][APT] Drop support for Ubuntu Disco
Key: ARROW-8109
URL: https://issues.apache.org/jira/browse/ARROW-8109
Project: Apache Arrow
Issue Type:
Liya Fan created ARROW-8108:
---
Summary: [Java] Extract a common interface for dictionary encoders
Key: ARROW-8108
URL: https://issues.apache.org/jira/browse/ARROW-8108
Project: Apache Arrow
Issue Ty
Kouhei Sutou created ARROW-8107:
---
Summary: [Packaging][APT] Use HTTPS for LLVM APT repository for
Debian GNU/Linux stretch
Key: ARROW-8107
URL: https://issues.apache.org/jira/browse/ARROW-8107
Project:
Wes McKinney created ARROW-8106:
---
Summary: [Python] Builds on master broken by pandas 1.0.2 release
Key: ARROW-8106
URL: https://issues.apache.org/jira/browse/ARROW-8106
Project: Apache Arrow
I
Daniel Nugent created ARROW-8105:
Summary: [Python] pyarrow.array segfaults when passed masked array
with shrunken mask
Key: ARROW-8105
URL: https://issues.apache.org/jira/browse/ARROW-8105
Project: A
Kouhei Sutou created ARROW-8104:
---
Summary: [C++] Don't install bundled Thrift
Key: ARROW-8104
URL: https://issues.apache.org/jira/browse/ARROW-8104
Project: Apache Arrow
Issue Type: Improvement
* What kind of devops tooling would be appropriate to provision and
manage the instances, scaling up and down based on need?
* What CI/CD platform would be appropriate to dispatch work to the
cloud nodes (taking into consideration the high costs of sysadmin, and
seeking to minimize nodes sitting un
Hi Soojin,
> Why have the timezone info in the TimeStampMilliTZHolder if it's never
set?
My guess is this is an oversight (it should probably be removed).
> - What if each of the rows have different TZ info? Are we to shift to
the tz on the TimeStampMilliTZVector on the write?
Currently there is
Cc Anthony
On Thu, Mar 12, 2020 at 5:27 PM Soojin Jeong wrote:
> Hi team,
>
> I would like to know what the recommended way of using TimeStampMilliTZ
> is.
>
> I see that
> - TimeStampMilliTZVector contains timestamp arrow type with timezone
> - TimeStampMilliTZVector's set doesn't actually pass
Great work! GitHub Actions has been a huge boon to the project on so
many fronts. Let's hope that GitHub / Microsoft keep up the free open
source project resources.
On Thu, Mar 12, 2020 at 1:40 PM Neal Richardson
wrote:
>
> Thanks Krisztián!
>
> On Thu, Mar 12, 2020 at 11:07 AM Krisztián Szűcs
>
Neal Richardson created ARROW-8103:
--
Summary: [R] Make default Linux build more minimal
Key: ARROW-8103
URL: https://issues.apache.org/jira/browse/ARROW-8103
Project: Apache Arrow
Issue Type
Krisztian Szucs created ARROW-8102:
--
Summary: [Dev] Crossbow's version detection doesn't work in the
comment bot's scenario
Key: ARROW-8102
URL: https://issues.apache.org/jira/browse/ARROW-8102
Proje
David Li created ARROW-8101:
---
Summary: [FlightRPC][Java] Can't read/write only an empty null
array
Key: ARROW-8101
URL: https://issues.apache.org/jira/browse/ARROW-8101
Project: Apache Arrow
Issue
I've never used cast().. I've converted python datetimes to pa.timestamp(s)
using:
pyarrow.array(obj, type=None, mask=None, size=None, from_pandas=None, bool
safe=True, MemoryPool memory_pool=None)
where type is pa.timestamp("ms")
-Original Message-
From: paul hess (Jira)
Sent: Thur
paul hess created ARROW-8100:
Summary: timestamp[ms] and date64 data types not working as
expected on write
Key: ARROW-8100
URL: https://issues.apache.org/jira/browse/ARROW-8100
Project: Apache Arrow
David Li created ARROW-8099:
---
Summary: [Integration] archery integration --with-LANG flags don't
work
Key: ARROW-8099
URL: https://issues.apache.org/jira/browse/ARROW-8099
Project: Apache Arrow
Is
Kevin Conaway created ARROW-8098:
Summary: [go] Checkptr Failures on Go 1.14
Key: ARROW-8098
URL: https://issues.apache.org/jira/browse/ARROW-8098
Project: Apache Arrow
Issue Type: Bug
Krisztian Szucs created ARROW-8097:
--
Summary: [Dev] Comment bot's crossbow command acts on the master
branch
Key: ARROW-8097
URL: https://issues.apache.org/jira/browse/ARROW-8097
Project: Apache Arro
Thanks Krisztián!
On Thu, Mar 12, 2020 at 11:07 AM Krisztián Szűcs
wrote:
> Hi,
>
> Since the Ursa-labs machines are down @ursabot comment
> bot was not operational. Luckily Github Actions is able to
> listen on more kinds of Github events like the issue_comment,
> so I've ported [1] the comment
Hi,
Since the Ursa-labs machines are down @ursabot comment
bot was not operational. Luckily Github Actions is able to
listen on more kinds of Github events like the issue_comment,
so I've ported [1] the comment bot to work without the buildbot
buildmaster.
So the comment bot is available again wit
Prudhvi Porandla created ARROW-8096:
---
Summary: [C++][Gandiva] Create null node of Interval type
Key: ARROW-8096
URL: https://issues.apache.org/jira/browse/ARROW-8096
Project: Apache Arrow
I made several tickets for the failures, ARROW-8091 - 8095, and have a
patch up for 8091. I did not ticket conda-win, centos-8, debian-stretch, or
gandiva-jar. Seems like they may be flaky or already resolved.
Neal
On Wed, Mar 11, 2020 at 5:35 PM Crossbow wrote:
>
> Arrow Build Report for Job
Neal Richardson created ARROW-8095:
--
Summary: [CI][Crossbow] Nightly turbodbc job fails
Key: ARROW-8095
URL: https://issues.apache.org/jira/browse/ARROW-8095
Project: Apache Arrow
Issue Type
Neal Richardson created ARROW-8094:
--
Summary: [CI][Crossbow] Nightly valgrind test fails
Key: ARROW-8094
URL: https://issues.apache.org/jira/browse/ARROW-8094
Project: Apache Arrow
Issue Typ
Neal Richardson created ARROW-8093:
--
Summary: [CI][Crossbow] Pandas integration test fails
Key: ARROW-8093
URL: https://issues.apache.org/jira/browse/ARROW-8093
Project: Apache Arrow
Issue T
Neal Richardson created ARROW-8092:
--
Summary: [CI][Crossbow] OSX wheels fail on bundled bzip2
Key: ARROW-8092
URL: https://issues.apache.org/jira/browse/ARROW-8092
Project: Apache Arrow
Issu
Neal Richardson created ARROW-8091:
--
Summary: [CI][Crossbow] Fix nightly homebrew and R failures
Key: ARROW-8091
URL: https://issues.apache.org/jira/browse/ARROW-8091
Project: Apache Arrow
I
Wes McKinney created ARROW-8090:
---
Summary: [C++][Compute] Implement stateful TopK operator node
Key: ARROW-8090
URL: https://issues.apache.org/jira/browse/ARROW-8090
Project: Apache Arrow
Issue
Krisztian Szucs created ARROW-8089:
--
Summary: [C++] Port the toolchain build from Appveyor to Github
Actions
Key: ARROW-8089
URL: https://issues.apache.org/jira/browse/ARROW-8089
Project: Apache Arro
Maarten -- AFAIK Micah's work only affects nested / non-flat column
paths, so flat data should not be impacted. Since we have a partial
implementation of writes for nested data (lists-of-lists and
structs-of-structs, but no mix of the two) that was the performance
difference I was referencing.
On
Hi Micah,
How does the performance change for “flat” schemas?
(particularly in the case of a large number of columns)
Thanks,
Maarten
> On Mar 11, 2020, at 11:53 PM, Micah Kornfield wrote:
>
> Another status update. I've integrated the level generation code with the
> parquet writing code [
hi Micah,
Great to hear about the progress, I'll help with code review.
FWIW, if the new code passes the existing unit tests I would be in
favor of deleting the old code so that we're fully invested in making
the new code suitably fast. Jump in with two feet, so to speak.
Thanks
Wes
On Wed, Mar
Joris Van den Bossche created ARROW-8088:
Summary: [C++][Dataset] Partition columns with specified
dictionary type result in all nulls
Key: ARROW-8088
URL: https://issues.apache.org/jira/browse/ARROW-8088
Joris Van den Bossche created ARROW-8087:
Summary: [C++][Dataset] Order of keys with HivePartitioning is
lost in resulting schema
Key: ARROW-8087
URL: https://issues.apache.org/jira/browse/ARROW-8087
Projjal Chanda created ARROW-8086:
-
Summary: [Java] Support writing decimal from big endian byte array
in UnionListWriter
Key: ARROW-8086
URL: https://issues.apache.org/jira/browse/ARROW-8086
Project:
Krisztian Szucs created ARROW-8085:
--
Summary: [Dev] Set JIRA ticket's status to in progress once a pull
request available
Key: ARROW-8085
URL: https://issues.apache.org/jira/browse/ARROW-8085
Project
Krisztian Szucs created ARROW-8084:
--
Summary: [Crossbow] Port crossbow to Archery and eliminate libgit2
dependency
Key: ARROW-8084
URL: https://issues.apache.org/jira/browse/ARROW-8084
Project: Apach
Kouhei Sutou created ARROW-8083:
---
Summary: [GLib] Add support for Peek() to GIOInputStream
Key: ARROW-8083
URL: https://issues.apache.org/jira/browse/ARROW-8083
Project: Apache Arrow
Issue Type
41 matches
Mail list logo