Re: [DISCUSS] Migration of parquet-cpp issues to GitHub

2024-06-22 Thread Rok Mihevc
ML > as the original discussion contains a lot of stuff and I didn't see > enough Parquet PMCs to reply to this topic. > > Best, > Gang > > On Thu, Jun 13, 2024 at 7:01 AM Rok Mihevc wrote: > > > > Perhaps we need a separate vote for non-parquet-cpp repos before th

Re: [DISCUSS] Migration of parquet-cpp issues to GitHub

2024-06-12 Thread Rok Mihevc
arate > vote for non-parquet-cpp repos before the action. > > Best, > Gamg > > On Wed, Jun 12, 2024 at 10:40 AM Rok Mihevc wrote: > > > I have set up a script for the parquet-cpp migration (and also for > > migration of other parquet tickets in case we decide to go ah

Re: [DISCUSS] Migration of parquet-cpp issues to GitHub

2024-06-11 Thread Rok Mihevc
Rok On Fri, May 31, 2024 at 10:04 AM Rok Mihevc wrote: > Would we also want to add issue templates to encourage some structure? See > [1] for inspiration. > > [1] https://github.com/apache/arrow/blob/main/.github/ISSUE_TEMPLATE > > On Fri, May 31, 2024 at 3:50 AM Gang Wu wrote:

Re: [VOTE] Migration of parquet-cpp issues to Arrow's issue tracker

2024-06-04 Thread Rok Mihevc
Corrected results with input from Julien and Antoine: Parquet: 3x +1 binding (Gang Wu, Wes McKinney, Julien Le Dem) 10x +1 non-binding (Micah Kornfield, Felipe Oliveira Carvalho, Fokko Driesprong, Antoine Pitrou, Alenka Frim, Andy Grove, Raúl Cumplido, Sutou Kouhei, Jiashen Zhang, Rok Mihevc

Re: [VOTE] Migration of parquet-cpp issues to Arrow's issue tracker

2024-06-03 Thread Rok Mihevc
, Alenka Frim, Andy Grove, Raúl Cumplido, Sutou Kouhei, Jiashen Zhang, Rok Mihevc) Arrow: 6x +1 binding (Micah Kornfield, Antoine Pitrou, Andy Grove, Raúl Cumplido, Wes McKinney, Sutou Kouhei) 6x +1 non-binding (Felipe Oliveira Carvalho, Fokko Driesprong, Gang Wu, Alenka Frim, Jiashen Zhang, Rok

Re: [VOTE] Migration of parquet-cpp issues to Arrow's issue tracker

2024-06-03 Thread Rok Mihevc
tracker" on > > Wed, 29 May 2024 16:14:44 +0200, > > Rok Mihevc wrote: > > > > > # sending this to both dev@arrow and dev@parquet > > > > > > Hi all, > > > > > > Following the ML discussion [1] I would like to propose a vote for

Re: [DISCUSS] Migration of parquet-cpp issues to GitHub

2024-05-31 Thread Rok Mihevc
sues and INFRA tickets are required before > > > migration. > > > > > > Best, > > > Gang > > > > > > On Thu, May 30, 2024 at 1:55 AM Micah Kornfield > > > > wrote: > > > > > > > SGTM +1 > > > > > >

Re: [DISCUSS] Migration of parquet-cpp issues to GitHub

2024-05-29 Thread Rok Mihevc
On Wed, May 29, 2024 at 4:39 PM Fokko Driesprong wrote: > Hey Rok, > > Thanks for bringing this up. I'm also very much in favor of Github. Once > we've migrated cpp, I think migrating the other repositories is a great > idea. Let me know if I can help! Perfect! A question I think we want to

[VOTE] Migration of parquet-cpp issues to Arrow's issue tracker

2024-05-29 Thread Rok Mihevc
# sending this to both dev@arrow and dev@parquet Hi all, Following the ML discussion [1] I would like to propose a vote for parquet-cpp issues to be moved from Parquet Jira [2] to Arrow's issue tracker [3]. [1] https://lists.apache.org/thread/zklp0lwcbcsdzgxoxy6wqjwrvt6y4s9p [2]

Re: [DISCUSS] Migration of parquet-cpp issues to GitHub

2024-05-29 Thread Rok Mihevc
On Wed, May 29, 2024 at 3:22 AM Gang Wu wrote: > Perhaps we can directly proceed to a vote? > Since we seem to be in agreement regarding parquet-cpp I'll go ahead and call for a vote. I would meanwhile propose to discuss migration of other parquet issues (parquet-java, parquet-site,

[DISCUSS] Migration of parquet-cpp issues to GitHub

2024-05-28 Thread Rok Mihevc
Hi all, I'd like to re-raise the idea of migrating parquet-cpp issues from Parquet's Jira to Arrow's GitHub issue tracker. Arrow migrated in January 2023 [1]. The migration was relatively smooth and the experience since seems to be positive. The reasons we would want to migrate parque-cpp issues

Re: Fwd: [C++] Parquet and Arrow overlap

2024-05-11 Thread Rok Mihevc
24, 2024, at 2:38 PM, Gang Wu wrote: > > > > > > > > +1 for moving parquet-cpp issues from Apache Jira to Arrow's > > > GitHub > > > > > > > issue. > > > > > > > > > > > > > > > > Besid

Re: [VOTE] Release Apache Arrow 16.1.0 - RC1

2024-05-09 Thread Rok Mihevc
+1 (non-binding) Ran: TEST_DEFAULT=0 TEST_SOURCE=1 ./verify-release-candidate.sh 16.1.0 1 On Ubuntu 22.04.1 x86_64 Thanks for the hard work Raul! Rok On Thu, May 9, 2024 at 6:51 PM Bryce Mecum wrote: > +1 (non-binding) > > I ran TEST_DEFAULT=0 TEST_CPP=1 >

Re: [VOTE][Format] JSON canonical extension type

2024-05-07 Thread Rok Mihevc
> > I spoke to the DuckDB maintainers about this. DuckDB has a JSON extension > which defines a JSON column type. They intend to have DuckDB's Arrow > integrations recognize this arrow.json extension name on input and set it > on output. > That's great to hear! Thanks for checking with DuckDB

Re: [ANNOUNCE] New Arrow committer: Dane Pitkin

2024-05-07 Thread Rok Mihevc
Congrats Dane! Rok On Tue, May 7, 2024 at 3:57 PM wish maple wrote: > Congrats! > > Best, > Xuwei Fu > > Joris Van den Bossche 于2024年5月7日周二 21:53写道: > > > On behalf of the Arrow PMC, I'm happy to announce that Dane Pitkin has > > accepted an invitation to become a committer on Apache Arrow.

Re: [VOTE][Format] JSON canonical extension type

2024-05-07 Thread Rok Mihevc
Hi all, With 9 +1 votes (4 binding, 5 non-binding) and 0 -1 votes the proposal is approved as shown below and in the PR [1]. Thank you everyone who voted and helped shape this proposal. Once the language is merged we'll proceed with work on the C++ implementation PR [2]. [1]

Re: [VOTE][Format] JSON canonical extension type

2024-05-07 Thread Rok Mihevc
t; > > Regards > > > > Antoine. > > > > > > Le 30/04/2024 à 19:26, Rok Mihevc a écrit : > > > Hi all, thanks for the votes and comments so far. > > > I've amended [1] the proposed language with the RFC-8259 requirement as > > it > > >

Re: [VOTE][Format] UUID canonical extension type

2024-05-07 Thread Rok Mihevc
Hi all, With 8 +1 votes (4 binding, 4 non-binding) and 0 -1 votes the proposal is approved as shown below and in the PR [1]. Thank you everyone who voted and helped shape this proposal. [1] https://github.com/apache/arrow/pull/41299 --- UUID * Extension name: `arrow.uuid`. * The storage

Re: [VOTE][Format] UUID canonical extension type

2024-05-07 Thread Rok Mihevc
+1 (non-binding) On Mon, May 6, 2024 at 12:14 PM Wes McKinney wrote: > +1 > > On Tue, Apr 30, 2024 at 4:03 PM Antoine Pitrou wrote: > > > +1 (binding) > > > > > > Le 19/04/2024 à 22:22, Rok Mihevc a écrit : > > > Hi all, > > > > >

Re: [VOTE][Format] JSON canonical extension type

2024-04-30 Thread Rok Mihevc
Hi all, thanks for the votes and comments so far. I've amended [1] the proposed language with the RFC-8259 requirement as it seems to be almost unanimously requested. New language is below. To Micah's comment regarding rejecting Binary arrays [2] - please discuss in the PR. Let's leave the vote

Re: [VOTE][Format] UUID canonical extension type

2024-04-30 Thread Rok Mihevc
Thanks for all the reviews and comments! I've included the big-endian requirement so the proposed language is now as below. I'll leave the vote open until after the May holiday. Rok UUID * Extension name: `arrow.uuid`. * The storage type of the extension is ``FixedSizeBinary`` with a

[VOTE][Format] JSON canonical extension type

2024-04-19 Thread Rok Mihevc
Hi all, Following discussions [1][2] and preliminary implementation work (by Pradeep Gollakota) [3] I would like to propose a vote to add language for JSON canonical extension type to CanonicalExtensions.rst as in PR [4] and written below. A draft C++ implementation PR can be seen here [3]. [1]

[VOTE][Format] UUID canonical extension type

2024-04-19 Thread Rok Mihevc
Hi all, Following initial requests [1][2] and recent tangential ML discussion [3] I would like to propose a vote to add language for UUID canonical extension type to CanonicalExtensions.rst as in PR [4] and written below. A draft C++ and Python implementation PR can be seen here [5]. [1]

Re: [VOTE] Release Apache Arrow 16.0.0 - RC0

2024-04-17 Thread Rok Mihevc
+1 I've successfully verified sources on Ubuntu 22.04: TEST_DEFAULT=0 TEST_SOURCE=1 dev/release/verify-release-candidate.sh 16.0.0 0 Rok On Wed, Apr 17, 2024 at 8:36 PM Raúl Cumplido wrote: > Hi Dominik, > > I am sorry the announcement was missed. I did send an email one month > ago [1] and

Re: [ANNOUNCE] New Arrow committer: Sarah Gilmore

2024-04-12 Thread Rok Mihevc
Congrats Sarah! On Fri, Apr 12, 2024 at 5:48 PM Ian Joiner wrote: > Congrats! > > On Fri, Apr 12, 2024 at 10:18 AM Gang Wu wrote: > > > Congrats! > > > > On Fri, Apr 12, 2024 at 9:11 PM Patrick Horan > wrote: > > > > > Congratulations! > > > > > > On Thu, Apr 11, 2024, at 11:10 AM, Raúl

Re: Unsupported/Other Type

2024-04-10 Thread Rok Mihevc
There are JSON [1] and UUID [2] PRs open. I don't know about the former (seems to be stuck in review), but I plan to work on the UUID PR this week. [1] https://github.com/apache/arrow/pull/13901 [2] https://github.com/apache/arrow/pull/37298 On Thu, Apr 11, 2024 at 12:31 AM James Duong wrote:

Re: [ANNOUNCE] New Arrow committer: Bryce Mecum

2024-03-18 Thread Rok Mihevc
Congrats and welcome Bryce! On Mon, Mar 18, 2024 at 11:07 AM Andrew Lamb wrote: > Congratulations Bryce! > > On Mon, Mar 18, 2024 at 3:35 AM Alenka Frim .invalid> > wrote: > > > Congratulations Bryce and thank you for all your contributions!! > > > > On Mon, Mar 18, 2024 at 6:43 AM Raúl

Re: [ANNOUNCE] New Arrow committer: Felipe Oliveira Carvalho

2023-12-07 Thread Rok Mihevc
Congrats Felipe! On Fri, Dec 8, 2023 at 3:00 AM Gang Wu wrote: > Congrats! > > On Fri, Dec 8, 2023 at 8:37 AM Dewey Dunnington > wrote: > > > Congrats! > > > > On Thu, Dec 7, 2023 at 4:28 PM Andrew Lamb wrote: > > > > > > Congratulations! > > > > > > On Thu, Dec 7, 2023 at 3:09 PM Kevin

Re: [ANNOUNCE] New Arrow committer: James Duong

2023-11-16 Thread Rok Mihevc
Congrats James! On Thu, Nov 16, 2023 at 3:58 PM Kevin Gurney wrote: > Congratulations, James! > > From: David Li > Sent: Thursday, November 16, 2023 9:30 AM > To: dev@arrow.apache.org > Subject: Re: [ANNOUNCE] New Arrow committer: James Duong > > Congrats

Re: [ANNOUNCE] New Arrow PMC member: Raúl Cumplido

2023-11-13 Thread Rok Mihevc
Congrats Raúl!! Rok On Mon, Nov 13, 2023 at 9:48 PM David Li wrote: > Congrats & welcome, Raúl! > > On Mon, Nov 13, 2023, at 15:39, Ian Cook wrote: > > Congratulations Raúl! > > > > On Mon, Nov 13, 2023 at 2:28 PM Andrew Lamb > wrote: > >> > >> The Project Management Committee (PMC) for

Re: [ANNOUNCE] New Arrow committer: Xuwei Fu

2023-10-23 Thread Rok Mihevc
Congrats Xuwei! Well deserved! Rok On Mon, Oct 23, 2023 at 11:25 AM Yibo Cai wrote: > Congrats Xuwei! > > -Original Message- > From: Gang Wu > Sent: Monday, October 23, 2023 13:29 > To: dev@arrow.apache.org > Subject: Re: [ANNOUNCE] New Arrow committer: Xuwei Fu > > Congrats Xuwei! >

Re: [ANNOUNCE] New Arrow PMC member: Jonathan Keane

2023-10-14 Thread Rok Mihevc
Congrats Jon! On Sat, Oct 14, 2023 at 8:10 PM Joris Van den Bossche < jorisvandenboss...@gmail.com> wrote: > Congratulations! > > On Sat, 14 Oct 2023 at 20:02, Matt Topol wrote: > > > > Congrats Jon!!! > > > > On Sat, Oct 14, 2023, 1:42 PM David Li wrote: > > > > > Congrats Jon! > > > > > > On

Re: [VOTE][Format] Variable shape tensor canonical extension type

2023-10-06 Thread Rok Mihevc
implementation ( https://github.com/apache/arrow/pull/38008) Rok On Mon, Oct 2, 2023 at 4:25 PM Rok Mihevc wrote: > +1 > Thanks everyone for voting! > > I'd like to leave the vote open until Wednesday, > > Rok > > On Fri, Sep 29, 2023 at 8:58 PM Matt Topol wrote: >

Re: [VOTE][Format] Variable shape tensor canonical extension type

2023-10-02 Thread Rok Mihevc
ting on this with all of us! > > > > On Fri, Sep 29, 2023 at 11:28 AM Alenka Frim > > wrote: > > > > > > +1 > > > Thanks for pushing this through! > > > > > > On Wed, Sep 27, 2023 at 2:44 PM Rok Mihevc > wrote: > > >

[VOTE][Format] Variable shape tensor canonical extension type

2023-09-27 Thread Rok Mihevc
Hi all, Following the discussion [1][2] I would like to propose a vote to add variable shape tensor canonical extension type language to CanonicalExtensions.rst [3] as written below. A draft C++ implementation and a Python wrapper can be seen here [2]. The vote will be open for at least 72

Re: [DISCUSS] Proposal to add VariableShapeTensor Canonical Extension Type

2023-09-24 Thread Rok Mihevc
/9c827a0ba54280f4695202e17e32902986c4f12f#diff-b54425cb176b53e51925c13a4d4e85cf7d03d4e1226e6d5bf4d7ae09923db8b3 Best, Rok On Sat, Sep 16, 2023 at 3:11 PM Rok Mihevc wrote: > I agree, the increased complexity is probably not worth the savings > from keeping only shapes of ragged dimensions. > Howeve

Re: [DISCUSS] Proposal to add VariableShapeTensor Canonical Extension Type

2023-09-16 Thread Rok Mihevc
Sep 15, 2023 at 8:32 PM Rok Mihevc wrote: > > > > > How about also changing shape and adding uniform_shape like so: > > """ > > **shape** is a ``FixedSizeList[ndim_ragged]`` of ragged shape > > of each tensor contained in ``data`` where the size

Re: [DISCUSS] Proposal to add VariableShapeTensor Canonical Extension Type

2023-09-15 Thread Rok Mihevc
First, thanks for all the input! On Wed, Sep 13, 2023 at 6:27 AM Alenka Frim wrote: > In the PR you mention that "this [ragged dimensions] would be purely > metadata that would help converting arrow <-> jagged/ragged". Are there any > examples available to better understand this metadata and how

Re: [DISCUSS] Proposal to add VariableShapeTensor Canonical Extension Type

2023-09-12 Thread Rok Mihevc
After some discussion on the PR [https://github.com/apache/arrow/pull/37166] we've altered the proposed type by removing the ndim parameter and adding ragged_dimensions one. If there is no further feedback I'd like to call for a vote early next week. Proposed language now reads: Variable shape

Re: [MATLAB] Using GitHub Projects for Project Planning

2023-08-22 Thread Rok Mihevc
To Jin's point - namespacing like "Arrow MATLAB" would prevent confusion. We have prior art of "Arrow ADBC Initial Release" [1]. Rok [1] https://github.com/orgs/apache/projects/159 On Tue, Aug 22, 2023 at 1:31 PM Jin Shang wrote: > Hi, > > I notice that this project can be seen directly from

Re: [VOTE] Release Apache Arrow 13.0.0 - RC3

2023-08-18 Thread Rok Mihevc
Successfully tested sources (except Python due to an odd and probably harmless cmake issue) and binaries on Ubuntu 22.04.1/Conda/x86_64. USE_CONDA=1 TEST_DEFAULT=0 TEST_PYTHON=0 TEST_SOURCE=1 dev/release/verify-release-candidate.sh 13.0.0 3 USE_CONDA=1 TEST_DEFAULT=0 TEST_BINARIES=1

[DISCUSS] Proposal to add VariableShapeTensor Canonical Extension Type

2023-08-17 Thread Rok Mihevc
Hey all! Besides the recently added FixedShapeTensor [1] canonical extension type there appears to be a need for an already proposed VariableShapeTensor [2]. VariableShapeTensor would store tensors of variable shapes but uniform number of dimensions, dimension names and dimension permutations.

Re: [ANNOUNCE] New Arrow committer: Gang Wu

2023-05-15 Thread Rok Mihevc
Congrats Gang! Rok On Mon, May 15, 2023 at 3:33 PM Sutou Kouhei wrote: > On behalf of the Arrow PMC, I'm happy to announce that Gang > Wu has accepted an invitation to become a committer on > Apache Arrow. Welcome, and thank you for your contributions! > > Thanks, > -- > kou >

Re: [ANNOUNCE] New Arrow PMC member: Matt Topol

2023-05-03 Thread Rok Mihevc
Congrats Matt. Well deserved! Rok On Wed, May 3, 2023 at 11:03 PM David Li wrote: > Congrats Matt! > > On Wed, May 3, 2023, at 16:06, Neal Richardson wrote: > > Congratulations! > > > > On Wed, May 3, 2023 at 1:58 PM Jacob Wujciak > > > wrote: > > > >> Congratulations, well deserved! > >> >

Re: [ANNOUNCE] New Arrow committer: Mustafa Akur

2023-03-31 Thread Rok Mihevc
Congrats! Rok On Fri, Mar 31, 2023 at 9:42 PM Mehmet Ozan Kabak wrote: > Congrats Mustafa! You are a great team member at Synnada and I’m sure you > will be a valued member of the Apache Arrow community too. > > > On Mar 31, 2023, at 10:54 AM, Matthew Topol > wrote: > > > > Congrats Mustafa!

Re: Proposal: add a bot to close PRs that haven't been updated in 30 days

2023-03-31 Thread Rok Mihevc
I agree with Joris' and David's points here and would prefer some form of pinging. Also at 120 open PRs we could realistically close out stale ones manually. Meanwhile we have 3.2k open issues where we might want to get creative. Rok On Fri, Mar 31, 2023 at 9:17 PM Antoine Pitrou wrote: > > I

Re: Zero copy cast kernels

2023-03-24 Thread Rok Mihevc
For scalar casting tests we use CheckCastZeroCopy [1] which you could reuse. [1] https://github.com/apache/arrow/blob/e7d6c13d4ae3d8df0e9b668468b990f35c8a9556/cpp/src/arrow/compute/kernels/scalar_cast_test.cc#L128-L138 Rok

Re: [VOTE][Format] Fixed shape tensor Canonical Extension Type

2023-03-15 Thread Rok Mihevc
Looking at fixed-size-list memory layout [1] I think we better proceed with this proposal and rather optimize the parquet reader/writer, e.g.: [2]. Best, Rok [1] https://arrow.apache.org/docs/format/Columnar.html#fixed-size-list-layout [2]

Re: [ANNOUNCE] New Arrow PMC member: Will Jones

2023-03-13 Thread Rok Mihevc
Congratulations Will! Rok On Mon, Mar 13, 2023 at 8:37 PM Steph Hazlitt wrote: > Congrats Will! > > On Mon, 13 Mar 2023 at 10:57, Andrew Lamb wrote: > > > The Project Management Committee (PMC) for Apache Arrow has invited > > Will Jones to become a PMC member and we are pleased to announce >

Re: [VOTE][Format] Fixed shape tensor Canonical Extension Type

2023-03-06 Thread Rok Mihevc
+1 Thanks for the discussion everyone! Rok On Mon, Mar 6, 2023 at 8:29 PM Dewey Dunnington wrote: > +1 (non-binding)! > > On Mon, Mar 6, 2023 at 9:59 AM Nic Crane wrote: > > > +1 > > > > On Mon, 6 Mar 2023 at 12:41, Alenka Frim .invalid> > > wrote: > > > > > Hi all, > > > > > > I am

Re: [VOTE] Format: Fixed shape tensor Canonical Extension Type

2023-02-23 Thread Rok Mihevc
, for example, one might > > store the dimension variable names. When determining type equality it may > > be useful that {..., permutation = [2, 0, 1], dim_names = ["C", "H", > "W"]} > > is not equal to {..., permutation = [

Re: [ANNOUNCE] New Arrow committer: Wang Mingming

2023-02-23 Thread Rok Mihevc
Congrats! Rok On Thu, Feb 23, 2023 at 4:21 AM Ian Joiner wrote: > Congrats! > > On Wed, Feb 22, 2023 at 7:26 AM Andrew Lamb wrote: > > > Hi, > > > > On behalf of the Arrow PMC, I'm happy to announce that mingmwang > > has accepted an invitation to become a committer on Apache > > Arrow.

Re: [VOTE] Format: Fixed shape tensor Canonical Extension Type

2023-02-22 Thread Rok Mihevc
> > > > > > > Should we rule that `dim_names` and `permutation` are mutually > exclusive? > > > > > > > Since `dim_names` have to "map to the physical layout (row-major)" that > > means permutation will always be trivial which indeed makes it > unnecessary > > to store both. > > I don't think it

Re: [VOTE] Format: Fixed shape tensor Canonical Extension Type

2023-02-21 Thread Rok Mihevc
> > Should we rule that `dim_names` and `permutation` are mutually exclusive? > Since `dim_names` have to "map to the physical layout (row-major)" that means permutation will always be trivial which indeed makes it unnecessary to store both. (This makes me think about extension type

Re: Question about memory usage and type casting using pyarrow Table

2023-02-15 Thread Rok Mihevc
; 00:00:00.0]] > >> > >> In [18]: table > >> Out[18]: > >> pyarrow.Table > >> time: timestamp[ns] > >> > >> time: [[1970-01-01 00:00:00.000000000,1970-01-01 > >> 00:00:00.1,1970-01-01 00:00:00.2

Re: Question about memory usage and type casting using pyarrow Table

2023-02-15 Thread Rok Mihevc
I'm not sure about (1) but I'm pretty sure for (2) doing a cast of tz-aware timestamp to tz-naive should be a metadata-only change. On Wed, Feb 15, 2023 at 4:19 PM Li Jin wrote: > Asking (2) because IIUC this is a metadata operation that could be zero > copy but I am not sure if this is

Re: [DISCUSS] Fixed shape tensor Canonical Extension Type

2023-02-10 Thread Rok Mihevc
A short update on the state of this discussion: * There is an ongoing thread on "GH-33923: [Docs] Tensor canonical extension type specification" [1]. Discussion is now down mostly to how would logical layout (strides) information be encoded (if at all) and more input would be most welcome. * There

Re: [DISCUSS] PR automation workflow

2023-02-03 Thread Rok Mihevc
+1 to Nic's comment. On Fri, Feb 3, 2023 at 12:46 PM Nic Crane wrote: > I have no specific comments on the what/how, other than to say I'm strongly > in favour of some kind of system being implemented and tried out, as I > currently rely on manual processes that are inefficient and make it easy

Re: [VOTE] Release Apache Arrow 11.0.0 - RC0

2023-01-19 Thread Rok Mihevc
On a relatively fresh Ubuntu 22.04 without conda I had to apt install some libs (default-jdk maven libjemalloc-dev libgirepository1.0-dev libsqlite3-dev) and this passed fine: TEST_DEFAULT=0 TEST_SOURCE=1 dev/release/verify-release-candidate.sh +1 Rok On Thu, Jan 19, 2023 at 3:05 PM Raúl

Re: [Migration] Moving Jira Issues to GitHub

2023-01-13 Thread Rok Mihevc
Hi all, We ran the Jira -> GitHub issue migration on Tuesday. 18292 tickets (2803 open and 15489 closed) were migrated and can be seen here [1]. Arrow's Jira issue tracker is now in read-only mode and all issues received a comment linking them to their GitHub counterparts. We strived to keep

Re: [ANNOUNCE] New Arrow committer: Jie Wen

2023-01-08 Thread Rok Mihevc
Congrats Jie! Rok On Sun, Jan 8, 2023 at 7:00 PM Raúl Cumplido wrote: > Congratulations Jie! > > El dom, 8 ene 2023, 18:45, David Li escribió: > > > Congrats Jie & welcome! > > > > On Sun, Jan 8, 2023, at 06:24, Andrew Lamb wrote: > > > Hi, > > > > > > On behalf of the Arrow PMC, I'm happy to

[Migration] Moving Jira Issues to GitHub

2023-01-08 Thread Rok Mihevc
Hi, We have decided to move issue tracking to GitHub [1] and have since disabled opening new issues on Jira. Next step is copying old issues to GitHub and locking Jira tracker for comments and changes. Migrated issues are expected to look more or less as seen here [2]. Work related to this was

Re: [Monorepo] Add labels breaking-change and critical-fix

2023-01-07 Thread Rok Mihevc
> > I replied in the GitHub thread [1], but will say that I am +1 on Priority: > Blocker and Priority: Critical. Though I wonder if we could use "Critical > Fix" in place of "Priority: Critical"? Unless we have two different > definitions. As is, the names are similar enough that it could be >

Re: [Monorepo] Add labels breaking-change and critical-fix

2023-01-06 Thread Rok Mihevc
Hey, +1 for the proposal. Perhaps we can loop back and evaluate come 12.0.0 to see if these were useful / used? I'd like to pile on another new label proposal. For purpose of Jira -> GitHub Migration I'd like to propose the following labels be added, that are common on Jira but missing on

Re: [ANNOUNCE] New Arrow PMC chair: Andrew Lamb

2022-12-26 Thread Rok Mihevc
Congratulations Andrew! Rok On Mon, Dec 26, 2022 at 11:26 PM Neal Richardson < neal.p.richard...@gmail.com> wrote: > Congratulations! > > On Mon, Dec 26, 2022 at 4:38 PM Matt Topol wrote: > > > Congrats!!! > > > > On Mon, Dec 26, 2022, 12:47 PM Jacob Wujciak > > > > > wrote: > > > > >

[WEBSITE] Website merge script is outdated

2022-12-19 Thread Rok Mihevc
Current website PR merge script is outdated [1] and should either be updated or replaced with merging with the button process. I've come across this issue when merging website changes related to Jira -> GitHub migration [2] and had to use the merge button. As things stand now we'll eventually

Re: [VOTE] Disable ASF Jira issue reporting

2022-12-19 Thread Rok Mihevc
New issue reporting on Jira has just been disabled. Thank you all for participating and Todd for setting this up. Rok On Fri, Dec 16, 2022 at 5:02 PM Rok Mihevc wrote: > Raul opened these issues to track required changes to the release scripts: > * [Release][Archery] Update archery r

Re: [VOTE] Disable ASF Jira issue reporting

2022-12-16 Thread Rok Mihevc
/issues/14997 [2] https://github.com/apache/arrow/issues/14999 [3] https://github.com/apache/arrow/issues/15002 On Fri, Dec 16, 2022 at 3:31 PM Rok Mihevc wrote: > Thanks for bringing that point up Raul! > Would a good workaround be to open the required Jira issues now, before we > lock

Re: [VOTE] Disable ASF Jira issue reporting

2022-12-16 Thread Rok Mihevc
ing to GitHub but worth > mentioning as it will require some extra effort until this is fixed. > > > El vie, 16 dic 2022 a las 8:28, Alenka Frim ( .invalid>) > escribió: > > > Thank you for working on this Rok  > > > > On Fri, 16 Dec 2022 at 01:21, Rok Mihevc w

Re: [VOTE] Disable ASF Jira issue reporting

2022-12-15 Thread Rok Mihevc
The vote is now 8 +1 votes, 1 +1 "when the merge scripts are ready" and 1 -1 vote "until the labels are ready". Please correct me if I'm wrong, but I believe merge scripts and labels are now ready. If that is the case we can tally this vote as 10 +1 votes and proceed with disabling ASF Jira issue

Re: [ANNOUNCE] New Arrow committer: Jacob Wujciak

2022-12-15 Thread Rok Mihevc
Congrats Jacob!! Rok On Fri, Dec 16, 2022 at 12:52 AM Vibhatha Abeykoon wrote: > Congratulations Jacob!!! > > On Fri, Dec 16, 2022 at 5:09 AM Raúl Cumplido > wrote: > > > Congratulations Jacob! > > > > El vie, 16 dic 2022 a las 0:34, Weston Pace () > > escribió: > > > > > Congratulations

Re: [ANNOUNCE] New Arrow committer: Raúl Cumplido

2022-12-06 Thread Rok Mihevc
Congrats Raul!! On Tue, Dec 6, 2022 at 12:04 PM Andrew Lamb wrote: > Congratulations Raúl > > On Tue, Dec 6, 2022 at 2:17 AM Vibhatha Abeykoon > wrote: > > > Congratulations Raul!!! > > > > On Tue, Dec 6, 2022 at 11:38 AM Alenka Frim > .invalid> > > wrote: > > > > > Congratulations Raul!!  >

Re: Need access to ASF Jira

2022-11-24 Thread Rok Mihevc
Hi Prashanth, Due to recent disabling of self-service user creation on Apache Jira [1] the Arrow project has decided to transition from Jira to GitHub's issue tracker [2]. The documentation you are referring to is not relevant for new contributors to the Arrow project. For now it would be best if

Re: [VOTE] Disable ASF Jira issue reporting

2022-11-24 Thread Rok Mihevc
+1 I would propose to also add a note about using tags (e.g. [C++][Parquet] before the issue name) when opening a new issue. Rok On Thu, Nov 24, 2022 at 5:03 PM Nic wrote: > +1 > > On Thu, 24 Nov 2022 at 15:57, Joris Van den Bossche < > jorisvandenboss...@gmail.com> wrote: > > > +1 > > > > On

Re: Need access to ASF Jira

2022-11-22 Thread Rok Mihevc
Hi Iris, Could you try using the GitHub issue tracker? Rok On Tue, Nov 22, 2022 at 11:16 PM Iris Chang wrote: > Hi, > > I have the same request -- could you please add the user > irischang...@gmail.com to ASF jira? > > Thanks, > Iris > > On Tue, Nov 22, 2022 at 1:17 PM Sutou Kouhei wrote: >

Re: [VOTE] Release Apache Arrow 10.0.1 - RC0

2022-11-17 Thread Rok Mihevc
+1 Passed on M1 with: TEST_DEFAULT=0 TEST_SOURCE=1 TEST_PYTHON=0 TEST_GLIB=0 TEST_RUBY=0 dev/release/verify-release-candidate.sh 10.0.1 0 With TEST_PYTHON=1 it reported a CMake issue: -- ArrowPythonFlight version: 10.0.1 -- Found the ArrowPythonFlight shared library:

Re: Creating dictionary encoded string in C++

2022-11-03 Thread Rok Mihevc
Hi Li, If it's practical for you to create an index and a dictionary array from your source you could use those to create a DictionaryArray as seen here [1]. Another option that might fit your situation is to use a dictionary builder [2]. Best, Rok [1]

Re: [ANNOUNCE] New Arrow committer: Jarrett Revels

2022-11-03 Thread Rok Mihevc
Congratulations! On Thu, Nov 3, 2022 at 12:31 AM David Li wrote: > Welcome Jarrett! > > On Tue, Nov 1, 2022, at 17:15, Sutou Kouhei wrote: > > On behalf of the Arrow PMC, I'm happy to announce that Jarrett Revels > > has accepted an invitation to become a committer on Apache > > Arrow. Welcome,

Re: [ANNOUNCE] New Arrow committer: Curtis Vogt

2022-11-03 Thread Rok Mihevc
Congratulations! On Thu, Nov 3, 2022 at 12:31 AM David Li wrote: > Welcome, Curtis! > > On Tue, Nov 1, 2022, at 17:14, Sutou Kouhei wrote: > > On behalf of the Arrow PMC, I'm happy to announce that Curtis Vogt > > has accepted an invitation to become a committer on Apache > > Arrow. Welcome,

Re: [ANNOUNCE] New Arrow committer: Yang Jiang

2022-11-03 Thread Rok Mihevc
Congrats! On Thu, Nov 3, 2022 at 2:27 PM Weston Pace wrote: > Congratulations > > On Thu, Nov 3, 2022, 6:25 AM Patrick Horan wrote: > > > Congrats Jiang! > > > > On Thu, Nov 3, 2022, at 1:52 AM, Wang Xudong wrote: > > > Congratulations! > > > > > > Yijie Shen 于2022年11月3日周四 11:08写道: > > > > >

Re: measuring memory usage of Arrow structures

2022-10-28 Thread Rok Mihevc
Hey Yaron, If you're using jemalloc you can use jemalloc_get_stat [1] to monitor total memory allocation. Another option would be LoggingMemoryPool, see tests on possible usage [2]. Rok [1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/memory_pool_jemalloc.cc#L157 [2]

Re: [ANNOUNCE] New Arrow committer: Will Jones

2022-10-28 Thread Rok Mihevc
Congratulations and welcome Will! :) Rok On Fri, Oct 28, 2022, 06:26 Anja wrote: > =) > > On Thu, 27 Oct 2022 at 16:56, Sutou Kouhei wrote: > > > On behalf of the Arrow PMC, I'm happy to announce that Will Jones > > has accepted an invitation to become a committer on Apache > > Arrow.

Re: [ANNOUNCE] New Arrow committer: Ben Baumgold

2022-10-27 Thread Rok Mihevc
Congrats Ben! Rok On Thu, Oct 27, 2022 at 5:53 PM Andrew Lamb wrote: > Congratulations! > > On Thu, Oct 27, 2022 at 11:21 AM Raúl Cumplido > wrote: > > > Congratulations Ben! > > > > El jue, 27 oct 2022 a las 5:11, Weston Pace () > > escribió: > > > > > Congratulations Ben! > > > > > > On

Re: [VOTE] Move issue tracking to GitHub Issues

2022-10-27 Thread Rok Mihevc
+1 (non-binding) Rok On Thu, Oct 27, 2022 at 9:19 AM Antoine Pitrou wrote: > > +1 (binding) but let's make sure we have a quality migration to keep as > much of the JIRA metadata as possible. > > Regards > > Antoine. > > > Le 27/10/2022 à 01:02, Neal Richardson a écrit : > > I propose that we

Re: [ANNOUNCE] New Arrow committer: Bogumił Kamiński

2022-10-25 Thread Rok Mihevc
Congrats Bogumił! Rok On Tue, Oct 25, 2022 at 11:15 PM David Li wrote: > Welcome Bogumił! > > On Tue, Oct 25, 2022, at 17:05, Sutou Kouhei wrote: > > Hi, > > > > On behalf of the Arrow PMC, I'm happy to announce that Bogumił Kamiński > > has accepted an invitation to become a committer on

Re: [ANNOUNCE] New Arrow PMC member: Jacob Quinn

2022-10-25 Thread Rok Mihevc
Congratulations Jacob! Rok On Tue, Oct 25, 2022 at 11:15 PM David Li wrote: > Congrats Jacob!! > > On Tue, Oct 25, 2022, at 17:06, Sutou Kouhei wrote: > > The Project Management Committee (PMC) for Apache Arrow has invited > > Jacob Quinn to become a PMC member and we are pleased to announce >

Re: [ANNOUNCE] New Arrow PMC member: Nicola Crane

2022-10-25 Thread Rok Mihevc
Congrats Nic! Rok On Tue, Oct 25, 2022 at 11:16 PM Will Jones wrote: > Congrats Nic! > > On Tue, Oct 25, 2022 at 2:14 PM David Li wrote: > > > Congrats & welcome Nic! > > > > On Tue, Oct 25, 2022, at 17:07, Matt Topol wrote: > > > Congrats!! > > > > > > On Tue, Oct 25, 2022 at 5:06 PM Sutou

Re: [ANNOUNCE] New Arrow committer: Remzi Yang

2022-09-12 Thread Rok Mihevc
Congrats! Rok On Sun, Sep 11, 2022 at 4:27 AM Ian Joiner wrote: > Congrats Remzi! > > On Sat, Sep 10, 2022 at 8:12 AM Andrew Lamb wrote: > > > On behalf of the Arrow PMC, I'm happy to announce that Remzi Yang > > has accepted an invitation to become a committer on Apache > > Arrow. Welcome,

Re: [ANNOUNCE] New Arrow committer: Yanghong Zhong

2022-09-12 Thread Rok Mihevc
Congrats! Rok On Fri, Sep 9, 2022 at 8:56 AM vin jake wrote: > Congratulations! > > On Fri, Sep 9, 2022 at 2:42 PM Yijie Shen > wrote: > > > Congratulations! > > > > On Fri, Sep 9, 2022 at 2:34 PM Kun Liu wrote: > > > > > Congrats!! > > > > > > Thanks, > > > Kun > > > > > > Matt Topol

Re: [ANNOUNCE] New Arrow PMC member: L. C. Hsieh

2022-09-05 Thread Rok Mihevc
Congrats! On Sun, Sep 4, 2022 at 8:14 PM Daniël Heres wrote: > Congratulations! > > On Sun, Sep 4, 2022, 19:37 L. C. Hsieh wrote: > > > Thanks all! > > > > On Sun, Sep 4, 2022 at 8:25 AM Chao Sun wrote: > > > > > > Congrats LiangChi! Well deserved! > > > > > > Chao > > > > > > On Sun, Sep 4,

Re: [ANNOUNCE] New Arrow PMC member: Weston Pace

2022-09-05 Thread Rok Mihevc
Congrats Weston! Rok

Re: [VOTE] Format: Rules and procedures for Canonical extension types

2022-08-29 Thread Rok Mihevc
+1 (non-binding) and preference for the "arrow." namespace. Rok

Re: [C++] Read Flight data source into Acero

2022-08-18 Thread Rok Mihevc
+1 for adding this either a utility function or cookbook recipe [1]. [1] https://github.com/apache/arrow-cookbook On Thu, Aug 18, 2022 at 2:34 PM Yaron Gvili wrote: > I have code in source_node.cc in a local branch adding factories for other > sources in SourceNode (e.g., streams of

Re: Arrow sync call August 17 at 12:00 US/Eastern, 16:00 UTC

2022-08-17 Thread Rok Mihevc
Attendees: Matt Topol Will Jones David Li Joris Van den Bosche Eduardo Ponce Atoine Pitrou Jacob Wujciak Ivan Ogasawara Ashish Paliwal Niranda Perera Discussion: - FlightSQL PR reviews David Li is inviting reviewers to two FlightSQL PRs. [1] "ARROW-7744: [Java][FlightRPC] JDBC Driver for Arrow

Re: Proposal: Allow any ASF Jira user to assign ARROW issues

2022-08-10 Thread Rok Mihevc
It would be great if this friction point is removed. +1 Rok On Wed, Aug 10, 2022 at 11:36 PM Sutou Kouhei wrote: > +1 > > In > "Proposal: Allow any ASF Jira user to assign ARROW issues" on Wed, 10 > Aug 2022 14:32:21 -0600, > Todd Farmer wrote: > > > Hello, > > > > Community members

Re: Arrow sync call July 20 at 12:00 US/Eastern, 16:00 UTC

2022-07-21 Thread Rok Mihevc
> One failing test is the R ubuntu test. Rok noted it is likely unrelated as he is seeing elsewhere [7]. > [7] https://github.com/apache/arrow/runs/7424773120?check_suite_focus=true Dragos is looking into this. It does seem to be an R timezone issue [1]. [1]

Re: Adding cpp memory profiling to Arrow

2022-07-06 Thread Rok Mihevc
I'm also working on exposing jemalloc statistics [1] if you'd want to directly access those. Rok [1] https://github.com/apache/arrow/pull/13516 On Wed, Jul 6, 2022 at 11:40 PM Rok Mihevc wrote: > I'm also working on exposing jemalloc statistics if you'd want to directly > access

Re: Adding cpp memory profiling to Arrow

2022-07-06 Thread Rok Mihevc
I'm also working on exposing jemalloc statistics if you'd want to directly access those. Rok On Wed, Jul 6, 2022 at 10:54 PM Ákos Hadnagy wrote: > Hi all, > > > As Will pointed it out, there’s an effort to integrate OTel and Acero, and > recently I did a few experiments to collect “big

Re: [ANNOUNCE] New Arrow committers: Dewey Dunnington, Alenka Frim, and Rok Mihevc

2022-06-27 Thread Rok Mihevc
itters! > > > > > > >> > > > > > > >> > > > > > > >> Le 22/06/2022 à 20:02, Andrew Lamb a écrit : > > > > > > >> > Congratulations! > > > > > > >> > > > > > > >

Re: Arrow sync call May 25 at 12:00 US/Eastern, 16:00 UTC

2022-05-25 Thread Rok Mihevc
Below are the minutes of the call. Best, Rok Present: Dewey Dunnington, Raul Cumplido, Will Jones, Jonathan Keane, Matt Topol, Rok Mihevc, Ian Joiner Agenda: 1. Minimal C++/C interface 2. Naming the C++ Compute Engine 3. Change to bucket creation behaviour in S3FileSystem Notes: 1. Dewey

  1   2   >