Re: [ANNOUNCE] New Arrow committer: Sarah Gilmore

2024-04-11 Thread Sarah Gilmore
Thank you everyone! It's been awesome working with everyone and look forwarding 
to continuing to do so! 

From: Ian Cook 
Sent: Thursday, April 11, 2024 2:43 PM
To: dev@arrow.apache.org 
Subject: Re: [ANNOUNCE] New Arrow committer: Sarah Gilmore

Congrats Sarah!

On Thu, Apr 11, 2024 at 12:31 Bryce Mecum  wrote:

> Congratulations!
>
> On Thu, Apr 11, 2024 at 3:13 AM Sutou Kouhei  wrote:
> >
> > Hi,
> >
> > On behalf of the Arrow PMC, I'm happy to announce that Sarah
> > Gilmore has accepted an invitation to become a committer on
> > Apache Arrow. Welcome, and thank you for your contributions!
> >
> > Thanks,
> > --
> > kou
>


Re: [DISCUSS][MATLAB] Proposed "Category B" License for Bundling MATLAB MEX Build Artifacts in Official Arrow Release

2024-03-12 Thread Sarah Gilmore
Hi Everyone,



We just wanted to close the loop on this discussion.



After further discussion with our colleagues at MathWorks, we determined that 
we can license the MEX binaries and ALL other contents included within the 
MLTBX files distrusted via the ASF release infrastructure under the standard 
Apache V2 license.



ASF Legal agreed [1] that this approach abides by the ASF 3rd Party License 
Policy [2].



Moving forward, Kevin and I will continue working on integrating with the Arrow 
project's release infrastructure [3] as we initially planned.



We sincerely appreciate everyone's patience as we navigated these challenges.



[1] 
https://issues.apache.org/jira/browse/LEGAL-665?focusedCommentId=17823330=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17823330
[2] https://www.apache.org/legal/resolved.html
[3] https://github.com/apache/arrow/pull/38660



Best,



Sarah and Kevin


From: Sarah Gilmore 
Sent: Friday, January 26, 2024 1:28 PM
To: dev@arrow.apache.org 
Cc: Kevin Gurney 
Subject: Re: [DISCUSS][MATLAB] Proposed "Category B" License for Bundling 
MATLAB MEX Build Artifacts in Official Arrow Release

Hi Ian,

Thanks for the feedback! We will proceed with the ASF Legal process. Once we 
hear back from them, we'll followup on this thread to close the loop.

Thanks again!

Sarah and Kevin


From: Ian Cook 
Sent: Friday, January 26, 2024 11:37 AM
To: dev@arrow.apache.org 
Cc: Kevin Gurney 
Subject: Re: [DISCUSS][MATLAB] Proposed "Category B" License for Bundling 
MATLAB MEX Build Artifacts in Official Arrow Release

Hi Sarah and Kevin,

Thanks for your thoughtful follow-up.

Based on all of this, it seems that this question will need to be
submitted to ASF Legal for consideration. I think it is quite clear
that this is a good-faith effort to abide by the spirit of the ASF 3rd
Party License Policy, but the specific details will need to be
considered by ASF Legal.

> The binaries we plan to submit, and the accompanying license,
> are similar to the use cases listed under “Handling Licenses That
> Prevent Modification” [3] in the Category B description. While most
> of the contents of the distributed MLTBX file would be Apache-
> licensed, the compiled MEX functions would be dynamically linked
> against proprietary MathWorks shared libraries, which would cause
> inclusion of non-Apache licensed object code.

Yes, I think that is the right approach to pursue with ASF Legal:
asking them to add the license that governs the MEX functions to the
list of approved licenses under [3].

Thanks,
Ian


On Fri, Jan 26, 2024 at 10:56 AM Sarah Gilmore
 wrote:
>
> Hi all,
>
> After consulting with some of our colleagues at MathWorks, we wanted to 
> follow-up on this thread.
>
> Before going through the official ASF legal process, we wanted to give the 
> community some insight into our thinking about why our proposed license may 
> be appropriate for Category B consideration.
>
> Our interpretation of the ASF 3rd Party License Policy [1] was that Category 
> B licenses are not limited to standard licenses, but, rather, must meet the 
> Appropriately Labelled Condition and the Binary-Only Inclusion Condition. The 
> proposed license [2] we shared is intended to meet these conditions. However, 
> we understand that our interpretation may not be accurate.
>
> The binaries we plan to submit, and the accompanying license, are similar to 
> the use cases listed under “Handling Licenses That Prevent Modification” [3] 
> in the Category B description. While most of the contents of the distributed 
> MLTBX file would be Apache-licensed, the compiled MEX functions would be 
> dynamically linked against proprietary MathWorks shared libraries, which 
> would cause inclusion of non-Apache licensed object code.
>
> The goal of the proposed license is to allow the MLTBX file to be used and 
> distributed freely as an official ASF release artifact. Ideally, MathWorks 
> would like to restrict reverse engineering and modification of the 
> proprietary components and the proposed license includes a clause for this 
> restriction. Since the MATLAB Interface to Arrow will likely only be useful 
> to users of MathWorks products, our hope is that this restriction would not 
> be an impediment to users.
>
> We understand this is an unusual situation and appreciate the community's 
> support in helping us identify a solution.
>
> [1] 
> https://www.apache.org/legal/resolved.html<https://www.apache.org/legal/resolved.html>
> [2] 
> https://github.com/apache/arrow/files/13955180/license.txt<https://github.com/apache/arrow/files/13955180/license.txt>
> [3] 
> https://www.apache.org/legal/resolved.html#no-modification<https://www.apache.org/legal/resolved.html#no-modification>
>
> Best R

Re: [DISCUSS][MATLAB] Proposed "Category B" License for Bundling MATLAB MEX Build Artifacts in Official Arrow Release

2024-01-26 Thread Sarah Gilmore
Hi Ian,

Thanks for the feedback! We will proceed with the ASF Legal process. Once we 
hear back from them, we'll followup on this thread to close the loop. 

Thanks again!

Sarah and Kevin


From: Ian Cook 
Sent: Friday, January 26, 2024 11:37 AM
To: dev@arrow.apache.org 
Cc: Kevin Gurney 
Subject: Re: [DISCUSS][MATLAB] Proposed "Category B" License for Bundling 
MATLAB MEX Build Artifacts in Official Arrow Release 
 
Hi Sarah and Kevin,

Thanks for your thoughtful follow-up.

Based on all of this, it seems that this question will need to be
submitted to ASF Legal for consideration. I think it is quite clear
that this is a good-faith effort to abide by the spirit of the ASF 3rd
Party License Policy, but the specific details will need to be
considered by ASF Legal.

> The binaries we plan to submit, and the accompanying license,
> are similar to the use cases listed under “Handling Licenses That
> Prevent Modification” [3] in the Category B description. While most
> of the contents of the distributed MLTBX file would be Apache-
> licensed, the compiled MEX functions would be dynamically linked
> against proprietary MathWorks shared libraries, which would cause
> inclusion of non-Apache licensed object code.

Yes, I think that is the right approach to pursue with ASF Legal:
asking them to add the license that governs the MEX functions to the
list of approved licenses under [3].

Thanks,
Ian


On Fri, Jan 26, 2024 at 10:56 AM Sarah Gilmore
 wrote:
>
> Hi all,
>
> After consulting with some of our colleagues at MathWorks, we wanted to 
> follow-up on this thread.
>
> Before going through the official ASF legal process, we wanted to give the 
> community some insight into our thinking about why our proposed license may 
> be appropriate for Category B consideration.
>
> Our interpretation of the ASF 3rd Party License Policy [1] was that Category 
> B licenses are not limited to standard licenses, but, rather, must meet the 
> Appropriately Labelled Condition and the Binary-Only Inclusion Condition. The 
> proposed license [2] we shared is intended to meet these conditions. However, 
> we understand that our interpretation may not be accurate.
>
> The binaries we plan to submit, and the accompanying license, are similar to 
> the use cases listed under “Handling Licenses That Prevent Modification” [3] 
> in the Category B description. While most of the contents of the distributed 
> MLTBX file would be Apache-licensed, the compiled MEX functions would be 
> dynamically linked against proprietary MathWorks shared libraries, which 
> would cause inclusion of non-Apache licensed object code.
>
> The goal of the proposed license is to allow the MLTBX file to be used and 
> distributed freely as an official ASF release artifact. Ideally, MathWorks 
> would like to restrict reverse engineering and modification of the 
> proprietary components and the proposed license includes a clause for this 
> restriction. Since the MATLAB Interface to Arrow will likely only be useful 
> to users of MathWorks products, our hope is that this restriction would not 
> be an impediment to users.
>
> We understand this is an unusual situation and appreciate the community's 
> support in helping us identify a solution.
>
> [1] https://www.apache.org/legal/resolved.html
> [2] https://github.com/apache/arrow/files/13955180/license.txt
> [3] https://www.apache.org/legal/resolved.html#no-modification
>
> Best Regards,
>
> Sarah and Kevin
>
>
> From: Sarah Gilmore 
> Sent: Friday, January 19, 2024 1:58 PM
> To: dev@arrow.apache.org 
> Cc: Kevin Gurney 
> Subject: Re: [DISCUSS][MATLAB] Proposed "Category B" License for Bundling 
> MATLAB MEX Build Artifacts in Official Arrow Release
>
> Hi Roman,
>
> > FWIW: while these are all excellent questions for the pre-work, if there
> > needs to be an ultimate statement on this -- you'll have to file a LEGAL
> > JIRA. E.g.: https://issues.apache.org/jira/browse/LEGAL-506
> >
> > (plz include all the relevant details when filing it -- whatever comes
> > out of this thread).
>
> Thank you for the guidance. We suspected this may be the case and will be 
> sure to include all the relevant information when we file the Jira issue.
>
> Best,
>
> Sarah and Kevin
>
> From: Roman Shaposhnik 
> Sent: Friday, January 19, 2024 12:15 PM
> To: dev@arrow.apache.org 
> Subject: Re: [DISCUSS][MATLAB] Proposed "Category B" License for Bundling 
> MATLAB MEX Build Artifacts in Official Arrow Release
>
> On Thu, Jan 18, 2024 at 12:24 PM Ian Cook  wrote:
> >
> > Hi Sarah,
> >
> > Thanks for pursuing this.
> >
> > The ASF 3rd Party License Policy lists a number of standard,
> > off-the-shelf l

Re: [DISCUSS][MATLAB] Proposed "Category B" License for Bundling MATLAB MEX Build Artifacts in Official Arrow Release

2024-01-26 Thread Sarah Gilmore
Hi all,

After consulting with some of our colleagues at MathWorks, we wanted to 
follow-up on this thread.

Before going through the official ASF legal process, we wanted to give the 
community some insight into our thinking about why our proposed license may be 
appropriate for Category B consideration.

Our interpretation of the ASF 3rd Party License Policy [1] was that Category B 
licenses are not limited to standard licenses, but, rather, must meet the 
Appropriately Labelled Condition and the Binary-Only Inclusion Condition. The 
proposed license [2] we shared is intended to meet these conditions. However, 
we understand that our interpretation may not be accurate. 

The binaries we plan to submit, and the accompanying license, are similar to 
the use cases listed under “Handling Licenses That Prevent Modification” [3] in 
the Category B description. While most of the contents of the distributed MLTBX 
file would be Apache-licensed, the compiled MEX functions would be dynamically 
linked against proprietary MathWorks shared libraries, which would cause 
inclusion of non-Apache licensed object code. 

The goal of the proposed license is to allow the MLTBX file to be used and 
distributed freely as an official ASF release artifact. Ideally, MathWorks 
would like to restrict reverse engineering and modification of the proprietary 
components and the proposed license includes a clause for this restriction. 
Since the MATLAB Interface to Arrow will likely only be useful to users of 
MathWorks products, our hope is that this restriction would not be an 
impediment to users. 

We understand this is an unusual situation and appreciate the community's 
support in helping us identify a solution.

[1] https://www.apache.org/legal/resolved.html
[2] https://github.com/apache/arrow/files/13955180/license.txt
[3] https://www.apache.org/legal/resolved.html#no-modification

Best Regards,

Sarah and Kevin


From: Sarah Gilmore 
Sent: Friday, January 19, 2024 1:58 PM
To: dev@arrow.apache.org 
Cc: Kevin Gurney 
Subject: Re: [DISCUSS][MATLAB] Proposed "Category B" License for Bundling 
MATLAB MEX Build Artifacts in Official Arrow Release 
 
Hi Roman,

> FWIW: while these are all excellent questions for the pre-work, if there
> needs to be an ultimate statement on this -- you'll have to file a LEGAL
> JIRA. E.g.: https://issues.apache.org/jira/browse/LEGAL-506
>
> (plz include all the relevant details when filing it -- whatever comes
> out of this thread).

Thank you for the guidance. We suspected this may be the case and will be sure 
to include all the relevant information when we file the Jira issue. 

Best,

Sarah and Kevin

From: Roman Shaposhnik 
Sent: Friday, January 19, 2024 12:15 PM
To: dev@arrow.apache.org 
Subject: Re: [DISCUSS][MATLAB] Proposed "Category B" License for Bundling 
MATLAB MEX Build Artifacts in Official Arrow Release 
 
On Thu, Jan 18, 2024 at 12:24 PM Ian Cook  wrote:
>
> Hi Sarah,
>
> Thanks for pursuing this.
>
> The ASF 3rd Party License Policy lists a number of standard,
> off-the-shelf licenses that are compatible with Category B, but the
> policy does not include any provision for custom-written licenses.
> This appears to be a custom-written license. Is that correct?
>
> Is this custom-written license based on one of the listed Category B
> licenses? If so, can you tell us which one? If not, can you provide
> some explanation of why this license should be considered to meet the
> criteria for Category B?

FWIW: while these are all excellent questions for the pre-work, if there
needs to be an ultimate statement on this -- you'll have to file a LEGAL
JIRA. E.g.: https://issues.apache.org/jira/browse/LEGAL-506

(plz include all the relevant details when filing it -- whatever comes
out of this thread).

Thanks,
Roman.


>
> Thank you,
> Ian
>
> On Wed, Jan 17, 2024 at 12:08 PM Sarah Gilmore
>  wrote:
> >
> > Hi Everyone,
> >
> > Kevin Gurney and I have been working on integrating the MATLAB Arrow 
> > bindings with the project's release processes in this pull request [1]. 
> > While working on integrating with the release tooling, we realized that we 
> > need to ensure that the licenses of any MEX artifacts [2] bundled with the 
> > released MLTBX [3] file are compatible with the ASF 3rd Party License 
> > Policy [4].
> >
> > After several rounds of discussion with some colleagues at MathWorks, we 
> > came up with a license [5] that is intended to meet the requirements for 
> > inclusion as a "Category B" [6] license according to the ASF 3rd Party 
> > License Policy.
> >
> > Our goal is to make sure we are doing the right thing here, so, as per 
> > Kou's suggestion [7], we wanted to share the proposed license [5] with the 
> > broader Arrow development community. We

Re: [DISCUSS][MATLAB] Proposed "Category B" License for Bundling MATLAB MEX Build Artifacts in Official Arrow Release

2024-01-19 Thread Sarah Gilmore
Hi Roman,

> FWIW: while these are all excellent questions for the pre-work, if there
> needs to be an ultimate statement on this -- you'll have to file a LEGAL
> JIRA. E.g.: https://issues.apache.org/jira/browse/LEGAL-506
>
> (plz include all the relevant details when filing it -- whatever comes
> out of this thread).

Thank you for the guidance. We suspected this may be the case and will be sure 
to include all the relevant information when we file the Jira issue. 

Best,

Sarah and Kevin

From: Roman Shaposhnik 
Sent: Friday, January 19, 2024 12:15 PM
To: dev@arrow.apache.org 
Subject: Re: [DISCUSS][MATLAB] Proposed "Category B" License for Bundling 
MATLAB MEX Build Artifacts in Official Arrow Release 
 
On Thu, Jan 18, 2024 at 12:24 PM Ian Cook  wrote:
>
> Hi Sarah,
>
> Thanks for pursuing this.
>
> The ASF 3rd Party License Policy lists a number of standard,
> off-the-shelf licenses that are compatible with Category B, but the
> policy does not include any provision for custom-written licenses.
> This appears to be a custom-written license. Is that correct?
>
> Is this custom-written license based on one of the listed Category B
> licenses? If so, can you tell us which one? If not, can you provide
> some explanation of why this license should be considered to meet the
> criteria for Category B?

FWIW: while these are all excellent questions for the pre-work, if there
needs to be an ultimate statement on this -- you'll have to file a LEGAL
JIRA. E.g.: https://issues.apache.org/jira/browse/LEGAL-506

(plz include all the relevant details when filing it -- whatever comes
out of this thread).

Thanks,
Roman.


>
> Thank you,
> Ian
>
> On Wed, Jan 17, 2024 at 12:08 PM Sarah Gilmore
>  wrote:
> >
> > Hi Everyone,
> >
> > Kevin Gurney and I have been working on integrating the MATLAB Arrow 
> > bindings with the project's release processes in this pull request [1]. 
> > While working on integrating with the release tooling, we realized that we 
> > need to ensure that the licenses of any MEX artifacts [2] bundled with the 
> > released MLTBX [3] file are compatible with the ASF 3rd Party License 
> > Policy [4].
> >
> > After several rounds of discussion with some colleagues at MathWorks, we 
> > came up with a license [5] that is intended to meet the requirements for 
> > inclusion as a "Category B" [6] license according to the ASF 3rd Party 
> > License Policy.
> >
> > Our goal is to make sure we are doing the right thing here, so, as per 
> > Kou's suggestion [7], we wanted to share the proposed license [5] with the 
> > broader Arrow development community. We understand this may need further 
> > input from ASF Legal as well.
> >
> > Please let us know what we can do to help move this forward. We sincerely 
> > appreciate everyone's support as we navigate these licensing requirements.
> >
> > [1] https://github.com/apache/arrow/pull/38660
> > [2] https://www.mathworks.com/help/matlab/call-mex-functions.html
> > [3] 
> > https://www.mathworks.com/help/matlab/creating-help.html?s_tid=CRUX_lftnav
> > [4] https://www.apache.org/legal/resolved.html
> > [5] https://github.com/apache/arrow/files/13955180/license.txt
> > [6] https://www.apache.org/legal/resolved.html#category-b
> > [7] https://github.com/apache/arrow/pull/38660#discussion_r1454804607
> >
> > Best,
> >
> > Sarah Gilmore
> >

Re: [DISCUSS][MATLAB] Proposed "Category B" License for Bundling MATLAB MEX Build Artifacts in Official Arrow Release

2024-01-18 Thread Sarah Gilmore
Hi Ian,

Thank you for sharing your input on this.

> This appears to be a custom-written license. Is that correct?

Yes, that is correct. This is a custom written license.

> If not, can you provide some explanation of why this license should be 
> considered to meet the criteria for Category B?

We want to make sure we provide a thorough response to this question. So, we 
would like to first consult with a few colleagues at MathWorks. Our sincere 
apologies for the delay.

We hope to get back to the community quickly on this.

Best,

Sarah and Kevin


From: Ian Cook 
Sent: Thursday, January 18, 2024 3:22 PM
To: sgilm...@mathworks.com.invalid 
Cc: dev ; Kevin Gurney 
Subject: Re: [DISCUSS][MATLAB] Proposed "Category B" License for Bundling 
MATLAB MEX Build Artifacts in Official Arrow Release

Hi Sarah,

Thanks for pursuing this.

The ASF 3rd Party License Policy lists a number of standard,
off-the-shelf licenses that are compatible with Category B, but the
policy does not include any provision for custom-written licenses.
This appears to be a custom-written license. Is that correct?

Is this custom-written license based on one of the listed Category B
licenses? If so, can you tell us which one? If not, can you provide
some explanation of why this license should be considered to meet the
criteria for Category B?

Thank you,
Ian

On Wed, Jan 17, 2024 at 12:08 PM Sarah Gilmore
 wrote:
>
> Hi Everyone,
>
> Kevin Gurney and I have been working on integrating the MATLAB Arrow bindings 
> with the project's release processes in this pull request [1]. While working 
> on integrating with the release tooling, we realized that we need to ensure 
> that the licenses of any MEX artifacts [2] bundled with the released MLTBX 
> [3] file are compatible with the ASF 3rd Party License Policy [4].
>
> After several rounds of discussion with some colleagues at MathWorks, we came 
> up with a license [5] that is intended to meet the requirements for inclusion 
> as a "Category B" [6] license according to the ASF 3rd Party License Policy.
>
> Our goal is to make sure we are doing the right thing here, so, as per Kou's 
> suggestion [7], we wanted to share the proposed license [5] with the broader 
> Arrow development community. We understand this may need further input from 
> ASF Legal as well.
>
> Please let us know what we can do to help move this forward. We sincerely 
> appreciate everyone's support as we navigate these licensing requirements.
>
> [1] 
> https://github.com/apache/arrow/pull/38660<https://github.com/apache/arrow/pull/38660>
> [2] https://www.mathworks.com/help/matlab/call-mex-functions.html
> [3] https://www.mathworks.com/help/matlab/creating-help.html?s_tid=CRUX_lftnav
> [4] 
> https://www.apache.org/legal/resolved.html<https://www.apache.org/legal/resolved.html>
> [5] 
> https://github.com/apache/arrow/files/13955180/license.txt<https://github.com/apache/arrow/files/13955180/license.txt>
> [6] 
> https://www.apache.org/legal/resolved.html#category-b<https://www.apache.org/legal/resolved.html#category-b>
> [7] 
> https://github.com/apache/arrow/pull/38660#discussion_r1454804607<https://github.com/apache/arrow/pull/38660#discussion_r1454804607>
>
> Best,
>
> Sarah Gilmore
>


[DISCUSS][MATLAB] Proposed "Category B" License for Bundling MATLAB MEX Build Artifacts in Official Arrow Release

2024-01-17 Thread Sarah Gilmore
Hi Everyone,

Kevin Gurney and I have been working on integrating the MATLAB Arrow bindings 
with the project's release processes in this pull request [1]. While working on 
integrating with the release tooling, we realized that we need to ensure that 
the licenses of any MEX artifacts [2] bundled with the released MLTBX [3] file 
are compatible with the ASF 3rd Party License Policy [4].

After several rounds of discussion with some colleagues at MathWorks, we came 
up with a license [5] that is intended to meet the requirements for inclusion 
as a "Category B" [6] license according to the ASF 3rd Party License Policy.

Our goal is to make sure we are doing the right thing here, so, as per Kou's 
suggestion [7], we wanted to share the proposed license [5] with the broader 
Arrow development community. We understand this may need further input from ASF 
Legal as well.

Please let us know what we can do to help move this forward. We sincerely 
appreciate everyone's support as we navigate these licensing requirements.

[1] https://github.com/apache/arrow/pull/38660
[2] https://www.mathworks.com/help/matlab/call-mex-functions.html
[3] https://www.mathworks.com/help/matlab/creating-help.html?s_tid=CRUX_lftnav
[4] https://www.apache.org/legal/resolved.html
[5] https://github.com/apache/arrow/files/13955180/license.txt
[6] https://www.apache.org/legal/resolved.html#category-b
[7] https://github.com/apache/arrow/pull/38660#discussion_r1454804607

Best,

Sarah Gilmore



Re: [DISCUSS][MATLAB] Proposal for incremental point releases of the MATLAB interface

2023-11-13 Thread Sarah Gilmore
Hi Khou and Raúl,

> Yes. We should use apache/arrow's GitHub Releases.

We agree using apache/arrow's GitHub Releases the right thing to do.

​> GitHub releases do have a prerelease/rc status that can be activated. Maybe
> that could be used as an indicator to not include theses on the exchange
> site?

We just got confirmation from the File Exchange development team that the File 
Exchange-GitHub Integration ignores pre-releases. So marking release candidates 
as pre-releases should prevent the File Exchange entry from linking to them.

>  Does it just use "polling"? Or do we need to install any
> GitHub App, set secret variable or something on
> apache/arrow? If the latter, we need to ask INFRA to do it.

The current version of the File Exchange GitHub integration relies on polling. 
If this changes, we'll followup.

Based on the discussion so far, it seems like we have a clear path forward for 
integrating the MATLAB Interface into the Apache Arrow release process via 
GitHub Releases.

Thanks again for all the advice!

Best,
Sarah Gilmore


From: Sutou Kouhei 
Sent: Sunday, November 12, 2023 4:25 PM
To: dev@arrow.apache.org 
Cc: Lei Hou ; Sarah Gilmore 
Subject: Re: [DISCUSS][MATLAB] Proposal for incremental point releases of the 
MATLAB interface

Yes. We should use apache/arrow's GitHub Releases.

In 
"Re: [DISCUSS][MATLAB] Proposal for incremental point releases of the MATLAB 
interface" on Fri, 10 Nov 2023 19:11:16 +0100,
Raúl Cumplido  wrote:

> In case it was not clear, even though the binary job is run on
> ursacomputing/crossbow when we upload the binaries and create the
> Release that should be, at least in my opinion, an apache/arrow
> release.
>
> Both for the steps:
> 1. RC: Upload MLTBX to GitHub Releases for apache-arrow-X.Y.Z-rcN
> and
> 2.2 Upload it to GitHub Releases for apache-arrow-X.Y.Z
>
> El vie, 10 nov 2023 a las 19:02, Raúl Cumplido () escribió:
>>
>> Hi Sara,
>>
>> El vie, 10 nov 2023 a las 18:48, Sarah Gilmore
>> () escribió:
>> >
>> > Hi Kou,
>> >
>> > > We can use apache/arrow's GitHub Releases. The release
>> > > distribution document says that we can use GitHub as a
>> > > release platform:
>> > > https://infra.apache.org/release-distribution.html#other-platforms<https://infra.apache.org/release-distribution.html#other-platforms>
>> > >
>> > > apache/arrow doesn't use GitHub Releases yet but
>> > > apache/arrow-adbc and apache/arrow-flight-sql-postgresql
>> > > already use GitHub Releases. (We just use "gh release
>> > > upload" to upload our artifacts to GitHub Releases.)
>> >
>> > Thank you for clarifying that we can use apache/arrow's GitHub Releases 
>> > area for hosting the MLTBX file. We assumed we couldn't use the main 
>> > repository, but it's great to hear we can!
>> >
>> > > BTW, how does File Exchange "Connecting to GitHub Repositories"?
>> > > https://www.mathworks.com/matlabcentral/content/fx/about.html#Why_GitHub
>> > >
>> > > Does it just use "polling"? Or do we need to install any
>> > > GitHub App, set secret variable or something on
>> > > apache/arrow? If the latter, we need to ask INFRA to do it.
>> >
>> > We are currently consulting with the development team responsible for the 
>> > GitHub <-> File Exchange integration. We'll send a followup email with a 
>> > concrete answer once we know more.
>> >
>> > > If we use GitHub Releases on apache/arrow, we can use the
>> > > following workflow. We don't need to use JFrog.
>> > >
>> > > 1. RC: Upload MLTBX to GitHub Releases for apache-arrow-X.Y.Z-rcN
>> > > 2. Release: Run a post release script that would:
>> > > 2.1 Download MLTBX from GitHub Releases for apache-arrow-X.Y.Z-rcN
>> > > 2.2 Upload it to GitHub Releases for apache-arrow-X.Y.Z
>> > > 2.3 Linked File Exchange entry will be automatically updated
>> >
>> > This seems like a much more streamlined approach. Not having to upload to 
>> > JFrog will make things easier. Thanks for the suggestion!
>> >
>> > To clarify, in step 1, would we upload the MLTBX to 
>> > ursacomputing/crossbow's GitHub Releases area [1]? Or, would we upload to 
>> > apache/arrow's GitHub Releases area? If we upload release candidates to 
>> > apache/arrow's GitHub Releases area, they would get automatically linked 
>> > to the File Exchange. Ideally, we wouldn't want users to download release 
>> > candidates.
>>

Re: [DISCUSS][MATLAB] Proposal for incremental point releases of the MATLAB interface

2023-11-10 Thread Sarah Gilmore
Hi Raúl,

> Currently all the binaries are generated on the third step of the
> Release process [1] when we run `03-binary-submit.sh`. The crossbow
> job could build the MLTBX artifact and then when we do download the
> other binaries (`04-binary-download.sh`) we should also download the
> MTLBX and when we submit the rest to jfrog (`05-binary-upload.sh`) we
> could Upload MLTBX to GitHub Releases for apache-arrow-X.Y.Z-rcN.

Thanks for clarifying how these scripts work together. This all makes sense. 
Our one concern is that the Arrow-MATLAB File Exchange entry would be 
automatically updated to show release candidates that have been uploaded to 
apache/arrow's GitHub Releases area. We're looking into how to prevent this 
from happening.

Best,

Sarah Gilmore


From: Raúl Cumplido 
Sent: Friday, November 10, 2023 1:11 PM
To: Raúl Cumplido 
Cc: Sutou Kouhei ; dev@arrow.apache.org 
; Lei Hou ; Sarah Gilmore 

Subject: Re: [DISCUSS][MATLAB] Proposal for incremental point releases of the 
MATLAB interface

In case it was not clear, even though the binary job is run on
ursacomputing/crossbow when we upload the binaries and create the
Release that should be, at least in my opinion, an apache/arrow
release.

Both for the steps:
1. RC: Upload MLTBX to GitHub Releases for apache-arrow-X.Y.Z-rcN
and
2.2 Upload it to GitHub Releases for apache-arrow-X.Y.Z

El vie, 10 nov 2023 a las 19:02, Raúl Cumplido () escribió:
>
> Hi Sara,
>
> El vie, 10 nov 2023 a las 18:48, Sarah Gilmore
> () escribió:
> >
> > Hi Kou,
> >
> > > We can use apache/arrow's GitHub Releases. The release
> > > distribution document says that we can use GitHub as a
> > > release platform:
> > > https://infra.apache.org/release-distribution.html#other-platforms<https://infra.apache.org/release-distribution.html#other-platforms>
> > >
> > > apache/arrow doesn't use GitHub Releases yet but
> > > apache/arrow-adbc and apache/arrow-flight-sql-postgresql
> > > already use GitHub Releases. (We just use "gh release
> > > upload" to upload our artifacts to GitHub Releases.)
> >
> > Thank you for clarifying that we can use apache/arrow's GitHub Releases 
> > area for hosting the MLTBX file. We assumed we couldn't use the main 
> > repository, but it's great to hear we can!
> >
> > > BTW, how does File Exchange "Connecting to GitHub Repositories"?
> > > https://www.mathworks.com/matlabcentral/content/fx/about.html#Why_GitHub
> > >
> > > Does it just use "polling"? Or do we need to install any
> > > GitHub App, set secret variable or something on
> > > apache/arrow? If the latter, we need to ask INFRA to do it.
> >
> > We are currently consulting with the development team responsible for the 
> > GitHub <-> File Exchange integration. We'll send a followup email with a 
> > concrete answer once we know more.
> >
> > > If we use GitHub Releases on apache/arrow, we can use the
> > > following workflow. We don't need to use JFrog.
> > >
> > > 1. RC: Upload MLTBX to GitHub Releases for apache-arrow-X.Y.Z-rcN
> > > 2. Release: Run a post release script that would:
> > > 2.1 Download MLTBX from GitHub Releases for apache-arrow-X.Y.Z-rcN
> > > 2.2 Upload it to GitHub Releases for apache-arrow-X.Y.Z
> > > 2.3 Linked File Exchange entry will be automatically updated
> >
> > This seems like a much more streamlined approach. Not having to upload to 
> > JFrog will make things easier. Thanks for the suggestion!
> >
> > To clarify, in step 1, would we upload the MLTBX to 
> > ursacomputing/crossbow's GitHub Releases area [1]? Or, would we upload to 
> > apache/arrow's GitHub Releases area? If we upload release candidates to 
> > apache/arrow's GitHub Releases area, they would get automatically linked to 
> > the File Exchange. Ideally, we wouldn't want users to download release 
> > candidates.
> >
>
> Currently all the binaries are generated on the third step of the
> Release process [1] when we run `03-binary-submit.sh`. The crossbow
> job could build the MLTBX artifact and then when we do download the
> other binaries (`04-binary-download.sh`) we should also download the
> MTLBX and when we submit the rest to jfrog (`05-binary-upload.sh`) we
> could Upload MLTBX to GitHub Releases for apache-arrow-X.Y.Z-rcN.
>
> Once the release is approved and we do the post-release tasks to
> "officially" release, we would download the MLTBX and upload to the
> new GitHub Releases for apache-arrow-X.Y.Z this can be done as another
> step on our post-release tasks (post-xx-matlab.sh)
>
&g

Re: [DISCUSS][MATLAB] Proposal for incremental point releases of the MATLAB interface

2023-11-10 Thread Sarah Gilmore
Hi Kou,

> We can use apache/arrow's GitHub Releases. The release
> distribution document says that we can use GitHub as a
> release platform:
> https://infra.apache.org/release-distribution.html#other-platforms
>
> apache/arrow doesn't use GitHub Releases yet but
> apache/arrow-adbc and apache/arrow-flight-sql-postgresql
> already use GitHub Releases. (We just use "gh release
> upload" to upload our artifacts to GitHub Releases.)

Thank you for clarifying that we can use apache/arrow's GitHub Releases area 
for hosting the MLTBX file. We assumed we couldn't use the main repository, but 
it's great to hear we can!

> BTW, how does File Exchange "Connecting to GitHub Repositories"?
> https://www.mathworks.com/matlabcentral/content/fx/about.html#Why_GitHub
>
> Does it just use "polling"? Or do we need to install any
> GitHub App, set secret variable or something on
> apache/arrow? If the latter, we need to ask INFRA to do it.

We are currently consulting with the development team responsible for the 
GitHub <-> File Exchange integration. We'll send a followup email with a 
concrete answer once we know more.

> If we use GitHub Releases on apache/arrow, we can use the
> following workflow. We don't need to use JFrog.
>
> 1. RC: Upload MLTBX to GitHub Releases for apache-arrow-X.Y.Z-rcN
> 2. Release: Run a post release script that would:
> 2.1 Download MLTBX from GitHub Releases for apache-arrow-X.Y.Z-rcN
> 2.2 Upload it to GitHub Releases for apache-arrow-X.Y.Z
> 2.3 Linked File Exchange entry will be automatically updated

This seems like a much more streamlined approach. Not having to upload to JFrog 
will make things easier. Thanks for the suggestion!

To clarify, in step 1, would we upload the MLTBX to ursacomputing/crossbow's 
GitHub Releases area [1]? Or, would we upload to apache/arrow's GitHub Releases 
area? If we upload release candidates to apache/arrow's GitHub Releases area, 
they would get automatically linked to the File Exchange. Ideally, we wouldn't 
want users to download release candidates.

> We can use GitHub Releases as I said. But if we use GitHub
> Releases, the release notes on GitHub Releases may include
> not only the MATLAB interface but also all
> implementations. It may not be useful for this use case.
>
> FYI: The R bindings have their release notes under
> https://arrow.apache.org/docs/r/ . See
> https://arrow.apache.org/docs/r/news/ .

We think it would still be useful to link to the GitHub release notes from the 
File Exchange entry even if it includes notes for all language bindings. The 
File Exchange <-> GitHub integration just includes a link to the GitHub release 
notes under the Version History tab. If we find having a more focused version 
of the release notes would be useful, then we can create a markdown file 
analogous to the NEWS.md for the R bindings as you suggested (thanks or 
pointing this out).

[1] https://github.com/ursacomputing/crossbow/releases

Thanks for all your help!

Best,

Sarah Gilmore
____
From: Sutou Kouhei 
Sent: Thursday, November 9, 2023 7:50 PM
To: dev@arrow.apache.org 
Cc: Sarah Gilmore ; Lei Hou 
Subject: Re: [DISCUSS][MATLAB] Proposal for incremental point releases of the 
MATLAB interface

Hi,

> One open question about this approach: which GitHub
> repository should we use for hosting the MLTBX via GitHub
> Releases?
>
> We don't think using the main apache/arrow GitHub Releases
> area is the right approach. So, would it make sense to
> create a separate "bridge" repository just for hosting the
> latest MLTBX files? Should this be an ASF associated
> repository like apache/arrow-matlab or would a MathWorks
> associated repository like mathworks/arrow-matlab be OK?
> We aren't sure what makes the most sense here, but welcome
> any suggestions.

We can use apache/arrow's GitHub Releases. The release
distribution document says that we can use GitHub as a
release platform:
https://infra.apache.org/release-distribution.html#other-platforms<https://infra.apache.org/release-distribution.html#other-platforms>

apache/arrow doesn't use GitHub Releases yet but
apache/arrow-adbc and apache/arrow-flight-sql-postgresql
already use GitHub Releases. (We just use "gh release
upload" to upload our artifacts to GitHub Releases.)

BTW, how does File Exchange "Connecting to GitHub Repositories"?
https://www.mathworks.com/matlabcentral/content/fx/about.html#Why_GitHub

Does it just use "polling"? Or do we need to install any
GitHub App, set secret variable or something on
apache/arrow? If the latter, we need to ask INFRA to do it.

If we use GitHub Releases on apache/arrow, we can use the
following workflow. We don't need to use JFrog.

1. RC: Upload MLTBX to GitHub Releases for apache-arrow-X.Y.Z-rcN
2. Relea

Re: [Parquet][C++][Python] Maximum Row Group Length Default

2021-11-16 Thread Sarah Gilmore
Hi Micah,

Thanks for the clarifying! I just created 
this<https://issues.apache.org/jira/browse/ARROW-14723> Jira issue to track the 
issue with Pyarrow.

Thanks again!

Sarah

From: Micah Kornfield 
Sent: Monday, November 15, 2021 3:34 PM
To: dev 
Subject: Re: [Parquet][C++][Python] Maximum Row Group Length Default

>
> I was wondering if anyone could elaborate on why the default maximum row
> group length is set to 67108864<
> https://github.com/apache/arrow/blob/5c936560c1da003baf714d67dc92f25670730c84/cpp/src/parquet/properties.h#L97<https://github.com/apache/arrow/blob/5c936560c1da003baf714d67dc92f25670730c84/cpp/src/parquet/properties.h#L97>>.
> From Apache Parquet's documentation, the recommended row group size is
> between 512 MB and 1 
> GB.<https://parquet.apache.org/documentation/latest/<https://parquet.apache.org/documentation/latest>>
> For a Float64Array whose length is 67108864, I believe its size would be
> approximately 545 MB, which is on the low end of that interval.


I don't think we currently have any heuristic around row group size and the
row count (we probably should try adding one). Even the default seems
pretty high, since in general parquet files are going to have more then one
column per row group.


> I experimented with setting the default maximum row group length to larger
> values and noticed pyarrow cannot import Parquet files containing row
> groups whose lengths exceed 2147483647 rows (int32 max). However, I was
> able to read these files in using the C++ Arrow bindings.

This is surprising, and without seeing the exact error it sounds like a
bug. Could you open a JIRA to discuss (or check if there is already one
tracking this).


On Mon, Nov 15, 2021 at 12:23 PM Sarah Gilmore 
wrote:

> Hi all,
>
> I was wondering if anyone could elaborate on why the default maximum row
> group length is set to 67108864<
> https://github.com/apache/arrow/blob/5c936560c1da003baf714d67dc92f25670730c84/cpp/src/parquet/properties.h#L97<https://github.com/apache/arrow/blob/5c936560c1da003baf714d67dc92f25670730c84/cpp/src/parquet/properties.h#L97>>.
> From Apache Parquet's documentation, the recommended row group size is
> between 512 MB and 1 
> GB.<https://parquet.apache.org/documentation/latest/<https://parquet.apache.org/documentation/latest/>>
> For a Float64Array whose length is 67108864, I believe its size would be
> approximately 545 MB, which is on the low end of that interval.
>
> I was wondering if there was a particular reason why 67108864 was chosen
> as the maximum row group length. I experimented with setting the default
> maximum row group length to larger values and noticed pyarrow cannot import
> Parquet files containing row groups whose lengths exceed 2147483647 rows
> (int32 max). However, I was able to read these files in using the C++ Arrow
> bindings.
>
>
> Best,
> Sarah
>
>
>


[Parquet][C++][Python] Maximum Row Group Length Default

2021-11-15 Thread Sarah Gilmore
Hi all,

I was wondering if anyone could elaborate on why the default maximum row group 
length is set to 
67108864.
 From Apache Parquet's documentation, the recommended row group size is between 
512 MB and 1 GB. For a 
Float64Array whose length is 67108864, I believe its size would be 
approximately 545 MB, which is on the low end of that interval.

I was wondering if there was a particular reason why 67108864 was chosen as the 
maximum row group length. I experimented with setting the default maximum row 
group length to larger values and noticed pyarrow cannot import Parquet files 
containing row groups whose lengths exceed 2147483647 rows (int32 max). 
However, I was able to read these files in using the C++ Arrow bindings.


Best,
Sarah




Re: [Parquet, C++] Writing Compliant Nested Types to Parquet

2021-10-26 Thread Sarah Gilmore
Hi Micah,

Thanks for clearing this up for me!

Best,
Sarah

From: Micah Kornfield 
Sent: Monday, October 25, 2021 12:06 PM
To: dev 
Subject: Re: [Parquet, C++] Writing Compliant Nested Types to Parquet

Hi Sarah,
For new consumers of the library setting it to true probably makes sense.
For Arrow itself, it is a breaking change (it breaks field naming and exact
round tripping) which is why we haven't flipped it yet.

More discussion is happening on
https://issues.apache.org/jira/browse/ARROW-14196<https://issues.apache.org/jira/browse/ARROW-14196>

Cheers,
Micah

On Mon, Oct 25, 2021 at 7:49 AM Sarah Gilmore 
wrote:

> Hi all,
>
> I have a question about writing nested datatypes to Parquet files. I
> noticed that the use_compliant_nested_types option is set to false by
> default. I was wondering if there was a particular reason this is set to
> false by default. I noticed a comment in parquet/properties.h<
> https://github.com/apache/arrow/blob/be665ef948cb2c6706c60053c5db918e948713e8/cpp/src/parquet/properties.h#L650<https://github.com/apache/arrow/blob/be665ef948cb2c6706c60053c5db918e948713e8/cpp/src/parquet/properties.h#L650>>
> suggesting the default should be flipped to true at some point. Is there a
> reason why you wouldn't want to set this option to true?
>
> Thanks,
> Sarah
>


[Parquet, C++] Writing Compliant Nested Types to Parquet

2021-10-25 Thread Sarah Gilmore
Hi all,

I have a question about writing nested datatypes to Parquet files. I noticed 
that the use_compliant_nested_types option is set to false by default. I was 
wondering if there was a particular reason this is set to false by default. I 
noticed a comment in 
parquet/properties.h
 suggesting the default should be flipped to true at some point. Is there a 
reason why you wouldn't want to set this option to true?

Thanks,
Sarah


[GitHub] Pull Request 10305

2021-06-23 Thread Sarah Gilmore
Hi all,

David Li suggested I email the mailing list to see if anyone would be 
interested in reviewing this pull 
request for the MATLAB interface to 
Arrow.

If anyone has time to look at it that would be great, and if there's anything 
we can do to help please let us know.

Best,
Sarah


Re: [MATLAB] Label for MATLAB Pull Requests

2021-05-25 Thread Sarah Gilmore
Hi Jorge,

Thanks for info! I'll create a jira task and pull request.

Best,
Sarah

From: Jorge Cardoso Leit?o 
Sent: Friday, May 21, 2021 12:40 PM
To: dev@arrow.apache.org 
Subject: Re: [MATLAB] Label for MATLAB Pull Requests

Hi,

Could you create a JIRA and PR? The relevant place is here [1]

Best,
Jorge

[1]
https://github.com/apache/arrow/blob/master/.github/workflows/dev_pr/labeler.yml#L1<https://github.com/apache/arrow/blob/master/.github/workflows/dev_pr/labeler.yml#L1>



On Fri, May 21, 2021, 17:46 Sarah Gilmore  wrote:

> Hi all,
>
> I was looking through the list of open pull-requests and I noticed a few
> of them are tagged with language-specific labels, such as "lang-c++". I was
> wondering if it's possible to add a "lang-matlab" label that we could use
> to better organize our pull requests for the MATLAB interface to Arrow?
>
> Best,
> Sarah
>


[MATLAB] Label for MATLAB Pull Requests

2021-05-21 Thread Sarah Gilmore
Hi all,

I was looking through the list of open pull-requests and I noticed a few of 
them are tagged with language-specific labels, such as "lang-c++". I was 
wondering if it's possible to add a "lang-matlab" label that we could use to 
better organize our pull requests for the MATLAB interface to Arrow?

Best,
Sarah


[JIRA Permissions] Requesting change to "Contributor"

2021-04-14 Thread Sarah Gilmore
Hi all,

I just created a Jira issue
 and I would like to be
added as a Jira contributor so that I can assign the issue to myself. My
Jira username is sgilmore. If this is possible, that would be great!

Best,
Sarah