Re: [ANNOUNCE] New Parquet PMC Member: Gang Wu

2024-05-12 Thread Gang Wu
Thanks all! It is really a pleasure to be a part of the community. Best, Gang On Sun, May 12, 2024 at 2:06 PM Ed Seidl wrote: > +1 :-) Congrats, Gang! > > On 5/11/24 4:05 PM, Micah Kornfield wrote: > > Congrats Gang! > > > > On Sat, May 11, 2024 at 12:15 PM Vinoo Ganesh > > wrote: > > > >> C

Re: [DISCUSS] Propose changing the default branch of the parquet-site repo

2024-05-12 Thread Gang Wu
+1 This makes sense. I was also confused when I had access to parquet-site for the first time. Thanks Andrew! Best, Gang On Sun, May 12, 2024 at 3:15 AM Vinoo Ganesh wrote: > +1, this would be great. It's something Xinli and I discussed when we first > made the website updates, but it ended u

Re: [PR] Update README.md on asf-site branch with pointer to real readme [parquet-site]

2024-05-12 Thread via GitHub
wgtmac commented on PR #57: URL: https://github.com/apache/parquet-site/pull/57#issuecomment-2106151363 Thanks Andrew for doing this! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] Update README.md on asf-site branch with pointer to real readme [parquet-site]

2024-05-12 Thread via GitHub
wgtmac merged PR #57: URL: https://github.com/apache/parquet-site/pull/57 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache

Re: [PR] Add Dockerfile + instructions on how to preview site using docker rather than installing `hugo` locally [parquet-site]

2024-05-12 Thread via GitHub
wgtmac commented on PR #56: URL: https://github.com/apache/parquet-site/pull/56#issuecomment-2106152954 For the staging site, I had a discussion with @gszadovszky here: https://github.com/apache/parquet-site/pull/31#issuecomment-1474023977. I think we can remove the staging site now and use

Re: [PR] Add Dockerfile + instructions on how to preview site using docker rather than installing `hugo` locally [parquet-site]

2024-05-12 Thread via GitHub
wgtmac merged PR #56: URL: https://github.com/apache/parquet-site/pull/56 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache

Re: [PR] First draft of docs about parquet format vs mr [parquet-site]

2024-05-12 Thread via GitHub
wgtmac commented on PR #53: URL: https://github.com/apache/parquet-site/pull/53#issuecomment-2106156421 @xhochy @pitrou @tustvold Would you like to take a final pass? Will merge it if there is no further comment next week. -- This is an automated message from the Apache Git Service.

Re: [DISCUSS] Propose changing the default branch of the parquet-site repo

2024-05-12 Thread Uwe L. Korn
+1 On Sun, May 12, 2024, at 9:31 AM, Gang Wu wrote: > +1 > > This makes sense. I was also confused when I had access to > parquet-site for the first time. > > Thanks Andrew! > > Best, > Gang > > On Sun, May 12, 2024 at 3:15 AM Vinoo Ganesh wrote: > >> +1, this would be great. It's something Xinl

Re: [PR] First draft of docs about parquet format vs mr [parquet-site]

2024-05-12 Thread via GitHub
xhochy commented on code in PR #53: URL: https://github.com/apache/parquet-site/pull/53#discussion_r1597576454 ## content/en/docs/Overview/_index.md: ## @@ -7,3 +7,40 @@ description: > --- Apache Parquet is a columnar storage format available to any project in the Hadoop ec

Re: [PR] First draft of docs about parquet format vs mr [parquet-site]

2024-05-12 Thread via GitHub
wgtmac commented on code in PR #53: URL: https://github.com/apache/parquet-site/pull/53#discussion_r1597576971 ## content/en/docs/Overview/_index.md: ## @@ -7,3 +7,40 @@ description: > --- Apache Parquet is a columnar storage format available to any project in the Hadoop ec

[DISCUSS] Add geometry logical type

2024-05-12 Thread Gang Wu
Hi, Apache Iceberg community is proposing to add geospatial support [1]. It would be good if Apache Parquet can support native geometry type to implement more efficient encoding, statistics and filtering. Therefore, I'd like to propose a format change to add a new geometry logical type: [2]. It is

Re: Interest in Parquet V3

2024-05-12 Thread Gang Wu
Hi Micah, I have also noticed the emergence of these new file formats which are challenging the popularity of Apache Parquet. It would always be good to evolve Parquet to be competitive. Personally I'm +1 on this. I'm also proposing adding a new geometry type to the specs: [1]. This seems to align

Re: [PR] Add Dockerfile + instructions on how to preview site using docker rather than installing `hugo` locally [parquet-site]

2024-05-12 Thread via GitHub
alamb commented on PR #56: URL: https://github.com/apache/parquet-site/pull/56#issuecomment-2106180380 Thanks @wgtmac and @vinooganesh -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: Interest in Parquet V3

2024-05-12 Thread Andrew Lamb
My opinion is that most (if not all) of the proposed benefits from these new formats can be achieved using the currrent parquet format and improved implementations (possibly with some minor extensions such as user defined encoding schemes)[1] Another reason people propose replacing parquet I think

Re: Fwd: [C++] Parquet and Arrow overlap

2024-05-12 Thread Gang Wu
I have just finished a round of checking parquet-cpp open issues and resolved some of them which I believe are completed. I will start a vote next week. Once the migration is done, what should we do with the Parquet tickets? Now all the Arrow tickets are immutable. However, Parquet tickets should

[PR] Remove staging [parquet-site]

2024-05-12 Thread via GitHub
vinooganesh opened a new pull request, #58: URL: https://github.com/apache/parquet-site/pull/58 There still needs to be an infra ticket filed to actually delete the `staging` branch (unless a PMC member can delete the branch) -- This is an automated message from the Apache Git Service. To

Re: [PR] Remove staging [parquet-site]

2024-05-12 Thread via GitHub
vinooganesh commented on PR #58: URL: https://github.com/apache/parquet-site/pull/58#issuecomment-2106372596 cc @wgtmac @gszadovszky @alamb after conversation on https://github.com/apache/parquet-site/pull/56 -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] First draft of docs about parquet format vs mr [parquet-site]

2024-05-12 Thread via GitHub
vinooganesh commented on code in PR #53: URL: https://github.com/apache/parquet-site/pull/53#discussion_r1597713221 ## content/en/docs/Overview/_index.md: ## @@ -7,3 +7,40 @@ description: > --- Apache Parquet is a columnar storage format available to any project in the Hado

Re: [PR] First draft of docs about parquet format vs mr [parquet-site]

2024-05-12 Thread via GitHub
vinooganesh commented on PR #53: URL: https://github.com/apache/parquet-site/pull/53#issuecomment-2106374201 Thanks - and just to make sure it's clear, my main goal was to start the process of actually documenting the institutional knowledge in the community and this PR is mostly intended a

Re: Interest in Parquet V3

2024-05-12 Thread Vinoo Ganesh
I don't have strong feelings about this one way or the other, but would gladly put my hand up to help collaborate on proposals/implementation as we figure this out. On Sun, May 12, 2024 at 5:31 AM Andrew Lamb wrote: > My opinion is that most (if not all) of the proposed benefits from these >

Re: [PR] Remove staging [parquet-site]

2024-05-12 Thread via GitHub
alamb commented on PR #58: URL: https://github.com/apache/parquet-site/pull/58#issuecomment-2106376324 We may also want to update the readme too: https://github.com/apache/parquet-site/blob/production/README.md -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] First draft of docs about parquet format vs mr [parquet-site]

2024-05-12 Thread via GitHub
alamb commented on PR #53: URL: https://github.com/apache/parquet-site/pull/53#issuecomment-2106376534 > Thanks - and just to make sure it's clear, my main goal was to start the process of actually documenting the institutional knowledge in the community and this PR is mostly intended as a

Re: [PR] Remove staging [parquet-site]

2024-05-12 Thread via GitHub
vinooganesh commented on PR #58: URL: https://github.com/apache/parquet-site/pull/58#issuecomment-2106379277 Yep, there is actually a sequencing of things that need to happen here: 1. Deleting the `asf-staging` brach 2. Deleting the `staging branch` 3. Deleting the README from the p

Re: [PR] Remove staging [parquet-site]

2024-05-12 Thread via GitHub
alamb commented on PR #58: URL: https://github.com/apache/parquet-site/pull/58#issuecomment-2106397019 > The main thing I'm curious about is whether the PMC can delete branches easily from github. If so, it maybe much more straightforward, otherwise will have to file INFRA tickets In

Re: [PR] Remove staging [parquet-site]

2024-05-12 Thread via GitHub
wgtmac commented on PR #58: URL: https://github.com/apache/parquet-site/pull/58#issuecomment-2106471367 I think I can delete the staging branch. Before that, should we send a notice to the dev ML in case there is any objection? Maybe we can set a deadline and proceed after that. I don't thi

Re: [PR] First draft of docs about parquet format vs mr [parquet-site]

2024-05-12 Thread via GitHub
jorisvandenbossche commented on code in PR #53: URL: https://github.com/apache/parquet-site/pull/53#discussion_r1597944460 ## content/en/docs/Overview/_index.md: ## @@ -7,3 +7,41 @@ description: > --- Apache Parquet is a columnar storage format available to any project in th