Hi
I have encountered a regression for writing nulls to the complex type. I
have moved from parquet 1.8.x to 1.12 recently.
Here is what I found out.
My dataset has 111k null values to be written to a complex type. Earlier
with 1.8.x, it would create single page but with 1.12 it creates 20 pages
[
https://issues.apache.org/jira/browse/PARQUET-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829769#comment-16829769
]
Deepak Majeti commented on PARQUET-1405:
Filed https://issues.apache.org/jira/b
[
https://issues.apache.org/jira/browse/PARQUET-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829735#comment-16829735
]
Wes McKinney commented on PARQUET-1405:
---
We can add an option to not write statis
[
https://issues.apache.org/jira/browse/PARQUET-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829711#comment-16829711
]
Deepak Majeti edited comment on PARQUET-1405 at 4/29/19 8:59 PM:
[
https://issues.apache.org/jira/browse/PARQUET-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829711#comment-16829711
]
Deepak Majeti commented on PARQUET-1405:
PARQUET-979 omits large statistics ins
[
https://issues.apache.org/jira/browse/PARQUET-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Deepak Majeti reassigned PARQUET-1405:
--
Assignee: Deepak Majeti
> [C++] 'Couldn't deserialize thrift' error when reading lar
Hi,
A video call sounds more secure to me than a photo which can be easily
manipulated. We could spend 5 minutes on it in the next Parquet sync
or alternatively is there someone already in the web of trust who
would volunteer to do a private video call with us before or after the
sync?
Thanks,
Z
On Mon, Apr 29, 2019 at 12:48 PM Zoltan Ivanfi
wrote:
>
> Hi,
>
> An excerpt from
> https://www.apache.org/dev/release-signing#verifying-signature : "A
> signature is valid, if gpg verifies the .asc as a good signature, and
> doesn't complain about expired or revoked keys." Another excerpt from
>
Hi,
An excerpt from
https://www.apache.org/dev/release-signing#verifying-signature : "A
signature is valid, if gpg verifies the .asc as a good signature, and
doesn't complain about expired or revoked keys." Another excerpt from
https://www.apache.org/dev/release-signing#check-integrity that
reinfo
Yeah, you are right. Looks like the right JIRA ticket.
On Mon, 29 Apr 2019 at 5:39 PM, Curt Hagenlocher
wrote:
> Would that be covered by PARQUET-458 (
> https://issues.apache.org/jira/browse/PARQUET-458)?
>
> On Mon, Apr 29, 2019 at 8:18 AM Wes McKinney wrote:
>
> > Is there a JIRA issue about
Not in V2, in V1 the whole page is encoded, but in V2 it is only values, if
I remember correctly. So we would have to extract repetition and definition
levels bytes and then decode values.
You can check out code in parquet rust module!
I am not sure about parquet-cpp, we can use that implementati
hi Zoltan,
I'm looking for ASF guidelines around this, whether it is MUST or SHOULD
https://www.apache.org/dev/release-signing#web-of-trust
Because SVN access is only password protected, having access to the
KEYS file is a weak standard of security. Could other PMC members
comment on this?
Than
Hi Wes,
Gabor's key is in the KEYS file available at
https://dist.apache.org/repos/dist/dev/parquet/KEYS Others may correct me
if I'm mistaken, but as far as I know, this is all that is required. I
mentioned this in the verification steps as well ("4. Verify the signature
by running `gpg --verify
-1
Gabor's PGP key is unsigned.
$ gpg --verify apache-parquet-1.11.0.tar.gz.asc
gpg: assuming signed data in 'apache-parquet-1.11.0.tar.gz'
gpg: Signature made Tue 19 Mar 2019 08:55:48 AM CDT
gpg:using RSA key 6FB82970311551C7CEF131F5021057DBF048F543
gpg: Good signature from "Gabo
[
https://issues.apache.org/jira/browse/PARQUET-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829371#comment-16829371
]
John Adcock commented on PARQUET-1405:
--
I'm being hit by this issue, I'm happy to
Dear Apache communities,
Big thanks to those who participated in the survey (great participation from
the Apache communities). If you haven’t participated, please participate?
What value is there in participating?
I will be sharing the results with the community in the form of a report
(slides
Would that be covered by PARQUET-458 (
https://issues.apache.org/jira/browse/PARQUET-458)?
On Mon, Apr 29, 2019 at 8:18 AM Wes McKinney wrote:
> Is there a JIRA issue about data page v2 issues in parquet-cpp?
>
> On Mon, Apr 29, 2019 at 9:57 AM Curt Hagenlocher
> wrote:
> >
> > But the data pag
Is there a JIRA issue about data page v2 issues in parquet-cpp?
On Mon, Apr 29, 2019 at 9:57 AM Curt Hagenlocher wrote:
>
> But the data page is decoded only after it is decompressed, so I wouldn’t
> expect an unsupported data page to cause a decompression failure.
>
> (I am playing with adding
But the data page is decoded only after it is decompressed, so I wouldn’t
expect an unsupported data page to cause a decompression failure.
(I am playing with adding V2 support to Parquet.Net.)
Sent from my iPhone
> On Apr 29, 2019, at 7:30 AM, Ivan Sadikov wrote:
>
> If you are referring to
If you are referring to the file in Apache/parquet-testing repository, it
is a valid Parquet file with data encoded into data page v2.
You can easily test it with “cargo install parquet” and “parquet-read
filepath”.
I am not sure what kind of code you have written, but the error you have
encounte
To the best of my ability to tell, there is invalid Snappy data in the file
parquet-testing/data/datapage_v2.snappy.parquet. I can neither read it with
my own code nor with pyarrow 0.13.0. Is this expected to work?
Thanks!
-Curt
21 matches
Mail list logo