[ 
https://issues.apache.org/jira/browse/PARQUET-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219039#comment-17219039
 ] 

ASF GitHub Bot commented on PARQUET-1917:
-----------------------------------------

dossett commented on pull request #820:
URL: https://github.com/apache/parquet-mr/pull/820#issuecomment-714526665


   @gszadovszky Tagging you on this PR per discussion in the dev list.  If you 
approve the change I will also clean up the new tests a bit per David's 
comments.  Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [parquet-proto] default values are stored in oneOf fields that aren't set
> -------------------------------------------------------------------------
>
>                 Key: PARQUET-1917
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1917
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-protobuf
>    Affects Versions: 1.12.0
>            Reporter: Aaron Blake Niskode-Dossett
>            Priority: Major
>
> SCHEMA
> --------
> {noformat}
> message Person {
>   int32 foo = 1;
>   oneof optional_bar {
>     int32 bar_int = 200;
>     int32 bar_int2 = 201;
>     string bar_string = 300;
>   }
> }{noformat}
>  
> CODE
> --------
> I set values for foo and bar_string
>  
> {noformat}
> for (int i = 0; i < 3; i += 1) {
>                 com.etsy.grpcparquet.Person message = Person.newBuilder()
>                         .setFoo(i)
>                         .setBarString("hello world")
>                         .build();
>                 message.writeDelimitedTo(out);
>             }{noformat}
> And then I write the protobuf file out to parquet.
>  
> RESULT
> -----------
> {noformat}
> $ parquet-tools show example.parquet                                          
>                                                                               
> +-------+-----------+------------+--------------+
> |   foo |   bar_int |   bar_int2 | bar_string   |
> |-------+-----------+------------+--------------|
> |     0 |         0 |          0 | hello world  |
> |     1 |         0 |          0 | hello world  |
> |     2 |         0 |          0 | hello world  |
> +-------+-----------+------------+--------------+{noformat}
>  
> bar_int and bar_int2 should be EMPTY for all three rows since only bar_string 
> is set in the oneof.  0 is the default value for int, but it should not be 
> stored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to