[jira] [Commented] (PARQUET-388) ProtoRecordConverter might wrongly cast a Message.Builder to Message

2016-05-19 Thread David Cardon (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292627#comment-15292627
 ] 

David Cardon commented on PARQUET-388:
--

A workaround I found was to replace all of the {{Message}} references in 
[ProtoReadSupport|https://github.com/apache/parquet-mr/blob/master/parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoReadSupport.java]
 with {{MessageOrBuilder}}.  You can then fairly easily resolve the {{Builder}} 
into a {{Message}} outside of the reading mechanism.

> ProtoRecordConverter might wrongly cast a Message.Builder to Message
> 
>
> Key: PARQUET-388
> URL: https://issues.apache.org/jira/browse/PARQUET-388
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-mr
>Reporter: Wu Xiang
>Assignee: Reuben Kuhnert
>
> ProtoRecordConverter returns current record as follows:
> {code}
>   public T getCurrentRecord() {
> if (buildBefore) {
>   return (T) this.reusedBuilder.build();
> } else {
>   return (T) this.reusedBuilder;
> }
>   }
> {code}
> However this might fail if T is subclass of Message and buildBefore == false, 
> since it's actually casting a Message.Builder instance to Message type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PARQUET-388) ProtoRecordConverter might wrongly cast a Message.Builder to Message

2016-02-01 Thread Matt Martin (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127838#comment-15127838
 ] 

Matt Martin commented on PARQUET-388:
-

I agree with [~wxiang7] that there seems to be something a little bit off here. 
 I'm also getting an unexpected ClassCastException in the following Scala code:

{code}
val reader = ParquetReader.builder(new ProtoReadSupport[SomeMessageClass](), 
new Path(file.toURI)).build
...
reader.read
{code}

At reader.read I get the following exception:

{code}
ClassCastException: : SomeMessageClass$Builder cannot be cast to 
SomeMessageClass
{code}

I cannot change the declaration of reader to the following:

{code}
val reader = ParquetReader.builder(new 
ProtoReadSupport[SomeMessageClass$Builder](), new Path(file.toURI)).build
{code}

because then I get the following error:

{code}
type arguments [SomeMessageClass$Builder] do not conform to class 
ProtoReadSupport's type parameter bounds [T <: com.google.protobuf.Message]
{code}

> ProtoRecordConverter might wrongly cast a Message.Builder to Message
> 
>
> Key: PARQUET-388
> URL: https://issues.apache.org/jira/browse/PARQUET-388
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-mr
>Reporter: Wu Xiang
>Assignee: Reuben Kuhnert
>
> ProtoRecordConverter returns current record as follows:
> {code}
>   public T getCurrentRecord() {
> if (buildBefore) {
>   return (T) this.reusedBuilder.build();
> } else {
>   return (T) this.reusedBuilder;
> }
>   }
> {code}
> However this might fail if T is subclass of Message and buildBefore == false, 
> since it's actually casting a Message.Builder instance to Message type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PARQUET-388) ProtoRecordConverter might wrongly cast a Message.Builder to Message

2015-11-03 Thread Lukas Nalezenec (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987434#comment-14987434
 ] 

Lukas Nalezenec commented on PARQUET-388:
-

> To be sure, when you say 'The cast is correct', do you mean by that 
> ProtoRecordConverter::getCurrentRecord would work correctly ?

No, it won't work correctly but  
ProtoRecordConverter::getCurrentRecord will work 
correctly.

> The signature for ProtoReadSupport is as follows, which leads 
> ProtoReadSupport to compile time error.

It looks like there is bug in the signature. IMO it should be 

public class ProtoReadSupport extends 
ReadSupport {}

I will look at it later today.

Thank you very much.

> ProtoRecordConverter might wrongly cast a Message.Builder to Message
> 
>
> Key: PARQUET-388
> URL: https://issues.apache.org/jira/browse/PARQUET-388
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-mr
>Reporter: Wu Xiang
>Assignee: Reuben Kuhnert
>
> ProtoRecordConverter returns current record as follows:
> {code}
>   public T getCurrentRecord() {
> if (buildBefore) {
>   return (T) this.reusedBuilder.build();
> } else {
>   return (T) this.reusedBuilder;
> }
>   }
> {code}
> However this might fail if T is subclass of Message and buildBefore == false, 
> since it's actually casting a Message.Builder instance to Message type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PARQUET-388) ProtoRecordConverter might wrongly cast a Message.Builder to Message

2015-11-03 Thread Lukas Nalezenec (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987136#comment-14987136
 ] 

Lukas Nalezenec commented on PARQUET-388:
-

Hi, author here.
The cast is of correct and the buildBefore flag is extra feature, its not used 
much.
Do you really need data converted from Builder to Message ? Using Builder is 
more efficient.

> ProtoRecordConverter might wrongly cast a Message.Builder to Message
> 
>
> Key: PARQUET-388
> URL: https://issues.apache.org/jira/browse/PARQUET-388
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-mr
>Reporter: Wu Xiang
>Assignee: Reuben Kuhnert
>
> ProtoRecordConverter returns current record as follows:
> {code}
>   public T getCurrentRecord() {
> if (buildBefore) {
>   return (T) this.reusedBuilder.build();
> } else {
>   return (T) this.reusedBuilder;
> }
>   }
> {code}
> However this might fail if T is subclass of Message and buildBefore == false, 
> since it's actually casting a Message.Builder instance to Message type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PARQUET-388) ProtoRecordConverter might wrongly cast a Message.Builder to Message

2015-11-03 Thread Reuben Kuhnert (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987875#comment-14987875
 ] 

Reuben Kuhnert commented on PARQUET-388:


Just as a general comment, returning {{}} does feel 
very mysterious to me. Does it make sense to have two explicit methods like 
{{asBuilder}} and {{asMessage}} in addition to the current {{getCurrentRecord}}?

> ProtoRecordConverter might wrongly cast a Message.Builder to Message
> 
>
> Key: PARQUET-388
> URL: https://issues.apache.org/jira/browse/PARQUET-388
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-mr
>Reporter: Wu Xiang
>Assignee: Reuben Kuhnert
>
> ProtoRecordConverter returns current record as follows:
> {code}
>   public T getCurrentRecord() {
> if (buildBefore) {
>   return (T) this.reusedBuilder.build();
> } else {
>   return (T) this.reusedBuilder;
> }
>   }
> {code}
> However this might fail if T is subclass of Message and buildBefore == false, 
> since it's actually casting a Message.Builder instance to Message type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PARQUET-388) ProtoRecordConverter might wrongly cast a Message.Builder to Message

2015-11-03 Thread Wu Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988762#comment-14988762
 ] 

Wu Xiang commented on PARQUET-388:
--

One more suggestion, let's just be honest to align the type signature of 
{{ProtoRecordConverter}} class definition and {{getCurrentRecord}} method.

I mean, when we use {{ProtoRecordConverter}} (or 
ProtoReadSupport), only MyProtoMessage instances would be 
returned from getCurrentRecord.
On the other hand, when we use {{ProtoRecordConverter}} 
or ProtoReadSupport, only MyProtoMessage.Builder 
instances returned.

In this way, I would feel less confused about ProtoRecordConverter API.

> ProtoRecordConverter might wrongly cast a Message.Builder to Message
> 
>
> Key: PARQUET-388
> URL: https://issues.apache.org/jira/browse/PARQUET-388
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-mr
>Reporter: Wu Xiang
>Assignee: Reuben Kuhnert
>
> ProtoRecordConverter returns current record as follows:
> {code}
>   public T getCurrentRecord() {
> if (buildBefore) {
>   return (T) this.reusedBuilder.build();
> } else {
>   return (T) this.reusedBuilder;
> }
>   }
> {code}
> However this might fail if T is subclass of Message and buildBefore == false, 
> since it's actually casting a Message.Builder instance to Message type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PARQUET-388) ProtoRecordConverter might wrongly cast a Message.Builder to Message

2015-11-03 Thread Wu Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987295#comment-14987295
 ] 

Wu Xiang commented on PARQUET-388:
--


Hi Lukas,

Thanks for your reply.
To be sure, when you say 'The cast is correct', do you mean by that 
{{ProtoRecordConverter::getCurrentRecord}} would work correctly 
?

In my case, I created a instance of {{ProtoReadSupport}}, which 
then created {{RecordMaterializer}} and 
{{ProtoRecordConverter}} instances internally.  Since the 
{{buildBefore}} flag is not set, 
{{ProtoRecordConverter::getCurrentRecord}} would definitely 
cast a {{MyProtoMessage.Builder}} instance to {{MyProtoMessage}}, with runtime 
exceptions.

P.S. The signature for {{ProtoReadSupport}} is as follows, which leads 
{{ProtoReadSupport}} to compile time error.
{code}
public class ProtoReadSupport extends ReadSupport {}
{code}


wu

> ProtoRecordConverter might wrongly cast a Message.Builder to Message
> 
>
> Key: PARQUET-388
> URL: https://issues.apache.org/jira/browse/PARQUET-388
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-mr
>Reporter: Wu Xiang
>Assignee: Reuben Kuhnert
>
> ProtoRecordConverter returns current record as follows:
> {code}
>   public T getCurrentRecord() {
> if (buildBefore) {
>   return (T) this.reusedBuilder.build();
> } else {
>   return (T) this.reusedBuilder;
> }
>   }
> {code}
> However this might fail if T is subclass of Message and buildBefore == false, 
> since it's actually casting a Message.Builder instance to Message type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PARQUET-388) ProtoRecordConverter might wrongly cast a Message.Builder to Message

2015-11-02 Thread Wu Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/PARQUET-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986645#comment-14986645
 ] 

Wu Xiang commented on PARQUET-388:
--


hi Reuben,

Yes,  I agree that current record could be either Message or Message.Builder. 
But the problem is when I use it, the 'buildBefore' flag would be always false, 
which means the Builder instance is always returned.

More details,
I'm implementing a wrapper for cascading to read/write parquet|protobuf data. 
And the default ProtoReadSupport class is used to help with that.

The signature for ProtoReadSupport is as follows, where only Mesage type is 
declared as the type parameter.
{code:title=ProtoReadSupport.java|borderStyle=solid}
public class ProtoReadSupport extends ReadSupport { }
{code}
But the RecordMaterializer inside ProtoReadSupport is actually returning 
Message.Builder. This is where type inconsistency comes from.

To make it work, I need to add some hacks for RecordMaterializer 
(ProtoRecordConverter), by switch the 'buildBefore' flag.
{code:title=ProtoReadSupport.java|borderStyle=solid}
@Override
public RecordMaterializer prepareForRead(Configuration configuration, 
Map keyValueMetaData, MessageType fileSchema, ReadContext 
readContext) {
String headerProtoClass = keyValueMetaData.get(PB_CLASS);
String configuredProtoClass = configuration.get(PB_CLASS);

if (configuredProtoClass != null) {
LOG.debug("Replacing class " + headerProtoClass + " by " + 
configuredProtoClass);
headerProtoClass = configuredProtoClass;
}

if (headerProtoClass == null) {
throw new RuntimeException("I Need parameter " + PB_CLASS + " with 
Protocol Buffer class");
}

LOG.debug("Reading data with Protocol Buffer class " + 
headerProtoClass);

MessageType requestedSchema = readContext.getRequestedSchema();
Class protobufClass = 
Protobufs.getProtobufClass(headerProtoClass);
ProtoRecordMaterializer materializer = new 
ProtoRecordMaterializer(requestedSchema, protobufClass);

((ProtoRecordConverter)materializer.getRootConverter()).setBuildBefore(true);
return materializer;
}
{code}

> ProtoRecordConverter might wrongly cast a Message.Builder to Message
> 
>
> Key: PARQUET-388
> URL: https://issues.apache.org/jira/browse/PARQUET-388
> Project: Parquet
>  Issue Type: Bug
>  Components: parquet-mr
>Reporter: Wu Xiang
>Assignee: Reuben Kuhnert
>
> ProtoRecordConverter returns current record as follows:
> {code}
>   public T getCurrentRecord() {
> if (buildBefore) {
>   return (T) this.reusedBuilder.build();
> } else {
>   return (T) this.reusedBuilder;
> }
>   }
> {code}
> However this might fail if T is subclass of Message and buildBefore == false, 
> since it's actually casting a Message.Builder instance to Message type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)