[jira] [Updated] (PIG-5272) BagToTuple Output Schema

2017-09-22 Thread Joshua Juen (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua Juen updated PIG-5272:
-
  Labels: patch  (was: )
Release Note: Removed Incorrect Schema Definition from BagToTuple
  Status: Patch Available  (was: Open)

> BagToTuple Output Schema
> 
>
> Key: PIG-5272
> URL: https://issues.apache.org/jira/browse/PIG-5272
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.17.0
>Reporter: Joshua Juen
>Priority: Minor
>  Labels: patch
> Fix For: 0.18.0
>
> Attachments: BagToTupleSchema.patch
>
>
> The output schema from BagToTuple is nonsensical causing problems using the 
> tuple later in the same script. 
> For example: Given a bag: { data:chararray }, calling BagToTuple yields the 
> schema: ( data:chararray )
> But, this makes no sense since if the above bag contains: {data1, data2, 
> data3} entries, the output tuple from BagToTuple will be:
> (data1:chararray, data2:chararray, data3:chararray) != (data:chararray),the 
> declared output schema from the UDF.
> Unfortunately, the schema of the tuple cannot be known during the initial 
> validation phase. Thus, I believe the output schema from the UDF should be 
> modified to be type tuple without the number of fields being fixed to the 
> number of columns in the input bag. 
> Under the current way, the elements in the tuple cannot be accessed in the 
> script after calling BagToTuple without getting an incompatible type error. 
> We have modified the UDF in our internal UDF jars to work around the issue. 
> Let me know if this sounds reasonable and I can generate the patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5272) BagToTuple Output Schema

2017-09-22 Thread Joshua Juen (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua Juen updated PIG-5272:
-
Flags: Patch
   Patch Info: Patch Available
Affects Version/s: 0.17.0
Fix Version/s: 0.18.0
  Summary: BagToTuple Output Schema  (was: BagToString Output 
Schema)

> BagToTuple Output Schema
> 
>
> Key: PIG-5272
> URL: https://issues.apache.org/jira/browse/PIG-5272
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.17.0
>Reporter: Joshua Juen
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: BagToTupleSchema.patch
>
>
> The output schema from BagToTuple is nonsensical causing problems using the 
> tuple later in the same script. 
> For example: Given a bag: { data:chararray }, calling BagToTuple yields the 
> schema: ( data:chararray )
> But, this makes no sense since if the above bag contains: {data1, data2, 
> data3} entries, the output tuple from BagToTuple will be:
> (data1:chararray, data2:chararray, data3:chararray) != (data:chararray),the 
> declared output schema from the UDF.
> Unfortunately, the schema of the tuple cannot be known during the initial 
> validation phase. Thus, I believe the output schema from the UDF should be 
> modified to be type tuple without the number of fields being fixed to the 
> number of columns in the input bag. 
> Under the current way, the elements in the tuple cannot be accessed in the 
> script after calling BagToTuple without getting an incompatible type error. 
> We have modified the UDF in our internal UDF jars to work around the issue. 
> Let me know if this sounds reasonable and I can generate the patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5272) BagToTuple Output Schema

2017-09-22 Thread Joshua Juen (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua Juen updated PIG-5272:
-
Attachment: BagToTupleSchema.patch

> BagToTuple Output Schema
> 
>
> Key: PIG-5272
> URL: https://issues.apache.org/jira/browse/PIG-5272
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.17.0
>Reporter: Joshua Juen
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: BagToTupleSchema.patch
>
>
> The output schema from BagToTuple is nonsensical causing problems using the 
> tuple later in the same script. 
> For example: Given a bag: { data:chararray }, calling BagToTuple yields the 
> schema: ( data:chararray )
> But, this makes no sense since if the above bag contains: {data1, data2, 
> data3} entries, the output tuple from BagToTuple will be:
> (data1:chararray, data2:chararray, data3:chararray) != (data:chararray),the 
> declared output schema from the UDF.
> Unfortunately, the schema of the tuple cannot be known during the initial 
> validation phase. Thus, I believe the output schema from the UDF should be 
> modified to be type tuple without the number of fields being fixed to the 
> number of columns in the input bag. 
> Under the current way, the elements in the tuple cannot be accessed in the 
> script after calling BagToTuple without getting an incompatible type error. 
> We have modified the UDF in our internal UDF jars to work around the issue. 
> Let me know if this sounds reasonable and I can generate the patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5272) BagToString Output Schema

2017-07-14 Thread Joshua Juen (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua Juen updated PIG-5272:
-
Description: 
The output schema from BagToTuple is nonsensical causing problems using the 
tuple later in the same script. 

For example: Given a bag: { data:chararray }, calling BagToTuple yields the 
schema: ( data:chararray )

But, this makes no sense since if the above bag contains: {data1, data2, data3} 
entries, the output tuple from BagToTuple will be:
(data1:chararray, data2:chararray, data3:chararray) != (data:chararray),the 
declared output schema from the UDF.

Unfortunately, the schema of the tuple cannot be known during the initial 
validation phase. Thus, I believe the output schema from the UDF should be 
modified to be type tuple without the number of fields being fixed to the 
number of columns in the input bag. 

Under the current way, the elements in the tuple cannot be accessed in the 
script after calling BagToTuple without getting an incompatible type error. We 
have modified the UDF in our internal UDF jars to work around the issue. Let me 
know if this sounds reasonable and I can generate the patch.

  was:
The output schema from BagToTuple is nonsensical causing problems using the 
tuple later in the same script. 

For example: Given a bag: { data:chararray }, calling BagToTuple yields the 
schema: ( data:chararray )

But, this makes no sense since if the above bag contains: {data1, data2, data3} 
entries, the output tuple from BagToTuple will be:
(data1:chararray, data2:chararray, data3:chararray) != (data:chararray),the 
declared output schema from the UDF.

Unfortunately, the schema of the tuple cannot be known during the initial 
validation phase. Thus, I believe the output schema from the UDF should be 
modified to be type tuple without the number of fields being fixed to the 
number of columns in the input bag. 

Under the current way, the elements in the tuple cannot be accessed in the 
script after calling BagToTuple without getting an incompatible type error.


> BagToString Output Schema
> -
>
> Key: PIG-5272
> URL: https://issues.apache.org/jira/browse/PIG-5272
> Project: Pig
>  Issue Type: Improvement
>Reporter: Joshua Juen
>Priority: Minor
>
> The output schema from BagToTuple is nonsensical causing problems using the 
> tuple later in the same script. 
> For example: Given a bag: { data:chararray }, calling BagToTuple yields the 
> schema: ( data:chararray )
> But, this makes no sense since if the above bag contains: {data1, data2, 
> data3} entries, the output tuple from BagToTuple will be:
> (data1:chararray, data2:chararray, data3:chararray) != (data:chararray),the 
> declared output schema from the UDF.
> Unfortunately, the schema of the tuple cannot be known during the initial 
> validation phase. Thus, I believe the output schema from the UDF should be 
> modified to be type tuple without the number of fields being fixed to the 
> number of columns in the input bag. 
> Under the current way, the elements in the tuple cannot be accessed in the 
> script after calling BagToTuple without getting an incompatible type error. 
> We have modified the UDF in our internal UDF jars to work around the issue. 
> Let me know if this sounds reasonable and I can generate the patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PIG-5272) BagToString Output Schema

2017-07-14 Thread Joshua Juen (JIRA)
Joshua Juen created PIG-5272:


 Summary: BagToString Output Schema
 Key: PIG-5272
 URL: https://issues.apache.org/jira/browse/PIG-5272
 Project: Pig
  Issue Type: Improvement
Reporter: Joshua Juen
Priority: Minor


The output schema from BagToTuple is nonsensical causing problems using the 
tuple later in the same script. 

For example: Given a bag: { data:chararray }, calling BagToTuple yields the 
schema: ( data:chararray )

But, this makes no sense since if the above bag contains: {data1, data2, data3} 
entries, the output tuple from BagToTuple will be:
(data1:chararray, data2:chararray, data3:chararray) != (data:chararray),the 
declared output schema from the UDF.

Unfortunately, the schema of the tuple cannot be known during the initial 
validation phase. Thus, I believe the output schema from the UDF should be 
modified to be type tuple without the number of fields being fixed to the 
number of columns in the input bag. 

Under the current way, the elements in the tuple cannot be accessed in the 
script after calling BagToTuple without getting an incompatible type error.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)