[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837574#action_12837574 ] Daniel Dai commented on PIG-1016: - Hi, busy, I checked your code, seems your patch assume PIG-1016.patch checked in. If I understand correctly, there are inconsistency in this approach. In your code, you allow map value to be any type. However, internally Pig always assume map value to be bytearray. So Pig will choose to use PigBytesRawComparator. And you further modify PigBytesRawComparator to handle all data type. This logic is very confusing. Further, TextDataParser itself if bogus since it will guess the data type based on the content. In PIG-613, we reiterate that map value is bytearray. However, we fixed the code which can cast bytearray to map/tuple/bag correctly. I verified the test case you gave, and it works. {code} A= load '9.txt' as (data:map[]); B= foreach A generate (int)(data#'a'), (chararray)(data#'b'),(tuple(map[]))(data#'c'); C= order B by $0; dump C; {code} Result: (1,'a',(1,2,3)) (2,'d',(1,2,3)) (3,'c',(1,2,3)) {code} D= order B by $1; dump D; {code} Result: (1,'a',(1,2,3)) (3,'c',(1,2,3)) (2,'d',(1,2,3)) {code} describe B; {code} Result: B: {int,chararray,(map[ ])} Do you have other use cases which PIG-613 cannot address? Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837643#action_12837643 ] Daniel Dai commented on PIG-1016: - Hi, busy, Finally I think I understand what you mean. You want to write a loader and in the loader, you want to put whatever to the map value, right? Then I think it is a valid use case. What I am talking about is if you use PigStorage to load data, map value is always bytearray. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837106#action_12837106 ] Daniel Dai commented on PIG-1016: - This issue should be fixed as part of the effort in [PIG-613|https://issues.apache.org/jira/browse/PIG-613]. hc busy, can you check if that patch address your issue? Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796893#action_12796893 ] hc busy commented on PIG-1016: -- Hi Thejas, Olga, and rest, it sounds about right. I think PIG-1082 is ready from my previous effort, and PIG-1083 still needs to be done. And perhaps it will more sense to use avro or some other binary format instead. I still have an ASCII nested datastructure to read in, but It's not very HP. Not sure if anybody needs it any more. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Fix For: 0.7.0 Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12771308#action_12771308 ] hc busy commented on PIG-1016: -- Well, I'd like to start by thanking everyone for the attention and support! As a first time contributor, I feel my heart warmed by the encouraging comments and serious time everyone is spending on my problem. I also greatly appreciate the patience everybody has, and of course I am perpetually grateful for everybody's work in making this all work. Line by line, {code} +// find bug is complaining about nulls. This check sequence will prevent nulls from being dereferenced. +if(o1!=null o2!=null){ ... +}else{ + if(o1==null o2==null){rc=0;} + else if(o1==null) {rc=-1;} + else{ rc=1; } {code} Does what it says, it prevents a findbug warning. non-null is greater than null by convention. {code} +// In case the objects are comparable +if((o1 instanceof NullableBytesWritable o2 instanceof NullableBytesWritable)|| + !(o1 instanceof PigNullableWritable o2 instanceof PigNullableWritable) +){ + + NullableBytesWritable nbw1 = (NullableBytesWritable)o1; + NullableBytesWritable nbw2 = (NullableBytesWritable)o2; + + // If either are null, handle differently. + if (!nbw1.isNull() !nbw2.isNull()) { + rc = ((DataByteArray)nbw1.getValueAsPigType()).compareTo((DataByteArray)nbw2.getValueAsPigType()); + } else { + // For sorting purposes two nulls are equal. + if (nbw1.isNull() nbw2.isNull()) rc = 0; + else if (nbw1.isNull()) rc = -1; + else rc = 1; + } +} {code} The if statement takes us outside of original comparison code (enclosed in outer if above) ONLY if both compratee are PigNullableWritable that are not NullableBytesWritable. This may seem confusing at first glance, but what it does is do the identical thing as before the patch except for the new case that I introduced by allowing other types. The code is awkward, as Santhosh noted. But I am not too sure I understand the original implementation. But certainly, this way, we preserve original behavior and for new cases that this patch introduces, they are handled in the remaining else: {code} else{ + // enter here only if both o1 and o2 are non-NullableByteWritable PigNullableWritable's + PigNullableWritable nbw1 = (PigNullableWritable)o1; + PigNullableWritable nbw2 = (PigNullableWritable)o2; + // If either are null, handle differently. + if (!nbw1.isNull() !nbw2.isNull()) { + rc = nbw1.compareTo(nbw2); + } else { + // For sorting purposes two nulls are equal. + if (nbw1.isNull() nbw2.isNull()) rc = 0; + else if (nbw1.isNull()) rc = -1; + else rc = 1; + } +} {code} This is the safest way I can think of writing this code, and I have been able to order by a value begotten out of a map. Also, join and then sort keyed on values of maps both works. I guess something that flows better might be the following: {code} if(o1!=null o2!=null){ if((o1 instanceof PigNullableWritable o2 instanceof PigNullableWritable ){ PigNullableWritable nbw1 = (PigNullableWritable)o1; PigNullableWritable nbw2 = (PigNullableWritable)o2; // If either are null, handle differently. if (!nbw1.isNull() !nbw2.isNull()) { rc = nbw1.compareTo(nbw2); } else { // For sorting purposes two nulls are equal. if (nbw1.isNull() nbw2.isNull()) rc = 0; else if (nbw1.isNull()) rc = -1; else rc = 1; } }else{ throw new Exception(bad compare); } }else{ if(o1==null o2==null){rc=0;} else if(o1==null) {rc=-1;} else{ rc=1; } {code} But I must admit that I don't know what the right thing to do is. I don't know the design well enough to know if throwing an exception is the appropriate thing? Or something else? And would the last code block perform the right comparison in place of the original function? lmk of your thoughts on improvements to the patch. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Fix For: 0.5.0
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12771543#action_12771543 ] Thejas M Nair commented on PIG-1016: A tuple can also be used instead of a typed map. This issue is specific to PigStorage load function, and it is present because it tries to auto-detect the map value type. I don't think we need to introduce a typed map in pig-latin for this. You can always create a new load function that returns typed maps. BinStorage() is an example of a Load/store function that stores the type information in data, and returns typed maps. I think run-time identification of type is a bad idea, it results in surprising/unpredictable behavior. In case of PigStorage(), I think it should always interpret the map-value as bytearray. In the pig-script , the user can cast the value to the expected type. PigStorage.bytesTo... functions would get used for this purpose. (I assume pig keeps track of the loader function that produced the data). Map parsing will also be faster with this approach, compared to auto-detect of value type. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Fix For: 0.5.0 Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12771571#action_12771571 ] hc busy commented on PIG-1016: -- Thejas, great point! Run time detection of type does use more time at run time and require more discipline to use. But I'd like to point out that the original implementation seemed to have allowed for this in PigStorage. The change to reduce the types that can be stored in the value of a map seems to reduce functionality of Pig. I guess the one case where I want to use map is when I have a sparse tuple, that I don't want to type in a type for each of the many fields. Because if I went to that trouble, I'd just write java code, or use something where schema is statically defined and stored. say, for simple example, self join of one row {{\[data1#\[score#15l,unique_id#100\],data2#\[score#15,foreign#00100\]\]}} {code} B = join A by m#data1#unique_id, A by m#data2#foriegn C = Filter B by $0#score=$1#score {code} I'd think something like this should work without me typing in the entire type structure. Also, what happens when BinStorage returns a map with value that isn't a bytearray, does the comparison fail? Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Fix For: 0.5.0 Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12771170#action_12771170 ] hc busy commented on PIG-1016: -- Okay, trying to get this into a release of pig... I noticed 0.4 came , but nothing has happened on this ticket. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Fix For: 0.5.0 Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12771200#action_12771200 ] Alan Gates commented on PIG-1016: - I am keeping an eye on this ticket. But at this point I'd like to get Santhosh's feedback on your changes before proceeding, as he had comments on your earlier patch and I want to make sure your new patch addresses them. Santhosh, can you provide feedback soon, or let one of the other committers know what to look for so we can move forward on this? Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Fix For: 0.5.0 Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12771287#action_12771287 ] Santhosh Srinivasan commented on PIG-1016: -- I am summarizing my understanding of the patch that has been submitted by hc busy. Root cause: PIG-880 changed the value type of maps in PigStorage from native Java types to DataByteArray. As a result of this change, parsing of complex types as map values was disabled. Proposed fix: Revert the changes made as part of PIG-880 to interpret map values as Java types. In addition, change the comparison method to check for the object type and call the appropriate compareTo method. The latter is required to workaround the fact that the front-end assigns the value type to be DataByteArray whereas the backend sees the actual type (Integer, Long, Tuple, DataBag, etc.) Based on this understanding I have the following review comment(s). Index: src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBytesRawComparator.java === Can you explain the checks in the if and the else? Specifically, NullableBytesWritable is a subclass of PigNullableWritable. As a result, in the if part, the check for both o1 and o2 not being PigNullableWritable is confusing as nbw1 and nbw2 are cast to NullableBytesWritable if o1 and o2 are not PigNullableWritable. {code} +// find bug is complaining about nulls. This check sequence will prevent nulls from being dereferenced. +if(o1!=null o2!=null){ + +// In case the objects are comparable +if((o1 instanceof NullableBytesWritable o2 instanceof NullableBytesWritable)|| + !(o1 instanceof PigNullableWritable o2 instanceof PigNullableWritable) +){ + + NullableBytesWritable nbw1 = (NullableBytesWritable)o1; + NullableBytesWritable nbw2 = (NullableBytesWritable)o2; + + // If either are null, handle differently. + if (!nbw1.isNull() !nbw2.isNull()) { + rc = ((DataByteArray)nbw1.getValueAsPigType()).compareTo((DataByteArray)nbw2.getValueAsPigType()); + } else { + // For sorting purposes two nulls are equal. + if (nbw1.isNull() nbw2.isNull()) rc = 0; + else if (nbw1.isNull()) rc = -1; + else rc = 1; + } +}else{ + // enter here only if both o1 and o2 are non-NullableByteWritable PigNullableWritable's + PigNullableWritable nbw1 = (PigNullableWritable)o1; + PigNullableWritable nbw2 = (PigNullableWritable)o2; + // If either are null, handle differently. + if (!nbw1.isNull() !nbw2.isNull()) { + rc = nbw1.compareTo(nbw2); + } else { + // For sorting purposes two nulls are equal. + if (nbw1.isNull() nbw2.isNull()) rc = 0; + else if (nbw1.isNull()) rc = -1; + else rc = 1; + } +} +}else{ + if(o1==null o2==null){rc=0;} + else if(o1==null) {rc=-1;} + else{ rc=1; } {code} Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Fix For: 0.5.0 Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12767313#action_12767313 ] Hadoop QA commented on PIG-1016: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12422436/PIG-1016.patch against trunk revision 826110. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/100/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/100/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/100/console This message is automatically generated. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12767385#action_12767385 ] Dmitriy V. Ryaboy commented on PIG-1016: All tests started failing at the end of last week for all patches. Hopefully someone at Y! can sort out what's causing Hudson's nervous breakdown. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12767387#action_12767387 ] hc busy commented on PIG-1016: -- %...@#$, had me sweating for a while..., as mentioned previously, this is functionality that I'd like to use... not just fun weekend project... hehe.. thnx. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12766444#action_12766444 ] Daniel Dai commented on PIG-1016: - I think the problem is in current TextDataParser, map is defined as String#String, and string exclude special characters such as (, ), ,, so busy has no way to generate a tuple in the value field of the map. The approach busy took looks valid to me. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12766710#action_12766710 ] hc busy commented on PIG-1016: -- 'kay, since my last comment, I've verified that in trunk, the patch in this ticket did not introduce an error. the Skewed join (correct or not) is returning the same number of rows when data read in is from a nested data structure as data read in from a tuple. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12766787#action_12766787 ] Hadoop QA commented on PIG-1016: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12422303/PIG-1016.patch against trunk revision 826047. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/90/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/90/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/90/console This message is automatically generated. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12766202#action_12766202 ] hc busy commented on PIG-1016: -- I skimed PIG-880. Here is a simplified version of what I might need to do: bash% cat map.dat [a#2,b#'d',c#(1,2,3)] [a#1,b#'a',c#(1,2,3)] [a#3,b#'c',c#(1,2,3)] bash% PIG gruntA= load 'map.dat' as (data:map[]); gruntB= foreach A generate (int)(data#'a'), (chararray)(data#'b'),(tuple())(data#'c'); gruntC= order B by $0; gruntdump C; (1,'a',(1,2,3)) (2,'d',(1,2,3)) (3,'c',(1,2,3)) gruntD= order B by $1; gruntdump D; (1,'a',(1,2,3)) (3,'c',(1,2,3)) (2,'d',(1,2,3)) Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765415#action_12765415 ] Hadoop QA commented on PIG-1016: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12422031/PIG-1016.patch against trunk revision 824980. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/25/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/25/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/25/console This message is automatically generated. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765183#action_12765183 ] Hadoop QA commented on PIG-1016: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12421949/PIG-1016.patch against trunk revision 824446. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/76/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/76/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/76/console This message is automatically generated. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765302#action_12765302 ] Dmitriy V. Ryaboy commented on PIG-1016: No worries, we are used to Jira sending us a never-ending stream of updates :-). Looks good to me (assuming this passes Hudson). Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: PIG-1016.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12764789#action_12764789 ] Hadoop QA commented on PIG-1016: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12421892/map_to_any_value.patch against trunk revision 824446. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/72/console This message is automatically generated. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: map_to_any_value.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1016) Reading in map data seems broken
[ https://issues.apache.org/jira/browse/PIG-1016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12764922#action_12764922 ] Hadoop QA commented on PIG-1016: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12421920/trunk_map_to_any_value.patch against trunk revision 824446. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/19/console This message is automatically generated. Reading in map data seems broken Key: PIG-1016 URL: https://issues.apache.org/jira/browse/PIG-1016 Project: Pig Issue Type: Improvement Components: data Affects Versions: 0.4.0 Reporter: hc busy Attachments: trunk_map_to_any_value.patch Hi, I'm trying to load a map that has a tuple for value. The read fails in 0.4.0 because of a misconfiguration in the parser. Where as in almost all documentation it is stated that value of the map can be any time. I've attached a patch that allows us to read in complex objects as value as documented. I've done simple verification of loading in maps with tuple/map values and writing them back out using LOAD and STORE. All seems to work fine. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.