[ https://issues.apache.org/jira/browse/PIG-616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663166#action_12663166 ]
araceli edited comment on PIG-616 at 1/12/09 5:05 PM: ------------------------------------------------------------- Please also verify the following test case where the source file "MyFile.tx" contains the following data: Fint:int, Fdouble:double, Ftuple: ( chararray, age, avg). The load statement defines a schema in the load statement that should result in a type conflict. ( Note that "BADTYPE" is being loaded as an int, but myFile contains a chararray. ) A =LOAD 'myFile.txt' USING PigStorage () AS (Fint:int, Fdouble:double, Ftuple: ( BADTYPE:int, age, avg) ); STORE A INTO 'resultMyFile.out' USING PigStorage(); The expected behavior is that an error be thrown indicating there is a type conflict - but currently no error is thrown. was (Author: araceli): Please also verify the following test case where the source file "MyFile.tx" contains the following data: Fint:int, Flong:longFdouble:double, Ftuple:( chararray, age, avg). The load statement defines a schema in the load statement that should result in a type conflict. ( Note that "BADTYPE" is being loaded as an int, but myFile contains a chararray. ) A =LOAD 'myFile.txt' USING PigStorage () AS (Fint:int, Flong:longFdouble:double, Ftuple:( BADTYPE:int, age, avg) ); STORE A INTO 'resultMyFile.out' USING PigStorage(); The expected behavior is that an error be thrown indicating there is a type conflict - but currently no error is thrown. > Casts to complex types do not work as expected > ---------------------------------------------- > > Key: PIG-616 > URL: https://issues.apache.org/jira/browse/PIG-616 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: types_branch > Reporter: Santhosh Srinivasan > Fix For: types_branch > > > When we specify a (complex) type as a column in Pig, the TypeCastInserter > inserts the appropriate cast for the (complex) type. However, in the > implementation of POCast.java, when databyte arrays are converted to the > (complex) types, we invoke the bytesToXXX method. > For complex types, especially tuples and bags, we do not enforce the typing > information specified by the user in the AS clause or with the explicit cast > statement. The implementation solely relies on bytesToXXX to figure out the > right type. > An example of a query that fails is given below. Wrt the query, the data is a > single column that is a bag of integers. The user specifies this bag to be a > bag of chararray. This conversion is allowed in pig but the implementation > does not perform the actual cast. Instead the bytesToBag is called on the > stream. The resulting type is a bag of integers and not a bag of chararray. > In the subsequent statement the user (correctly) assumes that the conversion > has been performed but in reality it has not been done. At run time when a > chararray based operation is performed we see a ClassCastException. > The notion of a schema has is absent in the physical operators. The > schema/fieldSchema in the logical layer has to be passed on to the physical > layer. The schema can be used to perform additional operations like casting, > etc. > {code} > grunt> cat bag.data > {(1)} > grunt> a = load 'bag.data' as (b:{t:(c:chararray)}); > grunt> b = foreach a generate flatten(b); > grunt> c = foreach b generate CONCAT('Hello ', $0); > grunt> dump c; > 2009-01-12 10:44:44,417 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - 0% complete > 2009-01-12 10:45:09,439 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - Map reduce job failed > 2009-01-12 10:45:09,440 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - Job failed! > 2009-01-12 10:45:09,443 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error > message from task (map) > task_200812151518_9681_m_000000java.lang.ClassCastException: > java.lang.Integer cannot be cast to java.lang.String > at org.apache.pig.builtin.StringConcat.exec(StringConcat.java:37) > at org.apache.pig.builtin.StringConcat.exec(StringConcat.java:31) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:185) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:259) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:271) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:197) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:187) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:175) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) > ... > 2009-01-12 10:45:09,448 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1066: Unable to open iterator for alias c > 2009-01-12 10:45:09,448 [main] ERROR org.apache.pig.tools.grunt.Grunt - > org.apache.pig.impl.logicalLayer.FrontendException: Unable to open iterator > for alias c > at org.apache.pig.PigServer.openIterator(PigServer.java:426) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:271) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84) > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:72) > at org.apache.pig.Main.main(Main.java:302) > Caused by: java.io.IOException: Job terminated with anomalous status FAILED > at org.apache.pig.PigServer.openIterator(PigServer.java:420) > ... 5 more > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.