[ https://issues.apache.org/jira/browse/SPARK-25256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen reassigned SPARK-25256: --------------------------------- Assignee: Darcy Shen > Plan mismatch errors in Hive tests in 2.12 > ------------------------------------------ > > Key: SPARK-25256 > URL: https://issues.apache.org/jira/browse/SPARK-25256 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 2.4.0 > Reporter: Sean Owen > Assignee: Darcy Shen > Priority: Major > Fix For: 2.4.0 > > > In Hive tests, in the Scala 2.12 build, still seeing a few failures that seem > to show mismatching schema inference. Not clear whether it's the same as > SPARK-25044. Examples: > {code:java} > - SPARK-5775 read array from partitioned_parquet_with_key_and_complextypes > *** FAILED *** > Results do not match for query: > Timezone: > sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] > Timezone Env: > == Parsed Logical Plan == > 'Project ['arrayField, 'p] > +- 'Filter ('p = 1) > +- 'UnresolvedRelation `partitioned_parquet_with_key_and_complextypes` > == Analyzed Logical Plan == > arrayField: array<int>, p: int > Project [arrayField#82569, p#82570] > +- Filter (p#82570 = 1) > +- SubqueryAlias `default`.`partitioned_parquet_with_key_and_complextypes` > +- > Relation[intField#82566,stringField#82567,structField#82568,arrayField#82569,p#82570] > parquet > == Optimized Logical Plan == > Project [arrayField#82569, p#82570] > +- Filter (isnotnull(p#82570) && (p#82570 = 1)) > +- > Relation[intField#82566,stringField#82567,structField#82568,arrayField#82569,p#82570] > parquet > == Physical Plan == > *(1) Project [arrayField#82569, p#82570] > +- *(1) FileScan parquet > default.partitioned_parquet_with_key_and_complextypes[arrayField#82569,p#82570] > Batched: false, Format: Parquet, Location: > PrunedInMemoryFileIndex[file:/home/srowen/spark-2.12/sql/hive/target/tmp/spark-d8d87d74-33e7-4f22..., > PartitionCount: 1, PartitionFilters: [isnotnull(p#82570), (p#82570 = 1)], > PushedFilters: [], ReadSchema: struct<arrayField:array<int>> > == Results == > == Results == > !== Correct Answer - 10 == == Spark Answer - 10 == > !struct<> struct<arrayField:array<int>,p:int> > ![Range 1 to 1,1] [WrappedArray(1),1] > ![Range 1 to 10,1] [WrappedArray(1, 2),1] > ![Range 1 to 2,1] [WrappedArray(1, 2, 3),1] > ![Range 1 to 3,1] [WrappedArray(1, 2, 3, 4),1] > ![Range 1 to 4,1] [WrappedArray(1, 2, 3, 4, 5),1] > ![Range 1 to 5,1] [WrappedArray(1, 2, 3, 4, 5, 6),1] > ![Range 1 to 6,1] [WrappedArray(1, 2, 3, 4, 5, 6, 7),1] > ![Range 1 to 7,1] [WrappedArray(1, 2, 3, 4, 5, 6, 7, 8),1] > ![Range 1 to 8,1] [WrappedArray(1, 2, 3, 4, 5, 6, 7, 8, 9),1] > ![Range 1 to 9,1] [WrappedArray(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),1] > (QueryTest.scala:163){code} > {code:java} > - SPARK-2693 udaf aggregates test *** FAILED *** > Results do not match for query: > Timezone: > sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] > Timezone Env: > == Parsed Logical Plan == > 'GlobalLimit 1 > +- 'LocalLimit 1 > +- 'Project [unresolvedalias('percentile('key, 'array(1, 1)), None)] > +- 'UnresolvedRelation `src` > == Analyzed Logical Plan == > percentile(key, array(1, 1), 1): array<double> > GlobalLimit 1 > +- LocalLimit 1 > +- Aggregate [percentile(key#205098, cast(array(1, 1) as array<double>), 1, > 0, 0) AS percentile(key, array(1, 1), 1)#205101] > +- SubqueryAlias `default`.`src` > +- HiveTableRelation `default`.`src`, > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [key#205098, value#205099] > == Optimized Logical Plan == > GlobalLimit 1 > +- LocalLimit 1 > +- Aggregate [percentile(key#205098, [1.0,1.0], 1, 0, 0) AS percentile(key, > array(1, 1), 1)#205101] > +- Project [key#205098] > +- HiveTableRelation `default`.`src`, > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [key#205098, value#205099] > == Physical Plan == > CollectLimit 1 > +- ObjectHashAggregate(keys=[], functions=[percentile(key#205098, [1.0,1.0], > 1, 0, 0)], output=[percentile(key, array(1, 1), 1)#205101]) > +- Exchange SinglePartition > +- ObjectHashAggregate(keys=[], functions=[partial_percentile(key#205098, > [1.0,1.0], 1, 0, 0)], output=[buf#205104]) > +- Scan hive default.src [key#205098], HiveTableRelation `default`.`src`, > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [key#205098, value#205099] > == Results == > == Results == > !== Correct Answer - 1 == == Spark Answer - 1 == > !struct<array(max(key), max(key)):array<int>> struct<percentile(key, array(1, > 1), 1):array<double>> > ![WrappedArray(498, 498)] [WrappedArray(498.0, 498.0)] > (QueryTest.scala:163){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org