[jira] [Created] (CARBONDATA-4286) Select query with and filter is giving empty result
Nihal kumar ojha created CARBONDATA-4286: Summary: Select query with and filter is giving empty result Key: CARBONDATA-4286 URL: https://issues.apache.org/jira/browse/CARBONDATA-4286 Project: CarbonData Issue Type: Bug Reporter: Nihal kumar ojha Select query on a table with and filter condition returns an empty result while valid data present in the table. Root cause: Currently when we are building the min-max index at block level that time we are using unsafe byte comparator for either dimension or measure column which returns incorrect result for measure columns. We should use different comparators for dimensions and measure columns which we are already doing at time of writing the min-max index at blocklet level. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4277) Compatibility Issue of GeoSpatial table of CarbonData 2.1.0 in CarbonData 2.2.0 (Spark 2.4.5 and Spark 3.1.1)
[ https://issues.apache.org/jira/browse/CARBONDATA-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Indhumathi resolved CARBONDATA-4277. Fix Version/s: 2.3.0 Resolution: Fixed > Compatibility Issue of GeoSpatial table of CarbonData 2.1.0 in CarbonData > 2.2.0 (Spark 2.4.5 and Spark 3.1.1) > - > > Key: CARBONDATA-4277 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4277 > Project: CarbonData > Issue Type: Bug >Affects Versions: 2.2.0 > Environment: Spark 2.4.5 > Spark 3.1.1 >Reporter: PURUJIT CHAUGULE >Priority: Major > Fix For: 2.3.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > > > *Issue 1 : Load on geospatial table from 2.1.0 table in 2.2.0(Spark 2.4.5 and > 3.1.1) is failing* > *STEPS:-* > # create table in CarbonData 2.1.0 : create table > source_index_2_1_0(TIMEVALUE BIGINT,LONGITUDE long,LATITUDE long) STORED AS > carbondata TBLPROPERTIES > ('SPATIAL_INDEX.mygeohash.type'='geohash','SPATIAL_INDEX.mygeohash.sourcecolumns'='longitude, > > latitude','SPATIAL_INDEX.mygeohash.originLatitude'='39.930753','SPATIAL_INDEX.mygeohash.gridSize'='50','SPATIAL_INDEX.mygeohash.minLongitude'='116.176090','SPATIAL_INDEX.mygeohash.maxLongitude'='116.736367','SPATIAL_INDEX.mygeohash.minLatitude'='39.930753','SPATIAL_INDEX.mygeohash.maxLatitude'='40.179415','SPATIAL_INDEX'='mygeohash','SPATIAL_INDEX.mygeohash.conversionRatio'='100'); > # LOAD DATA INPATH 'hdfs://hacluster/chetan/f_lcov_50basic_data.csv' INTO > TABLE source_index_2_1_0 OPTIONS('HEADER'='true','DELIMITER'='|', > 'QUOTECHAR'='|'); > # Take store of table the place in hdfs of CarbonData 2.2.0(Spark 2.4.5 and > Spark 3.1.1) clusters > # refresh table source_index_2_1_0; > # 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH > 'hdfs://hacluster/chetan/f_lcov_50basic_data.csv' INTO TABLE > source_index_2_1_0 OPTIONS('HEADER'='true','DELIMITER'='|', 'QUOTECHAR'='|'); > Error: org.apache.hive.service.cli.HiveSQLException: Error running query: > java.lang.Exception: DataLoad failure: Data Loading failed for table > source_index_2_1_0 > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.Exception: DataLoad failure: Data Loading failed for > table source_index_2_1_0 > at > org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:460) > at > org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:226) > at > org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:163) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand.$anonfun$run$3(package.scala:162) > at > org.apache.spark.sql.execution.command.Auditable.runWithAudit(package.scala:118) > at > org.apache.spark.sql.execution.command.Auditable.runWithAudit$(package.scala:114) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:155) >
[jira] [Created] (CARBONDATA-4285) complex columns with global sort compaction is failed
Mahesh Raju Somalaraju created CARBONDATA-4285: -- Summary: complex columns with global sort compaction is failed Key: CARBONDATA-4285 URL: https://issues.apache.org/jira/browse/CARBONDATA-4285 Project: CarbonData Issue Type: Bug Reporter: Mahesh Raju Somalaraju complex columns with global sort compaction is failed. Steps to reproduce -=- 1) create table with global sort 2) load the data multiple times 3) alter add columns 4) insert the data 5) repeat 3 and 4 for four times 6) execute the compaction. test("test the complex columns with global sort compaction") { sql("DROP TABLE IF EXISTS alter_global1") sql("CREATE TABLE alter_global1(intField INT) STORED AS carbondata " + "TBLPROPERTIES('sort_columns'='intField','sort_scope'='global_sort')") sql("insert into alter_global1 values(1)") sql("insert into alter_global1 values(2)") sql("insert into alter_global1 values(3)") sql( "ALTER TABLE alter_global1 ADD COLUMNS(str1 array)") sql("insert into alter_global1 values(4, array(1))") checkAnswer(sql("select * from alter_global1"), Seq(Row(1, null), Row(2, null), Row(3, null), Row(4, make(Array(1) val addedColumns = addedColumnsInSchemaEvolutionEntry("alter_global1") assert(addedColumns.size == 1) sql("alter table alter_global1 compact 'minor'") checkAnswer(sql("select * from alter_global1"), Seq(Row(1, null), Row(2, null), Row(3, null), Row(4, make(Array(1) sql("DROP TABLE IF EXISTS alter_global1") } test("test the multi-level complex columns with global sort compaction") { sql("DROP TABLE IF EXISTS alter_global2") sql("CREATE TABLE alter_global2(intField INT) STORED AS carbondata " + "TBLPROPERTIES('sort_columns'='intField','sort_scope'='global_sort')") sql("insert into alter_global2 values(1)") // multi-level nested array sql( "ALTER TABLE alter_global2 ADD COLUMNS(arr1 array>, arr2 array>>) ") sql( "insert into alter_global2 values(1, array(array(1,2)), array(named_struct('a1','st'," + "'map1', map('a','b'") // multi-level nested struct sql("ALTER TABLE alter_global2 ADD COLUMNS(struct1 struct>," + " struct2 struct>>) ") sql("insert into alter_global2 values(1, " + "array(array(1,2)), array(named_struct('a1','st','map1', map('a','b'))), " + "named_struct('s1','hi','arr',array(1,2)), named_struct('num',2.3,'contact',map('ph'," + "array(1,2") // multi-level nested map sql( "ALTER TABLE alter_global2 ADD COLUMNS(map1 map>, map2 map>>)") sql("insert into alter_global2 values(1, " + "array(array(1,2)), array(named_struct('a1','st','map1', map('a','b'))), " + "named_struct('s1','hi','arr',array(1,2)), named_struct('num',2.3,'contact',map('ph'," + "array(1,2))),map('a',array('hi')), map('a',named_struct('d',23,'s',named_struct('im'," + "'sh'") val addedColumns = addedColumnsInSchemaEvolutionEntry("alter_global2") assert(addedColumns.size == 6) sql("alter table alter_global2 compact 'minor'") sql("DROP TABLE IF EXISTS alter_global2") -- This message was sent by Atlassian Jira (v8.3.4#803005)