[jira] [Created] (CARBONDATA-4286) Select query with and filter is giving empty result

2021-09-15 Thread Nihal kumar ojha (Jira)
Nihal kumar ojha created CARBONDATA-4286:


 Summary: Select query with and filter is giving empty result
 Key: CARBONDATA-4286
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4286
 Project: CarbonData
  Issue Type: Bug
Reporter: Nihal kumar ojha


Select query on a table with and filter condition returns an empty result while 
valid data present in the table.

Root cause: Currently when we are building the min-max index at block level 
that time we are using unsafe byte comparator for either dimension or measure 
column which returns incorrect result for measure columns. 

We should use different comparators for dimensions and measure columns which we 
are already doing at time of writing the min-max index at blocklet level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4277) Compatibility Issue of GeoSpatial table of CarbonData 2.1.0 in CarbonData 2.2.0 (Spark 2.4.5 and Spark 3.1.1)

2021-09-15 Thread Indhumathi (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Indhumathi resolved CARBONDATA-4277.

Fix Version/s: 2.3.0
   Resolution: Fixed

> Compatibility Issue of GeoSpatial table of CarbonData 2.1.0 in CarbonData 
> 2.2.0 (Spark 2.4.5 and Spark 3.1.1)
> -
>
> Key: CARBONDATA-4277
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4277
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 2.2.0
> Environment: Spark 2.4.5
> Spark 3.1.1
>Reporter: PURUJIT CHAUGULE
>Priority: Major
> Fix For: 2.3.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
>  
>  
> *Issue 1 : Load on geospatial table from 2.1.0 table in 2.2.0(Spark 2.4.5 and 
> 3.1.1) is failing*
> *STEPS:-*
>  # create table in CarbonData 2.1.0 : create table 
> source_index_2_1_0(TIMEVALUE BIGINT,LONGITUDE long,LATITUDE long) STORED AS 
> carbondata TBLPROPERTIES 
> ('SPATIAL_INDEX.mygeohash.type'='geohash','SPATIAL_INDEX.mygeohash.sourcecolumns'='longitude,
>  
> latitude','SPATIAL_INDEX.mygeohash.originLatitude'='39.930753','SPATIAL_INDEX.mygeohash.gridSize'='50','SPATIAL_INDEX.mygeohash.minLongitude'='116.176090','SPATIAL_INDEX.mygeohash.maxLongitude'='116.736367','SPATIAL_INDEX.mygeohash.minLatitude'='39.930753','SPATIAL_INDEX.mygeohash.maxLatitude'='40.179415','SPATIAL_INDEX'='mygeohash','SPATIAL_INDEX.mygeohash.conversionRatio'='100');
>  # LOAD DATA INPATH 'hdfs://hacluster/chetan/f_lcov_50basic_data.csv' INTO 
> TABLE source_index_2_1_0 OPTIONS('HEADER'='true','DELIMITER'='|', 
> 'QUOTECHAR'='|');
>  # Take store of table the place in hdfs of CarbonData 2.2.0(Spark 2.4.5 and 
> Spark 3.1.1)  clusters
>  # refresh table source_index_2_1_0;
>  # 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH 
> 'hdfs://hacluster/chetan/f_lcov_50basic_data.csv' INTO TABLE 
> source_index_2_1_0 OPTIONS('HEADER'='true','DELIMITER'='|', 'QUOTECHAR'='|');
> Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
> java.lang.Exception: DataLoad failure: Data Loading failed for table 
> source_index_2_1_0
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263)
>  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  Caused by: java.lang.Exception: DataLoad failure: Data Loading failed for 
> table source_index_2_1_0
>  at 
> org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:460)
>  at 
> org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:226)
>  at 
> org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:163)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand.$anonfun$run$3(package.scala:162)
>  at 
> org.apache.spark.sql.execution.command.Auditable.runWithAudit(package.scala:118)
>  at 
> org.apache.spark.sql.execution.command.Auditable.runWithAudit$(package.scala:114)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:155)
> 

[jira] [Created] (CARBONDATA-4285) complex columns with global sort compaction is failed

2021-09-15 Thread Mahesh Raju Somalaraju (Jira)
Mahesh Raju Somalaraju created CARBONDATA-4285:
--

 Summary: complex columns with global sort compaction is failed
 Key: CARBONDATA-4285
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4285
 Project: CarbonData
  Issue Type: Bug
Reporter: Mahesh Raju Somalaraju


complex columns with global sort compaction is failed.

 

Steps to reproduce

-=-

1) create table with global sort

2) load the data multiple times

3) alter add columns

4) insert the data

5) repeat 3 and 4 for four times

6) execute the compaction.


test("test the complex columns with global sort compaction") {
 sql("DROP TABLE IF EXISTS alter_global1")
 sql("CREATE TABLE alter_global1(intField INT) STORED AS carbondata " +
 "TBLPROPERTIES('sort_columns'='intField','sort_scope'='global_sort')")
 sql("insert into alter_global1 values(1)")
 sql("insert into alter_global1 values(2)")
 sql("insert into alter_global1 values(3)")
 sql( "ALTER TABLE alter_global1 ADD COLUMNS(str1 array)")
 sql("insert into alter_global1 values(4, array(1))")
 checkAnswer(sql("select * from alter_global1"),
 Seq(Row(1, null), Row(2, null), Row(3, null), Row(4, make(Array(1)
 val addedColumns = addedColumnsInSchemaEvolutionEntry("alter_global1")
 assert(addedColumns.size == 1)
 sql("alter table alter_global1 compact 'minor'")
 checkAnswer(sql("select * from alter_global1"),
 Seq(Row(1, null), Row(2, null), Row(3, null), Row(4, make(Array(1)
 sql("DROP TABLE IF EXISTS alter_global1")
}


test("test the multi-level complex columns with global sort compaction") {
 sql("DROP TABLE IF EXISTS alter_global2")
 sql("CREATE TABLE alter_global2(intField INT) STORED AS carbondata " +
 "TBLPROPERTIES('sort_columns'='intField','sort_scope'='global_sort')")
 sql("insert into alter_global2 values(1)")
 // multi-level nested array
 sql(
 "ALTER TABLE alter_global2 ADD COLUMNS(arr1 array>, arr2 
array>>) ")
 sql(
 "insert into alter_global2 values(1, array(array(1,2)), 
array(named_struct('a1','st'," +
 "'map1', map('a','b'")
 // multi-level nested struct
 sql("ALTER TABLE alter_global2 ADD COLUMNS(struct1 struct>," +
 " struct2 struct>>) ")
 sql("insert into alter_global2 values(1, " +
 "array(array(1,2)), array(named_struct('a1','st','map1', map('a','b'))), " +
 "named_struct('s1','hi','arr',array(1,2)), 
named_struct('num',2.3,'contact',map('ph'," +
 "array(1,2")
 // multi-level nested map
 sql(
 "ALTER TABLE alter_global2 ADD COLUMNS(map1 map>, map2 
map>>)")
 sql("insert into alter_global2 values(1, " +
 "array(array(1,2)), array(named_struct('a1','st','map1', map('a','b'))), " +
 "named_struct('s1','hi','arr',array(1,2)), 
named_struct('num',2.3,'contact',map('ph'," +
 "array(1,2))),map('a',array('hi')), 
map('a',named_struct('d',23,'s',named_struct('im'," +
 "'sh'")

 val addedColumns = addedColumnsInSchemaEvolutionEntry("alter_global2")
 assert(addedColumns.size == 6)
 sql("alter table alter_global2 compact 'minor'")

 sql("DROP TABLE IF EXISTS alter_global2")



--
This message was sent by Atlassian Jira
(v8.3.4#803005)