[jira] [Resolved] (CARBONDATA-4342) Desc Column Shows new Column added, even though alter add column operation failed
[ https://issues.apache.org/jira/browse/CARBONDATA-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4342. - Fix Version/s: 2.3.1 Assignee: Indhumathi Resolution: Fixed > Desc Column Shows new Column added, even though alter add column operation > failed > - > > Key: CARBONDATA-4342 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4342 > Project: CarbonData > Issue Type: Bug >Reporter: Indhumathi >Assignee: Indhumathi >Priority: Minor > Fix For: 2.3.1 > > Time Spent: 4h 40m > Remaining Estimate: 0h > > # Create table and add new column. > # If Alter add column failed in the final step, then the revert operation is > unsuccessful > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CARBONDATA-4339) Nullpointer exception during load overwrite operation
Akash R Nilugal created CARBONDATA-4339: --- Summary: Nullpointer exception during load overwrite operation Key: CARBONDATA-4339 URL: https://issues.apache.org/jira/browse/CARBONDATA-4339 Project: CarbonData Issue Type: Bug Reporter: Akash R Nilugal Assignee: Akash R Nilugal Nullpointer exception when load overwrite is performed after delete segment and clean files with force option true -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (CARBONDATA-4320) Fix clean files removing wrong delta files
[ https://issues.apache.org/jira/browse/CARBONDATA-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4320. - Fix Version/s: 2.3.0 Resolution: Fixed > Fix clean files removing wrong delta files > -- > > Key: CARBONDATA-4320 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4320 > Project: CarbonData > Issue Type: Bug >Reporter: Vikram Ahuja >Priority: Major > Fix For: 2.3.0 > > Time Spent: 3h > Remaining Estimate: 0h > > h1. Fix clean files removing wrong delta files -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (CARBONDATA-4308) Update docs for new streamer properties
[ https://issues.apache.org/jira/browse/CARBONDATA-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4308. - Fix Version/s: 2.3.0 Resolution: Fixed > Update docs for new streamer properties > --- > > Key: CARBONDATA-4308 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4308 > Project: CarbonData > Issue Type: Sub-task > Components: docs >Reporter: Pratyaksh Sharma >Priority: Major > Fix For: 2.3.0 > > Time Spent: 14h 20m > Remaining Estimate: 0h > > Lot of new properties have been introduced as part of streamer tool work. > Need to update the docs for the same. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (CARBONDATA-4295) Introduce Streamer tool for Carbondata
[ https://issues.apache.org/jira/browse/CARBONDATA-4295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4295. - Fix Version/s: 2.3.0 Resolution: Fixed > Introduce Streamer tool for Carbondata > -- > > Key: CARBONDATA-4295 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4295 > Project: CarbonData > Issue Type: New Feature > Components: data-load >Reporter: Pratyaksh Sharma >Priority: Major > Fix For: 2.3.0 > > > Introduce streamer tool which can help ingest incrementally from various > commonly used sources like kafka, DFS etc. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (CARBONDATA-4318) Partition overwrite performance degrades as number of loads increase
Akash R Nilugal created CARBONDATA-4318: --- Summary: Partition overwrite performance degrades as number of loads increase Key: CARBONDATA-4318 URL: https://issues.apache.org/jira/browse/CARBONDATA-4318 Project: CarbonData Issue Type: Improvement Reporter: Akash R Nilugal Assignee: Akash R Nilugal Partition overwrite performance degrades as the number of loads increase -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (CARBONDATA-4316) Horizontal compaction fails for partition table
Akash R Nilugal created CARBONDATA-4316: --- Summary: Horizontal compaction fails for partition table Key: CARBONDATA-4316 URL: https://issues.apache.org/jira/browse/CARBONDATA-4316 Project: CarbonData Issue Type: Bug Reporter: Akash R Nilugal Fix For: 2.3.0 when delete operation performed on partition table, the horizontal compaction fails leading to lot of small delete delta files and impact query performance -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (CARBONDATA-4194) read from presto session throws error after delete operation from complex table from spark session
[ https://issues.apache.org/jira/browse/CARBONDATA-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4194. - Fix Version/s: 2.3.0 Resolution: Fixed > read from presto session throws error after delete operation from complex > table from spark session > -- > > Key: CARBONDATA-4194 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4194 > Project: CarbonData > Issue Type: Bug > Components: presto-integration >Affects Versions: 2.1.1 > Environment: Spark 2.4.5, Presto SQL 316 >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.3.0 > > Time Spent: 7h 20m > Remaining Estimate: 0h > > *Queries executed -* > From Spark session create table with complex types , load data to table and > delete data from table > create table Struct_com19_PR4031_009 (CUST_ID string, YEAR int, MONTH int, > AGE int, GENDER string, EDUCATED string, IS_MARRIED string, > STRUCT_INT_DOUBLE_STRING_DATE > struct,CARD_COUNT > int,DEBIT_COUNT int, CREDIT_COUNT int, DEPOSIT double, HQ_DEPOSIT > decimal(20,3)) stored as carbondata; > LOAD DATA INPATH 'hdfs://hacluster/chetan/Struct.csv' INTO table > Struct_com19_PR4031_009 options ('DELIMITER'=',', 'QUOTECHAR'='"', > 'FILEHEADER'='CUST_ID,YEAR,MONTH,AGE,GENDER,EDUCATED,IS_MARRIED,STRUCT_INT_DOUBLE_STRING_DATE,CARD_COUNT,DEBIT_COUNT,CREDIT_COUNT,DEPOSIT,HQ_DEPOSIT','COMPLEX_DELIMITER_LEVEL_1'='$'); > delete from Struct_com19_PR4031_009 where EDUCATED='MS'; > > From Presto CLI execute the select queries. > select * from Struct_com19_PR4031_009 limit 1; > select count(*) from Struct_com19_PR4031_009; > > *Issue : -* read from presto session throws error after delete operation from > complex table from spark session > presto:ranjan> select * from Struct_com19_PR4031_009 limit 1; > Query 20210528_075917_1_swzys, FAILED, 1 node > Splits: 18 total, 0 done (0.00%) > 0:00 [0 rows, 0B] [0 rows/s, 0B/s] > Query 20210528_075917_1_swzys failed: Error in Reading Data from > Carbondata > > *Log -* > org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException: > Error in Reading Data from Carbondata > at > org.apache.carbondata.presto.CarbondataPageSource$CarbondataBlockLoader.load(CarbondataPageSource.java:491) > at > org.apache.carbondata.presto.CarbondataPageSource$CarbondataBlockLoader.load(CarbondataPageSource.java:467) > at io.prestosql.spi.block.LazyBlock.assureLoaded(LazyBlock.java:276) > at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:267) > at io.prestosql.spi.Page.getLoadedPage(Page.java:261) > at > io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:283) > at io.prestosql.operator.Driver.processInternal(Driver.java:379) > at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283) > at io.prestosql.operator.Driver.tryWithLock(Driver.java:675) > at io.prestosql.operator.Driver.processFor(Driver.java:276) > at > io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075) > at > io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163) > at > io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484) > at io.prestosql.$gen.Presto_31620210526_073226_1.run(Unknown Source) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.RuntimeException: java.lang.RuntimeException: > java.lang.ClassCastException > at > org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk.convertToDimColDataChunkAndFillVector(DimensionRawColumnChunk.java:140) > at > org.apache.carbondata.core.scan.scanner.LazyPageLoader.loadPage(LazyPageLoader.java:75) > at > org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl.loadPage(CarbonColumnVectorImpl.java:531) > at > org.apache.carbondata.presto.CarbondataPageSource$CarbondataBlockLoader.load(CarbondataPageSource.java:483) > ... 16 more > Caused by: java.lang.RuntimeException: java.lang.ClassCastException > at > org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk.convertToDimColDataChunkAndFillVector(DimensionRawColumnChunk.java:140) > at > org.apache.carbondata.core.scan.scanner.LazyPageLoader.loadPage(LazyPageLoader.java:75) > at > org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl.loadPage(CarbonColumnVectorImpl.java:531) > at > org.apache.carbondata.core.datastore.page.encoding.compress.DirectCompressCodec$3.decodeAndFillVector(DirectCompressCodec.java
[jira] [Resolved] (CARBONDATA-4240) Properties present in https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java which are
[ https://issues.apache.org/jira/browse/CARBONDATA-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4240. - Fix Version/s: 2.3.0 Resolution: Fixed > Properties present in > https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java > which are not present in open source doc > --- > > Key: CARBONDATA-4240 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4240 > Project: CarbonData > Issue Type: Bug > Components: docs >Affects Versions: 2.2.0 > Environment: Open source docs >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.3.0 > > Time Spent: 9.5h > Remaining Estimate: 0h > > Properties present in > https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java > which are not present in open source doc as mentioned below. These > properties need to be updated in open source doc. > carbon.storelocation > carbon.blocklet.size > carbon.properties.filepath > carbon.date.format > carbon.complex.delimiter.level.1 > carbon.complex.delimiter.level.2 > carbon.complex.delimiter.level.3 > carbon.complex.delimiter.level.4 > carbon.lock.class > carbon.local.dictionary.enable > carbon.local.dictionary.decoder.fallback > spark.deploy.zookeeper.url > carbon.data.file.version > spark.carbon.hive.schema.store > spark.carbon.datamanagement.driver > spark.carbon.sessionstate.classname > spark.carbon.sqlastbuilder.classname > carbon.lease.recovery.retry.count > carbon.lease.recovery.retry.interval > carbon.index.schema.storage > carbon.merge.index.in.segment > carbon.number.of.cores.while.altPartition > carbon.minor.compaction.size > enable.unsafe.columnpage > carbon.lucene.compression.mode > sort.inmemory.size.inmb > is.driver.instance > carbon.input.metrics.update.interval > carbon.use.bitset.pipe.line > is.internal.load.call > carbon.lucene.index.stop.words > carbon.load.dateformat.setlenient.enable > carbon.infilter.subquery.pushdown.enable > broadcast.record.size > carbon.indexserver.tempfolder.deletetime -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4305) Support Carbondata Streamer tool to fetch data incrementally and merge
Akash R Nilugal created CARBONDATA-4305: --- Summary: Support Carbondata Streamer tool to fetch data incrementally and merge Key: CARBONDATA-4305 URL: https://issues.apache.org/jira/browse/CARBONDATA-4305 Project: CarbonData Issue Type: Sub-task Reporter: Akash R Nilugal Assignee: Akash R Nilugal Support a Spark streaming application that basically fetches new incremental data from sources like kafka and DFS and does deduplication and merge the changes onto the target carbondata table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4286) Select query with and filter is giving empty result
[ https://issues.apache.org/jira/browse/CARBONDATA-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4286. - Fix Version/s: 2.3.0 Resolution: Fixed > Select query with and filter is giving empty result > --- > > Key: CARBONDATA-4286 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4286 > Project: CarbonData > Issue Type: Bug >Reporter: Nihal kumar ojha >Priority: Major > Fix For: 2.3.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Select query on a table with and filter condition returns an empty result > while valid data present in the table. > Root cause: Currently when we are building the min-max index at block level > that time we are using unsafe byte comparator for either dimension or measure > column which returns incorrect result for measure columns. > We should use different comparators for dimensions and measure columns which > we are already doing at time of writing the min-max index at blocklet level. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4273) Cannot create table with partitions in Spark in EMR
[ https://issues.apache.org/jira/browse/CARBONDATA-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4273. - Fix Version/s: 2.3.0 Assignee: Indhumathi Muthumurugesh Resolution: Fixed > Cannot create table with partitions in Spark in EMR > --- > > Key: CARBONDATA-4273 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4273 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 2.2.0 > Environment: Release label:emr-5.24.1 > Hadoop distribution:Amazon 2.8.5 > Applications: > Hive 2.3.4, Pig 0.17.0, Hue 4.4.0, Flink 1.8.0, Spark 2.4.2, Presto 0.219, > JupyterHub 0.9.6 > Jar complied with: > apache-carbondata:2.2.0 > spark:2.4.5 > hadoop:2.8.3 >Reporter: Bigicecream >Assignee: Indhumathi Muthumurugesh >Priority: Critical > Labels: EMR, spark > Fix For: 2.3.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > > When trying to create a table like this: > {code:sql} > CREATE TABLE IF NOT EXISTS will_not_work( > timestamp string, > name string > ) > PARTITIONED BY (dt string, hr string) > STORED AS carbondata > LOCATION 's3a://my-bucket/CarbonDataTests/will_not_work > {code} > The folder 's3a://my-bucket/CarbonDataTests/will_not_work' is a not existing > folder > I get the following error: > {noformat} > org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: > Partition is not supported for external table > at > org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:219) > at > org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235) > at > org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:394) > at > org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69) > at > org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137) > at > org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137) > at > org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:118) > at > org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:134) > at > org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:137) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194) > at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3364) > at > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78) > at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3363) > at org.apache.spark.sql.Dataset.(Dataset.scala:194) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:643) > ... 64 elided > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4241) if the sort scope is changed to global sort and data loaded, major compaction fails
[ https://issues.apache.org/jira/browse/CARBONDATA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4241. - Fix Version/s: 2.2.0 Assignee: Indhumathi Muthumurugesh Resolution: Fixed > if the sort scope is changed to global sort and data loaded, major compaction > fails > --- > > Key: CARBONDATA-4241 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4241 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 2.2.0 > Environment: Spark 2.3.2 Carbon 1.6.1 , Spark 3.1.1 Carbon 2.2.0 >Reporter: Chetan Bhat >Assignee: Indhumathi Muthumurugesh >Priority: Major > Fix For: 2.2.0 > > > *Scenario 1 : create table with table_page_size_inmb'='1', load data ,* *set > sortscope as global sort , load data and do major compaction.*** > 0: jdbc:hive2://10.21.19.14:23040/default> CREATE TABLE uniqdata_pagesize > (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ > timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 > decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, > Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED as carbondata > TBLPROPERTIES('table_page_size_inmb'='1'); > +-+ > | Result | > +-+ > +-+ > No rows selected (0.229 seconds) > 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH > 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_pagesize > OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > +-+ > | Segment ID | > +-+ > | 0 | > +-+ > 1 row selected (1.016 seconds) > 0: jdbc:hive2://10.21.19.14:23040/default> alter table uniqdata_pagesize set > tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort'); > +-+ > | Result | > +-+ > +-+ > No rows selected (0.446 seconds) > 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH > 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_pagesize > OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > +-+ > | Segment ID | > +-+ > | 1 | > +-+ > 1 row selected (0.767 seconds) > 0: jdbc:hive2://10.21.19.14:23040/default> alter table uniqdata_pagesize > compact 'major'; > Error: org.apache.hive.service.cli.HiveSQLException: Error running query: > org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs > for more info. Exception in compaction Compaction Failure in Merger Rdd. > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) > at > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExec
[jira] [Resolved] (CARBONDATA-4247) Add fix for timestamp issue induced due to Spark3.0 changes
[ https://issues.apache.org/jira/browse/CARBONDATA-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4247. - Fix Version/s: 2.2.0 Assignee: Indhumathi Muthumurugesh Resolution: Fixed > Add fix for timestamp issue induced due to Spark3.0 changes > --- > > Key: CARBONDATA-4247 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4247 > Project: CarbonData > Issue Type: Sub-task >Reporter: Vikram Ahuja >Assignee: Indhumathi Muthumurugesh >Priority: Major > Fix For: 2.2.0 > > Time Spent: 8.5h > Remaining Estimate: 0h > > Add fix for timestamp issue induced due to Spark3.0 changes > > With spark 3.1, timestamp values loaded before 1900 years gives wrong results > > Refer > https://github.com/apache/carbondata/blob/master/integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/directdictionary/TimestampNoDictionaryColumnTestCase.scala -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4252) Support SQL support for the new upsert APIs added
Akash R Nilugal created CARBONDATA-4252: --- Summary: Support SQL support for the new upsert APIs added Key: CARBONDATA-4252 URL: https://issues.apache.org/jira/browse/CARBONDATA-4252 Project: CarbonData Issue Type: Sub-task Reporter: Akash R Nilugal Assignee: Akash R Nilugal -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4242) Improve the carbondata CDC merge performance Phase2
Akash R Nilugal created CARBONDATA-4242: --- Summary: Improve the carbondata CDC merge performance Phase2 Key: CARBONDATA-4242 URL: https://issues.apache.org/jira/browse/CARBONDATA-4242 Project: CarbonData Issue Type: Improvement Reporter: Akash R Nilugal Assignee: Akash R Nilugal Identify the bottleneck and improve the performance -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3895) Filenotfound exception after global sort compaction
[ https://issues.apache.org/jira/browse/CARBONDATA-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-3895: Issue Type: Bug (was: New Feature) > Filenotfound exception after global sort compaction > --- > > Key: CARBONDATA-3895 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3895 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > Filenotfound exception after global sort compaction > execute this test present in PR > test("test global sort compaction, clean files, update delete") { > sql("DROP TABLE IF EXISTS carbon_global_sort_update") > sql( > """ > | CREATE TABLE carbon_global_sort_update(id INT, name STRING, city > STRING, age INT) > | STORED AS carbondata TBLPROPERTIES('SORT_SCOPE'='GLOBAL_SORT', > 'sort_columns' = 'name, city') > """.stripMargin) > sql(s"LOAD DATA LOCAL INPATH '$filePath' INTO TABLE > carbon_global_sort_update") > sql(s"LOAD DATA LOCAL INPATH '$filePath' INTO TABLE > carbon_global_sort_update") > sql("alter table carbon_global_sort_update compact 'major'") > sql("clean files for table carbon_global_sort_update") > assert(sql("select * from carbon_global_sort_update").count() == 24) > val updatedRows = sql("update carbon_global_sort_update d set (id) = (id > + 3) where d.name = 'd'").collect() > assert(updatedRows.head.get(0) == 2) > val deletedRows = sql("delete from carbon_global_sort_update d where d.id > = 12").collect() > assert(deletedRows.head.get(0) == 2) > assert(sql("select * from carbon_global_sort_update").count() == 22) > } -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3929) Improve the CDC merge feature time
[ https://issues.apache.org/jira/browse/CARBONDATA-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3929. - Fix Version/s: 2.1.0 2.1.1 Resolution: Fixed > Improve the CDC merge feature time > -- > > Key: CARBONDATA-3929 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3929 > Project: CarbonData > Issue Type: Improvement >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Major > Fix For: 2.1.1, 2.1.0 > > Time Spent: 4h > Remaining Estimate: 0h > > Improve the CDC merge feature time -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3856) Support the LIMIT operator for show segments command
[ https://issues.apache.org/jira/browse/CARBONDATA-3856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-3856: Fix Version/s: (was: 2.2.0) > Support the LIMIT operator for show segments command > > > Key: CARBONDATA-3856 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3856 > Project: CarbonData > Issue Type: New Feature > Components: spark-integration >Affects Versions: 2.0.0 >Reporter: Xingjun Hao >Priority: Minor > Time Spent: 3.5h > Remaining Estimate: 0h > > Now, in the 2.0.0 release, CarbonData doesn't support LIMIT operator in the > SHOW SEGMENTS command. The time cost is expensive when there are too many > segments. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3603) Feature Change in CarbonData 2.0
[ https://issues.apache.org/jira/browse/CARBONDATA-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-3603: Fix Version/s: (was: 2.2.0) > Feature Change in CarbonData 2.0 > > > Key: CARBONDATA-3603 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3603 > Project: CarbonData > Issue Type: Improvement >Reporter: Jacky Li >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3746) Support column chunk cache creation and basic read/write
[ https://issues.apache.org/jira/browse/CARBONDATA-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-3746: Fix Version/s: (was: 2.2.0) > Support column chunk cache creation and basic read/write > > > Key: CARBONDATA-3746 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3746 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li >Assignee: Jacky Li >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3608) Drop 'STORED BY' syntax in create table
[ https://issues.apache.org/jira/browse/CARBONDATA-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-3608: Fix Version/s: (was: 2.2.0) > Drop 'STORED BY' syntax in create table > --- > > Key: CARBONDATA-3608 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3608 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3615) Show metacache shows the index server index-dictionary files when data loaded after index server disabled using set command
[ https://issues.apache.org/jira/browse/CARBONDATA-3615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-3615: Fix Version/s: (was: 2.2.0) > Show metacache shows the index server index-dictionary files when data loaded > after index server disabled using set command > --- > > Key: CARBONDATA-3615 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3615 > Project: CarbonData > Issue Type: Bug > Components: core >Affects Versions: 2.0.0 >Reporter: Vikram Ahuja >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > > Show metacache shows the index server index-dictionary files when data loaded > after index server disabled using set command > +-+-+-+-+--+ > | Field | Size | Comment | Cache Location | > +-+-+-+-+--+ > | Index | 0 B | 0/2 index files cached | DRIVER | > | Dictionary | 0 B | | DRIVER | > *| Index | 1.5 KB | 2/2 index files cached | INDEX SERVER |* > *| Dictionary | 0 B | | INDEX SERVER |* > *+-+-+-+*-+--+ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3643) Insert array('')/array() into Struct column will result in array(null), which is inconsist with Parquet
[ https://issues.apache.org/jira/browse/CARBONDATA-3643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-3643: Fix Version/s: (was: 2.2.0) > Insert array('')/array() into Struct column will result in > array(null), which is inconsist with Parquet > -- > > Key: CARBONDATA-3643 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3643 > Project: CarbonData > Issue Type: Bug >Affects Versions: 1.6.1, 2.0.0 >Reporter: Xingjun Hao >Priority: Minor > > > {code:java} > // > sql("create table datatype_struct_parquet(price struct>) > stored as parquet") > sql("insert into table datatype_struct_parquet values(named_struct('b', > array('')))") > sql("create table datatype_struct_carbondata(price struct>) > stored as carbondata") > sql("insert into datatype_struct_carbondata select * from > datatype_struct_parquet") > checkAnswer( sql("SELECT * FROM datatype_struct_carbondata"), sql("SELECT * > FROM datatype_struct_parquet")) > !== Correct Answer - 1 == == Spark Answer - 1 == > ![[WrappedArray()]] [[WrappedArray(null)]] > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3816) Support Float and Decimal in the Merge Flow
[ https://issues.apache.org/jira/browse/CARBONDATA-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-3816: Fix Version/s: (was: 2.2.0) > Support Float and Decimal in the Merge Flow > --- > > Key: CARBONDATA-3816 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3816 > Project: CarbonData > Issue Type: New Feature > Components: data-load >Affects Versions: 2.0.0 >Reporter: Xingjun Hao >Priority: Major > > We don't support FLOAT and DECIMAL datatype in the CDC Flow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4003) Improve IUD Concurrency
[ https://issues.apache.org/jira/browse/CARBONDATA-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-4003: Fix Version/s: (was: 2.2.0) > Improve IUD Concurrency > --- > > Key: CARBONDATA-4003 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4003 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Affects Versions: 2.0.1 >Reporter: Kejian Li >Priority: Major > Time Spent: 20h > Remaining Estimate: 0h > > When some segments' state of the table is INSERT IN PROGRESS, update > operation on the table fails. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3370) fix missing version of maven-duplicate-finder-plugin
[ https://issues.apache.org/jira/browse/CARBONDATA-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-3370: Fix Version/s: (was: 2.2.0) > fix missing version of maven-duplicate-finder-plugin > > > Key: CARBONDATA-3370 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3370 > Project: CarbonData > Issue Type: Improvement > Components: build >Affects Versions: 1.5.3 >Reporter: lamber-ken >Priority: Critical > Time Spent: 2h 50m > Remaining Estimate: 0h > > fix missing version of maven-duplicate-finder-plugin in pom file -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3559) Support adding carbon file into CarbonData table
[ https://issues.apache.org/jira/browse/CARBONDATA-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-3559: Fix Version/s: (was: 2.2.0) > Support adding carbon file into CarbonData table > > > Key: CARBONDATA-3559 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3559 > Project: CarbonData > Issue Type: Improvement >Reporter: Jacky Li >Assignee: Jacky Li >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > Since adding parquet/orc files into CarbonData table are supported now, > adding carbon files should be supported as well -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4229) Fix carbondata compilation and UT's for Spark 3.1.1
[ https://issues.apache.org/jira/browse/CARBONDATA-4229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4229. - Fix Version/s: 2.2.0 Resolution: Fixed > Fix carbondata compilation and UT's for Spark 3.1.1 > --- > > Key: CARBONDATA-4229 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4229 > Project: CarbonData > Issue Type: Sub-task >Reporter: Vikram Ahuja >Priority: Major > Fix For: 2.2.0 > > > Fix carbondata compilation and UT's for Spark 3.1.1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4225) Update is Slow and throws exception when auto compaction is enabled
[ https://issues.apache.org/jira/browse/CARBONDATA-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4225. - Fix Version/s: 2.2.0 Assignee: Indhumathi Muthu Murugesh Resolution: Fixed > Update is Slow and throws exception when auto compaction is enabled > --- > > Key: CARBONDATA-4225 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4225 > Project: CarbonData > Issue Type: Bug >Reporter: Indhumathi Muthu Murugesh >Assignee: Indhumathi Muthu Murugesh >Priority: Major > Fix For: 2.2.0 > > Time Spent: 2h > Remaining Estimate: 0h > > sql({color:#067d17}"""drop table if exists > iud.autoMergeUpdate"""{color}).collect() > sql({color:#067d17}"""create table iud.autoMergeUpdate (c1 string,c2 int,c3 > string,c5 string) STORED AS > {color}{color:#067d17} |carbondata > tblproperties('auto_load_merge'='true')"""{color}.stripMargin) > sql({color:#067d17}s"""LOAD DATA LOCAL INPATH > '{color}{color:#00b8bb}${color}{color:#871094}resourcesPath{color}{color:#067d17}/IUD/dest.csv' > INTO table iud.autoMergeUpdate"""{color}) > sql({color:#067d17}"update iud.autoMergeUpdate up_TAble > set(up_table.C1)=('abc')"{color}).show() > sql({color:#067d17}s"""LOAD DATA LOCAL INPATH > '{color}{color:#00b8bb}${color}{color:#871094}resourcesPath{color}{color:#067d17}/IUD/dest.csv' > INTO table iud.autoMergeUpdate"""{color}) > sql({color:#067d17}"update iud.autoMergeUpdate up_TAble > set(up_table.C1)=('abcd')"{color}).show() > > 2021-06-21 19:00:38 ERROR CarbonDataRDDFactory$:674 - Exception in start > compaction thread. > org.apache.carbondata.core.exception.ConcurrentOperationException: update is > in progress for table iud.zerorows, compaction operation is not allowed > at > org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.handleSegmentMerging(CarbonDataRDDFactory.scala:670) > at > org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.triggerEventsAfterLoading(CarbonDataRDDFactory.scala:581) > at > org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:559) > at > org.apache.spark.sql.execution.command.management.CarbonInsertIntoWithDf.insertData(CarbonInsertIntoWithDf.scala:193) > at > org.apache.spark.sql.execution.command.management.CarbonInsertIntoWithDf.process(CarbonInsertIntoWithDf.scala:150) > at > org.apache.spark.sql.execution.command.mutation.CarbonProjectForUpdateCommand.performUpdate(CarbonProjectForUpdateCommand.scala:361) > at > org.apache.spark.sql.execution.command.mutation.CarbonProjectForUpdateCommand.processData(CarbonProjectForUpdateCommand.scala:183) > at > org.apache.spark.sql.execution.command.DataCommand$$anonfun$run$2.apply(package.scala:146) > at > org.apache.spark.sql.execution.command.DataCommand$$anonfun$run$2.apply(package.scala:146) > at > org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:118) > at > org.apache.spark.sql.execution.command.DataCommand.runWithAudit(package.scala:144) > at org.apache.spark.sql.execution.command.DataCommand.run(package.scala:146) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) > at org.apache.spark.sql.Dataset$$anonfun$51.apply(Dataset.scala:3265) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3264) > at org.apache.spark.sql.Dataset.(Dataset.scala:190) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) > at > org.apache.spark.sql.test.SparkTestQueryExecutor.sql(SparkTestQueryExecutor.scala:37) > at org.apache.spark.sql.test.util.QueryTest.sql(QueryTest.scala:121) > at > org.apache.carbondata.spark.testsuite.iud.UpdateCarbonTableTestCase$$anonfun$62.apply$mcV$sp(UpdateCarbonTableTestCase.scala:1197) > at > org.apache.carbondata.spark.testsuite.iud.UpdateCarbonTableTestCase$$anonfun$62.apply(UpdateCarbonTableTestCase.scala:1190) > at > org.apache.carbondata.spark.testsuite.iud.UpdateCarbonTableTestCase$$anonfun$62.apply(UpdateCarbonTableTestCase.scala:1190) > at > org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) > at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.sc
[jira] [Resolved] (CARBONDATA-4211) from xx Insert into select fails if an SQL statement contains multiple inserts
[ https://issues.apache.org/jira/browse/CARBONDATA-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4211. - Fix Version/s: 2.2.0 Resolution: Fixed > from xx Insert into select fails if an SQL statement contains multiple inserts > -- > > Key: CARBONDATA-4211 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4211 > Project: CarbonData > Issue Type: Bug >Reporter: SHREELEKHYA GAMPA >Priority: Major > Fix For: 2.2.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > When multiple inserts with single query is used, it fails from SparkPlan > with: {{java.lang.ClassCastException: GenericInternalRow cannot be cast to > UnsafeRow}}. > [Steps] :- > From Spark SQL execute the following queries > 1、create tables: > create table catalog_returns_5(cr_returned_date_sk int,cr_returned_time_sk > int,cr_item_sk int)ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES > TERMINATED BY '\n' ; > create table catalog_returns_6(cr_returned_time_sk int,cr_item_sk int) > partitioned by (cr_returned_date_sk int) STORED BY > 'org.apache.carbondata.format' TBLPROPERTIES ( 'table_blocksize'='64'); > 2、insert table: > from catalog_returns_5 insert overwrite table catalog_returns_6 partition > (cr_returned_date_sk) select cr_returned_time_sk, cr_item_sk, > cr_returned_date_sk where cr_returned_date_sk is not null distribute by > cr_returned_date_sk insert overwrite table catalog_returns_6 partition > (cr_returned_date_sk) select cr_returned_time_sk, cr_item_sk, > cr_returned_date_sk where cr_returned_date_sk is null distribute by > cr_returned_date_sk; > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4212) Update Fails with Unsupported Complex types exception, even if table doesnt have complex column
[ https://issues.apache.org/jira/browse/CARBONDATA-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4212. - Fix Version/s: 2.2.0 Resolution: Fixed > Update Fails with Unsupported Complex types exception, even if table doesnt > have complex column > --- > > Key: CARBONDATA-4212 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4212 > Project: CarbonData > Issue Type: Bug >Reporter: Indhumathi Muthu Murugesh >Priority: Minor > Fix For: 2.2.0 > > Time Spent: 10m > Remaining Estimate: 0h > > drop table if exists iud.zerorows; > create table iud.zerorows (c1 string,c2 int,c3 string,c5 string) STORED AS > carbondata; > insert into iud.zerorows select 'a',1,'aa','b'; > update iud.zerorows up_TAble set(up_table.c1)=('abc') where up_TABLE.c2=1; > > Exception: > ANTLR Tool version 4.7 used for code generation does not match the current > runtime version 4.8ANTLR Runtime version 4.7 used for parser compilation does > not match the current runtime version 4.8ANTLR Tool version 4.7 used for code > generation does not match the current runtime version 4.8ANTLR Runtime > version 4.7 used for parser compilation does not match the current runtime > version 4.8org.apache.spark.sql.catalyst.parser.ParseException: > mismatched input 'update' expecting \{'(', 'SELECT', 'FROM', 'ADD', 'DESC', > 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 'DESCRIBE', > 'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 'START', > 'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', > 'DFS', 'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', > 'MSCK', 'EXPORT', 'IMPORT', 'LOAD'}(line 1, pos 0) > == SQL == > update iud.zerorows up_TAble set(up_table.c1)=('abc') where up_TABLE.c2=1 > ^^^ > at > org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:239) > at > org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:115) > at > org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48) > at > org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:69) > at > org.apache.spark.sql.parser.CarbonExtensionSqlParser.parsePlan(CarbonExtensionSqlParser.scala:60) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) > at > org.apache.spark.sql.test.SparkTestQueryExecutor.sql(SparkTestQueryExecutor.scala:37) > at org.apache.spark.sql.test.util.QueryTest.sql(QueryTest.scala:121) > at > org.apache.carbondata.spark.testsuite.iud.UpdateCarbonTableTestCase$$anonfun$61.apply$mcV$sp(UpdateCarbonTableTestCase.scala:1185) > at > org.apache.carbondata.spark.testsuite.iud.UpdateCarbonTableTestCase$$anonfun$61.apply(UpdateCarbonTableTestCase.scala:1181) > at > org.apache.carbondata.spark.testsuite.iud.UpdateCarbonTableTestCase$$anonfun$61.apply(UpdateCarbonTableTestCase.scala:1181) > at > org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) > at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > at org.scalatest.Transformer.apply(Transformer.scala:22) > at org.scalatest.Transformer.apply(Transformer.scala:20) > at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166) > at > org.apache.spark.sql.test.util.CarbonFunSuite.withFixture(CarbonFunSuite.scala:41) > at > org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at > org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) > at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) > at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175) > at org.scalatest.FunSuite.runTest(FunSuite.scala:1555) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413) > at > org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401) > at scala.collection.immutable.List.foreach(List.scala:381) > at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) > at > org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396) > at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483) > at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208) > at org.scalatest.FunSuite.runTests(FunSuite.scala:1555) > at org.scalatest.Suite$class.run(Suite.scala:1424) > at > org.scalatest.FunSuite.org$sc
[jira] [Resolved] (CARBONDATA-4208) Wrong Exception received for complex child long string columns
[ https://issues.apache.org/jira/browse/CARBONDATA-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4208. - Fix Version/s: 2.2.0 Resolution: Fixed > Wrong Exception received for complex child long string columns > -- > > Key: CARBONDATA-4208 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4208 > Project: CarbonData > Issue Type: Bug >Reporter: Mahesh Raju Somalaraju >Priority: Minor > Fix For: 2.2.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Wrong Exception received for complex child long string columns > > reproduce steps: > sql("create table complex2 (a int, arr1 array) " + > "stored as carbondata TBLPROPERTIES('LONG_STRING_COLUMNS'='arr1.val')") > > this case we should receive complex columns string columns will not support > long string exception message but receiving column not found in table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4193) Fix compaction failure after alter add complex column.
[ https://issues.apache.org/jira/browse/CARBONDATA-4193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4193. - Fix Version/s: 2.2.0 Resolution: Fixed > Fix compaction failure after alter add complex column. > --- > > Key: CARBONDATA-4193 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4193 > Project: CarbonData > Issue Type: Bug >Reporter: SHREELEKHYA GAMPA >Priority: Major > Fix For: 2.2.0 > > Time Spent: 6.5h > Remaining Estimate: 0h > > [Steps] :- > From spark beeline/SQL/Shell/Submit the following queries are executed > drop table if exists alter_complex; create table alter_complex (a int, b > string) stored as carbondata; insert into alter_complex select 1,'a'; insert > into alter_complex select 1,'a'; insert into alter_complex select 1,'a'; > insert into alter_complex select 1,'a'; insert into alter_complex select > 1,'a'; select _from alter_complex; ALTER TABLE alter_complex ADD > COLUMNS(struct1 STRUCT); insert into alter_complex select > 3,'c',named_struct('s1',4,'s2','d'); insert into alter_complex select > 3,'c',named_struct('s1',4,'s2','d'); insert into alter_complex select > 3,'c',named_struct('s1',4,'s2','d'); insert into alter_complex select > 3,'c',named_struct('s1',4,'s2','d'); insert into alter_complex select > 3,'c',named_struct('s1',4,'s2','d'); select_ from alter_complex; alter table > alter_complex compact 'minor'; OR alter table alter_complex compact 'major'; > OR alter table alter_complex compact 'custom' where segment.id In (3,4,5,6); > [Expected Result] :- Compaction should be success after alter add complex > column. > [Actual Issue] : - Compaction fails after alter add complex column. > > !https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/26/c71035/ec9486ee659c4374a211db588b2f6b2a/image.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (CARBONDATA-4055) Empty segment created and unnecessary entry to table status in update
[ https://issues.apache.org/jira/browse/CARBONDATA-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-4055: Comment: was deleted (was: df.write.format("hudi"). option(COMBINE_BEFORE_UPSERT_PROP, "false") option(PRECOMBINE_FIELD_OPT_KEY, "customerId"). option(RECORDKEY_FIELD_OPT_KEY, "str_uuid"). option(PARTITIONPATH_FIELD_OPT_KEY, ""). option(DataSourceWriteOptions.OPERATION_OPT_KEY, "insert"). option(DataSourceWriteOptions.HIVE_SYNC_ENABLED_OPT_KEY, "true"). option(DataSourceWriteOptions.HIVE_PARTITION_FIELDS_OPT_KEY, ""). option(DataSourceWriteOptions.HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY, "org.apache.hudi.hive.NonPartitionedExtractor"). option(DataSourceWriteOptions.KEYGENERATOR_CLASS_OPT_KEY, "org.apache.hudi.keygen.NonpartitionedKeyGenerator"). option(DataSourceWriteOptions.HIVE_DATABASE_OPT_KEY, db). option(DataSourceWriteOptions.HIVE_TABLE_OPT_KEY, tableName). option(TABLE_NAME, tableName).mode(Append).save(s"/hudicow6/${tableName}")) > Empty segment created and unnecessary entry to table status in update > - > > Key: CARBONDATA-4055 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4055 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Major > Fix For: 2.1.1 > > Time Spent: 5.5h > Remaining Estimate: 0h > > When the update command is executed and no data is updated, empty segment > directories are created and an in progress stale entry added to table status, > and even segment dirs are not cleaned during clean files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4055) Empty segment created and unnecessary entry to table status in update
[ https://issues.apache.org/jira/browse/CARBONDATA-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17356414#comment-17356414 ] Akash R Nilugal commented on CARBONDATA-4055: - df.write.format("hudi"). option(COMBINE_BEFORE_UPSERT_PROP, "false") option(PRECOMBINE_FIELD_OPT_KEY, "customerId"). option(RECORDKEY_FIELD_OPT_KEY, "str_uuid"). option(PARTITIONPATH_FIELD_OPT_KEY, ""). option(DataSourceWriteOptions.OPERATION_OPT_KEY, "insert"). option(DataSourceWriteOptions.HIVE_SYNC_ENABLED_OPT_KEY, "true"). option(DataSourceWriteOptions.HIVE_PARTITION_FIELDS_OPT_KEY, ""). option(DataSourceWriteOptions.HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY, "org.apache.hudi.hive.NonPartitionedExtractor"). option(DataSourceWriteOptions.KEYGENERATOR_CLASS_OPT_KEY, "org.apache.hudi.keygen.NonpartitionedKeyGenerator"). option(DataSourceWriteOptions.HIVE_DATABASE_OPT_KEY, db). option(DataSourceWriteOptions.HIVE_TABLE_OPT_KEY, tableName). option(TABLE_NAME, tableName).mode(Append).save(s"/hudicow6/${tableName}") > Empty segment created and unnecessary entry to table status in update > - > > Key: CARBONDATA-4055 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4055 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Major > Fix For: 2.1.1 > > Time Spent: 5.5h > Remaining Estimate: 0h > > When the update command is executed and no data is updated, empty segment > directories are created and an in progress stale entry added to table status, > and even segment dirs are not cleaned during clean files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (CARBONDATA-4055) Empty segment created and unnecessary entry to table status in update
[ https://issues.apache.org/jira/browse/CARBONDATA-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-4055: Comment: was deleted (was: Akash I am Apache carbondata PMC and Committer and Working as Senior Technical Lead at Cloud and AI/data platform team of Banglore Reasearch center, Huawei. I have been working on Bigdata and mainly Apache carbondata for 5 years now and have worked and interested in areas like index support on bigdata, Materialized Views, CDC on bigdata, Spark SQL query optimizations, Spark structured streaming, data lake and data warehouse functionality, trino.Currently I am working on Carbondata CDC. kunal I am Apache carbondata PMC and Committer and Working as System Architect at Cloud and AI/data platform team of Banglore Reasearch center, Huawei working on Bigdata technologies like Apache carbondata, Apache spark, Apache hive for 5 years now. Some of the major features include distributed index cache server, Hive + Carbondata integration, Pre-aggregation support, S3 support for carbondata, Secondary index on carbondata, Spark SQL query optimization in carbondata. ) > Empty segment created and unnecessary entry to table status in update > - > > Key: CARBONDATA-4055 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4055 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Major > Fix For: 2.1.1 > > Time Spent: 5.5h > Remaining Estimate: 0h > > When the update command is executed and no data is updated, empty segment > directories are created and an in progress stale entry added to table status, > and even segment dirs are not cleaned during clean files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4055) Empty segment created and unnecessary entry to table status in update
[ https://issues.apache.org/jira/browse/CARBONDATA-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17337242#comment-17337242 ] Akash R Nilugal commented on CARBONDATA-4055: - Akash I am Apache carbondata PMC and Committer and Working as Senior Technical Lead at Cloud and AI/data platform team of Banglore Reasearch center, Huawei. I have been working on Bigdata and mainly Apache carbondata for 5 years now and have worked and interested in areas like index support on bigdata, Materialized Views, CDC on bigdata, Spark SQL query optimizations, Spark structured streaming, data lake and data warehouse functionality, trino.Currently I am working on Carbondata CDC. kunal I am Apache carbondata PMC and Committer and Working as System Architect at Cloud and AI/data platform team of Banglore Reasearch center, Huawei working on Bigdata technologies like Apache carbondata, Apache spark, Apache hive for 5 years now. Some of the major features include distributed index cache server, Hive + Carbondata integration, Pre-aggregation support, S3 support for carbondata, Secondary index on carbondata, Spark SQL query optimization in carbondata. > Empty segment created and unnecessary entry to table status in update > - > > Key: CARBONDATA-4055 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4055 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Major > Fix For: 2.1.1 > > Time Spent: 5.5h > Remaining Estimate: 0h > > When the update command is executed and no data is updated, empty segment > directories are created and an in progress stale entry added to table status, > and even segment dirs are not cleaned during clean files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4037) Improve the table status and segment file writing
[ https://issues.apache.org/jira/browse/CARBONDATA-4037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4037. - Fix Version/s: 2.2.0 Resolution: Fixed > Improve the table status and segment file writing > - > > Key: CARBONDATA-4037 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4037 > Project: CarbonData > Issue Type: Improvement >Reporter: SHREELEKHYA GAMPA >Priority: Minor > Fix For: 2.2.0 > > Attachments: Improve table status and segment file writing_1.docx > > Time Spent: 27.5h > Remaining Estimate: 0h > > Currently, we update table status and segment files multiple times for a > single iud/merge/compact operation and delete the index files immediately > after merge. When concurrent queries are run, there may be situations like > user query is trying to access the segment index files and they are not > present, which is availability issue. > * To solve above issue, we can make mergeindex files generation mandatory > and fail load/compaction if mergeindex fails. Then if merge index is success, > update table status file and can delete index files immediately. However, in > legacy stores when alter segment merge is called, after merge index success, > do not delete index files immediately as it may cause issues for parallel > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4146) Query fails and the error message "unable to get file status" is displayed. query is normal after the "drop metacache on table" command is executed.
[ https://issues.apache.org/jira/browse/CARBONDATA-4146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4146. - Fix Version/s: 2.1.1 Resolution: Fixed > Query fails and the error message "unable to get file status" is displayed. > query is normal after the "drop metacache on table" command is executed. > - > > Key: CARBONDATA-4146 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4146 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.6.1, 2.0.0, 2.1.0 >Reporter: liuhe0702 >Priority: Major > Fix For: 2.1.1 > > Time Spent: 8h 40m > Remaining Estimate: 0h > > During compact execution, the status of the new segment is set to success > before index files are merged. After index files are merged, the carbonindex > files are deleted. As a result, the query task cannot find the cached > carbonindex files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4147) Carbondata 2.1.0 MV ERROR inserting data into table with MV
[ https://issues.apache.org/jira/browse/CARBONDATA-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4147. - Fix Version/s: (was: 2.1.0) 2.1.1 Assignee: Indhumathi Muthumurugesh Resolution: Fixed > Carbondata 2.1.0 MV ERROR inserting data into table with MV > > > Key: CARBONDATA-4147 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4147 > Project: CarbonData > Issue Type: Bug > Components: core >Affects Versions: 2.1.0 > Environment: Apache carbondata 2.1.0 >Reporter: Sushant Sammanwar >Assignee: Indhumathi Muthumurugesh >Priority: Major > Labels: datatype,double, materializedviews > Fix For: 2.1.1 > > Attachments: carbondata_210_insert_error_stack-trace > > Time Spent: 3h 10m > Remaining Estimate: 0h > > Hi Team , > > We are working on a POC where we are using carbon 2.1.0. > We have created below tables, MV : > create table if not exists fact_365_1_eutrancell_21 (ts timestamp, metric > STRING, tags_id STRING, value DOUBLE) partitioned by (ts2 timestamp) stored > as carbondata TBLPROPERTIES ('SORT_COLUMNS'='metric') > create materialized view if not exists fact_365_1_eutrancell_21_30_minute as > select tags_id ,metric ,ts2, timeseries(ts,'thirty_minute') as > ts,sum(value),avg(value),min(value),max(value) from fact_365_1_eutrancell_21 > group by metric, tags_id, timeseries(ts,'thirty_minute') ,ts2 > > When i try to insert data into above Table, below error is thrown : > scala> carbon.sql("insert into fact_365_1_eutrancell_21 values ('2020-09-25 > 05:30:00','eUtranCell.HHO.X2.InterFreq.PrepAttOut','ff6cb0f7-fba0-4134-81ee-55e820574627',392.2345,'2020-09-25 > 05:30:00')").show() > 21/03/10 22:32:20 AUDIT audit: \{"time":"March 10, 2021 10:32:20 PM > IST","username":"root","opName":"INSERT > INTO","opId":"33474031950342736","opStatus":"START"} > [Stage 0:> (0 + 1) / 1]21/03/10 22:32:32 WARN CarbonOutputIteratorWrapper: > try to poll a row batch one more time. > 21/03/10 22:32:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch > one more time. > 21/03/10 22:32:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch > one more time. > 21/03/10 22:32:36 WARN log: Updating partition stats fast for: > fact_365_1_eutrancell_21 > 21/03/10 22:32:36 WARN log: Updated size to 2699 > 21/03/10 22:32:38 AUDIT audit: \{"time":"March 10, 2021 10:32:38 PM > IST","username":"root","opName":"INSERT > OVERWRITE","opId":"33474049863830951","opStatus":"START"} > [Stage 3:==>(199 + 1) / > 200]21/03/10 22:33:07 WARN CarbonOutputIteratorWrapper: try to poll a row > batch one more time. > 21/03/10 22:33:07 WARN CarbonOutputIteratorWrapper: try to poll a row batch > one more time. > 21/03/10 22:33:07 WARN CarbonOutputIteratorWrapper: try to poll a row batch > one more time. > 21/03/10 22:33:07 ERROR CarbonFactDataHandlerColumnar: Error in producer > java.lang.ClassCastException: java.lang.Double cannot be cast to > java.lang.Long > at > org.apache.carbondata.core.datastore.page.ColumnPage.putData(ColumnPage.java:402) > at > org.apache.carbondata.processing.store.TablePage.convertToColumnarAndAddToPages(TablePage.java:239) > at > org.apache.carbondata.processing.store.TablePage.addRow(TablePage.java:201) > at > org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.processDataRows(CarbonFactDataHandlerColumnar.java:397) > at > org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.access$500(CarbonFactDataHandlerColumnar.java:60) > at > org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call(CarbonFactDataHandlerColumnar.java:637) > at > org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call(CarbonFactDataHandlerColumnar.java:614) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > > It seems the method is converting "decimal" data type of table to a "long" > data type for MV. > During value conversion it is throwing the error. > Could you please check if this is a defect / bug or let me know if i have > missed something ? > Note : This was working in carbon 2.0.1 > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4144) After the alter table xxx compact command is executed, the index size of the segment is 0, and an error is reported while quering
[ https://issues.apache.org/jira/browse/CARBONDATA-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-4144: Fix Version/s: (was: 2.2.0) 2.1.1 > After the alter table xxx compact command is executed, the index size of the > segment is 0, and an error is reported while quering > - > > Key: CARBONDATA-4144 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4144 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.6.1, 2.0.0, 2.1.0 >Reporter: liuhe0702 >Priority: Major > Fix For: 2.1.1 > > Time Spent: 4h 20m > Remaining Estimate: 0h > > When 'alter table xxx compact ...' command is executed, the value of > segmentFile is 13010.1_null.segment, and the values of indexSize and dataSize > are 0 in the tablestatus file of the secondary index table. Query failed and > the log displays java.lang.IndexOutOfBoundsException: Index:0, Size:0. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4145) Query fails and the message "File does not exist: xxxx.carbondata" is displayed
[ https://issues.apache.org/jira/browse/CARBONDATA-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-4145: Fix Version/s: (was: 2.2.0) 2.1.1 > Query fails and the message "File does not exist: .carbondata" is > displayed > --- > > Key: CARBONDATA-4145 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4145 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.6.1, 2.0.0, 2.1.0 >Reporter: liuhe0702 >Priority: Major > Fix For: 2.1.1 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > An exception occurs when the rebuild/refresh index command is executed. After > that, the query command fails to be executed, and the message "File does not > exist: > /user/hive/warehouse/carbon.store/sys/idx_tbl_data_event_carbon_user_num/Fact/Part0/Segment_27670/part-1-28_batchno0-0-x.carbondata" > is displayed and the idx_tbl_data_event_carbon_user_num table is secondary > index table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4145) Query fails and the message "File does not exist: xxxx.carbondata" is displayed
[ https://issues.apache.org/jira/browse/CARBONDATA-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4145. - Fix Version/s: 2.2.0 Resolution: Fixed > Query fails and the message "File does not exist: .carbondata" is > displayed > --- > > Key: CARBONDATA-4145 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4145 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.6.1, 2.0.0, 2.1.0 >Reporter: liuhe0702 >Priority: Major > Fix For: 2.2.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > An exception occurs when the rebuild/refresh index command is executed. After > that, the query command fails to be executed, and the message "File does not > exist: > /user/hive/warehouse/carbon.store/sys/idx_tbl_data_event_carbon_user_num/Fact/Part0/Segment_27670/part-1-28_batchno0-0-x.carbondata" > is displayed and the idx_tbl_data_event_carbon_user_num table is secondary > index table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4144) After the alter table xxx compact command is executed, the index size of the segment is 0, and an error is reported while quering
[ https://issues.apache.org/jira/browse/CARBONDATA-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4144. - Fix Version/s: 2.2.0 Resolution: Fixed > After the alter table xxx compact command is executed, the index size of the > segment is 0, and an error is reported while quering > - > > Key: CARBONDATA-4144 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4144 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.6.1, 2.0.0, 2.1.0 >Reporter: liuhe0702 >Priority: Major > Fix For: 2.2.0 > > Time Spent: 4h 10m > Remaining Estimate: 0h > > When 'alter table xxx compact ...' command is executed, the value of > segmentFile is 13010.1_null.segment, and the values of indexSize and dataSize > are 0 in the tablestatus file of the secondary index table. Query failed and > the log displays java.lang.IndexOutOfBoundsException: Index:0, Size:0. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4145) Query fails and the message "File does not exist: xxxx.carbondata" is displayed
[ https://issues.apache.org/jira/browse/CARBONDATA-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302309#comment-17302309 ] Akash R Nilugal commented on CARBONDATA-4145: - This is a duplicate jira, the issue is already being handled in[ https://github.com/apache/carbondata/pull/3988] refer CARBONDATA-4037 > Query fails and the message "File does not exist: .carbondata" is > displayed > --- > > Key: CARBONDATA-4145 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4145 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.6.1, 2.0.0, 2.1.0 >Reporter: liuhe0702 >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > > An exception occurs when the rebuild/refresh index command is executed. After > that, the query command fails to be executed, and the message "File does not > exist: > /user/hive/warehouse/carbon.store/sys/idx_tbl_data_event_carbon_user_num/Fact/Part0/Segment_27670/part-1-28_batchno0-0-x.carbondata" > is displayed and the idx_tbl_data_event_carbon_user_num table is secondary > index table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4146) Query fails and the error message "unable to get file status" is displayed. query is normal after the "drop metacache on table" command is executed.
[ https://issues.apache.org/jira/browse/CARBONDATA-4146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302291#comment-17302291 ] Akash R Nilugal commented on CARBONDATA-4146: - This is a duplicate jira, the issue is already being handled in[ https://github.com/apache/carbondata/pull/3988|http://example.com] refer CARBONDATA-4037 > Query fails and the error message "unable to get file status" is displayed. > query is normal after the "drop metacache on table" command is executed. > - > > Key: CARBONDATA-4146 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4146 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.6.1, 2.0.0, 2.1.0 >Reporter: liuhe0702 >Priority: Major > Time Spent: 4h 40m > Remaining Estimate: 0h > > During compact execution, the status of the new segment is set to success > before index files are merged. After index files are merged, the carbonindex > files are deleted. As a result, the query task cannot find the cached > carbonindex files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CARBONDATA-4146) Query fails and the error message "unable to get file status" is displayed. query is normal after the "drop metacache on table" command is executed.
[ https://issues.apache.org/jira/browse/CARBONDATA-4146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302291#comment-17302291 ] Akash R Nilugal edited comment on CARBONDATA-4146 at 3/16/21, 7:43 AM: --- This is a duplicate jira, the issue is already being handled in[ https://github.com/apache/carbondata/pull/3988] refer CARBONDATA-4037 was (Author: akashrn5): This is a duplicate jira, the issue is already being handled in[ https://github.com/apache/carbondata/pull/3988|http://example.com] refer CARBONDATA-4037 > Query fails and the error message "unable to get file status" is displayed. > query is normal after the "drop metacache on table" command is executed. > - > > Key: CARBONDATA-4146 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4146 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.6.1, 2.0.0, 2.1.0 >Reporter: liuhe0702 >Priority: Major > Time Spent: 4h 40m > Remaining Estimate: 0h > > During compact execution, the status of the new segment is set to success > before index files are merged. After index files are merged, the carbonindex > files are deleted. As a result, the query task cannot find the cached > carbonindex files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4110) Support clean files dry run and show statistics after clean files operation
[ https://issues.apache.org/jira/browse/CARBONDATA-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301587#comment-17301587 ] Akash R Nilugal commented on CARBONDATA-4110: - https://github.com/apache/carbondata/pull/4072 > Support clean files dry run and show statistics after clean files operation > --- > > Key: CARBONDATA-4110 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4110 > Project: CarbonData > Issue Type: New Feature >Reporter: Vikram Ahuja >Priority: Minor > Time Spent: 26h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4110) Support clean files dry run and show statistics after clean files operation
[ https://issues.apache.org/jira/browse/CARBONDATA-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4110. - Fix Version/s: 2.2.0 Resolution: Fixed > Support clean files dry run and show statistics after clean files operation > --- > > Key: CARBONDATA-4110 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4110 > Project: CarbonData > Issue Type: New Feature >Reporter: Vikram Ahuja >Priority: Minor > Fix For: 2.2.0 > > Time Spent: 26h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4110) Support clean files dry run and show statistics after clean files operation
[ https://issues.apache.org/jira/browse/CARBONDATA-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301586#comment-17301586 ] Akash R Nilugal commented on CARBONDATA-4110: - Why is this PR needed? Currently in the clean files operation the user does not know how much space will be freed. The idea is the add support for dry run in clean files which can tell the user how much space will be freed in the clean files operation without cleaning the actual data. What changes were proposed in this PR? This PR has the following changes: Support dry run in clean files: It will show the user how much space will be freed by the clean files operation and how much space left (which can be released after expiration time) after the clean files operation. Clean files output: Total size released during the clean files operation Disable clean files Statistics option in case the user does not want clean files statistics Clean files log: To enhance the clean files log to print the name of every file that is being deleted in the info log. > Support clean files dry run and show statistics after clean files operation > --- > > Key: CARBONDATA-4110 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4110 > Project: CarbonData > Issue Type: New Feature >Reporter: Vikram Ahuja >Priority: Minor > Time Spent: 26h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4124) Refresh MV which does not exist is not throwing proper message
[ https://issues.apache.org/jira/browse/CARBONDATA-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4124. - Fix Version/s: 2.2.0 Assignee: Indhumathi Muthu Murugesh Resolution: Fixed > Refresh MV which does not exist is not throwing proper message > -- > > Key: CARBONDATA-4124 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4124 > Project: CarbonData > Issue Type: Bug >Reporter: Indhumathi Muthu Murugesh >Assignee: Indhumathi Muthu Murugesh >Priority: Minor > Fix For: 2.2.0 > > Time Spent: 3h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4125) SI compatability issue fix
[ https://issues.apache.org/jira/browse/CARBONDATA-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4125. - Fix Version/s: 2.2.0 Assignee: Indhumathi Muthu Murugesh Resolution: Fixed > SI compatability issue fix > -- > > Key: CARBONDATA-4125 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4125 > Project: CarbonData > Issue Type: Bug >Reporter: Indhumathi Muthu Murugesh >Assignee: Indhumathi Muthu Murugesh >Priority: Major > Fix For: 2.2.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Refer > [http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Bug-SI-Compatibility-Issue-td105485.html] > for this issue -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4107) MV Performance and Lock issues
[ https://issues.apache.org/jira/browse/CARBONDATA-4107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4107. - Fix Version/s: 2.2.0 Resolution: Fixed > MV Performance and Lock issues > -- > > Key: CARBONDATA-4107 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4107 > Project: CarbonData > Issue Type: Bug >Reporter: Indhumathi Muthu Murugesh >Priority: Major > Fix For: 2.2.0 > > Time Spent: 11.5h > Remaining Estimate: 0h > > # After MV support multi-tenancy PR, mv system folder is moved to database > level. Hence, during each operation, insert/Load/IUD/show mv/query, we are > listing all the databases in the system and collecting mv schemas and > checking if there is any mv mapped to the table or not. This will degrade > performance of the query, to collect mv schemas from all databases, even > though the table has mv or not. > # When different jvm process call touchMDTFile method, file creation and > deletion can happen same time. This may fail the operation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4033) Error when using merge API with hive table
[ https://issues.apache.org/jira/browse/CARBONDATA-4033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17258008#comment-17258008 ] Akash R Nilugal commented on CARBONDATA-4033: - can you give more details of queries, because i cannot see table A here, so cannot run to check the error. > Error when using merge API with hive table > -- > > Key: CARBONDATA-4033 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4033 > Project: CarbonData > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1 >Reporter: Nguyen Dinh Huynh >Priority: Major > Labels: easyfix, features, newbie > > I always get this error when trying to upsert hive table. I'm using CDH 6.3.1 > with spark 2.4.3. Is this a bug ? > {code:java} > 2020-10-14 14:59:25 WARN BlockManager:66 - Putting block rdd_21_1 failed due > to exception java.lang.RuntimeException: Store location not set for the key > __temptable-7bdfc88b-e5b7-46d5-8492-dfbb98b9a1b0_1602662359786_null_389ec940-ed27-41d1-9038-72ed1cd162e90x0. > 2020-10-14 14:59:25 WARN BlockManager:66 - Block rdd_21_1 could not be > removed as it was not found on disk or in memory 2020-10-14 14:59:25 ERROR > Executor:91 - Exception in task 1.0 in stage 0.0 (TID 1) > java.lang.RuntimeException: Store location not set for the key > __temptable-7bdfc88b-e5b7-46d5-8492-dfbb98b9a1b0_1602662359786_null_389ec940-ed27-41d1-9038-72ed1cd162e90x0 > {code} > My code is: > {code:java} > val map = Map( > col("_external_op") -> col("A._external_op"), > col("_external_ts_sec") -> col("A._external_ts_sec"), > col("_external_row") -> col("A._external_row"), > col("_external_pos") -> col("A._external_pos"), > col("id") -> col("A.id"), > col("order") -> col("A.order"), > col("shop_code") -> col("A.shop_code"), > col("customer_tel") -> col("A.customer_tel"), > col("channel") -> col("A.channel"), > col("batch_session_id") -> col("A.batch_session_id"), > col("deleted_at") -> col("A.deleted_at"), > col("created") -> col("A.created")) > .asInstanceOf[Map[Any, Any]] > val testDf = > spark.sqlContext.read.format("carbondata") > .option("tableName", "package_drafts") > .option("schemaName", "db") > .option("dbName", "db") > .option("databaseName", "d")b > .load() > .as("B") > testDf.printSchema() > testDf.merge(package_draft_view, col("A.id").equalTo(col("B.id"))) > .whenMatched(col("A._external_op") === "u") > .updateExpr(map) > .whenMatched(col("A._external_op") === "c") > .insertExpr(map) > .whenMatched(col("A._external_op") === "d") > .delete() > .execute() > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4047) Datediff datatype is not working with spark-2.4.5. even in spark-sql its showing as null. .Either the query might be wrong or the versions don't support datatypes.
[ https://issues.apache.org/jira/browse/CARBONDATA-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17258005#comment-17258005 ] Akash R Nilugal commented on CARBONDATA-4047: - Also join our slack channel, you can directly ask questions there instead of raising issues directly as it will delay for responses. https://join.slack.com/t/carbondataworkspace/shared_invite/zt-g8sv1g92-pr3GTvjrW5H9DVvNl6H2dg > Datediff datatype is not working with spark-2.4.5. even in spark-sql its > showing as null. .Either the query might be wrong or the versions don't > support datatypes. > --- > > Key: CARBONDATA-4047 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4047 > Project: CarbonData > Issue Type: Task > Components: spark-integration >Affects Versions: 2.0.0 > Environment: Hadoop - 3.2.1 > Hive - 3.1.2 > Spark - 2.4.5 > carbon data-2.0 > mysql connector jar - mysql-connector-java-8.0.19.jar >Reporter: sravya >Priority: Major > Labels: CarbonData, hadoop, hive, spark2.4 > Attachments: carbon error.PNG, sparksql.PNG > > > 1.scala> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 as (select * , > intck ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM > vestedlogs) STORED AS carbondata").show() > :1: error: ')' expected but string literal found. > carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 as (select * , intck > ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM vestedlogs) > STORED AS carbondata").show() > > 2.scala> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 (SELECT > *,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE() AS > "Hours" FROM vestedlogs) STORED AS carbondata").show() > :1: error: ')' expected but string literal found. > carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 (SELECT > *,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE() AS > "Hours" FROM vestedlogs) STORED AS carbondata").show() > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CARBONDATA-4047) Datediff datatype is not working with spark-2.4.5. even in spark-sql its showing as null. .Either the query might be wrong or the versions don't support datat
[ https://issues.apache.org/jira/browse/CARBONDATA-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17258004#comment-17258004 ] Akash R Nilugal edited comment on CARBONDATA-4047 at 1/4/21, 6:22 AM: -- The query you have given is wrong, stored as should come first. correct query would be 1. CREATE TABLE IF NOT EXISTS vestedlogs3 STORED AS carbondata as (select * , intck ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM vestedlogs) This will also fail, as spark doesnt have intck function, it fails for parquet also, you can check. 2. rewrite the query CREATE TABLE IF NOT EXISTS vestedlogs3 STORED AS carbondata as (SELECT *,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE()) AS "Hours" FROM vestedlogs) after this again fails for parsing, please check your query again and can refer https://stackoverflow.com/questions/52527571/datediff-in-spark-sql spark doesnt support all, you can check for alternative was (Author: akashrn5): The query you have given is wrong, stored as should come first. correct query would be 1. CREATE TABLE IF NOT EXISTS vestedlogs3 STORED AS carbondata as (select * , intck ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM vestedlogs) This will also fail, as spark doesnt have intck function, it fails for parquet also, you can check. 2. rewrite the query CREATE TABLE IF NOT EXISTS vestedlogs3 STORED AS carbondata as (SELECT *,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE()) AS "Hours" FROM vestedlogs) after this again fails for parsing, please check your query again and can refer [this post|https://stackoverflow.com/questions/52527571/datediff-in-spark-sql] spark doesnt support all, you can check for alternative > Datediff datatype is not working with spark-2.4.5. even in spark-sql its > showing as null. .Either the query might be wrong or the versions don't > support datatypes. > --- > > Key: CARBONDATA-4047 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4047 > Project: CarbonData > Issue Type: Task > Components: spark-integration >Affects Versions: 2.0.0 > Environment: Hadoop - 3.2.1 > Hive - 3.1.2 > Spark - 2.4.5 > carbon data-2.0 > mysql connector jar - mysql-connector-java-8.0.19.jar >Reporter: sravya >Priority: Major > Labels: CarbonData, hadoop, hive, spark2.4 > Attachments: carbon error.PNG, sparksql.PNG > > > 1.scala> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 as (select * , > intck ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM > vestedlogs) STORED AS carbondata").show() > :1: error: ')' expected but string literal found. > carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 as (select * , intck > ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM vestedlogs) > STORED AS carbondata").show() > > 2.scala> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 (SELECT > *,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE() AS > "Hours" FROM vestedlogs) STORED AS carbondata").show() > :1: error: ')' expected but string literal found. > carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 (SELECT > *,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE() AS > "Hours" FROM vestedlogs) STORED AS carbondata").show() > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4047) Datediff datatype is not working with spark-2.4.5. even in spark-sql its showing as null. .Either the query might be wrong or the versions don't support datatypes.
[ https://issues.apache.org/jira/browse/CARBONDATA-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17258004#comment-17258004 ] Akash R Nilugal commented on CARBONDATA-4047: - The query you have given is wrong, stored as should come first. correct query would be 1. CREATE TABLE IF NOT EXISTS vestedlogs3 STORED AS carbondata as (select * , intck ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM vestedlogs) This will also fail, as spark doesnt have intck function, it fails for parquet also, you can check. 2. rewrite the query CREATE TABLE IF NOT EXISTS vestedlogs3 STORED AS carbondata as (SELECT *,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE()) AS "Hours" FROM vestedlogs) after this again fails for parsing, please check your query again and can refer [this post|https://stackoverflow.com/questions/52527571/datediff-in-spark-sql] spark doesnt support all, you can check for alternative > Datediff datatype is not working with spark-2.4.5. even in spark-sql its > showing as null. .Either the query might be wrong or the versions don't > support datatypes. > --- > > Key: CARBONDATA-4047 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4047 > Project: CarbonData > Issue Type: Task > Components: spark-integration >Affects Versions: 2.0.0 > Environment: Hadoop - 3.2.1 > Hive - 3.1.2 > Spark - 2.4.5 > carbon data-2.0 > mysql connector jar - mysql-connector-java-8.0.19.jar >Reporter: sravya >Priority: Major > Labels: CarbonData, hadoop, hive, spark2.4 > Attachments: carbon error.PNG, sparksql.PNG > > > 1.scala> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 as (select * , > intck ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM > vestedlogs) STORED AS carbondata").show() > :1: error: ')' expected but string literal found. > carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 as (select * , intck > ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM vestedlogs) > STORED AS carbondata").show() > > 2.scala> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 (SELECT > *,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE() AS > "Hours" FROM vestedlogs) STORED AS carbondata").show() > :1: error: ')' expected but string literal found. > carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 (SELECT > *,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE() AS > "Hours" FROM vestedlogs) STORED AS carbondata").show() > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4099) Fix Concurrent issues with clean files post event listener
[ https://issues.apache.org/jira/browse/CARBONDATA-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4099. - Fix Version/s: 2.2.0 Resolution: Fixed > Fix Concurrent issues with clean files post event listener > -- > > Key: CARBONDATA-4099 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4099 > Project: CarbonData > Issue Type: Bug >Reporter: Vikram Ahuja >Priority: Major > Fix For: 2.2.0 > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4088) Drop metacache didn't clear some cache information which leads to memory leak
[ https://issues.apache.org/jira/browse/CARBONDATA-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17254093#comment-17254093 ] Akash R Nilugal commented on CARBONDATA-4088: - please handle [https://issues.apache.org/jira/browse/CARBONDATA-4098|https://issues.apache.org/jira/browse/CARBONDATA-4098] > Drop metacache didn't clear some cache information which leads to memory leak > - > > Key: CARBONDATA-4088 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4088 > Project: CarbonData > Issue Type: Improvement > Components: core >Affects Versions: 2.1.0 >Reporter: Yahui Liu >Priority: Minor > Time Spent: 6h > Remaining Estimate: 0h > > When there are two spark applications, one drop a table, some cache > information of this table stay in another application and cannot be removed > with any method like "Drop metacache" command. This leads to memory leak. > With the passage of time, memory leak will also accumulate which finally > leads to driver OOM. Following are the leak points: 1) tableModifiedTimeStore > in CarbonFileMetastore; 2) segmentLockMap in BlockletDataMapIndexStore; 3) > absoluteTableIdentifierByteMap in SegmentPropertiesAndSchemaHolder; 4) > tableInfoMap in CarbonMetadata. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4095) Select Query with SI filter fails, when columnDrift is enabled
[ https://issues.apache.org/jira/browse/CARBONDATA-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4095. - Fix Version/s: 2.2.0 Assignee: Indhumathi Muthu Murugesh Resolution: Fixed > Select Query with SI filter fails, when columnDrift is enabled > -- > > Key: CARBONDATA-4095 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4095 > Project: CarbonData > Issue Type: Improvement >Reporter: Indhumathi Muthu Murugesh >Assignee: Indhumathi Muthu Murugesh >Priority: Major > Fix For: 2.2.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > sql({color:#067d17}"drop table if exists maintable"{color}) > sql({color:#067d17}"create table maintable (a string,b string,c int,d int) > STORED AS carbondata "{color}) > sql({color:#067d17}"insert into maintable values('k','d',2,3)"{color}) > sql({color:#067d17}"alter table maintable set > tblproperties('sort_columns'='c,d','sort_scope'='local_sort')"{color}) > sql({color:#067d17}"create index indextable on table maintable(b) AS > 'carbondata'"{color}) > sql({color:#067d17}"insert into maintable values('k','x',2,4)"{color}) > sql({color:#067d17}"select * from maintable where b='x'"{color}).show(false) > > > > > 2020-12-22 18:58:37 ERROR Executor:91 - Exception in task 0.0 in stage 40.0 > (TID 422) > java.lang.RuntimeException: Error while resolving filter expression > at > org.apache.carbondata.core.index.IndexFilter.resolveFilter(IndexFilter.java:283) > at > org.apache.carbondata.core.index.IndexFilter.getResolver(IndexFilter.java:203) > at > org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.initQuery(AbstractQueryExecutor.java:152) > at > org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.getBlockExecutionInfos(AbstractQueryExecutor.java:382) > at > org.apache.carbondata.core.scan.executor.impl.VectorDetailQueryExecutor.execute(VectorDetailQueryExecutor.java:43) > at > org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.initialize(VectorizedCarbonRecordReader.java:141) > at > org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:540) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.scan_nextBatch_0$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$12$$anon$1.hasNext(WholeStageCodegenExec.scala:631) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:253) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:836) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:836) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:109) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.carbondata.core.scan.filter.FilterExpressionProcessor.getFilterResolverBasedOnExpressionType(FilterExpressionProcessor.java:190) > at > org.apache.carbondata.core.scan.filter.FilterExpressionProcessor.createFilterResolverTree(FilterExpressionProcessor.java:128) > at > org.apache.carbondata.core.scan.filter.FilterExpressionProcessor.createFilterResolverTree(FilterExpressionProcessor.java:121) > at > org.apache.carbondata.core.scan.filter.FilterExpressionProcessor.getFilterResolverTree(FilterExpressionProcessor.java:77) > at > org.apache.carbondata.core.scan.filter.FilterExpressionProcessor.getFilterResolver(FilterExpressionProcessor.java:61) > at > org.apache.carbondata.core.index.IndexFilter.resolveFilter(IndexFilter.java:281) > ... 26 more > 2020-12-22 18:58:37 ERROR TaskSetMan
[jira] [Resolved] (CARBONDATA-4093) Add logs for MV and method to verify if mv is in Sync during query
[ https://issues.apache.org/jira/browse/CARBONDATA-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4093. - Fix Version/s: 2.2.0 Assignee: Indhumathi Muthu Murugesh Resolution: Fixed > Add logs for MV and method to verify if mv is in Sync during query > -- > > Key: CARBONDATA-4093 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4093 > Project: CarbonData > Issue Type: Improvement >Reporter: Indhumathi Muthu Murugesh >Assignee: Indhumathi Muthu Murugesh >Priority: Minor > Fix For: 2.2.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4076) Query having Subquery alias used in query projection doesnot hit mv after creation
[ https://issues.apache.org/jira/browse/CARBONDATA-4076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4076. - Resolution: Fixed > Query having Subquery alias used in query projection doesnot hit mv after > creation > -- > > Key: CARBONDATA-4076 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4076 > Project: CarbonData > Issue Type: Bug >Reporter: Indhumathi Muthu Murugesh >Priority: Minor > Fix For: 2.2.0 > > Time Spent: 5h 20m > Remaining Estimate: 0h > > {color:#067d17}CREATE TABLE fact_table1 (empname String, designation String, > doj Timestamp, > {color}{color:#067d17}workgroupcategory int, workgroupcategoryname String, > deptno int, deptname String, > {color}{color:#067d17}projectcode int, projectjoindate Timestamp, > projectenddate Timestamp,attendance int, > {color}{color:#067d17}utilization int,salary int) > {color}{color:#067d17}STORED AS carbondata;{color} > {color:#067d17}create materialized view mv_sub as select empname, sum(result) > sum_ut from (select empname, utilization result from fact_table1) fact_table1 > group by empname; > {color} > > {color:#067d17}select empname, sum(result) sum_ut from (select empname, > utilization result from fact_table1) fact_table1 group by empname;{color} > > {color:#067d17}explain select empname, sum(result) sum_ut from (select > empname, utilization result from fact_table1) fact_table1 group by > empname;{color} > > {color:#067d17}Expected: Query should hit MV{color} > {color:#067d17}Actual: Query is not hitting MV{color} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4052) Select query on SI table after insert overwrite is giving wrong result.
[ https://issues.apache.org/jira/browse/CARBONDATA-4052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4052. - Fix Version/s: 2.2.0 Resolution: Fixed > Select query on SI table after insert overwrite is giving wrong result. > --- > > Key: CARBONDATA-4052 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4052 > Project: CarbonData > Issue Type: Bug >Reporter: Nihal kumar ojha >Priority: Major > Fix For: 2.2.0 > > Time Spent: 4h 20m > Remaining Estimate: 0h > > # Create carbon table. > # Create SI table on the same carbon table. > # Do load or insert operation. > # Run query insert overwrite on maintable. > # Now select query on SI table is showing old as well as new data which > should be only new data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4055) Empty segment created and unnecessary entry to table status in update
Akash R Nilugal created CARBONDATA-4055: --- Summary: Empty segment created and unnecessary entry to table status in update Key: CARBONDATA-4055 URL: https://issues.apache.org/jira/browse/CARBONDATA-4055 Project: CarbonData Issue Type: Bug Reporter: Akash R Nilugal Assignee: Akash R Nilugal When the update command is executed and no data is updated, empty segment directories are created and an in progress stale entry added to table status, and even segment dirs are not cleaned during clean files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4042) Insert into select and CTAS launches fewer tasks(task count limited to number of nodes in cluster) even when target table is of no_sort
[ https://issues.apache.org/jira/browse/CARBONDATA-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4042. - Fix Version/s: 2.1.0 Resolution: Fixed > Insert into select and CTAS launches fewer tasks(task count limited to number > of nodes in cluster) even when target table is of no_sort > --- > > Key: CARBONDATA-4042 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4042 > Project: CarbonData > Issue Type: Improvement > Components: data-load, spark-integration >Reporter: Venugopal Reddy K >Priority: Major > Fix For: 2.1.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > *Issue:* > At present, When we do insert into table select from or create table as > select from, we lauch one single task per node. Whereas when we do a simple > select * from table query, tasks launched are equal to number of carbondata > files(CARBON_TASK_DISTRIBUTION default is CARBON_TASK_DISTRIBUTION_BLOCK). > Thus, slows down the load performance of insert into select and ctas cases. > Refer [Community discussion regd. task > lauch|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-Query-Regarding-Task-launch-mechanism-for-data-load-operations-tt98711.html] > > *Suggestion:* > Launch the same number of tasks as in select query for insert into select and > ctas cases when the target table is of no-sort. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CARBONDATA-3354) how to use filiters in datamaps
[ https://issues.apache.org/jira/browse/CARBONDATA-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218178#comment-17218178 ] Akash R Nilugal edited comment on CARBONDATA-3354 at 10/21/20, 8:06 AM: Hi [~imsuyash] Sorry for late reply. Now carbondata has good improved features in latest version. So you can check our documentation and try your scenarios, any doubt you can directly ask in slack channel as all dev members are present there. https://join.slack.com/t/carbondataworkspace/shared_invite/zt-g8sv1g92-pr3GTvjrW5H9DVvNl6H2dg Also, the above error, is valid one, as we support timeseries on time columns. Also in our latest version, we should all the granularities from year to second. You can refer here https://github.com/apache/carbondata/blob/master/docs/mv-guide.md#time-series-support Thanks was (Author: akashrn5): Hi [~imsuyash] Sorry for late reply. Now carbondata has good improved features in latest version. So you can check our documentation and try your scenarios, any doubt you can directly ask in slack channel as all dev members are present there. https://join.slack.com/t/carbondataworkspace/shared_invite/zt-g8sv1g92-pr3GTvjrW5H9DVvNl6H2dg Also, the above error, is valid one, as we support timeseries on time columns. Thanks > how to use filiters in datamaps > --- > > Key: CARBONDATA-3354 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3354 > Project: CarbonData > Issue Type: Task > Components: core >Affects Versions: 1.5.2 > Environment: apache carbon data 1.5.x >Reporter: suyash yadav >Priority: Major > > Hi Team, > > We are doing a POC on apache carbon data so that we can verify if this > database is capable of handling amount of data we are collecting form network > devices. > > We are stuck on few of our datamap related activities and have below queries: > > # How to use timiebased filters while creating datamap.We tried a time based > condition while creating a datamap but it didn't work. > # How to create a timeseries datamap with column which is having value of > epoch time.Our query is like below:- *carbon.sql("CREATE DATAMAP test ON > TABLE carbon_RT_test USING 'timeseries' DMPROPERTIES > ('event_time'='endMs','minute_granularity'='1',) AS SELECT sum(inOctets) FROM > carbon_RT_test GROUP BY inIfId")* > # *In above query endMs is having epoch time value.* > # We got an error like below: "Timeseries event time is only supported on > Timestamp column" > # Also we need to know if we can have a time granularity other then 1 like > in above query, can we have minute_granularity='5*'.* -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-3354) how to use filiters in datamaps
[ https://issues.apache.org/jira/browse/CARBONDATA-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218178#comment-17218178 ] Akash R Nilugal commented on CARBONDATA-3354: - Hi [~imsuyash] Sorry for late reply. Now carbondata has good improved features in latest version. So you can check our documentation and try your scenarios, any doubt you can directly ask in slack channel as all dev members are present there. https://join.slack.com/t/carbondataworkspace/shared_invite/zt-g8sv1g92-pr3GTvjrW5H9DVvNl6H2dg Also, the above error, is valid one, as we support timeseries on time columns. Thanks > how to use filiters in datamaps > --- > > Key: CARBONDATA-3354 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3354 > Project: CarbonData > Issue Type: Task > Components: core >Affects Versions: 1.5.2 > Environment: apache carbon data 1.5.x >Reporter: suyash yadav >Priority: Major > > Hi Team, > > We are doing a POC on apache carbon data so that we can verify if this > database is capable of handling amount of data we are collecting form network > devices. > > We are stuck on few of our datamap related activities and have below queries: > > # How to use timiebased filters while creating datamap.We tried a time based > condition while creating a datamap but it didn't work. > # How to create a timeseries datamap with column which is having value of > epoch time.Our query is like below:- *carbon.sql("CREATE DATAMAP test ON > TABLE carbon_RT_test USING 'timeseries' DMPROPERTIES > ('event_time'='endMs','minute_granularity'='1',) AS SELECT sum(inOctets) FROM > carbon_RT_test GROUP BY inIfId")* > # *In above query endMs is having epoch time value.* > # We got an error like below: "Timeseries event time is only supported on > Timestamp column" > # Also we need to know if we can have a time granularity other then 1 like > in above query, can we have minute_granularity='5*'.* -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-3970) Carbondata 2.0.1 MV ERROR CarbonInternalMetastore$: Adding/Modifying tableProperties operation failed
[ https://issues.apache.org/jira/browse/CARBONDATA-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218172#comment-17218172 ] Akash R Nilugal commented on CARBONDATA-3970: - [~sushantsam] please join our slack channel. It will be easy to discuss all the issues , as JIRA does not notify all the users. https://join.slack.com/t/carbondataworkspace/shared_invite/zt-g8sv1g92-pr3GTvjrW5H9DVvNl6H2dg > Carbondata 2.0.1 MV ERROR CarbonInternalMetastore$: Adding/Modifying > tableProperties operation failed > -- > > Key: CARBONDATA-3970 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3970 > Project: CarbonData > Issue Type: Bug > Components: data-query, hive-integration >Affects Versions: 2.0.1 > Environment: CarbonData 2.0.1 with Spark 2.4.5 >Reporter: Sushant Sammanwar >Priority: Major > > Hi , > > I am facing issues with materialized views - the query is not hitting the > view in the explain plan .I would really appreciate if you could help me on > this. > Below are the details : > I am using Spark shell to connect to Carbon 2.0.1 using spark 2.4.5 > Underlying table has data loaded. > I think problem is while create materialized view as i am getting a error > related to metastore. > > > scala> carbon.sql("create MATERIALIZED VIEW agg_sales_mv as select country, > sex,sum(quantity),avg(price) from sales group by country,sex").show() > 20/08/26 01:04:41 AUDIT audit: \{"time":"August 26, 2020 1:04:41 AM > IST","username":"root","opName":"CREATE MATERIALIZED > VIEW","opId":"16462372696035311","opStatus":"START"} > 20/08/26 01:04:45 AUDIT audit: \{"time":"August 26, 2020 1:04:45 AM > IST","username":"root","opName":"CREATE > TABLE","opId":"16462377160819798","opStatus":"START"} > 20/08/26 01:04:46 AUDIT audit: \{"time":"August 26, 2020 1:04:46 AM > IST","username":"root","opName":"CREATE > TABLE","opId":"16462377696791275","opStatus":"START"} > 20/08/26 01:04:48 AUDIT audit: \{"time":"August 26, 2020 1:04:48 AM > IST","username":"root","opName":"CREATE > TABLE","opId":"16462377696791275","opStatus":"SUCCESS","opTime":"2326 > ms","table":"NA","extraInfo":{}} > 20/08/26 01:04:48 AUDIT audit: \{"time":"August 26, 2020 1:04:48 AM > IST","username":"root","opName":"CREATE > TABLE","opId":"16462377160819798","opStatus":"SUCCESS","opTime":"2955 > ms","table":"default.agg_sales_mv","extraInfo":{"local_dictionary_threshold":"1","bad_record_path":"","table_blocksize":"1024","local_dictionary_enable":"true","flat_folder":"false","external":"false","sort_columns":"","comment":"","carbon.column.compressor":"snappy","mv_related_tables":"sales"}} > 20/08/26 01:04:50 ERROR CarbonInternalMetastore$: Adding/Modifying > tableProperties operation failed: > org.apache.spark.sql.hive.HiveExternalCatalog cannot be cast to > org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener > 20/08/26 01:04:50 ERROR CarbonInternalMetastore$: Adding/Modifying > tableProperties operation failed: > org.apache.spark.sql.hive.HiveExternalCatalog cannot be cast to > org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener > 20/08/26 01:04:51 AUDIT audit: \{"time":"August 26, 2020 1:04:51 AM > IST","username":"root","opName":"CREATE MATERIALIZED > VIEW","opId":"16462372696035311","opStatus":"SUCCESS","opTime":"10551 > ms","table":"NA","extraInfo":{"mvName":"agg_sales_mv"}} > ++ > || > ++ > ++ > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-3970) Carbondata 2.0.1 MV ERROR CarbonInternalMetastore$: Adding/Modifying tableProperties operation failed
[ https://issues.apache.org/jira/browse/CARBONDATA-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218169#comment-17218169 ] Akash R Nilugal commented on CARBONDATA-3970: - I think the issue with the carbon configurations. [~sushantsam] have you configured carbonExtensions and using sparksession itself for the queries ? Because we support and integrated SparkExtensions. Please can you tell us what is your configurations for the integration. > Carbondata 2.0.1 MV ERROR CarbonInternalMetastore$: Adding/Modifying > tableProperties operation failed > -- > > Key: CARBONDATA-3970 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3970 > Project: CarbonData > Issue Type: Bug > Components: data-query, hive-integration >Affects Versions: 2.0.1 > Environment: CarbonData 2.0.1 with Spark 2.4.5 >Reporter: Sushant Sammanwar >Priority: Major > > Hi , > > I am facing issues with materialized views - the query is not hitting the > view in the explain plan .I would really appreciate if you could help me on > this. > Below are the details : > I am using Spark shell to connect to Carbon 2.0.1 using spark 2.4.5 > Underlying table has data loaded. > I think problem is while create materialized view as i am getting a error > related to metastore. > > > scala> carbon.sql("create MATERIALIZED VIEW agg_sales_mv as select country, > sex,sum(quantity),avg(price) from sales group by country,sex").show() > 20/08/26 01:04:41 AUDIT audit: \{"time":"August 26, 2020 1:04:41 AM > IST","username":"root","opName":"CREATE MATERIALIZED > VIEW","opId":"16462372696035311","opStatus":"START"} > 20/08/26 01:04:45 AUDIT audit: \{"time":"August 26, 2020 1:04:45 AM > IST","username":"root","opName":"CREATE > TABLE","opId":"16462377160819798","opStatus":"START"} > 20/08/26 01:04:46 AUDIT audit: \{"time":"August 26, 2020 1:04:46 AM > IST","username":"root","opName":"CREATE > TABLE","opId":"16462377696791275","opStatus":"START"} > 20/08/26 01:04:48 AUDIT audit: \{"time":"August 26, 2020 1:04:48 AM > IST","username":"root","opName":"CREATE > TABLE","opId":"16462377696791275","opStatus":"SUCCESS","opTime":"2326 > ms","table":"NA","extraInfo":{}} > 20/08/26 01:04:48 AUDIT audit: \{"time":"August 26, 2020 1:04:48 AM > IST","username":"root","opName":"CREATE > TABLE","opId":"16462377160819798","opStatus":"SUCCESS","opTime":"2955 > ms","table":"default.agg_sales_mv","extraInfo":{"local_dictionary_threshold":"1","bad_record_path":"","table_blocksize":"1024","local_dictionary_enable":"true","flat_folder":"false","external":"false","sort_columns":"","comment":"","carbon.column.compressor":"snappy","mv_related_tables":"sales"}} > 20/08/26 01:04:50 ERROR CarbonInternalMetastore$: Adding/Modifying > tableProperties operation failed: > org.apache.spark.sql.hive.HiveExternalCatalog cannot be cast to > org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener > 20/08/26 01:04:50 ERROR CarbonInternalMetastore$: Adding/Modifying > tableProperties operation failed: > org.apache.spark.sql.hive.HiveExternalCatalog cannot be cast to > org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener > 20/08/26 01:04:51 AUDIT audit: \{"time":"August 26, 2020 1:04:51 AM > IST","username":"root","opName":"CREATE MATERIALIZED > VIEW","opId":"16462372696035311","opStatus":"SUCCESS","opTime":"10551 > ms","table":"NA","extraInfo":{"mvName":"agg_sales_mv"}} > ++ > || > ++ > ++ > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4025) storage space for MV is double to that of a table on which MV has been created.
[ https://issues.apache.org/jira/browse/CARBONDATA-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218159#comment-17218159 ] Akash R Nilugal commented on CARBONDATA-4025: - Hi, MV stores the aggregated data, so how the number of rows are same in MV also? can you give further details like, test queries, which granularity u tried? It would help to find the problem if any or suggest the proper way. Also, please join and discuss in slack channel, as jira wont be notified to all. https://join.slack.com/t/carbondataworkspace/shared_invite/zt-g8sv1g92-pr3GTvjrW5H9DVvNl6H2dg Thanks > storage space for MV is double to that of a table on which MV has been > created. > --- > > Key: CARBONDATA-4025 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4025 > Project: CarbonData > Issue Type: Improvement > Components: core >Affects Versions: 2.0.1 > Environment: Apcahe carbondata 2.0.1 > Apache spark 2.4.5 > Hadoop 2.7.2 >Reporter: suyash yadav >Priority: Major > > We are doing a POC based on carbondata but we have observed that when we > create n MV on a table with timeseries function of same granualarity the MV > takes double the space of the table. > > In my scenario, My table has 1.3 million records and MV also has same number > of records but the size of the table is 3.6 MB but the size of the MV is > around 6.5 MB. > This is really important for us as critical business decision are getting > affected due to this behaviour. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3934) Support insert into command for transactional support
[ https://issues.apache.org/jira/browse/CARBONDATA-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-3934: Attachment: Presto_write_flow.pdf > Support insert into command for transactional support > - > > Key: CARBONDATA-3934 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3934 > Project: CarbonData > Issue Type: Sub-task >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Major > Attachments: Presto_write_flow.pdf > > Time Spent: 8h 40m > Remaining Estimate: 0h > > Support insert into command for transactional support. > Should support writing table status file, segment files, all the folder > structure similar to transactional carbon table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3831) Support write carbon files with presto.
[ https://issues.apache.org/jira/browse/CARBONDATA-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-3831: Attachment: Presto_write_flow.pdf > Support write carbon files with presto. > --- > > Key: CARBONDATA-3831 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3831 > Project: CarbonData > Issue Type: New Feature >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Major > Attachments: Presto_write_flow.pdf, carbon_presto_write_transactional > SUpport.pdf > > > As we know the CarbonDataisan indexed columnar data format for fast analytics > on big data platforms. So we have already integrated with the query engines > like spark and even presto. Currently with presto we only support the > querying of carbondata files. But we don’t yet support the writing of > carbondata files > through the presto engine. > Currentlypresto is integrated with carbondata for reading the > carbondata files via presto. For this, we should be having the store already > ready which may be written carbon in spark and the table > should be hive metastore. So using carbondata connector we are able to read > the carbondata files. But we cannot create table or load the data to table in > presto. So it will somewhat hectic job to read the carbonfiles , by writing > first with other engine. > So here i will be trying to support the transactional load support in presto > integration for carbon. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4038) Support metrics during presto write
Akash R Nilugal created CARBONDATA-4038: --- Summary: Support metrics during presto write Key: CARBONDATA-4038 URL: https://issues.apache.org/jira/browse/CARBONDATA-4038 Project: CarbonData Issue Type: Sub-task Reporter: Akash R Nilugal Support metrics during presto write such as getSystemMemoryUsage () and getValidationCpuNanos() -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3824) Error when Secondary index tried to be created on table that does not exist is not correct.
[ https://issues.apache.org/jira/browse/CARBONDATA-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3824. - Fix Version/s: 2.1.0 Resolution: Fixed > Error when Secondary index tried to be created on table that does not exist > is not correct. > --- > > Key: CARBONDATA-3824 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3824 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.0.0 > Environment: Spark 2.3.2, Spark 2.4.5 >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.1.0 > > > *Issue :-* > Table uniqdata_double does not exist. > Secondary index tried to be created on table. Error message is incorrect. > CREATE INDEX indextable2 ON TABLE uniqdata_double (DOB) AS 'carbondata' > PROPERTIES('carbon.column.compressor'='zstd'); > *Error: java.lang.RuntimeException: Operation not allowed on non-carbon table > (state=,code=0)* > > *Expected :-* > *Error: java.lang.RuntimeException: Table does not exist* *(state=,code=0)*** -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3903) Documentation Issue in Github Docs Link https://github.com/apache/carbondata/tree/master/docs
[ https://issues.apache.org/jira/browse/CARBONDATA-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3903. - Fix Version/s: 2.1.0 Resolution: Fixed > Documentation Issue in Github Docs Link > https://github.com/apache/carbondata/tree/master/docs > -- > > Key: CARBONDATA-3903 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3903 > Project: CarbonData > Issue Type: Bug > Components: docs >Affects Versions: 2.0.1 > Environment: https://github.com/apache/carbondata/tree/master/docs >Reporter: PURUJIT CHAUGULE >Priority: Minor > Fix For: 2.1.0 > > > dml-of-carbondata.md > LOAD DATA: > * Mention Each Load is considered as a Segment. > * Give all possible options for SORT_SCOPE like > GLOBAL_SORT/LOCAL_SORT/NO_SORT (with explanation of difference between each > type). > * Add Example Of complete Load query with/without use of OPTIONS. > INSERT DATA: > * Mention each insert is a Segment. > LOAD Using Static/Dynamic Partitioning: > * Can give a hyperlink to Static/Dynamic partitioning. > UPDATE/DELETE: > * Mention about delta files concept in update and delete. > DELETE: > * Add example for deletion of all records from a table (delete from > tablename). > COMPACTION: > * Can mention Minor compaction of two types Auto and Manual( > carbon.auto.load.merge =true/false), and that if > carbon.auto.load.merge=false, trigger should be done manually. > * Hyperlink to Configurable properties of Compaction. > * Mention that compacted segments do not get cleaned automatically and > should be triggered manually using clean files. > > flink-integration-guide.md > * Mention what are stages, how is it used. > * Process of insertion, deletion of stages in carbontable. (How is it stored > in carbontable). > > language-manual.md > * Mention Compaction Hyperlink in DML section. > > spatial-index-guide.md > * Mention the TBLPROPERTIES supported / not supported for Geo table. > * Mention Spatial Index does not make a new column. > * CTAS from one geo table to another does not create another Geo table can > be mentioned. > * Mention that a certain combination of Spatial Index table properties need > to be added in create table, without which a geo table does not get created. > * Mention that we cannot alter columns (change datatype, change name, drop) > mentioned in spatial_index. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs
[ https://issues.apache.org/jira/browse/CARBONDATA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3901. - Fix Version/s: 2.1.0 Resolution: Fixed > Documentation issues in https://github.com/apache/carbondata/tree/master/docs > - > > Key: CARBONDATA-3901 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3901 > Project: CarbonData > Issue Type: Bug > Components: docs >Affects Versions: 2.0.1 > Environment: https://github.com/apache/carbondata/tree/master/docs >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > *Issue 1 :* > [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] > getOrCreateCarbonSession not used in Carbon 2.0 version and should be > removed.Issue 1 : > [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] > getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed. > Testing use alluxio by CarbonSessionimport > org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession > val carbon = > SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE > TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as > carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH > '${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into > table carbon_alluxio");carbon.sql("select * from carbon_alluxio").show > *Issue 2 -* > [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE] > Sort scope of the load.Options include no sort, local sort ,batch sort and > global sort --> Batch sort to be removed as its not supported. > *Issue 3 -* > [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream] > CLOSE STREAM link is not working. > *Issue 4 -* > [https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md] > Explain query does not hit the bloom. Hence the line "User can verify > whether a query can leverage BloomFilter Index by executing {{EXPLAIN}} > command, which will show the transformed logical plan, and thus user can > check whether the BloomFilter Index can skip blocklets during the scan." > needs to be removed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4036) When the ` character is present in column name, the table creation fails
[ https://issues.apache.org/jira/browse/CARBONDATA-4036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-4036: Description: When the ` character is present in column name, the table creation fails sql("create table special_char(`i#d` string, `nam(e` string,`ci)@!ty` string,`a\be` int, `ag!e` float, `na^me1` Decimal(8,4), ```a``bc``!!d``` int) stored as carbondata" + " tblproperties('INVERTED_INDEX'='`a`bc`!!d`', 'SORT_COLUMNS'='`a`bc`!!d`')") was:When the ` character is present in column name, the table creation fails > When the ` character is present in column name, the table creation fails > > > Key: CARBONDATA-4036 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4036 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Minor > > When the ` character is present in column name, the table creation fails > sql("create table special_char(`i#d` string, `nam(e` string,`ci)@!ty` > string,`a\be` int, `ag!e` float, `na^me1` Decimal(8,4), ```a``bc``!!d``` int) > stored as carbondata" + > " tblproperties('INVERTED_INDEX'='`a`bc`!!d`', > 'SORT_COLUMNS'='`a`bc`!!d`')") -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4036) When the ` character is present in column name, the table creation fails
Akash R Nilugal created CARBONDATA-4036: --- Summary: When the ` character is present in column name, the table creation fails Key: CARBONDATA-4036 URL: https://issues.apache.org/jira/browse/CARBONDATA-4036 Project: CarbonData Issue Type: Bug Reporter: Akash R Nilugal Assignee: Akash R Nilugal When the ` character is present in column name, the table creation fails -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-4035) MV table is not hit when sum() is applied on decimal column.
[ https://issues.apache.org/jira/browse/CARBONDATA-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17215334#comment-17215334 ] Akash R Nilugal commented on CARBONDATA-4035: - sql("drop table if exists special_char") sql("create table special_char(`i#d` string, `nam(e` string,`ci)@!ty` string,`a\be` int, `ag!e` float, `na^me1` Decimal(8,4), ```a``bc``!!d``` int) stored as carbondata" + " tblproperties('INVERTED_INDEX'='`a`bc`!!d`', 'SORT_COLUMNS'='`a`bc`!!d`')") > MV table is not hit when sum() is applied on decimal column. > > > Key: CARBONDATA-4035 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4035 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Minor > > MV table is not hit when sum() is applied on decimal column. > sql("drop table if exists sum_agg_decimal") > sql("create table sum_agg_decimal(salary1 decimal(7,2),salary2 > decimal(7,2),salary3 decimal(7,2),salary4 decimal(7,2),empname string) stored > as carbondata") > sql("drop materialized view if exists decimal_mv") > sql("create materialized view decimal_mv as select empname, sum(salary1 - > salary2) from sum_agg_decimal group by empname") > sql("explain select empname, sum( salary1 - salary2) from sum_agg_decimal > group by empname").show(false) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (CARBONDATA-4035) MV table is not hit when sum() is applied on decimal column.
[ https://issues.apache.org/jira/browse/CARBONDATA-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-4035: Comment: was deleted (was: sql("drop table if exists special_char") sql("create table special_char(`i#d` string, `nam(e` string,`ci)@!ty` string,`a\be` int, `ag!e` float, `na^me1` Decimal(8,4), ```a``bc``!!d``` int) stored as carbondata" + " tblproperties('INVERTED_INDEX'='`a`bc`!!d`', 'SORT_COLUMNS'='`a`bc`!!d`')")) > MV table is not hit when sum() is applied on decimal column. > > > Key: CARBONDATA-4035 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4035 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Minor > > MV table is not hit when sum() is applied on decimal column. > sql("drop table if exists sum_agg_decimal") > sql("create table sum_agg_decimal(salary1 decimal(7,2),salary2 > decimal(7,2),salary3 decimal(7,2),salary4 decimal(7,2),empname string) stored > as carbondata") > sql("drop materialized view if exists decimal_mv") > sql("create materialized view decimal_mv as select empname, sum(salary1 - > salary2) from sum_agg_decimal group by empname") > sql("explain select empname, sum( salary1 - salary2) from sum_agg_decimal > group by empname").show(false) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4035) MV table is not hit when sum() is applied on decimal column.
Akash R Nilugal created CARBONDATA-4035: --- Summary: MV table is not hit when sum() is applied on decimal column. Key: CARBONDATA-4035 URL: https://issues.apache.org/jira/browse/CARBONDATA-4035 Project: CarbonData Issue Type: Bug Reporter: Akash R Nilugal Assignee: Akash R Nilugal MV table is not hit when sum() is applied on decimal column. sql("drop table if exists sum_agg_decimal") sql("create table sum_agg_decimal(salary1 decimal(7,2),salary2 decimal(7,2),salary3 decimal(7,2),salary4 decimal(7,2),empname string) stored as carbondata") sql("drop materialized view if exists decimal_mv") sql("create materialized view decimal_mv as select empname, sum(salary1 - salary2) from sum_agg_decimal group by empname") sql("explain select empname, sum( salary1 - salary2) from sum_agg_decimal group by empname").show(false) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4017) insert fails when column name has back slash and Si creation fails
[ https://issues.apache.org/jira/browse/CARBONDATA-4017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4017. - Resolution: Fixed > insert fails when column name has back slash and Si creation fails > -- > > Key: CARBONDATA-4017 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4017 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > 1. when the column name contains the backslash character and the table is > created with carbon session , insert fails second time. > 2. when column name has special characters, SI creation fails in parsing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4018) CSV header validation is not considering the dimension columns
[ https://issues.apache.org/jira/browse/CARBONDATA-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4018. - Resolution: Fixed > CSV header validation is not considering the dimension columns > --- > > Key: CARBONDATA-4018 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4018 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > CSV header validation not considering the dimension columns in schema -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (CARBONDATA-3769) Upgrade hadoop version to 3.1.1 and add maven profile for 2.7.2
[ https://issues.apache.org/jira/browse/CARBONDATA-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal closed CARBONDATA-3769. --- Resolution: Not A Problem > Upgrade hadoop version to 3.1.1 and add maven profile for 2.7.2 > --- > > Key: CARBONDATA-3769 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3769 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Major > Time Spent: 3h 40m > Remaining Estimate: 0h > > Upgrade hadoop version to 3.1.1 and add maven profile for 2.7.2 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3911) NullPointerException is thrown when clean files is executed after two updates
[ https://issues.apache.org/jira/browse/CARBONDATA-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3911. - Fix Version/s: 2.1.0 Resolution: Fixed > NullPointerException is thrown when clean files is executed after two updates > - > > Key: CARBONDATA-3911 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3911 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Major > Fix For: 2.1.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > * create table > * load data > * load one more data > * update1 > * update2 > * clean files > fails with NullPointer -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4019) CDC fails when the join expression contains the AND or any logical expression
Akash R Nilugal created CARBONDATA-4019: --- Summary: CDC fails when the join expression contains the AND or any logical expression Key: CARBONDATA-4019 URL: https://issues.apache.org/jira/browse/CARBONDATA-4019 Project: CarbonData Issue Type: Bug Reporter: Akash R Nilugal Assignee: Akash R Nilugal CDC fails when the join expression contains the AND or any logical expressions Fails with cast expression -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4018) CSV header validation is not considering the dimension columns
Akash R Nilugal created CARBONDATA-4018: --- Summary: CSV header validation is not considering the dimension columns Key: CARBONDATA-4018 URL: https://issues.apache.org/jira/browse/CARBONDATA-4018 Project: CarbonData Issue Type: Bug Reporter: Akash R Nilugal Assignee: Akash R Nilugal Fix For: 2.1.0 CSV header validation not considering the dimension columns in schema -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4017) insert fails when column name has back slash and Si creation fails
Akash R Nilugal created CARBONDATA-4017: --- Summary: insert fails when column name has back slash and Si creation fails Key: CARBONDATA-4017 URL: https://issues.apache.org/jira/browse/CARBONDATA-4017 Project: CarbonData Issue Type: Bug Reporter: Akash R Nilugal Assignee: Akash R Nilugal Fix For: 2.1.0 1. when the column name contains the backslash character and the table is created with carbon session , insert fails second time. 2. when column name has special characters, SI creation fails in parsing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4009) PartialQuery not hitting mv
[ https://issues.apache.org/jira/browse/CARBONDATA-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4009. - Fix Version/s: 2.1.0 Resolution: Fixed > PartialQuery not hitting mv > --- > > Key: CARBONDATA-4009 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4009 > Project: CarbonData > Issue Type: Bug >Reporter: Indhumathi Muthumurugesh >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4005) SI with cache level blocklet issue
[ https://issues.apache.org/jira/browse/CARBONDATA-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4005. - Fix Version/s: 2.1.0 Resolution: Fixed > SI with cache level blocklet issue > -- > > Key: CARBONDATA-4005 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4005 > Project: CarbonData > Issue Type: Bug >Reporter: SHREELEKHYA GAMPA >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Select query on SI column returns blank resultset after changing the cache > level to blocklet > PR: https://github.com/apache/carbondata/pull/3951 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4002) Altering the value of sort columns and unsetting the longStringColumns results in deletion of columns from table schema.
[ https://issues.apache.org/jira/browse/CARBONDATA-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4002. - Fix Version/s: (was: 2.0.0) 2.1.0 Resolution: Fixed > Altering the value of sort columns and unsetting the longStringColumns > results in deletion of columns from table schema. > - > > Key: CARBONDATA-4002 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4002 > Project: CarbonData > Issue Type: Bug > Components: core >Reporter: Karan >Priority: Major > Fix For: 2.1.0 > > Time Spent: 5h > Remaining Estimate: 0h > > When we change the value of sortColumns by alter table query and then run > unset for longStringColumn. it removes some columns from table schema. > CREATE TABLE if not exists $longStringTable(id INT, name STRING, description > STRING, address STRING, note STRING) STORED AS > carbondataTBLPROPERTIES('sort_columns'='id,name'); > alter table long_string_table set > tblproperties('sort_columns'='ID','sort_scope'='no_sort'); > alter table long_string_table unset tblproperties('long_string_columns'); > these queries will remove Name column from the schema because initially it > was a sortColumn and after that value of sortColumns is changed. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3996) Show table extended like command throws java.lang.ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/CARBONDATA-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3996. - Resolution: Fixed > Show table extended like command throws > java.lang.ArrayIndexOutOfBoundsException > > > Key: CARBONDATA-3996 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3996 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 2.0.0 >Reporter: Venugopal Reddy K >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 5h > Remaining Estimate: 0h > > *Issue:* > Show table extended like command throws > java.lang.ArrayIndexOutOfBoundsException > *Steps to reproduce:* > spark.sql("create table employee(id string, name string) stored as > carbondata") > spark.sql("show table extended like 'emp*'").show(100, false) > *Exception stack:* > > {code:java} > Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: > 3Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 3 at > org.apache.spark.sql.catalyst.expressions.GenericInternalRow.genericGet(rows.scala:201) > at > org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getAs(rows.scala:35) > at > org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getUTF8String(rows.scala:46) > at > org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getUTF8String(rows.scala:195) > at > org.apache.spark.sql.catalyst.InternalRow$$anonfun$getAccessor$8.apply(InternalRow.scala:136) > at > org.apache.spark.sql.catalyst.InternalRow$$anonfun$getAccessor$8.apply(InternalRow.scala:136) > at > org.apache.spark.sql.catalyst.expressions.BoundReference.eval(BoundAttribute.scala:44) > at > org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:389) > at > org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:152) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:92) > at > org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24$$anonfun$applyOrElse$23.apply(Optimizer.scala:1364) > at > org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24$$anonfun$applyOrElse$23.apply(Optimizer.scala:1364) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35) at > scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at > scala.collection.AbstractTraversable.map(Traversable.scala:104) at > org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24.applyOrElse(Optimizer.scala:1364) > at > org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24.applyOrElse(Optimizer.scala:1359) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:258) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:258) > at > org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69) > at > org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:257) > at > org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.transformDown(AnalysisHelper.scala:149) > at > org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29) > at > org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:263) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:263) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:328) > at > org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:186) > at > org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:326) > at > org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:263) > at > org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.transformDown(AnalysisHelper.scala:149) > at > org.apache.s
[jira] [Resolved] (CARBONDATA-3998) FileNotFoundException being thrown in hive during insert.
[ https://issues.apache.org/jira/browse/CARBONDATA-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3998. - Fix Version/s: 2.1.0 Resolution: Fixed > FileNotFoundException being thrown in hive during insert. > -- > > Key: CARBONDATA-3998 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3998 > Project: CarbonData > Issue Type: Bug >Reporter: Kunal Kapoor >Assignee: Kunal Kapoor >Priority: Major > Fix For: 2.1.0 > > Time Spent: 4h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3990) Fix DropCache log error when indexmap is null
[ https://issues.apache.org/jira/browse/CARBONDATA-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3990. - Fix Version/s: 2.1.0 Resolution: Fixed > Fix DropCache log error when indexmap is null > -- > > Key: CARBONDATA-3990 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3990 > Project: CarbonData > Issue Type: Bug >Reporter: Indhumathi Muthumurugesh >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3984) compaction on table having range column after altering data type from string to long string fails.
[ https://issues.apache.org/jira/browse/CARBONDATA-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3984. - Fix Version/s: 2.1.0 Resolution: Fixed > compaction on table having range column after altering data type from string > to long string fails. > -- > > Key: CARBONDATA-3984 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3984 > Project: CarbonData > Issue Type: Bug > Components: core, spark-integration >Affects Versions: 2.0.0 >Reporter: Karan >Priority: Major > Fix For: 2.1.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > When dataType of a String column which is also provided as range column in > table properties is altered to longStringColumn. It shows following error > while performing compaction on the table. > > VARCHAR not supported for the filter expression; at > org.apache.spark.sql.util.CarbonException$.analysisException > (CarbonException.scala: 23) at > org.apache.carbondata.spark.rdd.CarbonMergerRDD $$ anon $ 1. > (CarbonMergerRDD.scala: 227) at > org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute ( > CarbonMergerRDD.scala: 104) at > org.apache.carbondata.spark.rdd.CarbonRDD.compute -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3986) multiple issues during compaction and concurrent scenarios
[ https://issues.apache.org/jira/browse/CARBONDATA-3986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3986. - Fix Version/s: 2.1.0 Resolution: Fixed > multiple issues during compaction and concurrent scenarios > -- > > Key: CARBONDATA-3986 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3986 > Project: CarbonData > Issue Type: Bug >Reporter: Ajantha Bhat >Assignee: Ajantha Bhat >Priority: Major > Fix For: 2.1.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > multiple issues during compaction and concurrent scenarios > a) Auto compaction/multiple times minor compaction is called, it was > considering compacted segments and coming compaction again ad overwriting the > files and segments > b) Minor/ auto compaction should skip >=2 level segments, now only skipping > =2 level segments > c) when compaction failed, no need to call merge index > d) At executor, When segment file or table status file failed to write during > merge index event, need to remove the stale files. > e) during partial load cleanup segment folders are removed but segment > metadata files were not removed > f) Some table status retry issues -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3983) SI compatability issue
[ https://issues.apache.org/jira/browse/CARBONDATA-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-3983. - Fix Version/s: 2.1.0 Resolution: Fixed > SI compatability issue > -- > > Key: CARBONDATA-3983 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3983 > Project: CarbonData > Issue Type: Bug >Reporter: SHREELEKHYA GAMPA >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Read from maintable having SI returns empty resultset when SI is stored with > old tuple id storage format. > Bug id: BUG2020090205414 > PR link: https://github.com/apache/carbondata/pull/3922 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-3793) Data load with partition columns fail with InvalidLoadOptionException when load option 'header' is set to 'true'
[ https://issues.apache.org/jira/browse/CARBONDATA-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195631#comment-17195631 ] Akash R Nilugal commented on CARBONDATA-3793: - PR https://github.com/apache/carbondata/pull/3911 is wrongly linked to this Jira. The actual Jira is CARBONDATA-3973 > Data load with partition columns fail with InvalidLoadOptionException when > load option 'header' is set to 'true' > > > Key: CARBONDATA-3793 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3793 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 2.0.0 >Reporter: Venugopal Reddy K >Priority: Minor > Fix For: 2.0.0 > > Attachments: Selection_001.png > > Time Spent: 4h 40m > Remaining Estimate: 0h > > *Issue:* > Data load with partition fail with `InvalidLoadOptionException` when load > option `header` is set to `true` > > *CallStack:* > 2020-05-05 21:49:35 AUDIT audit:97 - {"time":"5 May, 2020 9:49:35 PM > IST","username":"root1","opName":"LOAD > DATA","opId":"199081091980878","opStatus":"FAILED","opTime":"1734 > ms","table":"default.source","extraInfo":{color:#ff}{"Exception":"org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException","Message":"When > 'header' option is true, 'fileheader' option is not required."}}{color} > Exception in thread "main" > org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException: When > 'header' option is true, 'fileheader' option is not required. > at > org.apache.carbondata.processing.loading.model.CarbonLoadModelBuilder.build(CarbonLoadModelBuilder.java:203) > at > org.apache.carbondata.processing.loading.model.CarbonLoadModelBuilder.build(CarbonLoadModelBuilder.java:126) > at > org.apache.spark.sql.execution.datasources.SparkCarbonTableFormat.prepareWrite(SparkCarbonTableFormat.scala:132) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:103) > at > org.apache.spark.sql.execution.command.management.CarbonInsertIntoHadoopFsRelationCommand.run(CarbonInsertIntoHadoopFsRelationCommand.scala:160) > at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104) > at > org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3989) Unnecessary segment files are created even when the segments are neither updated nor deleted
Akash R Nilugal created CARBONDATA-3989: --- Summary: Unnecessary segment files are created even when the segments are neither updated nor deleted Key: CARBONDATA-3989 URL: https://issues.apache.org/jira/browse/CARBONDATA-3989 Project: CarbonData Issue Type: Bug Reporter: Akash R Nilugal Assignee: Akash R Nilugal Unnecessary segment files are created even when the segments are neither updated nor deleted -- This message was sent by Atlassian Jira (v8.3.4#803005)