query on string type return error

2018-04-07 Thread ??????
hi all, when I use carbondata to run a query "select count(*) from 
action_carbondata where starttimestr = 20180301;", then an error occurs. This 
is the error info:
###
0: jdbc:hive2://localhost:1> select count(*) from action_carbondata where 
starttimestr = 20180301;
Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 
12 in stage 7.0 failed 4 times, most recent failure: Lost task 12.3 in stage 
7.0 (TID 173, sz-pg-entanalytics-research-001.tendcloud.com, executor 1): 
org.apache.spark.util.TaskCompletionListenerException: 
org.apache.carbondata.core.scan.executor.exception.QueryExecutionException:


Previous exception in task: java.util.concurrent.ExecutionException: 
java.util.concurrent.ExecutionException: java.io.IOException: 
org.apache.thrift.protocol.TProtocolException: Required field 'data_chunk_list' 
was not present! Struct: DataChunk3(data_chunk_list:null)

org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.updateScanner(AbstractDataBlockIterator.java:136)

org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.processNextBatch(DataBlockIteratorImpl.java:64)

org.apache.carbondata.core.scan.result.iterator.VectorDetailQueryResultIterator.processNextBatch(VectorDetailQueryResultIterator.java:46)

org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextBatch(VectorizedCarbonRecordReader.java:283)

org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextKeyValue(VectorizedCarbonRecordReader.java:171)

org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:391)

org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown
 Source)

org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown
 Source)

org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
 Source)

org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)

org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)

org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)

org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)

org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
org.apache.spark.scheduler.Task.run(Task.scala:108)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
at 
org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138)
at 
org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116)
at org.apache.spark.scheduler.Task.run(Task.scala:118)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


Driver stacktrace: (state=,code=0)

###


create table statement:
CREATE TABLE action_carbondata(
cur_appversioncode  integer,
cur_appversionname  integer,
cur_browserid  integer,
cur_carrierid  integer,
cur_channelid  integer,
cur_cityid  integer,
cur_countryid  integer,
cur_ip  string,
cur_networkid  integer,
cur_osid  integer,
cur_provinceid  integer,
deviceproductoffset  long,
duration  integer,
eventcount  integer,
eventlabelid  integer,
eventtypeid  integer,
organizationid  integer,
platformid  integer,
productid  integer,
relatedaccountproductoffset  long,
sessionduration  integer,
sessionid  string,
sessionstarttime  long,
sessionstatus  integer,
sourceid  integer,
starttime  long,
starttimestr  string )
partitioned by (eventid int)
STORED BY 'carbondata'
TBLPROPERTIES ('partition_type'='Hash','NUM_PARTITIONS'='39',
'SORT_COLUMNS'='productid,sourceid,starttimestr,platformid,organizationid,eventtypeid,eventlabelid,cur_channelid,cur_provinceid,cur_countryid,cur_cityid,cur_osid,cur_appversioncode,cur_appversionname,cur_carrierid,cur_networkid,cur_browserid,sessionstatus,cur_ip');



The value of "starttimestr" field:
20180303
20180304.




any advice is appreciated!





the carbondata version is :
apache-carbondata-1.3.1-bin-spark2.2.1-hadoop2.7.2.jar


spark version is :
spark-2.2.1-bin-hadoop2.7

?????? query on string type return error

2018-04-17 Thread ??????
hi, liang chen.
I start thriftserver, then use beeline to execute this sql.  I use "insert into 
XXX select * from a_parquet_table" to load data.
I deploy a yarn cluster.


Because I can not find what's the problem, I  use "insert overwrite" to load 
data again, then the problem disappear.




--  --
??: "Liang Chen";
: 2018??4??16??(??) 3:51
??: "dev";

: Re: query on string type return error



Hi

From the log message, seems like can't find the data files.
Can you provide more detail info : 
1. How you created carbonsession and how loaded data.
2. Have you deployed cluster or only single machine?

Regards
Liang


?? wrote
> hi all, when I use carbondata to run a query "select count(*) from
> action_carbondata where starttimestr = 20180301;", then an error occurs.
> This is the error info:
> ###
> 0: jdbc:hive2://localhost:1> select count(*) from action_carbondata
> where starttimestr = 20180301;
> Error: org.apache.spark.SparkException: Job aborted due to stage failure:
> Task 12 in stage 7.0 failed 4 times, most recent failure: Lost task 12.3
> in stage 7.0 (TID 173, sz-pg-entanalytics-research-001.tendcloud.com,
> executor 1): org.apache.spark.util.TaskCompletionListenerException:
> org.apache.carbondata.core.scan.executor.exception.QueryExecutionException:
> 
> 
> Previous exception in task: java.util.concurrent.ExecutionException:
> java.util.concurrent.ExecutionException: java.io.IOException:
> org.apache.thrift.protocol.TProtocolException: Required field
> 'data_chunk_list' was not present! Struct:
> DataChunk3(data_chunk_list:null)
> 
> org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.updateScanner(AbstractDataBlockIterator.java:136)
> 
> org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.processNextBatch(DataBlockIteratorImpl.java:64)
> 
> org.apache.carbondata.core.scan.result.iterator.VectorDetailQueryResultIterator.processNextBatch(VectorDetailQueryResultIterator.java:46)
> 
> org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextBatch(VectorizedCarbonRecordReader.java:283)
> 
> org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextKeyValue(VectorizedCarbonRecordReader.java:171)
> 
> org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:391)
> 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown
> Source)
> 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown
> Source)
> 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
> Source)
> 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
>   scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
> 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>   org.apache.spark.scheduler.Task.run(Task.scala:108)
>   org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   java.lang.Thread.run(Thread.java:745)
>   at
> org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138)
>   at
> org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116)
>   at org.apache.spark.scheduler.Task.run(Task.scala:118)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
>   at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> 
> 
> Driver stacktrace: (state=,code=0)
> 
> ###
> 
> 
> create table statement:
> CREATE TABLE action_carbondata(
> cur_appversioncode  integer,
> cur_appversionname  integer,
> cur_browserid  integer,
> cur_carrierid  integer,
> cur_channelid  integer,
> cur_cityid  integer,
> cur_countryid  integer,
> cur_ip  string,
> cur_networkid  integer,
> cur_osid  integer,
> cur_provinceid  integer,
> deviceproductoffset  long,
&

?????? query on string type return error

2018-04-17 Thread ??????
I use apache-carbondata-1.3.1-bin-spark2.2.1-hadoop2.7.2.jar, which is 
downloaded from website. I did not build myself.
So I don't know thrift version. I don't update carbon version.




--  --
??: "xuchuanyin";
: 2018??4??16??(??) 7:04
??: "carbondata";

????: Re: query on string type return error



I think the problem may be metadata related. What's your thrift version?? Have 
you update carbon version recently after the data is loaded?? FROM MOBILE EMAIL 
CLIENT On 04/16/2018 15:51, Liang Chen wrote: Hi From the log message, seems 
like can't find the data files. Can you provide more detail info : 1. How you 
created carbonsession and how loaded data. 2. Have you deployed cluster or only 
single machine? Regards Liang ?? wrote > hi all, when I use carbondata to 
run a query "select count(*) from > action_carbondata where starttimestr = 
20180301;", then an error occurs. > This is the error info: > 
### > 0: jdbc:hive2://localhost:1> select count(*) from 
action_carbondata > where starttimestr = 20180301; > Error: 
org.apache.spark.SparkException: Job aborted due to stage failure: > Task 12 in 
stage 7.0 failed 4 times, most recent failure: Lost task 12.3 > in stage 7.0 
(TID 173, sz-pg-entanalytics-research-001.tendcloud.com, > executor 1): 
org.apache.spark.util.TaskCompletionListenerException: > 
org.apache.carbondata.core.scan.executor.exception.QueryExecutionException: > > 
> Previous exception in task: java.util.concurrent.ExecutionException: > 
java.util.concurrent.ExecutionException: java.io.IOException: > 
org.apache.thrift.protocol.TProtocolException: Required field > 
'data_chunk_list' was not present! Struct: > DataChunk3(data_chunk_list:null) > 
> 
org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.updateScanner(AbstractDataBlockIterator.java:136)
 > > 
org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.processNextBatch(DataBlockIteratorImpl.java:64)
 > > 
org.apache.carbondata.core.scan.result.iterator.VectorDetailQueryResultIterator.processNextBatch(VectorDetailQueryResultIterator.java:46)
 > > 
org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextBatch(VectorizedCarbonRecordReader.java:283)
 > > 
org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextKeyValue(VectorizedCarbonRecordReader.java:171)
 > > 
org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:391)
 > > 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown
 > Source) > > 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown
 > Source) > > 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
 > Source) > > 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
 > > 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
 >  scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > > 
org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
 > > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) 
> > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) 
>  org.apache.spark.scheduler.Task.run(Task.scala:108) >  
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) > > 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
> > 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
>  java.lang.Thread.run(Thread.java:745) >  at > 
org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138) >   
   at > 
org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116) > 
 at org.apache.spark.scheduler.Task.run(Task.scala:118) >  at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) >  at 
> 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
>  at > 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
>  at java.lang.Thread.run(Thread.java:745) > > > Driver stacktrace: 
(state=,code=0) > > ### > > > create table statement: > CREATE 
TABLE action_carbondata( > cur_appversioncode  integer, > cur_appversionname  
integer, > cur_browserid  integer, > cur_carrierid  integer, > cur_channelid  
integer, > cur_cityid  integer, > cur_countryid  integer, > cur_i

Re: query on string type return error

2018-04-16 Thread Liang Chen
Hi

>From the log message, seems like can't find the data files.
Can you provide more detail info : 
1. How you created carbonsession and how loaded data.
2. Have you deployed cluster or only single machine?

Regards
Liang


喜之郎 wrote
> hi all, when I use carbondata to run a query "select count(*) from
> action_carbondata where starttimestr = 20180301;", then an error occurs.
> This is the error info:
> ###
> 0: jdbc:hive2://localhost:1> select count(*) from action_carbondata
> where starttimestr = 20180301;
> Error: org.apache.spark.SparkException: Job aborted due to stage failure:
> Task 12 in stage 7.0 failed 4 times, most recent failure: Lost task 12.3
> in stage 7.0 (TID 173, sz-pg-entanalytics-research-001.tendcloud.com,
> executor 1): org.apache.spark.util.TaskCompletionListenerException:
> org.apache.carbondata.core.scan.executor.exception.QueryExecutionException:
> 
> 
> Previous exception in task: java.util.concurrent.ExecutionException:
> java.util.concurrent.ExecutionException: java.io.IOException:
> org.apache.thrift.protocol.TProtocolException: Required field
> 'data_chunk_list' was not present! Struct:
> DataChunk3(data_chunk_list:null)
> 
> org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.updateScanner(AbstractDataBlockIterator.java:136)
> 
> org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.processNextBatch(DataBlockIteratorImpl.java:64)
> 
> org.apache.carbondata.core.scan.result.iterator.VectorDetailQueryResultIterator.processNextBatch(VectorDetailQueryResultIterator.java:46)
> 
> org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextBatch(VectorizedCarbonRecordReader.java:283)
> 
> org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextKeyValue(VectorizedCarbonRecordReader.java:171)
> 
> org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:391)
> 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown
> Source)
> 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown
> Source)
> 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
> Source)
> 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
>   scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
> 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>   org.apache.spark.scheduler.Task.run(Task.scala:108)
>   org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   java.lang.Thread.run(Thread.java:745)
>   at
> org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138)
>   at
> org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116)
>   at org.apache.spark.scheduler.Task.run(Task.scala:118)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
>   at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> 
> 
> Driver stacktrace: (state=,code=0)
> 
> ###
> 
> 
> create table statement:
> CREATE TABLE action_carbondata(
> cur_appversioncode  integer,
> cur_appversionname  integer,
> cur_browserid  integer,
> cur_carrierid  integer,
> cur_channelid  integer,
> cur_cityid  integer,
> cur_countryid  integer,
> cur_ip  string,
> cur_networkid  integer,
> cur_osid  integer,
> cur_provinceid  integer,
> deviceproductoffset  long,
> duration  integer,
> eventcount  integer,
> eventlabelid  integer,
> eventtypeid  integer,
> organizationid  integer,
> platformid  integer,
> productid  integer,
> relatedaccountproductoffset  long,
> sessionduration  integer,
> sessionid  string,
> sessionstarttime  long,
> sessionstatus  integer,
> sourceid  integer,
> starttime  long,
> starttimestr  string )
> partitioned by (eventid int)
> STORED BY 'carbondata'
> TBLPROPERTIES ('partition_type'='Hash','NUM_PARTITIONS'='39',
> 'SORT_COLUMNS'='productid,sourceid,starttimestr,platformid,organizationid,eventtypeid,eventlabelid,cur_channelid,cur_provinceid,cur_countryid,cur_cityid,cur_osid,cur_appversioncode,cur_appversionname,cur_carrierid,cur_networkid,cur_browserid,sessionstatus,cur_ip');
> 
> 
> 
> The value of "starttimestr" field:
> 20180303
> 201803

Re: query on string type return error

2018-04-16 Thread xuchuanyin
I think the problem may be metadata related. What's your thrift version? Have 
you update carbon version recently after the data is loaded? FROM MOBILE EMAIL 
CLIENT On 04/16/2018 15:51, Liang Chen wrote: Hi From the log message, seems 
like can't find the data files. Can you provide more detail info : 1. How you 
created carbonsession and how loaded data. 2. Have you deployed cluster or only 
single machine? Regards Liang 喜之郎 wrote > hi all, when I use carbondata to run 
a query "select count(*) from > action_carbondata where starttimestr = 
20180301;", then an error occurs. > This is the error info: > 
### > 0: jdbc:hive2://localhost:1> select count(*) from 
action_carbondata > where starttimestr = 20180301; > Error: 
org.apache.spark.SparkException: Job aborted due to stage failure: > Task 12 in 
stage 7.0 failed 4 times, most recent failure: Lost task 12.3 > in stage 7.0 
(TID 173, sz-pg-entanalytics-research-001.tendcloud.com, > executor 1): 
org.apache.spark.util.TaskCompletionListenerException: > 
org.apache.carbondata.core.scan.executor.exception.QueryExecutionException: > > 
> Previous exception in task: java.util.concurrent.ExecutionException: > 
java.util.concurrent.ExecutionException: java.io.IOException: > 
org.apache.thrift.protocol.TProtocolException: Required field > 
'data_chunk_list' was not present! Struct: > DataChunk3(data_chunk_list:null) > 
> 
org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.updateScanner(AbstractDataBlockIterator.java:136)
 > > 
org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.processNextBatch(DataBlockIteratorImpl.java:64)
 > > 
org.apache.carbondata.core.scan.result.iterator.VectorDetailQueryResultIterator.processNextBatch(VectorDetailQueryResultIterator.java:46)
 > > 
org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextBatch(VectorizedCarbonRecordReader.java:283)
 > > 
org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextKeyValue(VectorizedCarbonRecordReader.java:171)
 > > 
org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:391)
 > > 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown
 > Source) > > 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown
 > Source) > > 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
 > Source) > > 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
 > > 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
 >      scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > > 
org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
 > > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) 
> > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) 
>      org.apache.spark.scheduler.Task.run(Task.scala:108) >      
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) > > 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
> > 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
>      java.lang.Thread.run(Thread.java:745) >      at > 
org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138) >   
   at > 
org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116) > 
     at org.apache.spark.scheduler.Task.run(Task.scala:118) >      at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) >      at 
> 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
>      at > 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
>      at java.lang.Thread.run(Thread.java:745) > > > Driver stacktrace: 
(state=,code=0) > > ### > > > create table statement: > CREATE 
TABLE action_carbondata( > cur_appversioncode  integer, > cur_appversionname  
integer, > cur_browserid  integer, > cur_carrierid  integer, > cur_channelid  
integer, > cur_cityid  integer, > cur_countryid  integer, > cur_ip  string, > 
cur_networkid  integer, > cur_osid  integer, > cur_provinceid  integer, > 
deviceproductoffset  long, > duration  integer, > eventcount  integer, > 
eventlabelid  integer, > eventtypeid  integer, > organizationid  integer, > 
platformid  integer, > productid  integer, > relatedaccountproductoffset  long, 
> sessionduration  integer, > sessionid  string, > sessionstarttime  long, > 
sessionstatus  integer, > sourceid  integer, > starttime  long, > starttimestr  
string ) > partitioned by (eventid int) > STORED BY 'carbondata' > 
TBLPROPERTIES ('partition_type'='Hash','NUM_PARTITIONS'='39', > 
'SORT_COLUMNS'='productid,sourceid,starttimestr,platformid,organizationid,eventtypeid,eventlabe

Re: query on string type return error

2021-04-19 Thread Yahui Liu
I encounter the same error in later carbon version. Have you find the root
cause and solution for this error?



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/