Hi dev,Apache CarbonData CI now is working for auto-checking all PRs

2016-12-05 Thread Liang Chen
Hi dev

Apache CarbonData CI now is working for auto-checking all PRs.

This is a job in Jenkins CI with name ApacheCarbonPRBuilder, which is
running in cloud machine machine with IP http://136.243.101.176:8080/ ,
anybody can access this machine and check the build status and result.


   - When a new pull request is opened in the project and the author name
   is in whitelist, the PR will be auto-checking automatically.


   - When a new pull request is opened in the project and the author of the
   pull request isn't whitelist, CarbonDataQA will ask "Can one of the admins
   verify this patch and trigger the CI checking?".

Once the build is finished, the status of the build is commented on the
same pull request as follows

●  Build Success

Build Success, Please check CI
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/33/



●  Build Fail

Build Failed, Please check CI
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/33/


Regards
Liang


[jira] [Created] (CARBONDATA-496) Implement unit test cases for core.carbon.datastore package

2016-12-05 Thread Anurag Srivastava (JIRA)
Anurag Srivastava created CARBONDATA-496:


 Summary: Implement unit test cases for core.carbon.datastore 
package
 Key: CARBONDATA-496
 URL: https://issues.apache.org/jira/browse/CARBONDATA-496
 Project: CarbonData
  Issue Type: Test
Reporter: Anurag Srivastava
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: About measure in carbon

2016-12-05 Thread manishgupta88
Hi Ravi,

Currently we are using hive parser to parse carbon create table DDL. Hive
does not support short datatype instead it supports short and byte with a
different name as smallInt(16 bit) and tinyInt(8 bit) data types. But in
carbon we do not support these datatypes.

Do we need to handle these datatypes in carbon?. If yes then I can create a
jira for this issue and track this feature.

Regards
Manish Gupta



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/About-measure-in-carbon-tp3627p3802.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


Re: select return error when filter string column in where clause

2016-12-05 Thread Lu Cao
Attach executor log:

#query successed
cc.sql("select count(*) from default.carbontest_003 where id =
'LSJW26761FS090787'").show

#and 'LSJW26760GS018559',  'LSJW26761ES064611' also successed

#query return error
cc.sql("select count(*) from default.carbontest_003 where id =
'LSJW26763FS088491'").show


===

INFO  06-12 12:20:54,821 - Got assigned task 651

INFO  06-12 12:20:54,822 - Running task 0.0 in stage 51.0 (TID 651)

INFO  06-12 12:20:54,830 - Updating epoch to 12 and clearing cache

INFO  06-12 12:20:54,830 - Started reading broadcast variable 43

INFO  06-12 12:20:54,837 - Block broadcast_43_piece0 stored as bytes
in memory (estimated size 8.6 KB, free 8.6 KB)

INFO  06-12 12:20:54,840 - Reading broadcast variable 43 took 10 ms

INFO  06-12 12:20:54,853 - Block broadcast_43 stored as values in
memory (estimated size 17.7 KB, free 26.3 KB)

INFO  06-12 12:20:54,868 - *carbon.properties

INFO  06-12 12:20:54,869 - [Executor task launch
worker-2][partitionID:003;queryID:4305874916432320_0] Query will be
executed on table: carbontest_003

ERROR 06-12 12:20:54,869 - [Executor task launch
worker-2][partitionID:003;queryID:4305874916432320_0]

java.lang.NullPointerException

at 
org.apache.carbondata.scan.result.iterator.AbstractDetailQueryResultIterator.intialiseInfos(AbstractDetailQueryResultIterator.java:117)

at 
org.apache.carbondata.scan.result.iterator.AbstractDetailQueryResultIterator.(AbstractDetailQueryResultIterator.java:107)

at 
org.apache.carbondata.scan.result.iterator.DetailQueryResultIterator.(DetailQueryResultIterator.java:43)

at 
org.apache.carbondata.scan.executor.impl.DetailQueryExecutor.execute(DetailQueryExecutor.java:39)

at 
org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.(CarbonScanRDD.scala:216)

at 
org.apache.carbondata.spark.rdd.CarbonScanRDD.compute(CarbonScanRDD.scala:192)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)

at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)

at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)

at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)

at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)

at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)

at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

at org.apache.spark.scheduler.Task.run(Task.scala:89)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)

at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

ERROR 06-12 12:20:54,871 - Exception in task 0.0 in stage 51.0 (TID 651)

java.lang.RuntimeException: Exception occurred in query
execution.Please check logs.

at scala.sys.package$.error(package.scala:27)

at 
org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.(CarbonScanRDD.scala:226)

at 
org.apache.carbondata.spark.rdd.CarbonScanRDD.compute(CarbonScanRDD.scala:192)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)

at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)

at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)

at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)

at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)

at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)

at 

Re: select return error when filter string column in where clause

2016-12-05 Thread Ravindra Pesala
Hi,

Please provide table schema, load command and sample data to reproduce this
issue, you may create the JIRA for it.

Regards,
Ravi

On 6 December 2016 at 07:05, Lu Cao  wrote:

> Hi Dev team,
> I have loaded some data into carbondata table. But when I put the id
> column(String type) in where clause it always return error as below:
>
> cc.sql("select to_date(data_date),count(*) from default.carbontest_001
> where id='LSJW26762FS044062' group by to_date(data_date)").show
>
>
>
> ===
> WARN  06-12 09:02:13,763 - Lost task 5.0 in stage 44.0 (TID 687,
> .com): java.lang.RuntimeException: Exception occurred in query
> execution.Please check logs.
> at scala.sys.package$.error(package.scala:27)
> at org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.<
> init>(CarbonScanRDD.scala:226)
> at org.apache.carbondata.spark.rdd.CarbonScanRDD.compute(
> CarbonScanRDD.scala:192)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(
> MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(
> MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(
> MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(
> MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(
> MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> ShuffleMapTask.scala:73)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
> ERROR 06-12 09:02:14,091 - Task 1 in stage 44.0 failed 4 times; aborting
> job
> org.apache.spark.SparkException: Job aborted due to stage failure:
> Task 1 in stage 44.0 failed 4 times, most recent failure: Lost task
> 1.3 in stage 44.0 (TID 694, scsp00258.saicdt.com):
> java.lang.RuntimeException: Exception occurred in query
> execution.Please check logs.
> at scala.sys.package$.error(package.scala:27)
> at org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.<
> init>(CarbonScanRDD.scala:226)
> at org.apache.carbondata.spark.rdd.CarbonScanRDD.compute(
> CarbonScanRDD.scala:192)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(
> MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(
> MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(
> MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(
> MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(
> MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> ShuffleMapTask.scala:73)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
> Driver stacktrace:
> at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$
> 

select return error when filter string column in where clause

2016-12-05 Thread Lu Cao
Hi Dev team,
I have loaded some data into carbondata table. But when I put the id
column(String type) in where clause it always return error as below:

cc.sql("select to_date(data_date),count(*) from default.carbontest_001
where id='LSJW26762FS044062' group by to_date(data_date)").show



===
WARN  06-12 09:02:13,763 - Lost task 5.0 in stage 44.0 (TID 687,
.com): java.lang.RuntimeException: Exception occurred in query
execution.Please check logs.
at scala.sys.package$.error(package.scala:27)
at 
org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.(CarbonScanRDD.scala:226)
at 
org.apache.carbondata.spark.rdd.CarbonScanRDD.compute(CarbonScanRDD.scala:192)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

ERROR 06-12 09:02:14,091 - Task 1 in stage 44.0 failed 4 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure:
Task 1 in stage 44.0 failed 4 times, most recent failure: Lost task
1.3 in stage 44.0 (TID 694, scsp00258.saicdt.com):
java.lang.RuntimeException: Exception occurred in query
execution.Please check logs.
at scala.sys.package$.error(package.scala:27)
at 
org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.(CarbonScanRDD.scala:226)
at 
org.apache.carbondata.spark.rdd.CarbonScanRDD.compute(CarbonScanRDD.scala:192)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at 

[jira] [Created] (CARBONDATA-495) Unify compressor interface

2016-12-05 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-495:
---

 Summary: Unify compressor interface
 Key: CARBONDATA-495
 URL: https://issues.apache.org/jira/browse/CARBONDATA-495
 Project: CarbonData
  Issue Type: Bug
Affects Versions: 0.2.0-incubating
Reporter: Jacky Li
Assignee: Jacky Li
 Fix For: 1.0.0-incubating


Use factory for compressor to unify the interface and reduce small objects



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-494) Implement unit test cases for filter.executer package

2016-12-05 Thread Anurag Srivastava (JIRA)
Anurag Srivastava created CARBONDATA-494:


 Summary: Implement unit test cases for filter.executer package
 Key: CARBONDATA-494
 URL: https://issues.apache.org/jira/browse/CARBONDATA-494
 Project: CarbonData
  Issue Type: Test
Reporter: Anurag Srivastava
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)