[GitHub] carbondata issue #1525: [CARBONDATA-1751] Make the type of exception and mes...

2017-11-19 Thread xubo245
Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1525
  
Erlu Chen and Yadong Qi have reivewed this PR, please review it again 
@jackylk 


---


[jira] [Updated] (CARBONDATA-1777) Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables creation in Spark-shell sessions are not used in the beeline session

2017-11-19 Thread Ramakrishna S (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramakrishna S updated CARBONDATA-1777:
--
Description: 
Steps:
Beeline:
1. Create table and load with  data
Spark-shell:
1. create a pre-aggregate table
Beeline:
1. Run aggregate query

*+Expected:+* Pre-aggregate table should be used in the aggregate query 
*+Actual:+* Pre-aggregate table is not used


1.
create table if not exists lineitem1(L_SHIPDATE string,L_SHIPMODE 
string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
'org.apache.carbondata.format' TBLPROPERTIES 
('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
load data inpath "hdfs://hacluster/user/test/lineitem.tbl.5" into table 
lineitem1 
options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');

2. 

 carbon.sql("create datamap agr1_lineitem1 ON TABLE lineitem1 USING 
'org.apache.carbondata.datamap.AggregateDataMapHandler' as select 
l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
from lineitem1 group by l_returnflag, l_linestatus").show();

3. 
select 
l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus;

Actual:
0: jdbc:hive2://10.18.98.136:23040> show tables;
+---+---+--+--+
| database  | tableName | isTemporary  |
+---+---+--+--+
| test_db2  | lineitem1 | false|
| test_db2  | lineitem1_agr1_lineitem1  | false|
+---+---+--+--+
2 rows selected (0.047 seconds)

Logs:
2017-11-20 15:46:48,314 | INFO  | [pool-23-thread-53] | Running query 'select 
l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus' 
with 7f3091a8-4d7b-40ac-840f-9db6f564c9cf | 
org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
2017-11-20 15:46:48,314 | INFO  | [pool-23-thread-53] | Parsing command: select 
l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus | 
org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
2017-11-20 15:46:48,353 | INFO  | [pool-23-thread-53] | 55: get_table : 
db=test_db2 tbl=lineitem1 | 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logInfo(HiveMetaStore.java:746)
2017-11-20 15:46:48,353 | INFO  | [pool-23-thread-53] | ugi=anonymous   
ip=unknown-ip-addr  cmd=get_table : db=test_db2 tbl=lineitem1| 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logAuditEvent(HiveMetaStore.java:371)
2017-11-20 15:46:48,354 | INFO  | [pool-23-thread-53] | 55: Opening raw store 
with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore | 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:589)
2017-11-20 15:46:48,355 | INFO  | [pool-23-thread-53] | ObjectStore, initialize 
called | 
org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:289)
2017-11-20 15:46:48,360 | INFO  | [pool-23-thread-53] | Reading in results for 
query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used 
is closing | org.datanucleus.util.Log4JLogger.info(Log4JLogger.java:77)
2017-11-20 15:46:48,362 | INFO  | [pool-23-thread-53] | Using direct SQL, 
underlying DB is MYSQL | 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.(MetaStoreDirectSql.java:139)
2017-11-20 15:46:48,362 | INFO  | [pool-23-thread-53] | Initialized ObjectStore 
| org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:272)
2017-11-20 15:46:48,376 | INFO  | [pool-23-thread-53] | Parsing command: 
array | 
org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
2017-11-20 15:46:48,399 | INFO  | [pool-23-thread-53] | Schema changes have 
been detected for table: `lineitem1` | 
org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
2017-11-20 15:46:48,399 | INFO  | [pool-23-thread-53] | 55: get_table : 
db=test_db2 tbl=lineitem1 | 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logInfo(HiveMetaStore.java:746)
2017-11-20 15:46:48,400 | INFO  | [pool-23-thread-53] | ugi=anonymous   
ip=unknown-ip-addr  cmd=get_table : db=test_db2 tbl=lineitem1| 
org

[jira] [Updated] (CARBONDATA-1777) Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables created in Spark-shell sessions are not used in the beeline session

2017-11-19 Thread Ramakrishna S (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramakrishna S updated CARBONDATA-1777:
--
Summary: Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables created in 
Spark-shell sessions are not used in the beeline session  (was: 
Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables creation in Spark-shell 
sessions are not used in the beeline session)

> Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables created in Spark-shell 
> sessions are not used in the beeline session
> -
>
> Key: CARBONDATA-1777
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1777
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: Kunal Kapoor
>  Labels: DFX
> Fix For: 1.3.0
>
>
> Steps:
> Beeline:
> 1. Create table and load with  data
> Spark-shell:
> 1. create a pre-aggregate table
> Beeline:
> 1. Run aggregate query
> *+Expected:+* Pre-aggregate table should be used in the aggregate query 
> *+Actual:+* Pre-aggregate table is not used
> 1.
> create table if not exists lineitem1(L_SHIPDATE string,L_SHIPMODE 
> string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
> string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
> int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
> double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.5" into table 
> lineitem1 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> 2. 
>  carbon.sql("create datamap agr1_lineitem1 ON TABLE lineitem1 USING 
> 'org.apache.carbondata.datamap.AggregateDataMapHandler' as select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 group by l_returnflag, l_linestatus").show();
> 3. 
> select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus;
> Actual:
> 0: jdbc:hive2://10.18.98.136:23040> show tables;
> +---+---+--+--+
> | database  | tableName | isTemporary  |
> +---+---+--+--+
> | test_db2  | lineitem1 | false|
> | test_db2  | lineitem1_agr1_lineitem1  | false|
> +---+---+--+--+
> 2 rows selected (0.047 seconds)
> Logs:
> 2017-11-20 15:46:48,314 | INFO  | [pool-23-thread-53] | Running query 'select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus' 
> with 7f3091a8-4d7b-40ac-840f-9db6f564c9cf | 
> org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
> 2017-11-20 15:46:48,314 | INFO  | [pool-23-thread-53] | Parsing command: 
> select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus | 
> org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
> 2017-11-20 15:46:48,353 | INFO  | [pool-23-thread-53] | 55: get_table : 
> db=test_db2 tbl=lineitem1 | 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logInfo(HiveMetaStore.java:746)
> 2017-11-20 15:46:48,353 | INFO  | [pool-23-thread-53] | ugi=anonymous 
> ip=unknown-ip-addr  cmd=get_table : db=test_db2 tbl=lineitem1| 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logAuditEvent(HiveMetaStore.java:371)
> 2017-11-20 15:46:48,354 | INFO  | [pool-23-thread-53] | 55: Opening raw store 
> with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore | 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:589)
> 2017-11-20 15:46:48,355 | INFO  | [pool-23-thread-53] | ObjectStore, 
> initialize called | 
> org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:289)
> 2017-11-20 15:46:48,360 | INFO  | [pool-23-thread-53] | Reading in results 
> for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection 
> used is closing | org.datanucleus.util.Log4JLogge

[jira] [Commented] (CARBONDATA-1516) Support pre-aggregate tables and timeseries in carbondata

2017-11-19 Thread zhouguangcheng (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258902#comment-16258902
 ] 

zhouguangcheng commented on CARBONDATA-1516:


[~kumarvishal]
can we add one more command to list all the pre-agg table, it is a useful 
command. because after create may pre-agg tables, use want to know all the 
cubes created and easy to maintenance the pre-agg tables. 

> Support pre-aggregate tables and timeseries in carbondata
> -
>
> Key: CARBONDATA-1516
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1516
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Ravindra Pesala
> Attachments: CarbonData Pre-aggregation Table.pdf, CarbonData 
> Pre-aggregation Table_v1.1.pdf
>
>
> Currently Carbondata has standard SQL capability on distributed data 
> sets.Carbondata should support pre-aggregating tables for timeseries and 
> improve query performance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1777) Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables creation in Spark-shell sessions are not used in the beeline session

2017-11-19 Thread Ramakrishna S (JIRA)
Ramakrishna S created CARBONDATA-1777:
-

 Summary: Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables 
creation in Spark-shell sessions are not used in the beeline session
 Key: CARBONDATA-1777
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1777
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 1.3.0
 Environment: Test - 3 node ant cluster
Reporter: Ramakrishna S
Assignee: Kunal Kapoor
 Fix For: 1.3.0


Steps:
1. Create table and load with  data
2. Run update query on the table - this will take table metalock
3. In parallel run the pre-aggregate table create step - this will not be 
allowed due to table lock
4. Rerun pre-aggegate table create step

*+Expected:+* Pre-aggregate table should be created 
*+Actual:+* Pre-aggregate table creation fails

+Create, Load & Update+:
0: jdbc:hive2://10.18.98.136:23040> create table if not exists 
lineitem4(L_SHIPDATE string,L_SHIPMODE string,L_SHIPINSTRUCT 
string,L_RETURNFLAG string,L_RECEIPTDATE string,L_ORDERKEY string,L_PARTKEY 
string,L_SUPPKEY   string,L_LINENUMBER int,L_QUANTITY double,L_EXTENDEDPRICE 
double,L_DISCOUNT double,L_TAX double,L_LINESTATUS string,L_COMMITDATE 
string,L_COMMENT  string) STORED BY 'org.apache.carbondata.format' 
TBLPROPERTIES 
('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (0.266 seconds)
0: jdbc:hive2://10.18.98.136:23040> load data inpath 
"hdfs://hacluster/user/test/lineitem.tbl.5" into table lineitem4 
options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (6.331 seconds)
0: jdbc:hive2://10.18.98.136:23040> update lineitem4 set (l_linestatus) = 
('xx');

+Create Datamap:+
0: jdbc:hive2://10.18.98.136:23040> create datamap agr_lineitem4 ON TABLE 
lineitem4 USING "org.apache.carbondata.datamap.AggregateDataMapHandler" as 
select 
l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
from lineitem4  group by l_returnflag, l_linestatus;
Error: java.lang.RuntimeException: Acquire table lock failed after retry, 
please try after some time (state=,code=0)
0: jdbc:hive2://10.18.98.136:23040> select 
l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
from lineitem4 group by l_returnflag, l_linestatus;
+---+---+--+-++--+
| l_returnflag  | l_linestatus  | sum(l_quantity)  |   avg(l_quantity)   | 
count(l_quantity)  |
+---+---+--+-++--+
| N | xx| 1.2863213E7  | 25.48745561614304   | 
504688 |
| A | xx| 6318125.0| 25.506342144783375  | 
247708 |
| R | xx| 6321939.0| 25.532459087898417  | 
247604 |
+---+---+--+-++--+
3 rows selected (1.033 seconds)
0: jdbc:hive2://10.18.98.136:23040> create datamap agr_lineitem4 ON TABLE 
lineitem4 USING "org.apache.carbondata.datamap.AggregateDataMapHandler" as 
select 
l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
from lineitem4  group by l_returnflag, l_linestatus;
Error: java.lang.RuntimeException: Table [lineitem4_agr_lineitem4] already 
exists under database [test_db1] (state=,code=0)




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (CARBONDATA-1516) Support pre-aggregate tables and timeseries in carbondata

2017-11-19 Thread zhouguangcheng (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258900#comment-16258900
 ] 

zhouguangcheng commented on CARBONDATA-1516:


[~kumarvishal]
hi Vishal, if donot support delete segment on the main table, how to support 
the table data retention ? 

> Support pre-aggregate tables and timeseries in carbondata
> -
>
> Key: CARBONDATA-1516
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1516
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Ravindra Pesala
> Attachments: CarbonData Pre-aggregation Table.pdf, CarbonData 
> Pre-aggregation Table_v1.1.pdf
>
>
> Currently Carbondata has standard SQL capability on distributed data 
> sets.Carbondata should support pre-aggregating tables for timeseries and 
> improve query performance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1525: [CARBONDATA-1751] Make the type of exception and mes...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1525
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1306/



---


[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...

2017-11-19 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1516
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/1767/



---


[GitHub] carbondata pull request #1536: [CARBONDATA-1776] Fix some possible test erro...

2017-11-19 Thread xubo245
GitHub user xubo245 opened a pull request:

https://github.com/apache/carbondata/pull/1536

[CARBONDATA-1776] Fix some possible test errors that are related to 
compaction

 Fix some possible test errors that are related to compaction
This PR only changed test class 

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 No
 - [ ] Any backward compatibility impacted?
 No
 - [ ] Document update required?
No
 - [ ] Testing done
only change test class, no new test
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
MR120


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xubo245/carbondata fixUT

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1536.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1536


commit ba60c10dca675c964bab2bbd80f236f254308ea4
Author: xubo245 <601450...@qq.com>
Date:   2017-11-20T07:19:27Z

[CARBONDATA-1776] Fix some possible test error that are related to 
compaction




---


[jira] [Updated] (CARBONDATA-1775) (Carbon1.3.0 - Streaming) Select query fails with java.io.EOFException when data streaming is in progress

2017-11-19 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1775:

Description: 
Steps :
User starts the thrift server using the command - bin/spark-submit --master 
yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G 
--num-executors 3 --class 
org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
 "hdfs://hacluster/user/hive/warehouse/carbon.store"
User connects to spark shell using the command - bin/spark-shell --master 
yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G 
--num-executors 3 --jars 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar

In spark shell User creates a table and does streaming load in the table as per 
the below socket streaming script.
import java.io.{File, PrintWriter}
import java.net.ServerSocket

import org.apache.spark.sql.{CarbonEnv, SparkSession}
import org.apache.spark.sql.hive.CarbonRelation
import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}

import org.apache.carbondata.core.constants.CarbonCommonConstants
import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}

CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
 "/MM/dd")

import org.apache.spark.sql.CarbonSession._

val carbonSession = SparkSession.
  builder().
  appName("StreamExample").
  getOrCreateCarbonSession("hdfs://hacluster/user/hive/warehouse/david")
   
carbonSession.sparkContext.setLogLevel("INFO")

def sql(sql: String) = carbonSession.sql(sql)

def writeSocket(serverSocket: ServerSocket): Thread = {
  val thread = new Thread() {
override def run(): Unit = {
  // wait for client to connection request and accept
  val clientSocket = serverSocket.accept()
  val socketWriter = new PrintWriter(clientSocket.getOutputStream())
  var index = 0
  for (_ <- 1 to 1000) {
// write 5 records per iteration
for (_ <- 0 to 100) {
  index = index + 1
  socketWriter.println(index.toString + ",name_" + index
   + ",city_" + index + "," + (index * 
1.00).toString +
   ",school_" + index + ":school_" + index + index 
+ "$" + index)
}
socketWriter.flush()
Thread.sleep(2000)
  }
  socketWriter.close()
  System.out.println("Socket closed")
}
  }
  thread.start()
  thread
}
  
def startStreaming(spark: SparkSession, tablePath: CarbonTablePath, tableName: 
String, port: Int): Thread = {
val thread = new Thread() {
  override def run(): Unit = {
var qry: StreamingQuery = null
try {
  val readSocketDF = spark.readStream
.format("socket")
.option("host", "10.18.98.34")
.option("port", port)
.load()

  qry = readSocketDF.writeStream
.format("carbondata")
.trigger(ProcessingTime("5 seconds"))
.option("checkpointLocation", tablePath.getStreamingCheckpointDir)
.option("tablePath", tablePath.getPath).option("tableName", 
tableName)
.start()

  qry.awaitTermination()
} catch {
  case ex: Throwable =>
ex.printStackTrace()
println("Done reading and writing streaming data")
} finally {
  qry.stop()
}
  }
}
thread.start()
thread
}

val streamTableName = "stream_table"

sql(s"CREATE TABLE $streamTableName (id INT,name STRING,city STRING,salary 
FLOAT) STORED BY 'carbondata' TBLPROPERTIES('streaming'='true', 
'sort_columns'='name')")

sql(s"LOAD DATA LOCAL INPATH 'hdfs://hacluster/tmp/streamSample.csv' INTO TABLE 
$streamTableName OPTIONS('HEADER'='true')")

sql(s"select * from $streamTableName").show

val carbonTable = CarbonEnv.getInstance(carbonSession).carbonMetastore.
  lookupRelation(Some("default"), 
streamTableName)(carbonSession).asInstanceOf[CarbonRelation].carbonTable

val tablePath = 
CarbonStorePath.getCarbonTablePath(carbonTable.getAbsoluteTableIdentifier)

val port = 7995
val serverSocket = new ServerSocket(port)
val socketThread = writeSocket(serverSocket)
val streamingThread = startStreaming(carbonSession, tablePath, streamTableName, 
port)

While load is in progress user executes select query on the streaming table 
from beeline.
0: jdbc:hive2://10.18.98.34:23040> select * from stream_table;

*Issue : The Select query fails with  java.io.EOFException when socket 
streaming is in progress.*
0: jdbc:hive2://10.18.98.34:23040> select * from stream_table;
Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 
3 in stage 1.0 failed

[jira] [Updated] (CARBONDATA-1775) (Carbon1.3.0 - Streaming) Select query fails with java.io.EOFException when data streaming is in progress

2017-11-19 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1775:

Description: 
Steps :
User starts the thrift server using the command - bin/spark-submit --master 
yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G 
--num-executors 3 --class 
org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
 "hdfs://hacluster/user/hive/warehouse/carbon.store"
User connects to spark shell using the command - bin/spark-shell --master 
yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G 
--num-executors 3 --jars 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar

In spark shell User creates a table and does streaming load in the table as per 
the below socket streaming script.
import java.io.{File, PrintWriter}
import java.net.ServerSocket

import org.apache.spark.sql.{CarbonEnv, SparkSession}
import org.apache.spark.sql.hive.CarbonRelation
import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}

import org.apache.carbondata.core.constants.CarbonCommonConstants
import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}

CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
 "/MM/dd")

import org.apache.spark.sql.CarbonSession._

val carbonSession = SparkSession.
  builder().
  appName("StreamExample").
  getOrCreateCarbonSession("hdfs://hacluster/user/hive/warehouse/david")
   
carbonSession.sparkContext.setLogLevel("INFO")

def sql(sql: String) = carbonSession.sql(sql)

def writeSocket(serverSocket: ServerSocket): Thread = {
  val thread = new Thread() {
override def run(): Unit = {
  // wait for client to connection request and accept
  val clientSocket = serverSocket.accept()
  val socketWriter = new PrintWriter(clientSocket.getOutputStream())
  var index = 0
  for (_ <- 1 to 1000) {
// write 5 records per iteration
for (_ <- 0 to 100) {
  index = index + 1
  socketWriter.println(index.toString + ",name_" + index
   + ",city_" + index + "," + (index * 
1.00).toString +
   ",school_" + index + ":school_" + index + index 
+ "$" + index)
}
socketWriter.flush()
Thread.sleep(2000)
  }
  socketWriter.close()
  System.out.println("Socket closed")
}
  }
  thread.start()
  thread
}
  
def startStreaming(spark: SparkSession, tablePath: CarbonTablePath, tableName: 
String, port: Int): Thread = {
val thread = new Thread() {
  override def run(): Unit = {
var qry: StreamingQuery = null
try {
  val readSocketDF = spark.readStream
.format("socket")
.option("host", "10.18.98.34")
.option("port", port)
.load()

  qry = readSocketDF.writeStream
.format("carbondata")
.trigger(ProcessingTime("5 seconds"))
.option("checkpointLocation", tablePath.getStreamingCheckpointDir)
.option("tablePath", tablePath.getPath).option("tableName", 
tableName)
.start()

  qry.awaitTermination()
} catch {
  case ex: Throwable =>
ex.printStackTrace()
println("Done reading and writing streaming data")
} finally {
  qry.stop()
}
  }
}
thread.start()
thread
}

val streamTableName = "stream_table"

sql(s"CREATE TABLE $streamTableName (id INT,name STRING,city STRING,salary 
FLOAT) STORED BY 'carbondata' TBLPROPERTIES('streaming'='true', 
'sort_columns'='name')")

sql(s"LOAD DATA LOCAL INPATH 'hdfs://hacluster/tmp/streamSample.csv' INTO TABLE 
$streamTableName OPTIONS('HEADER'='true')")

sql(s"select * from $streamTableName").show

val carbonTable = CarbonEnv.getInstance(carbonSession).carbonMetastore.
  lookupRelation(Some("default"), 
streamTableName)(carbonSession).asInstanceOf[CarbonRelation].carbonTable

val tablePath = 
CarbonStorePath.getCarbonTablePath(carbonTable.getAbsoluteTableIdentifier)

val port = 7995
val serverSocket = new ServerSocket(port)
val socketThread = writeSocket(serverSocket)
val streamingThread = startStreaming(carbonSession, tablePath, streamTableName, 
port)

While load is in progress user executes select query on the streaming table 
from beeline.
0: jdbc:hive2://10.18.98.34:23040> select * from stream_table;

*Issue : The Select query fails with  java.io.EOFException when socket 
streaming is in progress.*
0: jdbc:hive2://10.18.98.34:23040> select * from stream_table;
Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 
3 in stage 1.0 failed

[jira] [Created] (CARBONDATA-1776) Fix some possible test error that are related to compaction

2017-11-19 Thread xubo245 (JIRA)
xubo245 created CARBONDATA-1776:
---

 Summary: Fix some possible test error that are related to 
compaction
 Key: CARBONDATA-1776
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1776
 Project: CarbonData
  Issue Type: Bug
  Components: test
Reporter: xubo245
Assignee: xubo245
Priority: Minor
 Fix For: 1.3.0


Fix some possible test error that are related to compaction



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1775) (Carbon1.3.0 - Streaming) Select query fails with java.io.EOFException when socket streaming is in progress

2017-11-19 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1775:

Description: 
Steps :
User starts the thrift server using the command - bin/spark-submit --master 
yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G 
--num-executors 3 --class 
org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
 "hdfs://hacluster/user/hive/warehouse/carbon.store"
User connects to spark shell using the command - bin/spark-shell --master 
yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G 
--num-executors 3 --jars 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar

In spark shell User creates a table and does streaming load in the table as per 
the below socket streaming script.
import java.io.{File, PrintWriter}
import java.net.ServerSocket

import org.apache.spark.sql.{CarbonEnv, SparkSession}
import org.apache.spark.sql.hive.CarbonRelation
import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}

import org.apache.carbondata.core.constants.CarbonCommonConstants
import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}

CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
 "/MM/dd")

import org.apache.spark.sql.CarbonSession._

val carbonSession = SparkSession.
  builder().
  appName("StreamExample").
  getOrCreateCarbonSession("hdfs://hacluster/user/hive/warehouse/david")
   
carbonSession.sparkContext.setLogLevel("INFO")

def sql(sql: String) = carbonSession.sql(sql)

def writeSocket(serverSocket: ServerSocket): Thread = {
  val thread = new Thread() {
override def run(): Unit = {
  // wait for client to connection request and accept
  val clientSocket = serverSocket.accept()
  val socketWriter = new PrintWriter(clientSocket.getOutputStream())
  var index = 0
  for (_ <- 1 to 1000) {
// write 5 records per iteration
for (_ <- 0 to 100) {
  index = index + 1
  socketWriter.println(index.toString + ",name_" + index
   + ",city_" + index + "," + (index * 
1.00).toString +
   ",school_" + index + ":school_" + index + index 
+ "$" + index)
}
socketWriter.flush()
Thread.sleep(2000)
  }
  socketWriter.close()
  System.out.println("Socket closed")
}
  }
  thread.start()
  thread
}
  
def startStreaming(spark: SparkSession, tablePath: CarbonTablePath, tableName: 
String, port: Int): Thread = {
val thread = new Thread() {
  override def run(): Unit = {
var qry: StreamingQuery = null
try {
  val readSocketDF = spark.readStream
.format("socket")
.option("host", "10.18.98.34")
.option("port", port)
.load()

  qry = readSocketDF.writeStream
.format("carbondata")
.trigger(ProcessingTime("5 seconds"))
.option("checkpointLocation", tablePath.getStreamingCheckpointDir)
.option("tablePath", tablePath.getPath).option("tableName", 
tableName)
.start()

  qry.awaitTermination()
} catch {
  case ex: Throwable =>
ex.printStackTrace()
println("Done reading and writing streaming data")
} finally {
  qry.stop()
}
  }
}
thread.start()
thread
}

val streamTableName = "stream_table"

sql(s"CREATE TABLE $streamTableName (id INT,name STRING,city STRING,salary 
FLOAT) STORED BY 'carbondata' TBLPROPERTIES('streaming'='true', 
'sort_columns'='name')")

sql(s"LOAD DATA LOCAL INPATH 'hdfs://hacluster/tmp/streamSample.csv' INTO TABLE 
$streamTableName OPTIONS('HEADER'='true')")

sql(s"select * from $streamTableName").show

val carbonTable = CarbonEnv.getInstance(carbonSession).carbonMetastore.
  lookupRelation(Some("default"), 
streamTableName)(carbonSession).asInstanceOf[CarbonRelation].carbonTable

val tablePath = 
CarbonStorePath.getCarbonTablePath(carbonTable.getAbsoluteTableIdentifier)

val port = 7995
val serverSocket = new ServerSocket(port)
val socketThread = writeSocket(serverSocket)
val streamingThread = startStreaming(carbonSession, tablePath, streamTableName, 
port)

While load is in progress user executes select query on the streaming table 
from beeline.
0: jdbc:hive2://10.18.98.34:23040> select * from stream_table;

Issue : The Select query fails with  java.io.EOFException when socket streaming 
is in progress.
0: jdbc:hive2://10.18.98.34:23040> select * from stream_table;
Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 
3 in stage 1.0 failed 4

[jira] [Updated] (CARBONDATA-1775) (Carbon1.3.0 - Streaming) Select query fails with java.io.EOFException when data streaming is in progress

2017-11-19 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1775:

Summary: (Carbon1.3.0 - Streaming) Select query fails with  
java.io.EOFException when data streaming is in progress  (was: (Carbon1.3.0 - 
Streaming) Select query fails with  java.io.EOFException when socket streaming 
is in progress)

> (Carbon1.3.0 - Streaming) Select query fails with  java.io.EOFException when 
> data streaming is in progress
> --
>
> Key: CARBONDATA-1775
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1775
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: 3 node ant cluster
>Reporter: Chetan Bhat
>  Labels: DFX
>
> Steps :
> User starts the thrift server using the command - bin/spark-submit --master 
> yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G 
> --num-executors 3 --class 
> org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
> /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
>  "hdfs://hacluster/user/hive/warehouse/carbon.store"
> User connects to spark shell using the command - bin/spark-shell --master 
> yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G 
> --num-executors 3 --jars 
> /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
> In spark shell User creates a table and does streaming load in the table as 
> per the below socket streaming script.
> import java.io.{File, PrintWriter}
> import java.net.ServerSocket
> import org.apache.spark.sql.{CarbonEnv, SparkSession}
> import org.apache.spark.sql.hive.CarbonRelation
> import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}
> import org.apache.carbondata.core.constants.CarbonCommonConstants
> import org.apache.carbondata.core.util.CarbonProperties
> import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}
> CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
>  "/MM/dd")
> import org.apache.spark.sql.CarbonSession._
> val carbonSession = SparkSession.
>   builder().
>   appName("StreamExample").
>   getOrCreateCarbonSession("hdfs://hacluster/user/hive/warehouse/david")
>
> carbonSession.sparkContext.setLogLevel("INFO")
> def sql(sql: String) = carbonSession.sql(sql)
> def writeSocket(serverSocket: ServerSocket): Thread = {
>   val thread = new Thread() {
> override def run(): Unit = {
>   // wait for client to connection request and accept
>   val clientSocket = serverSocket.accept()
>   val socketWriter = new PrintWriter(clientSocket.getOutputStream())
>   var index = 0
>   for (_ <- 1 to 1000) {
> // write 5 records per iteration
> for (_ <- 0 to 100) {
>   index = index + 1
>   socketWriter.println(index.toString + ",name_" + index
>+ ",city_" + index + "," + (index * 
> 1.00).toString +
>",school_" + index + ":school_" + index + 
> index + "$" + index)
> }
> socketWriter.flush()
> Thread.sleep(2000)
>   }
>   socketWriter.close()
>   System.out.println("Socket closed")
> }
>   }
>   thread.start()
>   thread
> }
>   
> def startStreaming(spark: SparkSession, tablePath: CarbonTablePath, 
> tableName: String, port: Int): Thread = {
> val thread = new Thread() {
>   override def run(): Unit = {
> var qry: StreamingQuery = null
> try {
>   val readSocketDF = spark.readStream
> .format("socket")
> .option("host", "10.18.98.34")
> .option("port", port)
> .load()
>   qry = readSocketDF.writeStream
> .format("carbondata")
> .trigger(ProcessingTime("5 seconds"))
> .option("checkpointLocation", tablePath.getStreamingCheckpointDir)
> .option("tablePath", tablePath.getPath).option("tableName", 
> tableName)
> .start()
>   qry.awaitTermination()
> } catch {
>   case ex: Throwable =>
> ex.printStackTrace()
> println("Done reading and writing streaming data")
> } finally {
>   qry.stop()
> }
>   }
> }
> thread.start()
> thread
> }
> val streamTableName = "stream_table"
> sql(s"CREATE TABLE $streamTableName (id INT,name STRING,city STRING,salary 
> FLOAT) STORED BY 'carbondata' TBLPROPERTIES('streaming'='true', 
> 'sort_columns'='name')")
> sql(s"LOAD DATA LOCAL INPATH 'hdfs://hacluster/tmp/streamSample.csv' INTO 
> TABLE $s

[jira] [Created] (CARBONDATA-1775) (Carbon1.3.0 - Streaming) Select query fails with java.io.EOFException when socket streaming is in progress

2017-11-19 Thread Chetan Bhat (JIRA)
Chetan Bhat created CARBONDATA-1775:
---

 Summary: (Carbon1.3.0 - Streaming) Select query fails with  
java.io.EOFException when socket streaming is in progress
 Key: CARBONDATA-1775
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1775
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 1.3.0
 Environment: 3 node ant cluster
Reporter: Chetan Bhat


Steps :
User starts the thrift server using the command - bin/spark-submit --master 
yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G 
--num-executors 3 --class 
org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
 "hdfs://hacluster/user/hive/warehouse/carbon.store"
User connects to spark shell using the command - bin/spark-shell --master 
yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G 
--num-executors 3 --jars 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar

In spark shell User creates a table and does streaming load in the table as per 
the script.
import java.io.{File, PrintWriter}
import java.net.ServerSocket

import org.apache.spark.sql.{CarbonEnv, SparkSession}
import org.apache.spark.sql.hive.CarbonRelation
import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}

import org.apache.carbondata.core.constants.CarbonCommonConstants
import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}

CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
 "/MM/dd")

import org.apache.spark.sql.CarbonSession._

val carbonSession = SparkSession.
  builder().
  appName("StreamExample").
  getOrCreateCarbonSession("hdfs://hacluster/user/hive/warehouse/david")
   
carbonSession.sparkContext.setLogLevel("INFO")

def sql(sql: String) = carbonSession.sql(sql)

def writeSocket(serverSocket: ServerSocket): Thread = {
  val thread = new Thread() {
override def run(): Unit = {
  // wait for client to connection request and accept
  val clientSocket = serverSocket.accept()
  val socketWriter = new PrintWriter(clientSocket.getOutputStream())
  var index = 0
  for (_ <- 1 to 1000) {
// write 5 records per iteration
for (_ <- 0 to 100) {
  index = index + 1
  socketWriter.println(index.toString + ",name_" + index
   + ",city_" + index + "," + (index * 
1.00).toString +
   ",school_" + index + ":school_" + index + index 
+ "$" + index)
}
socketWriter.flush()
Thread.sleep(2000)
  }
  socketWriter.close()
  System.out.println("Socket closed")
}
  }
  thread.start()
  thread
}
  
def startStreaming(spark: SparkSession, tablePath: CarbonTablePath, tableName: 
String, port: Int): Thread = {
val thread = new Thread() {
  override def run(): Unit = {
var qry: StreamingQuery = null
try {
  val readSocketDF = spark.readStream
.format("socket")
.option("host", "10.18.98.34")
.option("port", port)
.load()

  qry = readSocketDF.writeStream
.format("carbondata")
.trigger(ProcessingTime("5 seconds"))
.option("checkpointLocation", tablePath.getStreamingCheckpointDir)
.option("tablePath", tablePath.getPath).option("tableName", 
tableName)
.start()

  qry.awaitTermination()
} catch {
  case ex: Throwable =>
ex.printStackTrace()
println("Done reading and writing streaming data")
} finally {
  qry.stop()
}
  }
}
thread.start()
thread
}

val streamTableName = "stream_table"

sql(s"CREATE TABLE $streamTableName (id INT,name STRING,city STRING,salary 
FLOAT) STORED BY 'carbondata' TBLPROPERTIES('streaming'='true', 
'sort_columns'='name')")

sql(s"LOAD DATA LOCAL INPATH 'hdfs://hacluster/tmp/streamSample.csv' INTO TABLE 
$streamTableName OPTIONS('HEADER'='true')")

sql(s"select * from $streamTableName").show

val carbonTable = CarbonEnv.getInstance(carbonSession).carbonMetastore.
  lookupRelation(Some("default"), 
streamTableName)(carbonSession).asInstanceOf[CarbonRelation].carbonTable

val tablePath = 
CarbonStorePath.getCarbonTablePath(carbonTable.getAbsoluteTableIdentifier)

val port = 7995
val serverSocket = new ServerSocket(port)
val socketThread = writeSocket(serverSocket)
val streamingThread = startStreaming(carbonSession, tablePath, streamTableName, 
port)

While load is in progress user executes select query on the streaming table 
from beeline.
0: jdbc:hive2://10.18.98.34:23040> select * from strea

[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1516
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1305/



---


[GitHub] carbondata issue #1499: [WIP][CARBONDATA-1235]Add Lucene Datamap

2017-11-19 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1499
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/1766/



---


[GitHub] carbondata issue #1514: [CARBONDATA-1746] Count star optimization

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1514
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1304/



---


[GitHub] carbondata pull request #1525: [CARBONDATA-1751] Make the type of exception ...

2017-11-19 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1525#discussion_r151912577
  
--- Diff: 
integration/spark2/src/test/scala/org/apache/spark/carbondata/CarbonDataSourceSuite.scala
 ---
@@ -18,12 +18,10 @@
 package org.apache.spark.carbondata
 
 import scala.collection.mutable
-
--- End diff --

OK, Done


---


[GitHub] carbondata pull request #1525: [CARBONDATA-1751] Make the type of exception ...

2017-11-19 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1525#discussion_r151912584
  
--- Diff: 
integration/spark2/src/test/scala/org/apache/spark/carbondata/CarbonDataSourceSuite.scala
 ---
@@ -18,12 +18,10 @@
 package org.apache.spark.carbondata
 
 import scala.collection.mutable
-
 import org.apache.spark.sql.common.util.Spark2QueryTest
 import org.apache.spark.sql.types._
-import org.apache.spark.sql.{Row, SaveMode}
+import org.apache.spark.sql.{AnalysisException, Row, SaveMode}
 import org.scalatest.BeforeAndAfterAll
-
--- End diff --

OK, Done


---


[GitHub] carbondata pull request #1525: [CARBONDATA-1751] Make the type of exception ...

2017-11-19 Thread chenerlu
Github user chenerlu commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1525#discussion_r151911286
  
--- Diff: 
integration/spark2/src/test/scala/org/apache/spark/carbondata/CarbonDataSourceSuite.scala
 ---
@@ -18,12 +18,10 @@
 package org.apache.spark.carbondata
 
 import scala.collection.mutable
-
--- End diff --

Suggest keep this space line


---


[GitHub] carbondata pull request #1525: [CARBONDATA-1751] Make the type of exception ...

2017-11-19 Thread chenerlu
Github user chenerlu commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1525#discussion_r151911377
  
--- Diff: 
integration/spark2/src/test/scala/org/apache/spark/carbondata/CarbonDataSourceSuite.scala
 ---
@@ -18,12 +18,10 @@
 package org.apache.spark.carbondata
 
 import scala.collection.mutable
-
 import org.apache.spark.sql.common.util.Spark2QueryTest
 import org.apache.spark.sql.types._
-import org.apache.spark.sql.{Row, SaveMode}
+import org.apache.spark.sql.{AnalysisException, Row, SaveMode}
 import org.scalatest.BeforeAndAfterAll
-
--- End diff --

Suggest keep this space line


---


[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1516
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1303/



---


[jira] [Assigned] (CARBONDATA-1774) Not able to fetch data from a table with Boolean data type in presto

2017-11-19 Thread anubhav tarar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anubhav tarar reassigned CARBONDATA-1774:
-

Assignee: anubhav tarar

> Not able to fetch data from a table with Boolean data type in presto
> 
>
> Key: CARBONDATA-1774
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1774
> Project: CarbonData
>  Issue Type: Bug
>  Components: presto-integration
>Affects Versions: 1.3.0
> Environment: spark 2.1
>Reporter: Vandana Yadav
>Assignee: anubhav tarar
>Priority: Minor
>
> Not able to fetch data from a table with Boolean data type in presto
> Steps to Reproduce:
> On Beeline
> 1)Create table:
> create table boolean ( id int, employee boolean) stored by 'carbondata';
> 2) Insert values in table:
> insert into boolean values (1,true);
> insert into boolean values (2,false);
> insert into boolean values (5,true);
> insert into boolean values (3,true);
> insert into boolean values (4,false);
> 3) Execute select query with and without boolean datatype;
> a)select id from boolean;
> output:
> +-+--+
> | id  |
> +-+--+
> | 2   |
> | 3   |
> | 4   |
> | 5   |
> | 1   |
> +-+--+
> b)select employee from boolean;
> output:
> ---+--+
> | employee  |
> +---+--+
> | false |
> | true  |
> | true  |
> | false |
> | true  |
> +---+--+
> c) select * from boolean;
> output:
> +-+---+--+
> | id  | employee  |
> +-+---+--+
> | 1   | true  |
> | 3   | true  |
> | 4   | false |
> | 5   | true  |
> | 2   | false |
> +-+---+--+
> On Presto CLI:
> Execute queries with and without boolean data type:
> a)select id from boolean;
> output:
>  id 
> 
>   2 
>   5 
>   1 
>   3 
>   4 
> (5 rows)
> b)select employee from boolean;
> output:
> Expected output: it should display the boolean data type values of employee 
> column as on beeline.
> Actual output:
> Query 20171120_054640_00011_2ppsk, FAILED, 1 node
> Splits: 21 total, 0 done (0.00%)
> 0:01 [0 rows, 0B] [0 rows/s, 0B/s]
> Query 20171120_054640_00011_2ppsk failed: 
> com.facebook.presto.spi.type.BooleanType
> c)select * from boolean;
> output:
> Expected output: it should display the boolean data type values of employee 
> column as on beeline.
> Actual output:
> Query 20171120_054858_00012_2ppsk, FAILED, 1 node
> Splits: 21 total, 0 done (0.00%)
> 0:00 [0 rows, 0B] [0 rows/s, 0B/s]
> Query 20171120_054858_00012_2ppsk failed: 
> com.facebook.presto.spi.type.BooleanType



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1774) Not able to fetch data from a table with Boolean data type in presto

2017-11-19 Thread Vandana Yadav (JIRA)
Vandana Yadav created CARBONDATA-1774:
-

 Summary: Not able to fetch data from a table with Boolean data 
type in presto
 Key: CARBONDATA-1774
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1774
 Project: CarbonData
  Issue Type: Bug
  Components: presto-integration
Affects Versions: 1.3.0
 Environment: spark 2.1
Reporter: Vandana Yadav
Priority: Minor


Not able to fetch data from a table with Boolean data type in presto

Steps to Reproduce:
On Beeline

1)Create table:
create table boolean ( id int, employee boolean) stored by 'carbondata';

2) Insert values in table:
insert into boolean values (1,true);
insert into boolean values (2,false);
insert into boolean values (5,true);
insert into boolean values (3,true);
insert into boolean values (4,false);

3) Execute select query with and without boolean datatype;
a)select id from boolean;
output:
+-+--+
| id  |
+-+--+
| 2   |
| 3   |
| 4   |
| 5   |
| 1   |
+-+--+

b)select employee from boolean;
output:
---+--+
| employee  |
+---+--+
| false |
| true  |
| true  |
| false |
| true  |
+---+--+

c) select * from boolean;
output:
+-+---+--+
| id  | employee  |
+-+---+--+
| 1   | true  |
| 3   | true  |
| 4   | false |
| 5   | true  |
| 2   | false |
+-+---+--+


On Presto CLI:

Execute queries with and without boolean data type:

a)select id from boolean;
output:
 id 

  2 
  5 
  1 
  3 
  4 
(5 rows)

b)select employee from boolean;
output:

Expected output: it should display the boolean data type values of employee 
column as on beeline.

Actual output:

Query 20171120_054640_00011_2ppsk, FAILED, 1 node
Splits: 21 total, 0 done (0.00%)
0:01 [0 rows, 0B] [0 rows/s, 0B/s]

Query 20171120_054640_00011_2ppsk failed: 
com.facebook.presto.spi.type.BooleanType

c)select * from boolean;
output:

Expected output: it should display the boolean data type values of employee 
column as on beeline.

Actual output:

Query 20171120_054858_00012_2ppsk, FAILED, 1 node
Splits: 21 total, 0 done (0.00%)
0:00 [0 rows, 0B] [0 rows/s, 0B/s]

Query 20171120_054858_00012_2ppsk failed: 
com.facebook.presto.spi.type.BooleanType





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...

2017-11-19 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1516
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/1765/



---


[GitHub] carbondata issue #1525: [CARBONDATA-1751] Make the type of exception and mes...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1525
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1302/



---


[GitHub] carbondata issue #1535: [CARBONDATA-1771] While segment_index compaction, .c...

2017-11-19 Thread dhatchayani
Github user dhatchayani commented on the issue:

https://github.com/apache/carbondata/pull/1535
  
retest sdv please


---


[GitHub] carbondata issue #1438: [CARBONDATA-1649]insert overwrite fix during job int...

2017-11-19 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1438
  
Please rebase @akashrn5 


---


[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1516
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1300/



---


[GitHub] carbondata issue #1496: [CARBONDATA-1709][DataFrame] Support sort_columns op...

2017-11-19 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1496
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/1764/



---


[GitHub] carbondata issue #1460: [Docs] Fix partition-guide.md docs NUM_PARTITIONS wr...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1460
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1299/



---


[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1516
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1298/



---


[GitHub] carbondata issue #1499: [WIP][CARBONDATA-1235]Add Lucene Datamap

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1499
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1297/



---


[jira] [Resolved] (CARBONDATA-1767) Remove dependency of Java 1.8

2017-11-19 Thread Liang Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Chen resolved CARBONDATA-1767.

Resolution: Fixed

> Remove dependency of Java 1.8
> -
>
> Key: CARBONDATA-1767
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1767
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 1.3.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> carbon should b enable to compile with Java 1.7



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1531: [CARBONDATA-1767] Remove dependency of Java 1...

2017-11-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1531


---


[GitHub] carbondata issue #1460: [Docs] Fix partition-guide.md docs NUM_PARTITIONS wr...

2017-11-19 Thread LiShuMing
Github user LiShuMing commented on the issue:

https://github.com/apache/carbondata/pull/1460
  
@chenliang613 already done.


---


[GitHub] carbondata issue #1531: [CARBONDATA-1767] Remove dependency of Java 1.8

2017-11-19 Thread chenliang613
Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1531
  
LGTM


---


[jira] [Resolved] (CARBONDATA-1768) Upgrade univocity parser to 2.2.1

2017-11-19 Thread Liang Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Chen resolved CARBONDATA-1768.

   Resolution: Fixed
Fix Version/s: 1.3.0

> Upgrade univocity parser to 2.2.1
> -
>
> Key: CARBONDATA-1768
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1768
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 1.3.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Univocity CSV parser has improved performance in 2.2.1, upgrade dependency to 
> use it



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1532: [CARBONDATA-1768] Upgrade univocity parser to...

2017-11-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1532


---


[GitHub] carbondata issue #1532: [CARBONDATA-1768] Upgrade univocity parser to 2.2.1

2017-11-19 Thread chenliang613
Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1532
  
LGTM


---


[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1516
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1296/



---


[jira] [Resolved] (CARBONDATA-1769) Change alterTableCompaction to support transfer tableInfo

2017-11-19 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-1769.
--
Resolution: Fixed

> Change alterTableCompaction to support transfer  tableInfo
> --
>
> Key: CARBONDATA-1769
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1769
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Reporter: xubo245
>Assignee: xubo245
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Change alterTableCompaction to support transfer  tableInfo



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1533: [CARBONDATA-1769] Change alterTableCompaction...

2017-11-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1533


---


[GitHub] carbondata issue #1533: [CARBONDATA-1769] Change alterTableCompaction to sup...

2017-11-19 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1533
  
LGTM


---


[jira] [Resolved] (CARBONDATA-1766) fix serialization issue for CarbonAppendableStreamSink

2017-11-19 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-1766.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> fix serialization issue for CarbonAppendableStreamSink
> --
>
> Key: CARBONDATA-1766
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1766
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Reporter: QiangCai
> Fix For: 1.3.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> fix serialization issue for CarbonAppendableStreamSink



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1530: [CARBONDATA-1766] fix serialization issue for...

2017-11-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1530


---


[GitHub] carbondata issue #1530: [CARBONDATA-1766] fix serialization issue for Carbon...

2017-11-19 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1530
  
LGTM


---


[GitHub] carbondata issue #1496: [CARBONDATA-1709][DataFrame] Support sort_columns op...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1496
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1295/



---


[GitHub] carbondata issue #1499: [WIP][CARBONDATA-1235]Add Lucene Datamap

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1499
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1294/



---


[GitHub] carbondata issue #1469: [WIP] Spark-2.2 Carbon Integration - Phase 1

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1469
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1293/



---


[GitHub] carbondata issue #1469: [WIP] Spark-2.2 Carbon Integration - Phase 1

2017-11-19 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1469
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/1762/



---


[GitHub] carbondata issue #1469: [WIP] Spark-2.2 Carbon Integration - Phase 1

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1469
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1292/



---


[GitHub] carbondata issue #1469: [WIP] Spark-2.2 Carbon Integration - Phase 1

2017-11-19 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1469
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/1761/



---


[GitHub] carbondata issue #1535: [CARBONDATA-1771] While segment_index compaction, .c...

2017-11-19 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1535
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/1760/



---


[jira] [Created] (CARBONDATA-1773) [Streaming]carbon StreamWriter task is failled with ClassCastException

2017-11-19 Thread Babulal (JIRA)
Babulal created CARBONDATA-1773:
---

 Summary: [Streaming]carbon StreamWriter task is failled with 
ClassCastException
 Key: CARBONDATA-1773
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1773
 Project: CarbonData
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Babulal
 Attachments: streamingLog.log

Run below Seq of commands  in spark Shell ( bin/spark-shell --jars 
/opt/carbon/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar  --master 
yarn-client --executor-memory 1G --executor-cores 2 --driver-memory 1G
)

// carbon is SparkSession with CarbonStateBuilder

carbon.sql("create table stable (value String,count String) STORED BY 
'carbondata' TBLPROPERTIES ('streaming' = 'true')")
val lines = carbon.readStream.format("socket")  .option("host", "localhost")  
.option("port", )  .load()
 val words = lines.as[String].flatMap(_.split(" ")) 
 val wordCounts = words.groupBy("value").count()
val carbonTable = CarbonEnv.getCarbonTable(Some("default"), "stable")(carbon)
 val tablePath = 
CarbonStorePath.getCarbonTablePath(carbonTable.getAbsoluteTableIdentifier)

val qry = 
wordCounts.writeStream.format("carbondata").outputMode("complete").trigger(ProcessingTime("1
 seconds")).option("tablePath", tablePath.getPath).option("checkpointLocation", 
tablePath.getStreamingCheckpointDir).option("tableName","stable").start()
scala> qry.awaitTermination()


Now in another window run below command
root@master ~ # nc -lk 
babu


Check SparkShell
Stage 1:>(0 + 6) / 
200]17/11/19 17:59:57 WARN TaskSetManager: Lost task 2.0 in stage 1.0 (TID 3, 
slave1, executor 2): org.apache.carbondata.streaming.CarbonStreamException: 
Task failed while writing rows
at 
org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:286)
at 
org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:192)
at 
org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:191)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassCastException: cannot assign instance of 
scala.collection.immutable.List$SerializationProxy to field 
scala.collection.convert.Wrappers$SeqWrapper.underlying of type 
scala.collection.Seq in instance of scala.collection.convert.Wrappers$SeqWrapper
at 
java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133)
at 
java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2251)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
at 
org.apache.carbondata.hadoop.util.ObjectSerializationUtil.convertStringToObject(ObjectSerializationUtil.java:99)







--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1772) [Streaming]carbon StreamWriter task is failled with ClassCastException

2017-11-19 Thread Babulal (JIRA)
Babulal created CARBONDATA-1772:
---

 Summary: [Streaming]carbon StreamWriter task is failled with 
ClassCastException
 Key: CARBONDATA-1772
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1772
 Project: CarbonData
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Babulal
 Attachments: streamingLog.log

Run below Seq of commands  in spark Shell ( bin/spark-shell --jars 
/opt/carbon/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar  --master 
yarn-client --executor-memory 1G --executor-cores 2 --driver-memory 1G
)

// carbon is SparkSession with CarbonStateBuilder

carbon.sql("create table stable (value String,count String) STORED BY 
'carbondata' TBLPROPERTIES ('streaming' = 'true')")
val lines = carbon.readStream.format("socket")  .option("host", "localhost")  
.option("port", )  .load()
 val words = lines.as[String].flatMap(_.split(" ")) 
 val wordCounts = words.groupBy("value").count()
val carbonTable = CarbonEnv.getCarbonTable(Some("default"), "stable")(carbon)
 val tablePath = 
CarbonStorePath.getCarbonTablePath(carbonTable.getAbsoluteTableIdentifier)

val qry = 
wordCounts.writeStream.format("carbondata").outputMode("complete").trigger(ProcessingTime("1
 seconds")).option("tablePath", tablePath.getPath).option("checkpointLocation", 
tablePath.getStreamingCheckpointDir).option("tableName","stable").start()
scala> qry.awaitTermination()


Now in another window run below command
root@master ~ # nc -lk 
babu


Check SparkShell
Stage 1:>(0 + 6) / 
200]17/11/19 17:59:57 WARN TaskSetManager: Lost task 2.0 in stage 1.0 (TID 3, 
slave1, executor 2): org.apache.carbondata.streaming.CarbonStreamException: 
Task failed while writing rows
at 
org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:286)
at 
org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:192)
at 
org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:191)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassCastException: cannot assign instance of 
scala.collection.immutable.List$SerializationProxy to field 
scala.collection.convert.Wrappers$SeqWrapper.underlying of type 
scala.collection.Seq in instance of scala.collection.convert.Wrappers$SeqWrapper
at 
java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133)
at 
java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2251)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
at 
org.apache.carbondata.hadoop.util.ObjectSerializationUtil.convertStringToObject(ObjectSerializationUtil.java:99)







--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1535: [CARBONDATA-1771] While segment_index compaction, .c...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1535
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1291/



---


[GitHub] carbondata issue #1531: [CARBONDATA-1767] Remove dependency of Java 1.8

2017-11-19 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1531
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/1759/



---


[GitHub] carbondata issue #1535: [CARBONDATA-1771] While segment_index compaction, .c...

2017-11-19 Thread dhatchayani
Github user dhatchayani commented on the issue:

https://github.com/apache/carbondata/pull/1535
  
Retest this please


---


[GitHub] carbondata issue #1535: [CARBONDATA-1771] While segment_index compaction, .c...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1535
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1290/



---


[GitHub] carbondata issue #1532: [CARBONDATA-1768] Upgrade univocity parser to 2.2.1

2017-11-19 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1532
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/1758/



---


[GitHub] carbondata pull request #1535: [CARBONDATA-1771] While segment_index compact...

2017-11-19 Thread dhatchayani
GitHub user dhatchayani opened a pull request:

https://github.com/apache/carbondata/pull/1535

[CARBONDATA-1771] While segment_index compaction, .carbonindex files of 
invalid segments are also getting merged

**Scenario:**
Disable feature, do loads, execute MINOR compaction
Execute SEGMENT_INDEX compaction
SEGMENT_INDEX  compaction merges .carbonindex files of compacted invalid 
segments also

**Solution:**
Merge index files of valid segments only

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [X] Testing done
UT Added
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhatchayani/incubator-carbondata 
index_compaction

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1535.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1535


commit 9462791bb5ef5eb9dbc9f7c3a8c97ef3e6ad7dfb
Author: dhatchayani 
Date:   2017-11-19T15:23:50Z

[CARBONDATA-1771] While segment_index compaction, .carbonindex files of 
invalid segments are also getting merged




---


[jira] [Updated] (CARBONDATA-1771) While segment_index compaction, .carbonindex files of invalid segments are also getting merged

2017-11-19 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-1771:

Summary: While segment_index compaction, .carbonindex files of invalid 
segments are also getting merged  (was: While segment_index compaction, invalid 
segments .carbonindex files are also getting merged)

> While segment_index compaction, .carbonindex files of invalid segments are 
> also getting merged
> --
>
> Key: CARBONDATA-1771
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1771
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1533: [WIP][CARBONDATA-1769] Change alterTableCompaction t...

2017-11-19 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1533
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/1757/



---


[GitHub] carbondata issue #1508: [CARBONDATA-1738] Block direct insert/load on pre-ag...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1508
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1289/



---


[jira] [Created] (CARBONDATA-1771) While segment_index compaction, invalid segments .carbonindex files are also getting merged

2017-11-19 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-1771:
---

 Summary: While segment_index compaction, invalid segments 
.carbonindex files are also getting merged
 Key: CARBONDATA-1771
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1771
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1508: [CARBONDATA-1738] Block direct insert/load on pre-ag...

2017-11-19 Thread kunal642
Github user kunal642 commented on the issue:

https://github.com/apache/carbondata/pull/1508
  
retest this please


---


[GitHub] carbondata issue #1508: [CARBONDATA-1738] Block direct insert/load on pre-ag...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1508
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1288/



---


[GitHub] carbondata issue #1534: [CARBONDATA-1770] Update error docs and consolidate ...

2017-11-19 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1534
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/1756/



---


[GitHub] carbondata issue #1532: [CARBONDATA-1768] Upgrade univocity parser to 2.2.1

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1532
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1287/



---


[GitHub] carbondata issue #1167: [CARBONDATA-1304] [IUD BuggFix] Iud with single pass

2017-11-19 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1167
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/1755/



---


[GitHub] carbondata issue #1534: [CARBONDATA-1770] Update error docs and consolidate ...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1534
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1286/



---


[GitHub] carbondata issue #1532: [CARBONDATA-1768] Upgrade univocity parser to 2.2.1

2017-11-19 Thread chenliang613
Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1532
  
retest this please


---


[GitHub] carbondata pull request #1534: [CARBONDATA-1770] Update documents and consol...

2017-11-19 Thread chenliang613
GitHub user chenliang613 opened a pull request:

https://github.com/apache/carbondata/pull/1534

[CARBONDATA-1770] Update documents and consolidate DDL,DML,Partition docs

1. Update documents : there are some error description.
2. Consolidate Data management, DDL,DML,Partition docs, to ensure one 
feature which only be described in one place.

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [X] Any interfaces changed?
 NA
 - [X] Any backward compatibility impacted?
 NA
 - [X] Document update required?
YES
 - [X] Testing done
NA
 - [X] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
YES



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chenliang613/carbondata update_docs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1534.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1534


commit a0333be14051166072fb9865dc0623ee1473c92e
Author: chenliang613 
Date:   2017-11-19T13:12:11Z

[CARBONDATA-1770] Update documents and consolidate DDL,DML,Partition docs




---


[jira] [Created] (CARBONDATA-1770) Update documents and consolidate DDL,DML,Partition docs

2017-11-19 Thread Liang Chen (JIRA)
Liang Chen created CARBONDATA-1770:
--

 Summary: Update documents and consolidate DDL,DML,Partition docs
 Key: CARBONDATA-1770
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1770
 Project: CarbonData
  Issue Type: Improvement
  Components: docs
Reporter: Liang Chen
Assignee: Liang Chen


1. Update documents : there are some error description.
2. Consolidate Data management, DDL,DML,Partition docs, to ensure one feature 
which only be described in one place.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1533: [WIP][CARBONDATA-1769] Change alterTableCompaction t...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1533
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1285/



---


[GitHub] carbondata pull request #1533: [WIP][CARBONDATA-1769] Change alterTableCompa...

2017-11-19 Thread xubo245
GitHub user xubo245 opened a pull request:

https://github.com/apache/carbondata/pull/1533

[WIP][CARBONDATA-1769] Change alterTableCompaction to support transfer 
tabeInfo

Change alterTableCompaction to support transfer tabeInfo

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 No
 - [ ] Any backward compatibility impacted?
 No
 - [ ] Document update required?
No
 - [ ] Testing done
No
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
MR119


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xubo245/carbondata CompactionGithub

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1533.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1533


commit 1605179cd3b8c8db18027537123b57fa3976ecc3
Author: xubo245 <601450...@qq.com>
Date:   2017-11-19T12:06:00Z

[CARBONDATA-1769] Change alterTableCompaction to support transfer tableInfo




---


[jira] [Created] (CARBONDATA-1769) Change alterTableCompaction to support transfer tableInfo

2017-11-19 Thread xubo245 (JIRA)
xubo245 created CARBONDATA-1769:
---

 Summary: Change alterTableCompaction to support transfer  tableInfo
 Key: CARBONDATA-1769
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1769
 Project: CarbonData
  Issue Type: Improvement
  Components: spark-integration
Reporter: xubo245
Assignee: xubo245
Priority: Minor
 Fix For: 1.3.0


Change alterTableCompaction to support transfer  tableInfo



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1525: [CARBONDATA-1751] Make the type of exception and mes...

2017-11-19 Thread xubo245
Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1525
  
I have fixed conflicts, please review it @jackylk 


---


[GitHub] carbondata issue #1525: [CARBONDATA-1751] Make the type of exception and mes...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1525
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1284/



---


[GitHub] carbondata issue #1525: [CARBONDATA-1751] Make the type of exception and mes...

2017-11-19 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1525
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1283/



---