[GitHub] carbondata issue #1525: [CARBONDATA-1751] Make the type of exception and mes...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/1525 Erlu Chen and Yadong Qi have reivewed this PR, please review it again @jackylk ---
[jira] [Updated] (CARBONDATA-1777) Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables creation in Spark-shell sessions are not used in the beeline session
[ https://issues.apache.org/jira/browse/CARBONDATA-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna S updated CARBONDATA-1777: -- Description: Steps: Beeline: 1. Create table and load with data Spark-shell: 1. create a pre-aggregate table Beeline: 1. Run aggregate query *+Expected:+* Pre-aggregate table should be used in the aggregate query *+Actual:+* Pre-aggregate table is not used 1. create table if not exists lineitem1(L_SHIPDATE string,L_SHIPMODE string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY string,L_LINENUMBER int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT string) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'=''); load data inpath "hdfs://hacluster/user/test/lineitem.tbl.5" into table lineitem1 options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT'); 2. carbon.sql("create datamap agr1_lineitem1 ON TABLE lineitem1 USING 'org.apache.carbondata.datamap.AggregateDataMapHandler' as select l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) from lineitem1 group by l_returnflag, l_linestatus").show(); 3. select l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus; Actual: 0: jdbc:hive2://10.18.98.136:23040> show tables; +---+---+--+--+ | database | tableName | isTemporary | +---+---+--+--+ | test_db2 | lineitem1 | false| | test_db2 | lineitem1_agr1_lineitem1 | false| +---+---+--+--+ 2 rows selected (0.047 seconds) Logs: 2017-11-20 15:46:48,314 | INFO | [pool-23-thread-53] | Running query 'select l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus' with 7f3091a8-4d7b-40ac-840f-9db6f564c9cf | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54) 2017-11-20 15:46:48,314 | INFO | [pool-23-thread-53] | Parsing command: select l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54) 2017-11-20 15:46:48,353 | INFO | [pool-23-thread-53] | 55: get_table : db=test_db2 tbl=lineitem1 | org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logInfo(HiveMetaStore.java:746) 2017-11-20 15:46:48,353 | INFO | [pool-23-thread-53] | ugi=anonymous ip=unknown-ip-addr cmd=get_table : db=test_db2 tbl=lineitem1| org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logAuditEvent(HiveMetaStore.java:371) 2017-11-20 15:46:48,354 | INFO | [pool-23-thread-53] | 55: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore | org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:589) 2017-11-20 15:46:48,355 | INFO | [pool-23-thread-53] | ObjectStore, initialize called | org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:289) 2017-11-20 15:46:48,360 | INFO | [pool-23-thread-53] | Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing | org.datanucleus.util.Log4JLogger.info(Log4JLogger.java:77) 2017-11-20 15:46:48,362 | INFO | [pool-23-thread-53] | Using direct SQL, underlying DB is MYSQL | org.apache.hadoop.hive.metastore.MetaStoreDirectSql.(MetaStoreDirectSql.java:139) 2017-11-20 15:46:48,362 | INFO | [pool-23-thread-53] | Initialized ObjectStore | org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:272) 2017-11-20 15:46:48,376 | INFO | [pool-23-thread-53] | Parsing command: array | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54) 2017-11-20 15:46:48,399 | INFO | [pool-23-thread-53] | Schema changes have been detected for table: `lineitem1` | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54) 2017-11-20 15:46:48,399 | INFO | [pool-23-thread-53] | 55: get_table : db=test_db2 tbl=lineitem1 | org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logInfo(HiveMetaStore.java:746) 2017-11-20 15:46:48,400 | INFO | [pool-23-thread-53] | ugi=anonymous ip=unknown-ip-addr cmd=get_table : db=test_db2 tbl=lineitem1| org
[jira] [Updated] (CARBONDATA-1777) Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables created in Spark-shell sessions are not used in the beeline session
[ https://issues.apache.org/jira/browse/CARBONDATA-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna S updated CARBONDATA-1777: -- Summary: Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables created in Spark-shell sessions are not used in the beeline session (was: Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables creation in Spark-shell sessions are not used in the beeline session) > Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables created in Spark-shell > sessions are not used in the beeline session > - > > Key: CARBONDATA-1777 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1777 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.3.0 > Environment: Test - 3 node ant cluster >Reporter: Ramakrishna S >Assignee: Kunal Kapoor > Labels: DFX > Fix For: 1.3.0 > > > Steps: > Beeline: > 1. Create table and load with data > Spark-shell: > 1. create a pre-aggregate table > Beeline: > 1. Run aggregate query > *+Expected:+* Pre-aggregate table should be used in the aggregate query > *+Actual:+* Pre-aggregate table is not used > 1. > create table if not exists lineitem1(L_SHIPDATE string,L_SHIPMODE > string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE > string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY string,L_LINENUMBER > int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX > double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT string) STORED BY > 'org.apache.carbondata.format' TBLPROPERTIES > ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'=''); > load data inpath "hdfs://hacluster/user/test/lineitem.tbl.5" into table > lineitem1 > options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT'); > 2. > carbon.sql("create datamap agr1_lineitem1 ON TABLE lineitem1 USING > 'org.apache.carbondata.datamap.AggregateDataMapHandler' as select > l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) > from lineitem1 group by l_returnflag, l_linestatus").show(); > 3. > select > l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) > from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus; > Actual: > 0: jdbc:hive2://10.18.98.136:23040> show tables; > +---+---+--+--+ > | database | tableName | isTemporary | > +---+---+--+--+ > | test_db2 | lineitem1 | false| > | test_db2 | lineitem1_agr1_lineitem1 | false| > +---+---+--+--+ > 2 rows selected (0.047 seconds) > Logs: > 2017-11-20 15:46:48,314 | INFO | [pool-23-thread-53] | Running query 'select > l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) > from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus' > with 7f3091a8-4d7b-40ac-840f-9db6f564c9cf | > org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54) > 2017-11-20 15:46:48,314 | INFO | [pool-23-thread-53] | Parsing command: > select > l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) > from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus | > org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54) > 2017-11-20 15:46:48,353 | INFO | [pool-23-thread-53] | 55: get_table : > db=test_db2 tbl=lineitem1 | > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logInfo(HiveMetaStore.java:746) > 2017-11-20 15:46:48,353 | INFO | [pool-23-thread-53] | ugi=anonymous > ip=unknown-ip-addr cmd=get_table : db=test_db2 tbl=lineitem1| > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logAuditEvent(HiveMetaStore.java:371) > 2017-11-20 15:46:48,354 | INFO | [pool-23-thread-53] | 55: Opening raw store > with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore | > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:589) > 2017-11-20 15:46:48,355 | INFO | [pool-23-thread-53] | ObjectStore, > initialize called | > org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:289) > 2017-11-20 15:46:48,360 | INFO | [pool-23-thread-53] | Reading in results > for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection > used is closing | org.datanucleus.util.Log4JLogge
[jira] [Commented] (CARBONDATA-1516) Support pre-aggregate tables and timeseries in carbondata
[ https://issues.apache.org/jira/browse/CARBONDATA-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258902#comment-16258902 ] zhouguangcheng commented on CARBONDATA-1516: [~kumarvishal] can we add one more command to list all the pre-agg table, it is a useful command. because after create may pre-agg tables, use want to know all the cubes created and easy to maintenance the pre-agg tables. > Support pre-aggregate tables and timeseries in carbondata > - > > Key: CARBONDATA-1516 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1516 > Project: CarbonData > Issue Type: New Feature >Reporter: Ravindra Pesala > Attachments: CarbonData Pre-aggregation Table.pdf, CarbonData > Pre-aggregation Table_v1.1.pdf > > > Currently Carbondata has standard SQL capability on distributed data > sets.Carbondata should support pre-aggregating tables for timeseries and > improve query performance. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1777) Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables creation in Spark-shell sessions are not used in the beeline session
Ramakrishna S created CARBONDATA-1777: - Summary: Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables creation in Spark-shell sessions are not used in the beeline session Key: CARBONDATA-1777 URL: https://issues.apache.org/jira/browse/CARBONDATA-1777 Project: CarbonData Issue Type: Bug Components: data-load Affects Versions: 1.3.0 Environment: Test - 3 node ant cluster Reporter: Ramakrishna S Assignee: Kunal Kapoor Fix For: 1.3.0 Steps: 1. Create table and load with data 2. Run update query on the table - this will take table metalock 3. In parallel run the pre-aggregate table create step - this will not be allowed due to table lock 4. Rerun pre-aggegate table create step *+Expected:+* Pre-aggregate table should be created *+Actual:+* Pre-aggregate table creation fails +Create, Load & Update+: 0: jdbc:hive2://10.18.98.136:23040> create table if not exists lineitem4(L_SHIPDATE string,L_SHIPMODE string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY string,L_LINENUMBER int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT string) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'=''); +-+--+ | Result | +-+--+ +-+--+ No rows selected (0.266 seconds) 0: jdbc:hive2://10.18.98.136:23040> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.5" into table lineitem4 options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT'); +-+--+ | Result | +-+--+ +-+--+ No rows selected (6.331 seconds) 0: jdbc:hive2://10.18.98.136:23040> update lineitem4 set (l_linestatus) = ('xx'); +Create Datamap:+ 0: jdbc:hive2://10.18.98.136:23040> create datamap agr_lineitem4 ON TABLE lineitem4 USING "org.apache.carbondata.datamap.AggregateDataMapHandler" as select l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) from lineitem4 group by l_returnflag, l_linestatus; Error: java.lang.RuntimeException: Acquire table lock failed after retry, please try after some time (state=,code=0) 0: jdbc:hive2://10.18.98.136:23040> select l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) from lineitem4 group by l_returnflag, l_linestatus; +---+---+--+-++--+ | l_returnflag | l_linestatus | sum(l_quantity) | avg(l_quantity) | count(l_quantity) | +---+---+--+-++--+ | N | xx| 1.2863213E7 | 25.48745561614304 | 504688 | | A | xx| 6318125.0| 25.506342144783375 | 247708 | | R | xx| 6321939.0| 25.532459087898417 | 247604 | +---+---+--+-++--+ 3 rows selected (1.033 seconds) 0: jdbc:hive2://10.18.98.136:23040> create datamap agr_lineitem4 ON TABLE lineitem4 USING "org.apache.carbondata.datamap.AggregateDataMapHandler" as select l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) from lineitem4 group by l_returnflag, l_linestatus; Error: java.lang.RuntimeException: Table [lineitem4_agr_lineitem4] already exists under database [test_db1] (state=,code=0) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (CARBONDATA-1516) Support pre-aggregate tables and timeseries in carbondata
[ https://issues.apache.org/jira/browse/CARBONDATA-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258900#comment-16258900 ] zhouguangcheng commented on CARBONDATA-1516: [~kumarvishal] hi Vishal, if donot support delete segment on the main table, how to support the table data retention ? > Support pre-aggregate tables and timeseries in carbondata > - > > Key: CARBONDATA-1516 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1516 > Project: CarbonData > Issue Type: New Feature >Reporter: Ravindra Pesala > Attachments: CarbonData Pre-aggregation Table.pdf, CarbonData > Pre-aggregation Table_v1.1.pdf > > > Currently Carbondata has standard SQL capability on distributed data > sets.Carbondata should support pre-aggregating tables for timeseries and > improve query performance. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1525: [CARBONDATA-1751] Make the type of exception and mes...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1525 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1306/ ---
[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1516 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1767/ ---
[GitHub] carbondata pull request #1536: [CARBONDATA-1776] Fix some possible test erro...
GitHub user xubo245 opened a pull request: https://github.com/apache/carbondata/pull/1536 [CARBONDATA-1776] Fix some possible test errors that are related to compaction Fix some possible test errors that are related to compaction This PR only changed test class Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [ ] Testing done only change test class, no new test - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. MR120 You can merge this pull request into a Git repository by running: $ git pull https://github.com/xubo245/carbondata fixUT Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1536.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1536 commit ba60c10dca675c964bab2bbd80f236f254308ea4 Author: xubo245 <601450...@qq.com> Date: 2017-11-20T07:19:27Z [CARBONDATA-1776] Fix some possible test error that are related to compaction ---
[jira] [Updated] (CARBONDATA-1775) (Carbon1.3.0 - Streaming) Select query fails with java.io.EOFException when data streaming is in progress
[ https://issues.apache.org/jira/browse/CARBONDATA-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1775: Description: Steps : User starts the thrift server using the command - bin/spark-submit --master yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G --num-executors 3 --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar "hdfs://hacluster/user/hive/warehouse/carbon.store" User connects to spark shell using the command - bin/spark-shell --master yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G --num-executors 3 --jars /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar In spark shell User creates a table and does streaming load in the table as per the below socket streaming script. import java.io.{File, PrintWriter} import java.net.ServerSocket import org.apache.spark.sql.{CarbonEnv, SparkSession} import org.apache.spark.sql.hive.CarbonRelation import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery} import org.apache.carbondata.core.constants.CarbonCommonConstants import org.apache.carbondata.core.util.CarbonProperties import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath} CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd") import org.apache.spark.sql.CarbonSession._ val carbonSession = SparkSession. builder(). appName("StreamExample"). getOrCreateCarbonSession("hdfs://hacluster/user/hive/warehouse/david") carbonSession.sparkContext.setLogLevel("INFO") def sql(sql: String) = carbonSession.sql(sql) def writeSocket(serverSocket: ServerSocket): Thread = { val thread = new Thread() { override def run(): Unit = { // wait for client to connection request and accept val clientSocket = serverSocket.accept() val socketWriter = new PrintWriter(clientSocket.getOutputStream()) var index = 0 for (_ <- 1 to 1000) { // write 5 records per iteration for (_ <- 0 to 100) { index = index + 1 socketWriter.println(index.toString + ",name_" + index + ",city_" + index + "," + (index * 1.00).toString + ",school_" + index + ":school_" + index + index + "$" + index) } socketWriter.flush() Thread.sleep(2000) } socketWriter.close() System.out.println("Socket closed") } } thread.start() thread } def startStreaming(spark: SparkSession, tablePath: CarbonTablePath, tableName: String, port: Int): Thread = { val thread = new Thread() { override def run(): Unit = { var qry: StreamingQuery = null try { val readSocketDF = spark.readStream .format("socket") .option("host", "10.18.98.34") .option("port", port) .load() qry = readSocketDF.writeStream .format("carbondata") .trigger(ProcessingTime("5 seconds")) .option("checkpointLocation", tablePath.getStreamingCheckpointDir) .option("tablePath", tablePath.getPath).option("tableName", tableName) .start() qry.awaitTermination() } catch { case ex: Throwable => ex.printStackTrace() println("Done reading and writing streaming data") } finally { qry.stop() } } } thread.start() thread } val streamTableName = "stream_table" sql(s"CREATE TABLE $streamTableName (id INT,name STRING,city STRING,salary FLOAT) STORED BY 'carbondata' TBLPROPERTIES('streaming'='true', 'sort_columns'='name')") sql(s"LOAD DATA LOCAL INPATH 'hdfs://hacluster/tmp/streamSample.csv' INTO TABLE $streamTableName OPTIONS('HEADER'='true')") sql(s"select * from $streamTableName").show val carbonTable = CarbonEnv.getInstance(carbonSession).carbonMetastore. lookupRelation(Some("default"), streamTableName)(carbonSession).asInstanceOf[CarbonRelation].carbonTable val tablePath = CarbonStorePath.getCarbonTablePath(carbonTable.getAbsoluteTableIdentifier) val port = 7995 val serverSocket = new ServerSocket(port) val socketThread = writeSocket(serverSocket) val streamingThread = startStreaming(carbonSession, tablePath, streamTableName, port) While load is in progress user executes select query on the streaming table from beeline. 0: jdbc:hive2://10.18.98.34:23040> select * from stream_table; *Issue : The Select query fails with java.io.EOFException when socket streaming is in progress.* 0: jdbc:hive2://10.18.98.34:23040> select * from stream_table; Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 1.0 failed
[jira] [Updated] (CARBONDATA-1775) (Carbon1.3.0 - Streaming) Select query fails with java.io.EOFException when data streaming is in progress
[ https://issues.apache.org/jira/browse/CARBONDATA-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1775: Description: Steps : User starts the thrift server using the command - bin/spark-submit --master yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G --num-executors 3 --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar "hdfs://hacluster/user/hive/warehouse/carbon.store" User connects to spark shell using the command - bin/spark-shell --master yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G --num-executors 3 --jars /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar In spark shell User creates a table and does streaming load in the table as per the below socket streaming script. import java.io.{File, PrintWriter} import java.net.ServerSocket import org.apache.spark.sql.{CarbonEnv, SparkSession} import org.apache.spark.sql.hive.CarbonRelation import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery} import org.apache.carbondata.core.constants.CarbonCommonConstants import org.apache.carbondata.core.util.CarbonProperties import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath} CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd") import org.apache.spark.sql.CarbonSession._ val carbonSession = SparkSession. builder(). appName("StreamExample"). getOrCreateCarbonSession("hdfs://hacluster/user/hive/warehouse/david") carbonSession.sparkContext.setLogLevel("INFO") def sql(sql: String) = carbonSession.sql(sql) def writeSocket(serverSocket: ServerSocket): Thread = { val thread = new Thread() { override def run(): Unit = { // wait for client to connection request and accept val clientSocket = serverSocket.accept() val socketWriter = new PrintWriter(clientSocket.getOutputStream()) var index = 0 for (_ <- 1 to 1000) { // write 5 records per iteration for (_ <- 0 to 100) { index = index + 1 socketWriter.println(index.toString + ",name_" + index + ",city_" + index + "," + (index * 1.00).toString + ",school_" + index + ":school_" + index + index + "$" + index) } socketWriter.flush() Thread.sleep(2000) } socketWriter.close() System.out.println("Socket closed") } } thread.start() thread } def startStreaming(spark: SparkSession, tablePath: CarbonTablePath, tableName: String, port: Int): Thread = { val thread = new Thread() { override def run(): Unit = { var qry: StreamingQuery = null try { val readSocketDF = spark.readStream .format("socket") .option("host", "10.18.98.34") .option("port", port) .load() qry = readSocketDF.writeStream .format("carbondata") .trigger(ProcessingTime("5 seconds")) .option("checkpointLocation", tablePath.getStreamingCheckpointDir) .option("tablePath", tablePath.getPath).option("tableName", tableName) .start() qry.awaitTermination() } catch { case ex: Throwable => ex.printStackTrace() println("Done reading and writing streaming data") } finally { qry.stop() } } } thread.start() thread } val streamTableName = "stream_table" sql(s"CREATE TABLE $streamTableName (id INT,name STRING,city STRING,salary FLOAT) STORED BY 'carbondata' TBLPROPERTIES('streaming'='true', 'sort_columns'='name')") sql(s"LOAD DATA LOCAL INPATH 'hdfs://hacluster/tmp/streamSample.csv' INTO TABLE $streamTableName OPTIONS('HEADER'='true')") sql(s"select * from $streamTableName").show val carbonTable = CarbonEnv.getInstance(carbonSession).carbonMetastore. lookupRelation(Some("default"), streamTableName)(carbonSession).asInstanceOf[CarbonRelation].carbonTable val tablePath = CarbonStorePath.getCarbonTablePath(carbonTable.getAbsoluteTableIdentifier) val port = 7995 val serverSocket = new ServerSocket(port) val socketThread = writeSocket(serverSocket) val streamingThread = startStreaming(carbonSession, tablePath, streamTableName, port) While load is in progress user executes select query on the streaming table from beeline. 0: jdbc:hive2://10.18.98.34:23040> select * from stream_table; *Issue : The Select query fails with java.io.EOFException when socket streaming is in progress.* 0: jdbc:hive2://10.18.98.34:23040> select * from stream_table; Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 1.0 failed
[jira] [Created] (CARBONDATA-1776) Fix some possible test error that are related to compaction
xubo245 created CARBONDATA-1776: --- Summary: Fix some possible test error that are related to compaction Key: CARBONDATA-1776 URL: https://issues.apache.org/jira/browse/CARBONDATA-1776 Project: CarbonData Issue Type: Bug Components: test Reporter: xubo245 Assignee: xubo245 Priority: Minor Fix For: 1.3.0 Fix some possible test error that are related to compaction -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1775) (Carbon1.3.0 - Streaming) Select query fails with java.io.EOFException when socket streaming is in progress
[ https://issues.apache.org/jira/browse/CARBONDATA-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1775: Description: Steps : User starts the thrift server using the command - bin/spark-submit --master yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G --num-executors 3 --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar "hdfs://hacluster/user/hive/warehouse/carbon.store" User connects to spark shell using the command - bin/spark-shell --master yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G --num-executors 3 --jars /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar In spark shell User creates a table and does streaming load in the table as per the below socket streaming script. import java.io.{File, PrintWriter} import java.net.ServerSocket import org.apache.spark.sql.{CarbonEnv, SparkSession} import org.apache.spark.sql.hive.CarbonRelation import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery} import org.apache.carbondata.core.constants.CarbonCommonConstants import org.apache.carbondata.core.util.CarbonProperties import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath} CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd") import org.apache.spark.sql.CarbonSession._ val carbonSession = SparkSession. builder(). appName("StreamExample"). getOrCreateCarbonSession("hdfs://hacluster/user/hive/warehouse/david") carbonSession.sparkContext.setLogLevel("INFO") def sql(sql: String) = carbonSession.sql(sql) def writeSocket(serverSocket: ServerSocket): Thread = { val thread = new Thread() { override def run(): Unit = { // wait for client to connection request and accept val clientSocket = serverSocket.accept() val socketWriter = new PrintWriter(clientSocket.getOutputStream()) var index = 0 for (_ <- 1 to 1000) { // write 5 records per iteration for (_ <- 0 to 100) { index = index + 1 socketWriter.println(index.toString + ",name_" + index + ",city_" + index + "," + (index * 1.00).toString + ",school_" + index + ":school_" + index + index + "$" + index) } socketWriter.flush() Thread.sleep(2000) } socketWriter.close() System.out.println("Socket closed") } } thread.start() thread } def startStreaming(spark: SparkSession, tablePath: CarbonTablePath, tableName: String, port: Int): Thread = { val thread = new Thread() { override def run(): Unit = { var qry: StreamingQuery = null try { val readSocketDF = spark.readStream .format("socket") .option("host", "10.18.98.34") .option("port", port) .load() qry = readSocketDF.writeStream .format("carbondata") .trigger(ProcessingTime("5 seconds")) .option("checkpointLocation", tablePath.getStreamingCheckpointDir) .option("tablePath", tablePath.getPath).option("tableName", tableName) .start() qry.awaitTermination() } catch { case ex: Throwable => ex.printStackTrace() println("Done reading and writing streaming data") } finally { qry.stop() } } } thread.start() thread } val streamTableName = "stream_table" sql(s"CREATE TABLE $streamTableName (id INT,name STRING,city STRING,salary FLOAT) STORED BY 'carbondata' TBLPROPERTIES('streaming'='true', 'sort_columns'='name')") sql(s"LOAD DATA LOCAL INPATH 'hdfs://hacluster/tmp/streamSample.csv' INTO TABLE $streamTableName OPTIONS('HEADER'='true')") sql(s"select * from $streamTableName").show val carbonTable = CarbonEnv.getInstance(carbonSession).carbonMetastore. lookupRelation(Some("default"), streamTableName)(carbonSession).asInstanceOf[CarbonRelation].carbonTable val tablePath = CarbonStorePath.getCarbonTablePath(carbonTable.getAbsoluteTableIdentifier) val port = 7995 val serverSocket = new ServerSocket(port) val socketThread = writeSocket(serverSocket) val streamingThread = startStreaming(carbonSession, tablePath, streamTableName, port) While load is in progress user executes select query on the streaming table from beeline. 0: jdbc:hive2://10.18.98.34:23040> select * from stream_table; Issue : The Select query fails with java.io.EOFException when socket streaming is in progress. 0: jdbc:hive2://10.18.98.34:23040> select * from stream_table; Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 1.0 failed 4
[jira] [Updated] (CARBONDATA-1775) (Carbon1.3.0 - Streaming) Select query fails with java.io.EOFException when data streaming is in progress
[ https://issues.apache.org/jira/browse/CARBONDATA-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1775: Summary: (Carbon1.3.0 - Streaming) Select query fails with java.io.EOFException when data streaming is in progress (was: (Carbon1.3.0 - Streaming) Select query fails with java.io.EOFException when socket streaming is in progress) > (Carbon1.3.0 - Streaming) Select query fails with java.io.EOFException when > data streaming is in progress > -- > > Key: CARBONDATA-1775 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1775 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.3.0 > Environment: 3 node ant cluster >Reporter: Chetan Bhat > Labels: DFX > > Steps : > User starts the thrift server using the command - bin/spark-submit --master > yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G > --num-executors 3 --class > org.apache.carbondata.spark.thriftserver.CarbonThriftServer > /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar > "hdfs://hacluster/user/hive/warehouse/carbon.store" > User connects to spark shell using the command - bin/spark-shell --master > yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G > --num-executors 3 --jars > /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar > In spark shell User creates a table and does streaming load in the table as > per the below socket streaming script. > import java.io.{File, PrintWriter} > import java.net.ServerSocket > import org.apache.spark.sql.{CarbonEnv, SparkSession} > import org.apache.spark.sql.hive.CarbonRelation > import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery} > import org.apache.carbondata.core.constants.CarbonCommonConstants > import org.apache.carbondata.core.util.CarbonProperties > import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath} > CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, > "/MM/dd") > import org.apache.spark.sql.CarbonSession._ > val carbonSession = SparkSession. > builder(). > appName("StreamExample"). > getOrCreateCarbonSession("hdfs://hacluster/user/hive/warehouse/david") > > carbonSession.sparkContext.setLogLevel("INFO") > def sql(sql: String) = carbonSession.sql(sql) > def writeSocket(serverSocket: ServerSocket): Thread = { > val thread = new Thread() { > override def run(): Unit = { > // wait for client to connection request and accept > val clientSocket = serverSocket.accept() > val socketWriter = new PrintWriter(clientSocket.getOutputStream()) > var index = 0 > for (_ <- 1 to 1000) { > // write 5 records per iteration > for (_ <- 0 to 100) { > index = index + 1 > socketWriter.println(index.toString + ",name_" + index >+ ",city_" + index + "," + (index * > 1.00).toString + >",school_" + index + ":school_" + index + > index + "$" + index) > } > socketWriter.flush() > Thread.sleep(2000) > } > socketWriter.close() > System.out.println("Socket closed") > } > } > thread.start() > thread > } > > def startStreaming(spark: SparkSession, tablePath: CarbonTablePath, > tableName: String, port: Int): Thread = { > val thread = new Thread() { > override def run(): Unit = { > var qry: StreamingQuery = null > try { > val readSocketDF = spark.readStream > .format("socket") > .option("host", "10.18.98.34") > .option("port", port) > .load() > qry = readSocketDF.writeStream > .format("carbondata") > .trigger(ProcessingTime("5 seconds")) > .option("checkpointLocation", tablePath.getStreamingCheckpointDir) > .option("tablePath", tablePath.getPath).option("tableName", > tableName) > .start() > qry.awaitTermination() > } catch { > case ex: Throwable => > ex.printStackTrace() > println("Done reading and writing streaming data") > } finally { > qry.stop() > } > } > } > thread.start() > thread > } > val streamTableName = "stream_table" > sql(s"CREATE TABLE $streamTableName (id INT,name STRING,city STRING,salary > FLOAT) STORED BY 'carbondata' TBLPROPERTIES('streaming'='true', > 'sort_columns'='name')") > sql(s"LOAD DATA LOCAL INPATH 'hdfs://hacluster/tmp/streamSample.csv' INTO > TABLE $s
[jira] [Created] (CARBONDATA-1775) (Carbon1.3.0 - Streaming) Select query fails with java.io.EOFException when socket streaming is in progress
Chetan Bhat created CARBONDATA-1775: --- Summary: (Carbon1.3.0 - Streaming) Select query fails with java.io.EOFException when socket streaming is in progress Key: CARBONDATA-1775 URL: https://issues.apache.org/jira/browse/CARBONDATA-1775 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 1.3.0 Environment: 3 node ant cluster Reporter: Chetan Bhat Steps : User starts the thrift server using the command - bin/spark-submit --master yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G --num-executors 3 --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar "hdfs://hacluster/user/hive/warehouse/carbon.store" User connects to spark shell using the command - bin/spark-shell --master yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G --num-executors 3 --jars /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar In spark shell User creates a table and does streaming load in the table as per the script. import java.io.{File, PrintWriter} import java.net.ServerSocket import org.apache.spark.sql.{CarbonEnv, SparkSession} import org.apache.spark.sql.hive.CarbonRelation import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery} import org.apache.carbondata.core.constants.CarbonCommonConstants import org.apache.carbondata.core.util.CarbonProperties import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath} CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd") import org.apache.spark.sql.CarbonSession._ val carbonSession = SparkSession. builder(). appName("StreamExample"). getOrCreateCarbonSession("hdfs://hacluster/user/hive/warehouse/david") carbonSession.sparkContext.setLogLevel("INFO") def sql(sql: String) = carbonSession.sql(sql) def writeSocket(serverSocket: ServerSocket): Thread = { val thread = new Thread() { override def run(): Unit = { // wait for client to connection request and accept val clientSocket = serverSocket.accept() val socketWriter = new PrintWriter(clientSocket.getOutputStream()) var index = 0 for (_ <- 1 to 1000) { // write 5 records per iteration for (_ <- 0 to 100) { index = index + 1 socketWriter.println(index.toString + ",name_" + index + ",city_" + index + "," + (index * 1.00).toString + ",school_" + index + ":school_" + index + index + "$" + index) } socketWriter.flush() Thread.sleep(2000) } socketWriter.close() System.out.println("Socket closed") } } thread.start() thread } def startStreaming(spark: SparkSession, tablePath: CarbonTablePath, tableName: String, port: Int): Thread = { val thread = new Thread() { override def run(): Unit = { var qry: StreamingQuery = null try { val readSocketDF = spark.readStream .format("socket") .option("host", "10.18.98.34") .option("port", port) .load() qry = readSocketDF.writeStream .format("carbondata") .trigger(ProcessingTime("5 seconds")) .option("checkpointLocation", tablePath.getStreamingCheckpointDir) .option("tablePath", tablePath.getPath).option("tableName", tableName) .start() qry.awaitTermination() } catch { case ex: Throwable => ex.printStackTrace() println("Done reading and writing streaming data") } finally { qry.stop() } } } thread.start() thread } val streamTableName = "stream_table" sql(s"CREATE TABLE $streamTableName (id INT,name STRING,city STRING,salary FLOAT) STORED BY 'carbondata' TBLPROPERTIES('streaming'='true', 'sort_columns'='name')") sql(s"LOAD DATA LOCAL INPATH 'hdfs://hacluster/tmp/streamSample.csv' INTO TABLE $streamTableName OPTIONS('HEADER'='true')") sql(s"select * from $streamTableName").show val carbonTable = CarbonEnv.getInstance(carbonSession).carbonMetastore. lookupRelation(Some("default"), streamTableName)(carbonSession).asInstanceOf[CarbonRelation].carbonTable val tablePath = CarbonStorePath.getCarbonTablePath(carbonTable.getAbsoluteTableIdentifier) val port = 7995 val serverSocket = new ServerSocket(port) val socketThread = writeSocket(serverSocket) val streamingThread = startStreaming(carbonSession, tablePath, streamTableName, port) While load is in progress user executes select query on the streaming table from beeline. 0: jdbc:hive2://10.18.98.34:23040> select * from strea
[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1516 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1305/ ---
[GitHub] carbondata issue #1499: [WIP][CARBONDATA-1235]Add Lucene Datamap
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1499 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1766/ ---
[GitHub] carbondata issue #1514: [CARBONDATA-1746] Count star optimization
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1514 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1304/ ---
[GitHub] carbondata pull request #1525: [CARBONDATA-1751] Make the type of exception ...
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1525#discussion_r151912577 --- Diff: integration/spark2/src/test/scala/org/apache/spark/carbondata/CarbonDataSourceSuite.scala --- @@ -18,12 +18,10 @@ package org.apache.spark.carbondata import scala.collection.mutable - --- End diff -- OK, Done ---
[GitHub] carbondata pull request #1525: [CARBONDATA-1751] Make the type of exception ...
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1525#discussion_r151912584 --- Diff: integration/spark2/src/test/scala/org/apache/spark/carbondata/CarbonDataSourceSuite.scala --- @@ -18,12 +18,10 @@ package org.apache.spark.carbondata import scala.collection.mutable - import org.apache.spark.sql.common.util.Spark2QueryTest import org.apache.spark.sql.types._ -import org.apache.spark.sql.{Row, SaveMode} +import org.apache.spark.sql.{AnalysisException, Row, SaveMode} import org.scalatest.BeforeAndAfterAll - --- End diff -- OK, Done ---
[GitHub] carbondata pull request #1525: [CARBONDATA-1751] Make the type of exception ...
Github user chenerlu commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1525#discussion_r151911286 --- Diff: integration/spark2/src/test/scala/org/apache/spark/carbondata/CarbonDataSourceSuite.scala --- @@ -18,12 +18,10 @@ package org.apache.spark.carbondata import scala.collection.mutable - --- End diff -- Suggest keep this space line ---
[GitHub] carbondata pull request #1525: [CARBONDATA-1751] Make the type of exception ...
Github user chenerlu commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1525#discussion_r151911377 --- Diff: integration/spark2/src/test/scala/org/apache/spark/carbondata/CarbonDataSourceSuite.scala --- @@ -18,12 +18,10 @@ package org.apache.spark.carbondata import scala.collection.mutable - import org.apache.spark.sql.common.util.Spark2QueryTest import org.apache.spark.sql.types._ -import org.apache.spark.sql.{Row, SaveMode} +import org.apache.spark.sql.{AnalysisException, Row, SaveMode} import org.scalatest.BeforeAndAfterAll - --- End diff -- Suggest keep this space line ---
[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1516 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1303/ ---
[jira] [Assigned] (CARBONDATA-1774) Not able to fetch data from a table with Boolean data type in presto
[ https://issues.apache.org/jira/browse/CARBONDATA-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anubhav tarar reassigned CARBONDATA-1774: - Assignee: anubhav tarar > Not able to fetch data from a table with Boolean data type in presto > > > Key: CARBONDATA-1774 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1774 > Project: CarbonData > Issue Type: Bug > Components: presto-integration >Affects Versions: 1.3.0 > Environment: spark 2.1 >Reporter: Vandana Yadav >Assignee: anubhav tarar >Priority: Minor > > Not able to fetch data from a table with Boolean data type in presto > Steps to Reproduce: > On Beeline > 1)Create table: > create table boolean ( id int, employee boolean) stored by 'carbondata'; > 2) Insert values in table: > insert into boolean values (1,true); > insert into boolean values (2,false); > insert into boolean values (5,true); > insert into boolean values (3,true); > insert into boolean values (4,false); > 3) Execute select query with and without boolean datatype; > a)select id from boolean; > output: > +-+--+ > | id | > +-+--+ > | 2 | > | 3 | > | 4 | > | 5 | > | 1 | > +-+--+ > b)select employee from boolean; > output: > ---+--+ > | employee | > +---+--+ > | false | > | true | > | true | > | false | > | true | > +---+--+ > c) select * from boolean; > output: > +-+---+--+ > | id | employee | > +-+---+--+ > | 1 | true | > | 3 | true | > | 4 | false | > | 5 | true | > | 2 | false | > +-+---+--+ > On Presto CLI: > Execute queries with and without boolean data type: > a)select id from boolean; > output: > id > > 2 > 5 > 1 > 3 > 4 > (5 rows) > b)select employee from boolean; > output: > Expected output: it should display the boolean data type values of employee > column as on beeline. > Actual output: > Query 20171120_054640_00011_2ppsk, FAILED, 1 node > Splits: 21 total, 0 done (0.00%) > 0:01 [0 rows, 0B] [0 rows/s, 0B/s] > Query 20171120_054640_00011_2ppsk failed: > com.facebook.presto.spi.type.BooleanType > c)select * from boolean; > output: > Expected output: it should display the boolean data type values of employee > column as on beeline. > Actual output: > Query 20171120_054858_00012_2ppsk, FAILED, 1 node > Splits: 21 total, 0 done (0.00%) > 0:00 [0 rows, 0B] [0 rows/s, 0B/s] > Query 20171120_054858_00012_2ppsk failed: > com.facebook.presto.spi.type.BooleanType -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1774) Not able to fetch data from a table with Boolean data type in presto
Vandana Yadav created CARBONDATA-1774: - Summary: Not able to fetch data from a table with Boolean data type in presto Key: CARBONDATA-1774 URL: https://issues.apache.org/jira/browse/CARBONDATA-1774 Project: CarbonData Issue Type: Bug Components: presto-integration Affects Versions: 1.3.0 Environment: spark 2.1 Reporter: Vandana Yadav Priority: Minor Not able to fetch data from a table with Boolean data type in presto Steps to Reproduce: On Beeline 1)Create table: create table boolean ( id int, employee boolean) stored by 'carbondata'; 2) Insert values in table: insert into boolean values (1,true); insert into boolean values (2,false); insert into boolean values (5,true); insert into boolean values (3,true); insert into boolean values (4,false); 3) Execute select query with and without boolean datatype; a)select id from boolean; output: +-+--+ | id | +-+--+ | 2 | | 3 | | 4 | | 5 | | 1 | +-+--+ b)select employee from boolean; output: ---+--+ | employee | +---+--+ | false | | true | | true | | false | | true | +---+--+ c) select * from boolean; output: +-+---+--+ | id | employee | +-+---+--+ | 1 | true | | 3 | true | | 4 | false | | 5 | true | | 2 | false | +-+---+--+ On Presto CLI: Execute queries with and without boolean data type: a)select id from boolean; output: id 2 5 1 3 4 (5 rows) b)select employee from boolean; output: Expected output: it should display the boolean data type values of employee column as on beeline. Actual output: Query 20171120_054640_00011_2ppsk, FAILED, 1 node Splits: 21 total, 0 done (0.00%) 0:01 [0 rows, 0B] [0 rows/s, 0B/s] Query 20171120_054640_00011_2ppsk failed: com.facebook.presto.spi.type.BooleanType c)select * from boolean; output: Expected output: it should display the boolean data type values of employee column as on beeline. Actual output: Query 20171120_054858_00012_2ppsk, FAILED, 1 node Splits: 21 total, 0 done (0.00%) 0:00 [0 rows, 0B] [0 rows/s, 0B/s] Query 20171120_054858_00012_2ppsk failed: com.facebook.presto.spi.type.BooleanType -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1516 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1765/ ---
[GitHub] carbondata issue #1525: [CARBONDATA-1751] Make the type of exception and mes...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1525 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1302/ ---
[GitHub] carbondata issue #1535: [CARBONDATA-1771] While segment_index compaction, .c...
Github user dhatchayani commented on the issue: https://github.com/apache/carbondata/pull/1535 retest sdv please ---
[GitHub] carbondata issue #1438: [CARBONDATA-1649]insert overwrite fix during job int...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/1438 Please rebase @akashrn5 ---
[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1516 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1300/ ---
[GitHub] carbondata issue #1496: [CARBONDATA-1709][DataFrame] Support sort_columns op...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1496 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1764/ ---
[GitHub] carbondata issue #1460: [Docs] Fix partition-guide.md docs NUM_PARTITIONS wr...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1460 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1299/ ---
[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1516 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1298/ ---
[GitHub] carbondata issue #1499: [WIP][CARBONDATA-1235]Add Lucene Datamap
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1499 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1297/ ---
[jira] [Resolved] (CARBONDATA-1767) Remove dependency of Java 1.8
[ https://issues.apache.org/jira/browse/CARBONDATA-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Chen resolved CARBONDATA-1767. Resolution: Fixed > Remove dependency of Java 1.8 > - > > Key: CARBONDATA-1767 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1767 > Project: CarbonData > Issue Type: Bug >Reporter: Jacky Li >Assignee: Jacky Li > Fix For: 1.3.0 > > Time Spent: 40m > Remaining Estimate: 0h > > carbon should b enable to compile with Java 1.7 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1531: [CARBONDATA-1767] Remove dependency of Java 1...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1531 ---
[GitHub] carbondata issue #1460: [Docs] Fix partition-guide.md docs NUM_PARTITIONS wr...
Github user LiShuMing commented on the issue: https://github.com/apache/carbondata/pull/1460 @chenliang613 already done. ---
[GitHub] carbondata issue #1531: [CARBONDATA-1767] Remove dependency of Java 1.8
Github user chenliang613 commented on the issue: https://github.com/apache/carbondata/pull/1531 LGTM ---
[jira] [Resolved] (CARBONDATA-1768) Upgrade univocity parser to 2.2.1
[ https://issues.apache.org/jira/browse/CARBONDATA-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Chen resolved CARBONDATA-1768. Resolution: Fixed Fix Version/s: 1.3.0 > Upgrade univocity parser to 2.2.1 > - > > Key: CARBONDATA-1768 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1768 > Project: CarbonData > Issue Type: Improvement >Reporter: Jacky Li >Assignee: Jacky Li > Fix For: 1.3.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Univocity CSV parser has improved performance in 2.2.1, upgrade dependency to > use it -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1532: [CARBONDATA-1768] Upgrade univocity parser to...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1532 ---
[GitHub] carbondata issue #1532: [CARBONDATA-1768] Upgrade univocity parser to 2.2.1
Github user chenliang613 commented on the issue: https://github.com/apache/carbondata/pull/1532 LGTM ---
[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1516 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1296/ ---
[jira] [Resolved] (CARBONDATA-1769) Change alterTableCompaction to support transfer tableInfo
[ https://issues.apache.org/jira/browse/CARBONDATA-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1769. -- Resolution: Fixed > Change alterTableCompaction to support transfer tableInfo > -- > > Key: CARBONDATA-1769 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1769 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Reporter: xubo245 >Assignee: xubo245 >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Change alterTableCompaction to support transfer tableInfo -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1533: [CARBONDATA-1769] Change alterTableCompaction...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1533 ---
[GitHub] carbondata issue #1533: [CARBONDATA-1769] Change alterTableCompaction to sup...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/1533 LGTM ---
[jira] [Resolved] (CARBONDATA-1766) fix serialization issue for CarbonAppendableStreamSink
[ https://issues.apache.org/jira/browse/CARBONDATA-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1766. -- Resolution: Fixed Fix Version/s: 1.3.0 > fix serialization issue for CarbonAppendableStreamSink > -- > > Key: CARBONDATA-1766 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1766 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Reporter: QiangCai > Fix For: 1.3.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > fix serialization issue for CarbonAppendableStreamSink -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1530: [CARBONDATA-1766] fix serialization issue for...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1530 ---
[GitHub] carbondata issue #1530: [CARBONDATA-1766] fix serialization issue for Carbon...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/1530 LGTM ---
[GitHub] carbondata issue #1496: [CARBONDATA-1709][DataFrame] Support sort_columns op...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1496 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1295/ ---
[GitHub] carbondata issue #1499: [WIP][CARBONDATA-1235]Add Lucene Datamap
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1499 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1294/ ---
[GitHub] carbondata issue #1469: [WIP] Spark-2.2 Carbon Integration - Phase 1
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1469 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1293/ ---
[GitHub] carbondata issue #1469: [WIP] Spark-2.2 Carbon Integration - Phase 1
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1469 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1762/ ---
[GitHub] carbondata issue #1469: [WIP] Spark-2.2 Carbon Integration - Phase 1
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1469 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1292/ ---
[GitHub] carbondata issue #1469: [WIP] Spark-2.2 Carbon Integration - Phase 1
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1469 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1761/ ---
[GitHub] carbondata issue #1535: [CARBONDATA-1771] While segment_index compaction, .c...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1535 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1760/ ---
[jira] [Created] (CARBONDATA-1773) [Streaming]carbon StreamWriter task is failled with ClassCastException
Babulal created CARBONDATA-1773: --- Summary: [Streaming]carbon StreamWriter task is failled with ClassCastException Key: CARBONDATA-1773 URL: https://issues.apache.org/jira/browse/CARBONDATA-1773 Project: CarbonData Issue Type: Bug Affects Versions: 1.3.0 Reporter: Babulal Attachments: streamingLog.log Run below Seq of commands in spark Shell ( bin/spark-shell --jars /opt/carbon/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar --master yarn-client --executor-memory 1G --executor-cores 2 --driver-memory 1G ) // carbon is SparkSession with CarbonStateBuilder carbon.sql("create table stable (value String,count String) STORED BY 'carbondata' TBLPROPERTIES ('streaming' = 'true')") val lines = carbon.readStream.format("socket") .option("host", "localhost") .option("port", ) .load() val words = lines.as[String].flatMap(_.split(" ")) val wordCounts = words.groupBy("value").count() val carbonTable = CarbonEnv.getCarbonTable(Some("default"), "stable")(carbon) val tablePath = CarbonStorePath.getCarbonTablePath(carbonTable.getAbsoluteTableIdentifier) val qry = wordCounts.writeStream.format("carbondata").outputMode("complete").trigger(ProcessingTime("1 seconds")).option("tablePath", tablePath.getPath).option("checkpointLocation", tablePath.getStreamingCheckpointDir).option("tableName","stable").start() scala> qry.awaitTermination() Now in another window run below command root@master ~ # nc -lk babu Check SparkShell Stage 1:>(0 + 6) / 200]17/11/19 17:59:57 WARN TaskSetManager: Lost task 2.0 in stage 1.0 (TID 3, slave1, executor 2): org.apache.carbondata.streaming.CarbonStreamException: Task failed while writing rows at org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:286) at org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:192) at org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:191) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field scala.collection.convert.Wrappers$SeqWrapper.underlying of type scala.collection.Seq in instance of scala.collection.convert.Wrappers$SeqWrapper at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133) at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2251) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422) at org.apache.carbondata.hadoop.util.ObjectSerializationUtil.convertStringToObject(ObjectSerializationUtil.java:99) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1772) [Streaming]carbon StreamWriter task is failled with ClassCastException
Babulal created CARBONDATA-1772: --- Summary: [Streaming]carbon StreamWriter task is failled with ClassCastException Key: CARBONDATA-1772 URL: https://issues.apache.org/jira/browse/CARBONDATA-1772 Project: CarbonData Issue Type: Bug Affects Versions: 1.3.0 Reporter: Babulal Attachments: streamingLog.log Run below Seq of commands in spark Shell ( bin/spark-shell --jars /opt/carbon/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar --master yarn-client --executor-memory 1G --executor-cores 2 --driver-memory 1G ) // carbon is SparkSession with CarbonStateBuilder carbon.sql("create table stable (value String,count String) STORED BY 'carbondata' TBLPROPERTIES ('streaming' = 'true')") val lines = carbon.readStream.format("socket") .option("host", "localhost") .option("port", ) .load() val words = lines.as[String].flatMap(_.split(" ")) val wordCounts = words.groupBy("value").count() val carbonTable = CarbonEnv.getCarbonTable(Some("default"), "stable")(carbon) val tablePath = CarbonStorePath.getCarbonTablePath(carbonTable.getAbsoluteTableIdentifier) val qry = wordCounts.writeStream.format("carbondata").outputMode("complete").trigger(ProcessingTime("1 seconds")).option("tablePath", tablePath.getPath).option("checkpointLocation", tablePath.getStreamingCheckpointDir).option("tableName","stable").start() scala> qry.awaitTermination() Now in another window run below command root@master ~ # nc -lk babu Check SparkShell Stage 1:>(0 + 6) / 200]17/11/19 17:59:57 WARN TaskSetManager: Lost task 2.0 in stage 1.0 (TID 3, slave1, executor 2): org.apache.carbondata.streaming.CarbonStreamException: Task failed while writing rows at org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:286) at org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:192) at org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:191) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field scala.collection.convert.Wrappers$SeqWrapper.underlying of type scala.collection.Seq in instance of scala.collection.convert.Wrappers$SeqWrapper at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133) at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2251) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2245) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2169) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2027) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422) at org.apache.carbondata.hadoop.util.ObjectSerializationUtil.convertStringToObject(ObjectSerializationUtil.java:99) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1535: [CARBONDATA-1771] While segment_index compaction, .c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1535 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1291/ ---
[GitHub] carbondata issue #1531: [CARBONDATA-1767] Remove dependency of Java 1.8
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1531 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1759/ ---
[GitHub] carbondata issue #1535: [CARBONDATA-1771] While segment_index compaction, .c...
Github user dhatchayani commented on the issue: https://github.com/apache/carbondata/pull/1535 Retest this please ---
[GitHub] carbondata issue #1535: [CARBONDATA-1771] While segment_index compaction, .c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1535 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1290/ ---
[GitHub] carbondata issue #1532: [CARBONDATA-1768] Upgrade univocity parser to 2.2.1
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1532 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1758/ ---
[GitHub] carbondata pull request #1535: [CARBONDATA-1771] While segment_index compact...
GitHub user dhatchayani opened a pull request: https://github.com/apache/carbondata/pull/1535 [CARBONDATA-1771] While segment_index compaction, .carbonindex files of invalid segments are also getting merged **Scenario:** Disable feature, do loads, execute MINOR compaction Execute SEGMENT_INDEX compaction SEGMENT_INDEX compaction merges .carbonindex files of compacted invalid segments also **Solution:** Merge index files of valid segments only - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [X] Testing done UT Added - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhatchayani/incubator-carbondata index_compaction Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1535.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1535 commit 9462791bb5ef5eb9dbc9f7c3a8c97ef3e6ad7dfb Author: dhatchayani Date: 2017-11-19T15:23:50Z [CARBONDATA-1771] While segment_index compaction, .carbonindex files of invalid segments are also getting merged ---
[jira] [Updated] (CARBONDATA-1771) While segment_index compaction, .carbonindex files of invalid segments are also getting merged
[ https://issues.apache.org/jira/browse/CARBONDATA-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhatchayani updated CARBONDATA-1771: Summary: While segment_index compaction, .carbonindex files of invalid segments are also getting merged (was: While segment_index compaction, invalid segments .carbonindex files are also getting merged) > While segment_index compaction, .carbonindex files of invalid segments are > also getting merged > -- > > Key: CARBONDATA-1771 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1771 > Project: CarbonData > Issue Type: Improvement >Reporter: dhatchayani >Assignee: dhatchayani >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1533: [WIP][CARBONDATA-1769] Change alterTableCompaction t...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1533 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1757/ ---
[GitHub] carbondata issue #1508: [CARBONDATA-1738] Block direct insert/load on pre-ag...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1508 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1289/ ---
[jira] [Created] (CARBONDATA-1771) While segment_index compaction, invalid segments .carbonindex files are also getting merged
dhatchayani created CARBONDATA-1771: --- Summary: While segment_index compaction, invalid segments .carbonindex files are also getting merged Key: CARBONDATA-1771 URL: https://issues.apache.org/jira/browse/CARBONDATA-1771 Project: CarbonData Issue Type: Improvement Reporter: dhatchayani Assignee: dhatchayani Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1508: [CARBONDATA-1738] Block direct insert/load on pre-ag...
Github user kunal642 commented on the issue: https://github.com/apache/carbondata/pull/1508 retest this please ---
[GitHub] carbondata issue #1508: [CARBONDATA-1738] Block direct insert/load on pre-ag...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1508 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1288/ ---
[GitHub] carbondata issue #1534: [CARBONDATA-1770] Update error docs and consolidate ...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1534 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1756/ ---
[GitHub] carbondata issue #1532: [CARBONDATA-1768] Upgrade univocity parser to 2.2.1
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1532 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1287/ ---
[GitHub] carbondata issue #1167: [CARBONDATA-1304] [IUD BuggFix] Iud with single pass
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1167 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1755/ ---
[GitHub] carbondata issue #1534: [CARBONDATA-1770] Update error docs and consolidate ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1534 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1286/ ---
[GitHub] carbondata issue #1532: [CARBONDATA-1768] Upgrade univocity parser to 2.2.1
Github user chenliang613 commented on the issue: https://github.com/apache/carbondata/pull/1532 retest this please ---
[GitHub] carbondata pull request #1534: [CARBONDATA-1770] Update documents and consol...
GitHub user chenliang613 opened a pull request: https://github.com/apache/carbondata/pull/1534 [CARBONDATA-1770] Update documents and consolidate DDL,DML,Partition docs 1. Update documents : there are some error description. 2. Consolidate Data management, DDL,DML,Partition docs, to ensure one feature which only be described in one place. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [X] Any interfaces changed? NA - [X] Any backward compatibility impacted? NA - [X] Document update required? YES - [X] Testing done NA - [X] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. YES You can merge this pull request into a Git repository by running: $ git pull https://github.com/chenliang613/carbondata update_docs Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1534.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1534 commit a0333be14051166072fb9865dc0623ee1473c92e Author: chenliang613 Date: 2017-11-19T13:12:11Z [CARBONDATA-1770] Update documents and consolidate DDL,DML,Partition docs ---
[jira] [Created] (CARBONDATA-1770) Update documents and consolidate DDL,DML,Partition docs
Liang Chen created CARBONDATA-1770: -- Summary: Update documents and consolidate DDL,DML,Partition docs Key: CARBONDATA-1770 URL: https://issues.apache.org/jira/browse/CARBONDATA-1770 Project: CarbonData Issue Type: Improvement Components: docs Reporter: Liang Chen Assignee: Liang Chen 1. Update documents : there are some error description. 2. Consolidate Data management, DDL,DML,Partition docs, to ensure one feature which only be described in one place. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1533: [WIP][CARBONDATA-1769] Change alterTableCompaction t...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1533 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1285/ ---
[GitHub] carbondata pull request #1533: [WIP][CARBONDATA-1769] Change alterTableCompa...
GitHub user xubo245 opened a pull request: https://github.com/apache/carbondata/pull/1533 [WIP][CARBONDATA-1769] Change alterTableCompaction to support transfer tabeInfo Change alterTableCompaction to support transfer tabeInfo Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [ ] Testing done No - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. MR119 You can merge this pull request into a Git repository by running: $ git pull https://github.com/xubo245/carbondata CompactionGithub Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1533.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1533 commit 1605179cd3b8c8db18027537123b57fa3976ecc3 Author: xubo245 <601450...@qq.com> Date: 2017-11-19T12:06:00Z [CARBONDATA-1769] Change alterTableCompaction to support transfer tableInfo ---
[jira] [Created] (CARBONDATA-1769) Change alterTableCompaction to support transfer tableInfo
xubo245 created CARBONDATA-1769: --- Summary: Change alterTableCompaction to support transfer tableInfo Key: CARBONDATA-1769 URL: https://issues.apache.org/jira/browse/CARBONDATA-1769 Project: CarbonData Issue Type: Improvement Components: spark-integration Reporter: xubo245 Assignee: xubo245 Priority: Minor Fix For: 1.3.0 Change alterTableCompaction to support transfer tableInfo -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1525: [CARBONDATA-1751] Make the type of exception and mes...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/1525 I have fixed conflicts, please review it @jackylk ---
[GitHub] carbondata issue #1525: [CARBONDATA-1751] Make the type of exception and mes...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1525 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1284/ ---
[GitHub] carbondata issue #1525: [CARBONDATA-1751] Make the type of exception and mes...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1525 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1283/ ---