[GitHub] carbondata pull request #1559: [CARBONDATA-1805][Dictionary] Optimize prunin...

2017-12-13 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1559#discussion_r156873416
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/GlobalDictionaryUtil.scala
 ---
@@ -348,36 +347,53 @@ object GlobalDictionaryUtil {
   }
 
   /**
-   * load CSV files to DataFrame by using datasource 
"com.databricks.spark.csv"
+   * load and prune dictionary Rdd from csv file or input dataframe
*
-   * @param sqlContext  SQLContext
-   * @param carbonLoadModel carbon data load model
+   * @param sqlContext sqlContext
+   * @param carbonLoadModel carbonLoadModel
+   * @param inputDF input dataframe
+   * @param requiredCols names of dictionary column
+   * @param hadoopConf hadoop configuration
+   * @return rdd that contains only dictionary columns
*/
-  def loadDataFrame(sqlContext: SQLContext,
-  carbonLoadModel: CarbonLoadModel,
-  hadoopConf: Configuration): DataFrame = {
-CommonUtil.configureCSVInputFormat(hadoopConf, carbonLoadModel)
-hadoopConf.set(FileInputFormat.INPUT_DIR, 
carbonLoadModel.getFactFilePath)
-val columnNames = carbonLoadModel.getCsvHeaderColumns
-val schema = StructType(columnNames.map[StructField, 
Array[StructField]] { column =>
-  StructField(column, StringType)
-})
-val values = new Array[String](columnNames.length)
-val row = new StringArrayRow(values)
-val jobConf = new JobConf(hadoopConf)
-SparkHadoopUtil.get.addCredentials(jobConf)
-TokenCache.obtainTokensForNamenodes(jobConf.getCredentials,
-  Array[Path](new Path(carbonLoadModel.getFactFilePath)),
-  jobConf)
-val rdd = new NewHadoopRDD[NullWritable, StringArrayWritable](
-  sqlContext.sparkContext,
-  classOf[CSVInputFormat],
-  classOf[NullWritable],
-  classOf[StringArrayWritable],
-  jobConf).setName("global dictionary").map[Row] { currentRow =>
-  row.setValues(currentRow._2.get())
+  private def loadInputDataAsDictRdd(sqlContext: SQLContext, 
carbonLoadModel: CarbonLoadModel,
+  inputDF: Option[DataFrame], requiredCols: Array[String],
+  hadoopConf: Configuration): RDD[Row] = {
+if (inputDF.isDefined) {
+  inputDF.get.select(requiredCols.head, requiredCols.tail : _*).rdd
+} else {
+  CommonUtil.configureCSVInputFormat(hadoopConf, carbonLoadModel)
+  hadoopConf.set(FileInputFormat.INPUT_DIR, 
carbonLoadModel.getFactFilePath)
+  val headerCols = 
carbonLoadModel.getCsvHeaderColumns.map(_.toLowerCase)
+  val header2Idx = headerCols.zipWithIndex.toMap
+  // index of dictionary columns in header
+  val dictColIdx = requiredCols.map(c => header2Idx(c.toLowerCase))
+
+  val jobConf = new JobConf(hadoopConf)
+  SparkHadoopUtil.get.addCredentials(jobConf)
+  TokenCache.obtainTokensForNamenodes(jobConf.getCredentials,
+Array[Path](new Path(carbonLoadModel.getFactFilePath)),
+jobConf)
+  val dictRdd = new NewHadoopRDD[NullWritable, StringArrayWritable](
+sqlContext.sparkContext,
+classOf[CSVInputFormat],
+classOf[NullWritable],
+classOf[StringArrayWritable],
+jobConf).setName("global dictionary").map[Row] { currentRow =>
--- End diff --

move setName and map to separate line


---


[GitHub] carbondata issue #1658: [CARBONDATA-1680] Fixed Bug to show partition Ids fo...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1658
  
Build Success with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/730/



---


[GitHub] carbondata issue #1658: [CARBONDATA-1680] Fixed Bug to show partition Ids fo...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1658
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1958/



---


[GitHub] carbondata pull request #1559: [CARBONDATA-1805][Dictionary] Optimize prunin...

2017-12-13 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1559#discussion_r156871399
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/GlobalDictionaryUtil.scala
 ---
@@ -348,36 +347,53 @@ object GlobalDictionaryUtil {
   }
 
   /**
-   * load CSV files to DataFrame by using datasource 
"com.databricks.spark.csv"
+   * load and prune dictionary Rdd from csv file or input dataframe
*
-   * @param sqlContext  SQLContext
-   * @param carbonLoadModel carbon data load model
+   * @param sqlContext sqlContext
+   * @param carbonLoadModel carbonLoadModel
+   * @param inputDF input dataframe
+   * @param requiredCols names of dictionary column
+   * @param hadoopConf hadoop configuration
+   * @return rdd that contains only dictionary columns
*/
-  def loadDataFrame(sqlContext: SQLContext,
-  carbonLoadModel: CarbonLoadModel,
-  hadoopConf: Configuration): DataFrame = {
-CommonUtil.configureCSVInputFormat(hadoopConf, carbonLoadModel)
-hadoopConf.set(FileInputFormat.INPUT_DIR, 
carbonLoadModel.getFactFilePath)
-val columnNames = carbonLoadModel.getCsvHeaderColumns
-val schema = StructType(columnNames.map[StructField, 
Array[StructField]] { column =>
-  StructField(column, StringType)
-})
-val values = new Array[String](columnNames.length)
-val row = new StringArrayRow(values)
-val jobConf = new JobConf(hadoopConf)
-SparkHadoopUtil.get.addCredentials(jobConf)
-TokenCache.obtainTokensForNamenodes(jobConf.getCredentials,
-  Array[Path](new Path(carbonLoadModel.getFactFilePath)),
-  jobConf)
-val rdd = new NewHadoopRDD[NullWritable, StringArrayWritable](
-  sqlContext.sparkContext,
-  classOf[CSVInputFormat],
-  classOf[NullWritable],
-  classOf[StringArrayWritable],
-  jobConf).setName("global dictionary").map[Row] { currentRow =>
-  row.setValues(currentRow._2.get())
+  private def loadInputDataAsDictRdd(sqlContext: SQLContext, 
carbonLoadModel: CarbonLoadModel,
--- End diff --

please move parameter to separate line, one parameter one line


---


[GitHub] carbondata issue #1647: [CARBONDATA-1887] block pruning not happening is car...

2017-12-13 Thread mohammadshahidkhan
Github user mohammadshahidkhan commented on the issue:

https://github.com/apache/carbondata/pull/1647
  
retest sdv please


---


[GitHub] carbondata issue #1658: [CARBONDATA-1680] Fixed Bug to show partition Ids fo...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1658
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1957/



---


[GitHub] carbondata issue #1658: [CARBONDATA-1680] Fixed Bug to show partition Ids fo...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1658
  
Build Failed with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/729/



---


[GitHub] carbondata pull request #1601: [CARBONDATA-1787] Validation for table proper...

2017-12-13 Thread geetikagupta16
Github user geetikagupta16 closed the pull request at:

https://github.com/apache/carbondata/pull/1601


---


[GitHub] carbondata issue #1648: [CARBONDATA-1888][PreAggregate][Bug]Fixed compaction...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1648
  
@kumarvishal09 can you check the failed testcase


---


[GitHub] carbondata issue #1655: [CARBONDATA-1894] Add compactionType Parameter to co...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1655
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1956/



---


[jira] [Resolved] (CARBONDATA-1878) JVM crash after off-heap-sort disabled

2017-12-13 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-1878.
-
   Resolution: Fixed
Fix Version/s: 1.3.0

> JVM crash after off-heap-sort disabled
> --
>
> Key: CARBONDATA-1878
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1878
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.3.0
>Reporter: xuchuanyin
>Assignee: xuchuanyin
> Fix For: 1.3.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> # SCENARIO
> Recently I have fixed some issues in Carbondata. To perform a full test to 
> cover all the code that has been modified by me, I performed some iteration 
> of the whole test case. Each iteration is started with different  key 
> configurations that will affect the flow in the code.
> After I set `enable.offheap.sort=false` (default value is true) in the 
> configuration, running tests will always end up with JVM crash error. The 
> error messages are shown as below:
> ```
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f346b207ff1, pid=144619, tid=0x7f346c2fc700
> #
> # JRE version: Java(TM) SE Runtime Environment (8.0_111-b14) (build 
> 1.8.0_111-b14)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.111-b14 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # V  [libjvm.so+0xa90ff1]  Unsafe_SetNativeShort+0x51
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /home/xu/ws/carbondata/hs_err_pid144619.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> #
> Process finished with exit code 134 (interrupted by signal 6: SIGABRT)
> ```
> # STEPS TOREPRODUCE
> The error can be easily reproduced in different ways. Here I will provide a 
> simple way to reproduce it:
> 1. Find the test case `DateDataTypeDirectDictionaryTest`.
> 2. Add the following code in the method `beforeAll`.
> ```
> CarbonProperties.getInstance()
> .addProperty(CarbonCommonConstants.ENABLE_OFFHEAP_SORT, "false")
> ```
> 3. Run this test case.
> 4. You will find the test failed with the above error.
> 5. Replace the code in Step2 with the following code:
> ```
> CarbonProperties.getInstance()
> .addProperty(CarbonCommonConstants.ENABLE_OFFHEAP_SORT, "true")
> ```
> 6. Run this test case.
> 7. The test is success without error.
> # ANALYZE & RESOLVE
> I have reproduced this error and analyzed the core dump file. The final stack 
> message in core dump looks like below:
> ```
> Thread 73303: (state = IN_VM)
>  - sun.misc.Unsafe.putShort(long, short) @bci=0 (Interpreted frame)
>  - 
> org.apache.carbondata.core.indexstore.UnsafeMemoryDMStore.addToUnsafe(org.apache.carbondata.core.indexstore.schema.CarbonRowSchema,
>  org.apache.carbondata.core.indexstore.row.DataMapRow, int) @bci=781, 
> line=150 (Interpreted frame)
>  - 
> org.apache.carbondata.core.indexstore.UnsafeMemoryDMStore.addIndexRowToUnsafe(org.apache.carbondata.core.indexstore.row.DataMapRow)
>  @bci=59, line=99 (Interpreted frame)
>  ...
> ```
> After inspecting the code, I found there lies bug in `UnsafeMemoryDMStore 
> line=150` while writing length to unsafe memory -- It writes with wrong base 
> object.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1633: [CARBONDATA-1878] [DataMap] Fix bugs in unsaf...

2017-12-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1633


---


[GitHub] carbondata issue #1633: [CARBONDATA-1878] [DataMap] Fix bugs in unsafe datam...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1633
  
LGTM, Thank you for fixing the issue.


---


[GitHub] carbondata issue #1655: [CARBONDATA-1894] Add compactionType Parameter to co...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1655
  
Build Success with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/728/



---


[jira] [Commented] (CARBONDATA-1758) Carbon1.3.0- No Inverted Index : Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException

2017-12-13 Thread Sangeeta Gulia (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16290392#comment-16290392
 ] 

Sangeeta Gulia commented on CARBONDATA-1758:


Please provide more details for this bug as i am not able to replicate this 
issue, neither on my local system or 3 node cluster.

> Carbon1.3.0- No Inverted Index : Select column with is null for 
> no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException
> 
>
> Key: CARBONDATA-1758
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1758
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: 3 node cluster
>Reporter: Chetan Bhat
>  Labels: Functional
>
> Steps :
> In Beeline user executes the queries in sequence.
> CREATE TABLE uniqdata_DI_int (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='cust_id','NO_INVERTED_INDEX'='cust_id');
> LOAD DATA INPATH 'hdfs://hacluster/chetan/3000_UniqData.csv' into table 
> uniqdata_DI_int OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> Select count(CUST_ID) from uniqdata_DI_int;
> Select count(CUST_ID)*10 as multiple from uniqdata_DI_int;
> Select avg(CUST_ID) as average from uniqdata_DI_int;
> Select floor(CUST_ID) as average from uniqdata_DI_int;
> Select ceil(CUST_ID) as average from uniqdata_DI_int;
> Select ceiling(CUST_ID) as average from uniqdata_DI_int;
> Select CUST_ID*integer_column1 as multiple from uniqdata_DI_int;
> Select CUST_ID from uniqdata_DI_int where CUST_ID is null;
> *Issue : Select column with is null for no_inverted_index column throws 
> java.lang.ArrayIndexOutOfBoundsException*
> 0: jdbc:hive2://10.18.98.34:23040> Select CUST_ID from uniqdata_DI_int where 
> CUST_ID is null;
> Error: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 0 in stage 79.0 failed 4 times, most recent failure: Lost task 0.3 in 
> stage 79.0 (TID 123, BLR114278, executor 18): 
> org.apache.spark.util.TaskCompletionListenerException: 
> java.util.concurrent.ExecutionException: 
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105)
> at org.apache.spark.scheduler.Task.run(Task.scala:112)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Driver stacktrace: (state=,code=0)
> Expected : Select column with is null for no_inverted_index column should be 
> successful displaying the correct result set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1658: [CARBONDATA-1680] Fixed Bug to show partition...

2017-12-13 Thread SangeetaGulia
GitHub user SangeetaGulia opened a pull request:

https://github.com/apache/carbondata/pull/1658

[CARBONDATA-1680] Fixed Bug to show partition Ids for Hash Partition

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [x] Any interfaces changed? No
 
 - [x] Any backward compatibility impacted? No
 
 - [x] Document update required? No

 - [x] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [x] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. (N/A) 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/SangeetaGulia/incubator-carbondata 
CARBONDATA-1680

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1658.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1658


commit 807474ea179efcc2528867221e039b66a9948ad7
Author: SangeetaGulia 
Date:   2017-12-14T06:04:02Z

Fixed Bug to show partition Ids for Hash Partition




---


[GitHub] carbondata issue #1657: [CARBONDATA-1895] Fix issue of create table if not e...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1657
  
Build Failed with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/727/



---


[GitHub] carbondata issue #1281: [CARBONDATA-1326] Fixed findbug issue and univocity-...

2017-12-13 Thread pawanmalwal
Github user pawanmalwal commented on the issue:

https://github.com/apache/carbondata/pull/1281
  
Duplicate PR - univocity jar changes done in PR 
https://github.com/apache/carbondata/pull/1532


---


[GitHub] carbondata pull request #1281: [CARBONDATA-1326] Fixed findbug issue and uni...

2017-12-13 Thread pawanmalwal
Github user pawanmalwal closed the pull request at:

https://github.com/apache/carbondata/pull/1281


---


[GitHub] carbondata issue #1657: [CARBONDATA-1895] Fix issue of create table if not e...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1657
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1955/



---


[GitHub] carbondata issue #1657: [CARBONDATA-1895] Fix issue of create table if not e...

2017-12-13 Thread chenerlu
Github user chenerlu commented on the issue:

https://github.com/apache/carbondata/pull/1657
  
retest this please


---


[GitHub] carbondata pull request #1638: [CARBONDATA-1879][Streaming] Support alter ta...

2017-12-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1638


---


[GitHub] carbondata issue #1638: [CARBONDATA-1879][Streaming] Support alter table to ...

2017-12-13 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1638
  
LGTM


---


[jira] [Assigned] (CARBONDATA-1680) Carbon 1.3.0-Partitioning:Show Partition for Hash Partition doesn't display the partition id

2017-12-13 Thread Jatin (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jatin reassigned CARBONDATA-1680:
-

Assignee: Jatin

> Carbon 1.3.0-Partitioning:Show Partition for Hash Partition doesn't display 
> the partition id
> 
>
> Key: CARBONDATA-1680
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1680
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.3.0
>Reporter: Ayushi Sharma
>Assignee: Jatin
>Priority: Minor
> Attachments: Show_part_1_doc.PNG, show_part_1.PNG
>
>
> CREATE TABLE IF NOT EXISTS t9(
>  id Int,
>  logdate Timestamp,
>  phonenumber Int,
>  country String,
>  area String
>  )
>  PARTITIONED BY (vin String)
>  STORED BY 'carbondata'
>  TBLPROPERTIES('PARTITION_TYPE'='HASH','NUM_PARTITIONS'='5');
> show partitions t9;



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1653: [CARBONDATA-1893] Data load with multiple QUOTECHAR ...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1653
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1954/



---


[GitHub] carbondata issue #1653: [CARBONDATA-1893] Data load with multiple QUOTECHAR ...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1653
  
Build Success with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/726/



---


[jira] [Assigned] (CARBONDATA-1541) There are some errors when bad_records_action is IGNORE

2017-12-13 Thread xubo245 (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xubo245 reassigned CARBONDATA-1541:
---

Assignee: (was: xubo245)

> There are some errors when bad_records_action is IGNORE
> ---
>
> Key: CARBONDATA-1541
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1541
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.1.1
>Reporter: xubo245
>Priority: Minor
>   Original Estimate: 240h
>  Time Spent: 5h 40m
>  Remaining Estimate: 234h 20m
>
> There are some errors when bad_records_action is IGNORE
> {code:java}
> 17/10/09 01:20:31 ERROR CarbonRowDataWriterProcessorStepImpl: [Executor task 
> launch 
> worker-0][partitionID:default_int_table_2ade496b-a9e8-4e7c-82bd-fb21c2e590eb] 
> Failed for table: int_table in DataWriterProcessorStepImpl
> org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException:
>  unable to generate the mdkey
>   at 
> org.apache.carbondata.processing.newflow.steps.CarbonRowDataWriterProcessorStepImpl.processBatch(CarbonRowDataWriterProcessorStepImpl.java:276)
>   at 
> org.apache.carbondata.processing.newflow.steps.CarbonRowDataWriterProcessorStepImpl.doExecute(CarbonRowDataWriterProcessorStepImpl.java:162)
>   at 
> org.apache.carbondata.processing.newflow.steps.CarbonRowDataWriterProcessorStepImpl.execute(CarbonRowDataWriterProcessorStepImpl.java:123)
>   at 
> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:51)
>   at 
> org.apache.carbondata.spark.rdd.NewCarbonDataLoadRDD$$anon$1.(NewCarbonDataLoadRDD.scala:254)
>   at 
> org.apache.carbondata.spark.rdd.NewCarbonDataLoadRDD.internalCompute(NewCarbonDataLoadRDD.scala:229)
>   at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:62)
> {code}
>   
> 1. When table only have one column and the column data is INT, there is an 
> error:
> code:
> {code:java}
> test("Loading table: int, bad_records_action is IGNORE") {
> val fileLocation = 
> s"$rootPath/integration/spark-common-test/src/test/resources/badrecords/intTest.csv"
> sql("drop table if exists int_table")
> sql("CREATE TABLE if not exists int_table(intField INT) STORED BY 
> 'carbondata'")
> sql(
>   s"""
>  | LOAD DATA LOCAL INPATH '$fileLocation'
>  | INTO TABLE int_table
>  | OPTIONS('FILEHEADER' = 
> 'intField','bad_records_logger_enable'='true','bad_records_action'='IGNORE')
>""".stripMargin)
> sql("select * from int_table").show()
> checkAnswer(sql("select * from int_table where intField = 1"),
>   Seq(Row(1), Row(1)))
> sql("drop table if exists int_table")
>   }
> {code}
> 2. when sort_columns is null, there is an error :
> {code:java}
>   test("sort_columns is null, error") {
> sql("drop table if exists sales")
> sql(
>   """CREATE TABLE IF NOT EXISTS sales(ID BigInt, date Timestamp, country 
> String,
>   actual_price Double, Quantity int, sold_price Decimal(19,2))
>   STORED BY 'carbondata'
>   TBLPROPERTIES('sort_columns'='')""")
> CarbonProperties.getInstance()
>   .addProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC,
> new File("./target/test/badRecords")
>   .getCanonicalPath)
> CarbonProperties.getInstance()
>   .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, 
> "/MM/dd")
> var csvFilePath = s"$resourcesPath/badrecords/datasample.csv"
> sql("LOAD DATA local inpath '" + csvFilePath + "' INTO TABLE sales 
> OPTIONS"
>   +
>   "('bad_records_logger_enable'='true','bad_records_action'='redirect', 
> 'DELIMITER'=" +
>   " ',', 'QUOTECHAR'= '\"')");
> checkAnswer(
>   sql("select count(*) from sales"),
>   Seq(Row(2)
>   )
> )
>   }
> {code}
> The test code has been pushed into 
> https://github.com/xubo245/carbondata/tree/badRecordAction
> {code:java}
> org.apache.carbondata.integration.spark.testsuite.dataload.LoadDataWithBadRecords
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (CARBONDATA-1779) GeneriVectorizedReader for Presto

2017-12-13 Thread Liang Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Chen resolved CARBONDATA-1779.

   Resolution: Fixed
Fix Version/s: 1.3.0

> GeneriVectorizedReader for Presto
> -
>
> Key: CARBONDATA-1779
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1779
> Project: CarbonData
>  Issue Type: Improvement
>  Components: presto-integration
>Affects Versions: 1.3.0
>Reporter: Bhavya Aggarwal
>Assignee: Bhavya Aggarwal
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Write a Generic Vectorized Reader for Presto to remove the dependencies on 
> spark



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (CARBONDATA-1751) Modify sys.err to AnalysisException when uses run related operation except IUD,compaction and alter

2017-12-13 Thread xubo245 (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xubo245 closed CARBONDATA-1751.
---
Resolution: Fixed

> Modify sys.err to AnalysisException when  uses run related operation except 
> IUD,compaction and alter
> 
>
> Key: CARBONDATA-1751
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1751
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.2.0
>Reporter: xubo245
>Assignee: xubo245
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> carbon printout improper error message, for example, it printout system error 
> when users run create table with the same column name, but it should printout 
> related exception information
> So we modify sys.error method to AnalysisException when uses run related 
> operation except IUD,compaction and alter
> Make the type of exception and message correctly,including Spark2 and 
> spark-common module



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (CARBONDATA-1742) Fix NullPointerException in SegmentStatusManager

2017-12-13 Thread xubo245 (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xubo245 closed CARBONDATA-1742.
---
Resolution: Fixed

> Fix NullPointerException in SegmentStatusManager
> 
>
> Key: CARBONDATA-1742
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1742
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.2.0
>Reporter: xubo245
>Assignee: xubo245
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
>   when loadFolderDetailsArray is null ,there is NullPointerException. We 
> should fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1657: [CARBONDATA-1895] Fix issue of create table if not e...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1657
  
Build Failed with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/725/



---


[GitHub] carbondata issue #1657: [CARBONDATA-1895] Fix issue of create table if not e...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1657
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1953/



---


[GitHub] carbondata issue #1653: [CARBONDATA-1893] Data load with multiple QUOTECHAR ...

2017-12-13 Thread dhatchayani
Github user dhatchayani commented on the issue:

https://github.com/apache/carbondata/pull/1653
  
Retest this please


---


[GitHub] carbondata issue #1657: [CARBONDATA-1895] Fix issue of create table if not e...

2017-12-13 Thread chenerlu
Github user chenerlu commented on the issue:

https://github.com/apache/carbondata/pull/1657
  
retest this please


---


[GitHub] carbondata pull request #1581: [CARBONDATA-1779] GenericVectorizedReader

2017-12-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1581


---


[GitHub] carbondata issue #1642: [CARBONDATA-1855][PARTITION] Added outputformat to c...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1642
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1952/



---


[GitHub] carbondata issue #1642: [CARBONDATA-1855][PARTITION] Added outputformat to c...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1642
  
Build Success with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/724/



---


[GitHub] carbondata pull request #1656: [CARBONDATA-1247] Block pruning not working f...

2017-12-13 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1656#discussion_r156842298
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/CastExpressionOptimization.scala
 ---
@@ -122,6 +164,13 @@ object CastExpressionOptimization {
 } else {
   Some(CastExpr(c))
 }
+  case d: DateType if t.sameType(StringType) =>
--- End diff --

Merge the case blocks as follows,
```
case TimestampType | DateType
```


---


[GitHub] carbondata pull request #1656: [CARBONDATA-1247] Block pruning not working f...

2017-12-13 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1656#discussion_r156841880
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/CastExpressionOptimization.scala
 ---
@@ -82,6 +109,21 @@ object CastExpressionOptimization {
 }
   }
 
+  def typeCastStringToLongListForDateType(list: Seq[Expression]): 
Seq[Expression] = {
--- End diff --

code duplicated, extract common code


---


[GitHub] carbondata pull request #1656: [CARBONDATA-1247] Block pruning not working f...

2017-12-13 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1656#discussion_r156841784
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/CastExpressionOptimization.scala
 ---
@@ -67,6 +68,32 @@ object CastExpressionOptimization {
 }
   }
 
+  def typeCastStringToLongForDateType(v: Any): Any = {
--- End diff --

code duplicated, extract common code


---


[GitHub] carbondata issue #1638: [CARBONDATA-1879][Streaming] Support alter table to ...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1638
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1951/



---


[GitHub] carbondata issue #1638: [CARBONDATA-1879][Streaming] Support alter table to ...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1638
  
Build Success with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/723/



---


[GitHub] carbondata issue #1559: [CARBONDATA-1805][Dictionary] Optimize pruning for d...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1559
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1950/



---


[GitHub] carbondata pull request #1638: [CARBONDATA-1879][Streaming] Support alter ta...

2017-12-13 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1638#discussion_r156832490
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonAlterTableFinishStreaming.scala
 ---
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command.management
+
+import org.apache.spark.sql.{CarbonEnv, Row, SparkSession}
+import org.apache.spark.sql.execution.command.DataCommand
+
+import org.apache.carbondata.streaming.segment.StreamSegment
+
+/**
+ * This command will try to change the status of the segment from 
"streaming" to "streaming finish"
+ */
+case class CarbonAlterTableFinishStreaming(
+dbName: Option[String],
+tableName: String)
+  extends DataCommand {
--- End diff --

fixed


---


[GitHub] carbondata pull request #1638: [CARBONDATA-1879][Streaming] Support alter ta...

2017-12-13 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1638#discussion_r156834294
  
--- Diff: 
streaming/src/main/java/org/apache/carbondata/streaming/segment/StreamSegment.java
 ---
@@ -180,6 +182,70 @@ public static String close(CarbonTable table, String 
segmentId)
 }
   }
 
+  /**
+   * change the status of the segment from "streaming" to "streaming 
finish"
+   */
+  public static void finishStreaming(CarbonTable carbonTable) throws 
Exception {
+ICarbonLock lock = CarbonLockFactory.getCarbonLockObj(
+carbonTable.getTableInfo().getOrCreateAbsoluteTableIdentifier(),
+LockUsage.TABLE_STATUS_LOCK);
+try {
+  if (lock.lockWithRetries()) {
+ICarbonLock streamingLock = CarbonLockFactory.getCarbonLockObj(
+
carbonTable.getTableInfo().getOrCreateAbsoluteTableIdentifier(),
+LockUsage.STREAMING_LOCK);
+try {
+  if (streamingLock.lockWithRetries()) {
+LoadMetadataDetails[] details =
+
SegmentStatusManager.readLoadMetadata(carbonTable.getMetaDataFilepath());
+boolean updated = false;
+for (LoadMetadataDetails detail : details) {
+  if (SegmentStatus.STREAMING == detail.getSegmentStatus()) {
+detail.setLoadEndTime(System.currentTimeMillis());
+detail.setSegmentStatus(SegmentStatus.STREAMING_FINISH);
+updated = true;
+  }
+}
+if (updated) {
+  CarbonTablePath tablePath =
+  
CarbonStorePath.getCarbonTablePath(carbonTable.getAbsoluteTableIdentifier());
+  SegmentStatusManager.writeLoadDetailsIntoFile(
+  tablePath.getTableStatusFilePath(), details);
+}
+  } else {
+String msg = "Failed to finish streaming, because streaming is 
locked for table " +
+carbonTable.getDatabaseName() + "." + 
carbonTable.getTableName();
+LOGGER.error(msg);
+throw new Exception(msg);
+  }
+} finally {
+  if (streamingLock.unlock()) {
+LOGGER.info("Table unlocked successfully after streaming 
finished" + carbonTable
+.getDatabaseName() + "." + carbonTable.getTableName());
+  } else {
+LOGGER.error("Unable to unlock Table lock for table " +
+carbonTable.getDatabaseName() + "." + 
carbonTable.getTableName() +
+" during streaming finished");
+  }
+}
+  } else {
+String msg = "Failed to acquire table status lock of " +
+carbonTable.getDatabaseName() + "." + 
carbonTable.getTableName();
+LOGGER.error(msg);
+throw new Exception(msg);
--- End diff --

fixed


---


[GitHub] carbondata pull request #1638: [CARBONDATA-1879][Streaming] Support alter ta...

2017-12-13 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1638#discussion_r156834271
  
--- Diff: 
streaming/src/main/java/org/apache/carbondata/streaming/segment/StreamSegment.java
 ---
@@ -180,6 +182,70 @@ public static String close(CarbonTable table, String 
segmentId)
 }
   }
 
+  /**
+   * change the status of the segment from "streaming" to "streaming 
finish"
+   */
+  public static void finishStreaming(CarbonTable carbonTable) throws 
Exception {
+ICarbonLock lock = CarbonLockFactory.getCarbonLockObj(
+carbonTable.getTableInfo().getOrCreateAbsoluteTableIdentifier(),
+LockUsage.TABLE_STATUS_LOCK);
+try {
+  if (lock.lockWithRetries()) {
+ICarbonLock streamingLock = CarbonLockFactory.getCarbonLockObj(
+
carbonTable.getTableInfo().getOrCreateAbsoluteTableIdentifier(),
+LockUsage.STREAMING_LOCK);
--- End diff --

fixed


---


[GitHub] carbondata pull request #1638: [CARBONDATA-1879][Streaming] Support alter ta...

2017-12-13 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1638#discussion_r156832564
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala
 ---
@@ -129,6 +129,12 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser 
{
 CarbonAlterTableCompactionCommand(altertablemodel)
 }
 
+  protected lazy val alterTableFinishStreaming: Parser[LogicalPlan] =
--- End diff --

fixed


---


[GitHub] carbondata issue #1632: [CARBONDATA-1839] [DataLoad]Fix bugs in compressing ...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1632
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1949/



---


[GitHub] carbondata issue #1559: [CARBONDATA-1805][Dictionary] Optimize pruning for d...

2017-12-13 Thread xuchuanyin
Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1559
  
retest this please


---


[GitHub] carbondata issue #1632: [CARBONDATA-1839] [DataLoad]Fix bugs in compressing ...

2017-12-13 Thread xuchuanyin
Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1632
  
retest this please


---


[GitHub] carbondata issue #1655: [CARBONDATA-1894] Add compactionType Parameter to co...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1655
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2274/



---


[GitHub] carbondata issue #1636: [CARBONDATA-1801] Remove unnecessary mdk computation...

2017-12-13 Thread chenliang613
Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1636
  
@kevinjmh  this pr , looks you introduced many other commit record? 
may need to rework :) 


---


[GitHub] carbondata issue #1641: [CARBONDATA-1882] select with group by and insertove...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1641
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2272/



---


[GitHub] carbondata issue #1657: [CARBONDATA-1895] Fix issue of create table if not e...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1657
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2271/



---


[GitHub] carbondata issue #1636: [CARBONDATA-1801] Remove unnecessary mdk computation...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1636
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2270/



---


[GitHub] carbondata issue #1647: [CARBONDATA-1887] block pruning not happening is car...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1647
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1948/



---


[GitHub] carbondata issue #1653: [CARBONDATA-1893] Data load with multiple QUOTECHAR ...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1653
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2269/



---


[GitHub] carbondata issue #1654: [CARBONDATA-1856] Support insert/load data for parti...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1654
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2268/



---


[GitHub] carbondata issue #1636: [CARBONDATA-1801] Remove unnecessary mdk computation...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1636
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1947/



---


[GitHub] carbondata issue #1647: [CARBONDATA-1887] block pruning not happening is car...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1647
  
Build Success with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/720/



---


[GitHub] carbondata issue #1641: [CARBONDATA-1882] select with group by and insertove...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1641
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1946/



---


[GitHub] carbondata issue #1656: [CARBONDATA-1247] Block pruning not working for date...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1656
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2267/



---


[GitHub] carbondata issue #1636: [CARBONDATA-1801] Remove unnecessary mdk computation...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1636
  
Build Success with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/719/



---


[GitHub] carbondata issue #1653: [CARBONDATA-1893] Data load with multiple QUOTECHAR ...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1653
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1945/



---


[GitHub] carbondata issue #1647: [CARBONDATA-1887] block pruning not happening is car...

2017-12-13 Thread mohammadshahidkhan
Github user mohammadshahidkhan commented on the issue:

https://github.com/apache/carbondata/pull/1647
  
retest this please


---


[GitHub] carbondata issue #1647: [CARBONDATA-1887] block pruning not happening is car...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1647
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2266/



---


[GitHub] carbondata issue #1655: [CARBONDATA-1894] Add compactionType Parameter to co...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1655
  
Build Success with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/717/



---


[GitHub] carbondata issue #1657: [CARBONDATA-1895] Fix issue of create table if not e...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1657
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1943/



---


[GitHub] carbondata issue #1655: [CARBONDATA-1894] Add compactionType Parameter to co...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1655
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2265/



---


[GitHub] carbondata issue #1657: [CARBONDATA-1895] Fix issue of create table if not e...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1657
  
Build Failed with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/716/



---


[GitHub] carbondata issue #1636: [CARBONDATA-1801] Remove unnecessary mdk computation...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1636
  
Build Success with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/715/



---


[GitHub] carbondata issue #1648: [CARBONDATA-1888][PreAggregate][Bug]Fixed compaction...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1648
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1941/



---


[GitHub] carbondata issue #1653: [CARBONDATA-1893] Data load with multiple QUOTECHAR ...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1653
  
Build Success with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/714/



---


[GitHub] carbondata issue #1655: [CARBONDATA-1894] Add compactionType Parameter to co...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1655
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2264/



---


[GitHub] carbondata issue #1653: [CARBONDATA-1893] Data load with multiple QUOTECHAR ...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1653
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1940/



---


[GitHub] carbondata issue #1654: [CARBONDATA-1856] Support insert/load data for parti...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1654
  
Build Failed with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/713/



---


[GitHub] carbondata issue #1655: [CARBONDATA-1894] Add compactionType Parameter to co...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1655
  
Build Success with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/712/



---


[GitHub] carbondata issue #1650: [CARBONDATA-1703] Refactored code for creation of fi...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1650
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2263/



---


[GitHub] carbondata issue #1656: [CARBONDATA-1247] Block pruning not working for date...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1656
  
Build Failed with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/711/



---


[GitHub] carbondata issue #1657: [CARBONDATA-1895] Fix issue of create table if not e...

2017-12-13 Thread chenerlu
Github user chenerlu commented on the issue:

https://github.com/apache/carbondata/pull/1657
  
retest this please


---


[GitHub] carbondata issue #1655: [CARBONDATA-1894] Add compactionType Parameter to co...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1655
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1938/



---


[GitHub] carbondata issue #1546: [CARBONDATA-1736][Pre-Aggregate] Query from segment ...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1546
  
Build Failed with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/709/



---


[GitHub] carbondata issue #1654: [CARBONDATA-1856] Support insert/load data for parti...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1654
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2262/



---


[GitHub] carbondata issue #1656: [CARBONDATA-1247] Block pruning not working for date...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1656
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1937/



---


[GitHub] carbondata issue #1653: [CARBONDATA-1893] Data load with multiple QUOTECHAR ...

2017-12-13 Thread dhatchayani
Github user dhatchayani commented on the issue:

https://github.com/apache/carbondata/pull/1653
  
retest this please


---


[GitHub] carbondata issue #1636: [CARBONDATA-1801] Remove unnecessary mdk computation...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1636
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2261/



---


[GitHub] carbondata issue #1653: [CARBONDATA-1893] Data load with multiple QUOTECHAR ...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1653
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1936/



---


[GitHub] carbondata pull request #1641: [CARBONDATA-1882] select with group by and in...

2017-12-13 Thread gvramana
Github user gvramana commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1641#discussion_r156700781
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
 ---
@@ -483,20 +491,21 @@ object CarbonDataRDDFactory {
  s"${ carbonLoadModel.getDatabaseName }.${ 
carbonLoadModel.getTableName }")
 throw new Exception(status(0)._2._2.errorMsg)
   }
-  // if segment is empty then fail the data load
+
+  var newEntryLoadStatus =
   if 
(!carbonLoadModel.getCarbonDataLoadSchema.getCarbonTable.isChildDataMap &&
   !CarbonLoaderUtil.isValidSegment(carbonLoadModel, 
carbonLoadModel.getSegmentId.toInt)) {
-// update the load entry in table status file for changing the 
status to marked for delete
-CommonUtil.updateTableStatusForFailure(carbonLoadModel)
-LOGGER.info("starting clean up**")
-CarbonLoaderUtil.deleteSegment(carbonLoadModel, 
carbonLoadModel.getSegmentId.toInt)
-LOGGER.info("clean up done**")
+
 LOGGER.audit(s"Data load is failed for " +
  s"${ carbonLoadModel.getDatabaseName }.${ 
carbonLoadModel.getTableName }" +
  " as there is no data to load")
 LOGGER.warn("Cannot write load metadata file as data load failed")
-throw new Exception("No Data to load")
+
--- End diff --

write comment 'as no records loaded in new segment, new segment should be 
deleted'


---


[GitHub] carbondata pull request #1641: [CARBONDATA-1882] select with group by and in...

2017-12-13 Thread gvramana
Github user gvramana commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1641#discussion_r156700497
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
 ---
@@ -375,7 +375,15 @@ object CarbonDataRDDFactory {
   }
   }
 } else {
-  loadStatus = SegmentStatus.LOAD_FAILURE
+  if (dataFrame.isDefined && updateModel.isEmpty) {
--- End diff --

Write comment explaining this


---


[GitHub] carbondata pull request #1641: [CARBONDATA-1882] select with group by and in...

2017-12-13 Thread gvramana
Github user gvramana commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1641#discussion_r156698805
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/InsertIntoCarbonTableTestCase.scala
 ---
@@ -276,8 +281,178 @@ class InsertIntoCarbonTableTestCase extends QueryTest 
with BeforeAndAfterAll {
 }
 sql("LOAD DATA INPATH '" + resourcesPath + "/100_olap.csv' overwrite 
INTO table TCarbonSourceOverwrite options ('DELIMITER'=',', 'QUOTECHAR'='\', 
'FILEHEADER'='imei,deviceInformationId,MAC,deviceColor,device_backColor,modelId,marketName,AMSize,ROMSize,CUPAudit,CPIClocked,series,productionDate,bomCode,internalModels,deliveryTime,channelsId,channelsName,deliveryAreaId,deliveryCountry,deliveryProvince,deliveryCity,deliveryDistrict,deliveryStreet,oxSingleNumber,ActiveCheckTime,ActiveAreaId,ActiveCountry,ActiveProvince,Activecity,ActiveDistrict,ActiveStreet,ActiveOperatorId,Active_releaseId,Active_EMUIVersion,Active_operaSysVersion,Active_BacVerNumber,Active_BacFlashVer,Active_webUIVersion,Active_webUITypeCarrVer,Active_webTypeDataVerNumber,Active_operatorsVersion,Active_phonePADPartitionedVersions,Latest_YEAR,Latest_MONTH,Latest_DAY,Latest_HOUR,Latest_areaId,Latest_country,Latest_province,Latest_city,Latest_district,Latest_street,Latest_releaseId,Latest_EMUIVersion,Latest_operaS
 
ysVersion,Latest_BacVerNumber,Latest_BacFlashVer,Latest_webUIVersion,Latest_webUITypeCarrVer,Latest_webTypeDataVerNumber,Latest_operatorsVersion,Latest_phonePADPartitionedVersions,Latest_operatorId,gamePointDescription,gamePointId,contractNumber')")
 assert(rowCount == sql("select imei from 
TCarbonSourceOverwrite").count())
+
+  }
+
+  test("insert overwrite in group by scenario with t1 no record and t2 
some record") {
--- End diff --

Move common code to a function


---


[GitHub] carbondata issue #1655: [CARBONDATA-1894] Add compactionType Parameter to co...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1655
  
Build Failed with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/707/



---


[GitHub] carbondata pull request #1657: [CARBONDATA-1895] Fix issue of create table i...

2017-12-13 Thread chenerlu
GitHub user chenerlu opened a pull request:

https://github.com/apache/carbondata/pull/1657

[CARBONDATA-1895] Fix issue of create table if not exists

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed? No
 
 - [ ] Any backward compatibility impacted? No
 
 - [ ] Document update required? No

 - [ ] Testing done
   Already add test case in project.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chenerlu/incubator-carbondata pr-1212

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1657.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1657






---


[GitHub] carbondata issue #1655: [CARBONDATA-1894] Add compactionType Parameter to co...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1655
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1935/



---


[GitHub] carbondata issue #1653: [CARBONDATA-1893] Data load with multiple QUOTECHAR ...

2017-12-13 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1653
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/2260/



---


[jira] [Created] (CARBONDATA-1895) Fix issue of create table if not exits

2017-12-13 Thread chenerlu (JIRA)
chenerlu created CARBONDATA-1895:


 Summary: Fix issue of create table if not exits 
 Key: CARBONDATA-1895
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1895
 Project: CarbonData
  Issue Type: Bug
Reporter: chenerlu
Assignee: chenerlu






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (CARBONDATA-1247) Block pruning not working for date type data type column

2017-12-13 Thread Pawan Malwal (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289364#comment-16289364
 ] 

Pawan Malwal commented on CARBONDATA-1247:
--

[https://github.com/apache/carbondata/pull/1656]

> Block pruning not working for date type data type column
> 
>
> Key: CARBONDATA-1247
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1247
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
> Environment: Standalone
>Reporter: krishna reddy
>Assignee: Pawan Malwal
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> 1. create table if not exists test_date (id int,first_name String,last_name 
> string,email string,gender string,dob date) stored by 'carbondata'
> 2. LOAD DATA LOCAL INPATH 'D:/data/MOCK_DATA_24.csv' into table test_date
> 3. LOAD DATA LOCAL INPATH 'D:/data/MOCK_DATA_25.csv' into table test_date
> 4. select dob from test_date where dob = '2016-06-24'
> Actual Result: In the logs it is going to 2 blocks
>  Identified no.of.blocks: 2,
>  no.of.tasks: 2,
>  no.of.nodes: 0,
>  parallelism: 1
> Expected: It should select only one block



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1653: [CARBONDATA-1893] Data load with multiple QUOTECHAR ...

2017-12-13 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1653
  
Build Success with Spark 2.2.0, Please check CI 
http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/706/



---


[GitHub] carbondata pull request #1656: [CARBONDATA-1247] Block pruning not working f...

2017-12-13 Thread pawanmalwal
GitHub user pawanmalwal reopened a pull request:

https://github.com/apache/carbondata/pull/1656

[CARBONDATA-1247] Block pruning not working for date type column

Block pruning not working for date type column.
Root Cause : Type casting of String for DateType is not handled

Solution: CastExpressionOptimization should handle the casting of String 
for DateType

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [X] Any interfaces changed?
 None
 - [X] Any backward compatibility impacted?
 None
 - [X] Document update required?
NA
 - [X] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   Done manual testing
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
NA



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pawanmalwal/carbondata 
date_datatype_block_pruning_issue_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1656.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1656


commit de83c25a03606a42eac049433798b2cfdfc7b5af
Author: Pawan Malwal 
Date:   2017-12-13T07:31:20Z

[CARBONDATA-1887] Block pruning not working for date type column




---


[GitHub] carbondata pull request #1656: [CARBONDATA-1247] Block pruning not working f...

2017-12-13 Thread pawanmalwal
Github user pawanmalwal closed the pull request at:

https://github.com/apache/carbondata/pull/1656


---


[jira] [Closed] (CARBONDATA-1889) Block pruning not working for date type column

2017-12-13 Thread Pawan Malwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pawan Malwal closed CARBONDATA-1889.

Resolution: Duplicate

> Block pruning not working for date type column
> --
>
> Key: CARBONDATA-1889
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1889
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Pawan Malwal
>Assignee: Pawan Malwal
>
> spark.sql(s"""create table test_dateType(c1 int, c2 string, c3 smallint, 
> c4 bigint, c5 short, c6 date) stored by 'carbondata'""".stripMargin).show()
> spark.sql(s"""insert into test_dateType select 
> 1,'111',111,11,,'2017-12-12'""".stripMargin)
> spark.sql(s"""insert into test_dateType select 
> 2,'222',222,22,,'2018-11-11'""".stripMargin)
> spark.sql(s"""insert into test_dateType select 
> 3,'333',333,33,,'2019-10-10'""".stripMargin)
> spark.sql(s"""select * from test_dateType where c6 = '2018-11-11' 
> """.stripMargin).show(200,false)
> Check for "Identified no.of.blocks" in logs
> It shows :  Identified no.of.blocks: 3
> Only 1 block should be selected but all the 3 blocks are getting selected



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >