date:20171116

[GitHub] carbondata pull request #1516: [CARBONDATA-1729]Fix the compatibility issue ...

2017-11-16 Thread chenliang613

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1516#discussion_r151621845
  
--- Diff: pom.xml ---
@@ -453,9 +453,9 @@
   
 
 
-  hadoop-2.7.2
+  hadoop-2.2.0
--- End diff --

you can add a profile for hadoop-2.2.0, don't need to overwrite 
hadoop-2.7.2.  by default, should use hadoop-2.7.2


---

[GitHub] carbondata issue #1519: [CARBONDATA-1753][Streaming]Fix missing 'org.scalate...

2017-11-16 Thread zzcclp

Github user zzcclp commented on the issue:

https://github.com/apache/carbondata/pull/1519
  
Done


---

[jira] [Resolved] (CARBONDATA-1750) SegmentStatusManager.readLoadMetadata showing NPE if tablestatus file is empty

2017-11-16 Thread Jacky Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-1750.
--
   Resolution: Fixed
 Assignee: QiangCai
Fix Version/s: 1.3.0

> SegmentStatusManager.readLoadMetadata showing NPE if tablestatus file is empty
> --
>
> Key: CARBONDATA-1750
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1750
> Project: CarbonData
>  Issue Type: Bug
>Reporter: QiangCai
>Assignee: QiangCai
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> SegmentStatusManager.readLoadMetadata showing NPE if tablestatus file is empty



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1519: [CARBONDATA-1753]Fix missing 'org.scalatest.tools.Ru...

2017-11-16 Thread chenliang613

Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1519
  
LGTM


---

[GitHub] carbondata issue #1519: [CARBONDATA-1753]Fix missing 'org.scalatest.tools.Ru...

2017-11-16 Thread chenliang613

Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1519
  
@zzcclp  can you change the title format , like : 
[CARBONDATA-1753][Streaming] Fix missing 'org.scalatest.tools.Runner' issue 
when run test with streaming module


---

[GitHub] carbondata pull request #1517: [CARBONDATA-1750] Fix NPE when tablestatus fi...

2017-11-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1517


---

[GitHub] carbondata issue #1508: [CARBONDATA-1738] Block direct insert/load on pre-ag...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1508
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1215/



---

[GitHub] carbondata pull request #1522: [HOTFIX] change to use store path in property...

2017-11-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1522


---

[GitHub] carbondata issue #1522: [HOTFIX] change to use store path in property in tes...

2017-11-16 Thread QiangCai

Github user QiangCai commented on the issue:

https://github.com/apache/carbondata/pull/1522
  
LGTM


---

[GitHub] carbondata pull request #1522: [HOTFIX] change to use store path in property...

2017-11-16 Thread jackylk

GitHub user jackylk opened a pull request:

https://github.com/apache/carbondata/pull/1522

[HOTFIX] change to use store path in property in testcase

Change to use store path in property in testcase

 - [X] Any interfaces changed?
 No
 - [X] Any backward compatibility impacted?
 No
 - [X] Document update required?
No
 - [X] Testing done
No test case is added
 - [X] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
NA


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jackylk/incubator-carbondata patch-3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1522.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1522


commit 7668cab9530d54bfc2ed46c3d3011d65d21a
Author: Jacky Li 
Date:   2017-11-17T07:45:26Z

[HOTFIX] change to use store path in property in testcase

change to use store path in property in testcase




---

[GitHub] carbondata pull request #1521: [WIP] [CARBONDATA-1743] fix conurrent pre-agg...

2017-11-16 Thread kunal642

GitHub user kunal642 opened a pull request:

https://github.com/apache/carbondata/pull/1521

[WIP] [CARBONDATA-1743] fix  conurrent pre-agg creation and query

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kunal642/carbondata concurrent_query

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1521.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1521


commit 8952423e18a0c36afe08c5ead3730ff48fc3661e
Author: kunal642 
Date:   2017-11-17T06:43:25Z

fix  conurrent pre-agg creation and query




---

[GitHub] carbondata issue #1519: [CARBONDATA-1753]Fix missing 'org.scalatest.tools.Ru...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1519
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1214/



---

[GitHub] carbondata pull request #1520: [CARBONDATA-1734] Ignore empty line while rea...

2017-11-16 Thread dhatchayani

GitHub user dhatchayani opened a pull request:

https://github.com/apache/carbondata/pull/1520

[CARBONDATA-1734] Ignore empty line while reading CSV

Ignore / Skip empty lines while loading
Load level and System level properties are added to control it.


 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [X] Document update required?
 New System level property and load level property is added. 
Document should be updated 
 accordingly.

 - [X] Testing done
UT Added
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhatchayani/incubator-carbondata empty_line

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1520.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1520


commit 41b0a259ba94509245383a23029b6a6ce2366760
Author: dhatchayani 
Date:   2017-11-17T07:11:49Z

[CARBONDATA-1734] Ignore empty line while reading CSV




---

[jira] [Updated] (CARBONDATA-1726) Carbon1.3.0-Streaming - Select query from spark-shell does not execute successfully for streaming table load

2017-11-16 Thread Chetan Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1726:

Description: 
Steps :
// prepare csv file for batch loading
cd /srv/spark2.2Bigdata/install/hadoop/datanode/bin

// generate streamSample.csv

10001,batch_1,city_1,0.1,school_1:school_11$20
10002,batch_2,city_2,0.2,school_2:school_22$30
10003,batch_3,city_3,0.3,school_3:school_33$40
10004,batch_4,city_4,0.4,school_4:school_44$50
10005,batch_5,city_5,0.5,school_5:school_55$60

// put to hdfs /tmp/streamSample.csv
./hadoop fs -put streamSample.csv /tmp

// spark-beeline
cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
bin/spark-submit --master yarn-client --executor-memory 10G --executor-cores 5 
--driver-memory 5G --num-executors 3 --class 
org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
 "hdfs://hacluster/user/sparkhive/warehouse"

bin/beeline -u jdbc:hive2://10.18.98.34:23040

CREATE TABLE stream_table(
id INT,
name STRING,
city STRING,
salary FLOAT
)
STORED BY 'carbondata'
TBLPROPERTIES('streaming'='true', 'sort_columns'='name');

LOAD DATA LOCAL INPATH 'hdfs://hacluster/chetan/streamSample.csv' INTO TABLE 
stream_table OPTIONS('HEADER'='false');

// spark-shell 
cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
bin/spark-shell --master yarn-client

import java.io.{File, PrintWriter}
import java.net.ServerSocket

import org.apache.spark.sql.{CarbonEnv, SparkSession}
import org.apache.spark.sql.hive.CarbonRelation
import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}

import org.apache.carbondata.core.constants.CarbonCommonConstants
import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}

CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
 "/MM/dd")

import org.apache.spark.sql.CarbonSession._

val carbonSession = SparkSession.
  builder().
  appName("StreamExample").
  config("spark.sql.warehouse.dir", 
"hdfs://hacluster/user/sparkhive/warehouse").
  config("javax.jdo.option.ConnectionURL", 
"jdbc:mysql://10.18.98.34:3306/sparksql?characterEncoding=UTF-8").
  config("javax.jdo.option.ConnectionDriverName", "com.mysql.jdbc.Driver").
  config("javax.jdo.option.ConnectionPassword", "huawei").
  config("javax.jdo.option.ConnectionUserName", "sparksql").
  getOrCreateCarbonSession()
   
carbonSession.sparkContext.setLogLevel("ERROR")

carbonSession.sql("select * from stream_table").show

*Issue : Select query from spark-shell does not execute successfully for 
streaming table load.*
When the executor and driver cores and memory is increased while launching the 
spark shell the issue still occurs.
bin/spark-shell --master yarn-client --executor-memory 10G --executor-cores 5 
--driver-memory 5G --num-executors 3
scala> import org.apache.carbondata.core.constants.CarbonCommonConstants
import org.apache.carbondata.core.constants.CarbonCommonConstants

scala> import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.util.CarbonProperties

scala> import org.apache.carbondata.core.util.path.{CarbonStorePath, 
CarbonTablePath}
import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}

scala>

scala> 
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
 "/MM/dd")
res29: org.apache.carbondata.core.util.CarbonProperties = 
org.apache.carbondata.core.util.CarbonProperties@67b056e7

scala>

scala> import org.apache.spark.sql.CarbonSession._
import org.apache.spark.sql.CarbonSession._

scala>

scala> val carbonSession = SparkSession.
 |   builder().
 |   appName("StreamExample").
 |   config("spark.sql.warehouse.dir", 
"hdfs://hacluster/user/sparkhive/warehouse").
 |   config("javax.jdo.option.ConnectionURL", 
"jdbc:mysql://10.18.98.34:3306/sparksql?characterEncoding=UTF-8").
 |   config("javax.jdo.option.ConnectionDriverName", 
"com.mysql.jdbc.Driver").
 |   config("javax.jdo.option.ConnectionPassword", "huawei").
 |   config("javax.jdo.option.ConnectionUserName", "sparksql").
 |   getOrCreateCarbonSession()
carbonSession: org.apache.spark.sql.SparkSession = 
org.apache.spark.sql.CarbonSession@1d0590bc

scala>
 | carbonSession.sparkContext.setLogLevel("ERROR")

scala> carbonSession.sql("select * from stream_table").show
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 25.0 failed 4 times, most recent failure: Lost task 0.3 in stage 25.0 
(TID 65, BLR114269, executor 8): java.lang.IllegalStateException: unread 
block data
at 
java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2424)
at java.io.ObjectInputStream.readObject0(Obje

[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...

2017-11-16 Thread zzcclp

Github user zzcclp commented on the issue:

https://github.com/apache/carbondata/pull/1516
  
I have complied successfully with below commands:
1. mvn -Pspark-2.1 -Pbuild-with-format -Dspark.version=2.1.2 clean package;
2. mvn -Pspark-2.1 -Phadoop-2.2.0 -Pbuild-with-format -Dspark.version=2.1.2 
-Dhadoop.version=2.6.0-cdh5.7.1;

@QiangCai @jackylk @chenliang613 please review, thanks.


---

[jira] [Assigned] (CARBONDATA-1734) Ignore empty line while reading CSV

2017-11-16 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani reassigned CARBONDATA-1734:
---

Assignee: dhatchayani  (was: Akash R Nilugal)

> Ignore empty line while reading CSV
> ---
>
> Key: CARBONDATA-1734
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1734
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Ignore empty line while reading CSV file in LOAD



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

2017-11-16 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1471#discussion_r151615805
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datamap/dev/AbstractDataMapWriter.java
 ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.core.datamap.dev;
+
+import java.io.IOException;
+
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.datastore.page.ColumnPage;
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.util.CarbonUtil;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+/**
+ * Data Map writer
+ */
+public abstract class AbstractDataMapWriter {
+
+  protected AbsoluteTableIdentifier identifier;
+
+  protected String segmentId;
+
+  protected String writeDirectoryPath;
+
+  public AbstractDataMapWriter(AbsoluteTableIdentifier identifier, String 
segmentId,
+  String writeDirectoryPath) {
+this.identifier = identifier;
+this.segmentId = segmentId;
+this.writeDirectoryPath = writeDirectoryPath;
+  }
+
+  /**
+   * Start of new block notification.
+   *
+   * @param blockId file name of the carbondata file
+   */
+  public abstract void onBlockStart(String blockId);
+
+  /**
+   * End of block notification
+   */
+  public abstract void onBlockEnd(String blockId);
+
+  /**
+   * Start of new blocklet notification.
+   *
+   * @param blockletId sequence number of blocklet in the block
+   */
+  public abstract void onBlockletStart(int blockletId);
+
+  /**
+   * End of blocklet notification
+   *
+   * @param blockletId sequence number of blocklet in the block
+   */
+  public abstract void onBlockletEnd(int blockletId);
+
+  /**
+   * Add the column pages row to the datamap, order of pages is same as 
`indexColumns` in
+   * DataMapMeta returned in DataMapFactory.
+   * Implementation should copy the content of `pages` as needed, because 
`pages` memory
+   * may be freed after this method returns, if using unsafe column page.
+   */
+  public abstract void onPageAdded(int blockletId, int pageId, 
ColumnPage[] pages);
+
+  /**
+   * This is called during closing of writer.So after this call no more 
data will be sent to this
+   * class.
+   */
+  public abstract void finish();
+
+  /**
+   * It copies the file from temp folder to actual folder
+   *
+   * @param dataMapFile
+   * @throws IOException
+   */
+  protected void commitFile(String dataMapFile) throws IOException {
--- End diff --

What if anything failed inside this function, who will catch IOException 
and handle it?


---

[jira] [Updated] (CARBONDATA-1726) Carbon1.3.0-Streaming - Select query from spark-shell does not execute successfully for streaming table load

2017-11-16 Thread Chetan Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1726:

Description: 
Steps :
// prepare csv file for batch loading
cd /srv/spark2.2Bigdata/install/hadoop/datanode/bin

// generate streamSample.csv

10001,batch_1,city_1,0.1,school_1:school_11$20
10002,batch_2,city_2,0.2,school_2:school_22$30
10003,batch_3,city_3,0.3,school_3:school_33$40
10004,batch_4,city_4,0.4,school_4:school_44$50
10005,batch_5,city_5,0.5,school_5:school_55$60

// put to hdfs /tmp/streamSample.csv
./hadoop fs -put streamSample.csv /tmp

// spark-beeline
cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
bin/spark-submit --master yarn-client --executor-memory 10G --executor-cores 5 
--driver-memory 5G --num-executors 3 --class 
org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
 "hdfs://hacluster/user/sparkhive/warehouse"

bin/beeline -u jdbc:hive2://10.18.98.34:23040

CREATE TABLE stream_table(
id INT,
name STRING,
city STRING,
salary FLOAT
)
STORED BY 'carbondata'
TBLPROPERTIES('streaming'='true', 'sort_columns'='name');

LOAD DATA LOCAL INPATH 'hdfs://hacluster/chetan/streamSample.csv' INTO TABLE 
stream_table OPTIONS('HEADER'='false');

// spark-shell 
cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
bin/spark-shell --master yarn-client

import java.io.{File, PrintWriter}
import java.net.ServerSocket

import org.apache.spark.sql.{CarbonEnv, SparkSession}
import org.apache.spark.sql.hive.CarbonRelation
import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}

import org.apache.carbondata.core.constants.CarbonCommonConstants
import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}

CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
 "/MM/dd")

import org.apache.spark.sql.CarbonSession._

val carbonSession = SparkSession.
  builder().
  appName("StreamExample").
  config("spark.sql.warehouse.dir", 
"hdfs://hacluster/user/sparkhive/warehouse").
  config("javax.jdo.option.ConnectionURL", 
"jdbc:mysql://10.18.98.34:3306/sparksql?characterEncoding=UTF-8").
  config("javax.jdo.option.ConnectionDriverName", "com.mysql.jdbc.Driver").
  config("javax.jdo.option.ConnectionPassword", "huawei").
  config("javax.jdo.option.ConnectionUserName", "sparksql").
  getOrCreateCarbonSession()
   
carbonSession.sparkContext.setLogLevel("ERROR")

carbonSession.sql("select * from stream_table").show

Issue : Select query from spark-shell does not execute successfully for 
streaming table load.
When the executor and driver cores and memory is increased while launching the 
spark shell the issue still occurs.
bin/spark-shell --master yarn-client --executor-memory 10G --executor-cores 5 
--driver-memory 5G --num-executors 3
scala> import org.apache.carbondata.core.constants.CarbonCommonConstants
import org.apache.carbondata.core.constants.CarbonCommonConstants

scala> import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.util.CarbonProperties

scala> import org.apache.carbondata.core.util.path.{CarbonStorePath, 
CarbonTablePath}
import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}

scala>

scala> 
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
 "/MM/dd")
res29: org.apache.carbondata.core.util.CarbonProperties = 
org.apache.carbondata.core.util.CarbonProperties@67b056e7

scala>

scala> import org.apache.spark.sql.CarbonSession._
import org.apache.spark.sql.CarbonSession._

scala>

scala> val carbonSession = SparkSession.
 |   builder().
 |   appName("StreamExample").
 |   config("spark.sql.warehouse.dir", 
"hdfs://hacluster/user/sparkhive/warehouse").
 |   config("javax.jdo.option.ConnectionURL", 
"jdbc:mysql://10.18.98.34:3306/sparksql?characterEncoding=UTF-8").
 |   config("javax.jdo.option.ConnectionDriverName", 
"com.mysql.jdbc.Driver").
 |   config("javax.jdo.option.ConnectionPassword", "huawei").
 |   config("javax.jdo.option.ConnectionUserName", "sparksql").
 |   getOrCreateCarbonSession()
carbonSession: org.apache.spark.sql.SparkSession = 
org.apache.spark.sql.CarbonSession@1d0590bc

scala>
 | carbonSession.sparkContext.setLogLevel("ERROR")

scala> carbonSession.sql("select * from stream_table").show
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 25.0 failed 4 times, most recent failure: Lost task 0.3 in stage 25.0 
(TID 65, BLR114269, executor 8): java.lang.IllegalStateException: unread 
block data
at 
java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2424)
at java.io.ObjectInputStream.readObject0(Object

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

2017-11-16 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1471#discussion_r151615573
  
--- Diff: 
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
 ---
@@ -755,7 +758,8 @@ private CarbonInputSplit 
convertToCarbonInputSplit(ExtendedBlocklet blocklet)
 
org.apache.carbondata.hadoop.CarbonInputSplit.from(blocklet.getSegmentId(),
 new FileSplit(new Path(blocklet.getPath()), 0, 
blocklet.getLength(),
 blocklet.getLocations()),
-ColumnarFormatVersion.valueOf((short) 
blocklet.getDetailInfo().getVersionNumber()));
+ColumnarFormatVersion.valueOf((short) 
blocklet.getDetailInfo().getVersionNumber()),
+blocklet.getDataMapWriterPath());
--- End diff --

indentation  not correct


---

[jira] [Updated] (CARBONDATA-1726) Carbon1.3.0-Streaming - Select query from spark-shell does not execute successfully for streaming table load

2017-11-16 Thread Chetan Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1726:

Description: 
Steps :
// prepare csv file for batch loading
cd /srv/spark2.2Bigdata/install/hadoop/datanode/bin

// generate streamSample.csv

10001,batch_1,city_1,0.1,school_1:school_11$20
10002,batch_2,city_2,0.2,school_2:school_22$30
10003,batch_3,city_3,0.3,school_3:school_33$40
10004,batch_4,city_4,0.4,school_4:school_44$50
10005,batch_5,city_5,0.5,school_5:school_55$60

// put to hdfs /tmp/streamSample.csv
./hadoop fs -put streamSample.csv /tmp

// spark-beeline
cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
bin/spark-submit --master yarn-client --executor-memory 10G --executor-cores 5 
--driver-memory 5G --num-executors 3 --class 
org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
 "hdfs://hacluster/user/sparkhive/warehouse"

bin/beeline -u jdbc:hive2://10.18.98.34:23040

CREATE TABLE stream_table(
id INT,
name STRING,
city STRING,
salary FLOAT
)
STORED BY 'carbondata'
TBLPROPERTIES('streaming'='true', 'sort_columns'='name');

LOAD DATA LOCAL INPATH 'hdfs://hacluster/chetan/streamSample.csv' INTO TABLE 
stream_table OPTIONS('HEADER'='false');

// spark-shell 
cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
bin/spark-shell --master yarn-client

import java.io.{File, PrintWriter}
import java.net.ServerSocket

import org.apache.spark.sql.{CarbonEnv, SparkSession}
import org.apache.spark.sql.hive.CarbonRelation
import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}

import org.apache.carbondata.core.constants.CarbonCommonConstants
import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}

CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
 "/MM/dd")

import org.apache.spark.sql.CarbonSession._

val carbonSession = SparkSession.
  builder().
  appName("StreamExample").
  config("spark.sql.warehouse.dir", 
"hdfs://hacluster/user/sparkhive/warehouse").
  config("javax.jdo.option.ConnectionURL", 
"jdbc:mysql://10.18.98.34:3306/sparksql?characterEncoding=UTF-8").
  config("javax.jdo.option.ConnectionDriverName", "com.mysql.jdbc.Driver").
  config("javax.jdo.option.ConnectionPassword", "huawei").
  config("javax.jdo.option.ConnectionUserName", "sparksql").
  getOrCreateCarbonSession()
   
carbonSession.sparkContext.setLogLevel("ERROR")

carbonSession.sql("select * from stream_table").show

Issue : Select query from spark-shell does not execute successfully for 
streaming table load.
When the executor and driver cores and memory is increased while launching the 
spark shell the issue still occurs.
bin/spark-shell --master yarn-client --executor-memory 10G --executor-cores 5 
--driver-memory 5G --num-executors 3
scala> carbonSession.sql("select * from stream_table").show
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 25.0 failed 4 times, most recent failure: Lost task 0.3 in stage 25.0 
(TID 65, BLR114269, executor 8): java.lang.IllegalStateException: unread 
block data
at 
java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2424)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1383)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at 
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:258)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
  at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435)
  at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1423)
  at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1422)
  at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  at org.apache.spark.scheduler.DAG

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

2017-11-16 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1471#discussion_r151615263
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java
 ---
@@ -574,7 +482,9 @@ private CopyThread(String fileName) {
  * @throws Exception if unable to compute a result
  */
 @Override public Void call() throws Exception {
-  copyCarbonDataFileToCarbonStorePath(fileName);
+  CarbonUtil.copyCarbonDataFileToCarbonStorePath(fileName,
--- End diff --

move parameter to next line


---

[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1516
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1213/



---

[GitHub] carbondata issue #1515: [CARBONDATA-1751] Modify sys.err to AnalysisExceptio...

2017-11-16 Thread xubo245

Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1515
  
Please review it @jackylk  @QiangCai 


---

[GitHub] carbondata issue #1517: [CARBONDATA-1750] Fix NPE when tablestatus file is e...

2017-11-16 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1517
  
LGTM


---

[jira] [Resolved] (CARBONDATA-1326) Fixed high priority findbug issues

2017-11-16 Thread Jacky Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-1326.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> Fixed high priority findbug issues
> --
>
> Key: CARBONDATA-1326
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1326
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Manish Gupta
>Assignee: Manish Gupta
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 23h 50m
>  Remaining Estimate: 0h
>
> Currently there are lot if find bug issues in the carbondata code. These need 
> to be priortized and fixed. So through this jira all high priority findbug 
> issues are addressed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata pull request #1507: [CARBONDATA-1326] Fixed high priority findbug...

2017-11-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1507


---

[GitHub] carbondata issue #1507: [CARBONDATA-1326] Fixed high priority findbug issue

2017-11-16 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1507
  
LGTM


---

[GitHub] carbondata pull request #1509: [CARBONDATA-1739] Clean up store path interfa...

2017-11-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1509


---

[GitHub] carbondata issue #1509: [CARBONDATA-1739] Clean up store path interface

2017-11-16 Thread QiangCai

Github user QiangCai commented on the issue:

https://github.com/apache/carbondata/pull/1509
  
LGTM


---

[GitHub] carbondata issue #1491: [CARBONDATA-1651] [Supported Boolean Type When Savin...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1491
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1212/



---

[GitHub] carbondata pull request #1500: [CARBONDATA-1717]Remove spark broadcast for g...

2017-11-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1500


---

[GitHub] carbondata pull request #1519: [CARBONDATA-1753]Fix missing 'org.scalatest.t...

2017-11-16 Thread zzcclp

GitHub user zzcclp opened a pull request:

https://github.com/apache/carbondata/pull/1519

[CARBONDATA-1753]Fix missing 'org.scalatest.tools.Runner' issue when run 
test with streaming module

Fix missing 'org.scalatest.tools.Runner' issue when run test with streaming 
module

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zzcclp/carbondata CARBONDATA-1753

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1519.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1519


commit 452d442e2a2779b5bb870a492b3b6ec7994b4161
Author: Zhang Zhichao <441586...@qq.com>
Date:   2017-11-17T06:27:23Z

[CARBONDATA-1753]Fix missing 'org.scalatest.tools.Runner' issue when run 
test with streaming module

Fix missing 'org.scalatest.tools.Runner' issue when run test with streaming 
module




---

[GitHub] carbondata issue #1500: [CARBONDATA-1717]Remove spark broadcast for gettting...

2017-11-16 Thread QiangCai

Github user QiangCai commented on the issue:

https://github.com/apache/carbondata/pull/1500
  
LGTM


---

[jira] [Created] (CARBONDATA-1753) Missing 'org.scalatest.tools.Runner' when run test with streaming module

2017-11-16 Thread Zhichao Zhang (JIRA)

Zhichao  Zhang created CARBONDATA-1753:
--

 Summary: Missing 'org.scalatest.tools.Runner' when run test with 
streaming module
 Key: CARBONDATA-1753
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1753
 Project: CarbonData
  Issue Type: Bug
  Components: build
Affects Versions: 1.3.0
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
Priority: Minor
 Fix For: 1.3.0


Missing 'org.scalatest.tools.Runner' when run test with streaming module.
Need to add scalatest to pom.xml of streaming module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1507: [CARBONDATA-1326] Fixed high priority findbug issue

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1507
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1211/



---

[GitHub] carbondata issue #1508: [CARBONDATA-1738] Block direct insert/load on pre-ag...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1508
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1210/



---

[GitHub] carbondata issue #1515: [CARBONDATA-1751] Modify sys.err to AnalysisExceptio...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1515
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1209/



---

[jira] [Updated] (CARBONDATA-1713) Carbon1.3.0-Pre-AggregateTable - Aggregate query on main table fails after creating pre-aggregate table when upper case used for column name

2017-11-16 Thread Ramakrishna S (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramakrishna S updated CARBONDATA-1713:
--
Summary: Carbon1.3.0-Pre-AggregateTable - Aggregate query on main table 
fails after creating pre-aggregate table when upper case used for column name  
(was: Carbon1.3.0-Pre-AggregateTable - Aggregate query on main table fails 
after creating pre-aggregate table)

> Carbon1.3.0-Pre-AggregateTable - Aggregate query on main table fails after 
> creating pre-aggregate table when upper case used for column name
> 
>
> Key: CARBONDATA-1713
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1713
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: ANT Test cluster - 3 node
>Reporter: Ramakrishna S
>Assignee: kumar vishal
>Priority: Minor
>  Labels: Functional, sanity
> Fix For: 1.3.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> 0: jdbc:hive2://10.18.98.34:23040> load data inpath 
> "hdfs://hacluster/user/test/lineitem.tbl.1" into table lineitem 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> Error: org.apache.spark.sql.catalyst.analysis.NoSuchTableException: Table or 
> view 'lineitem' not found in database 'default'; (state=,code=0)
> 0: jdbc:hive2://10.18.98.34:23040> create table if not exists lineitem(
> 0: jdbc:hive2://10.18.98.34:23040> L_SHIPDATE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_SHIPMODE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_SHIPINSTRUCT string,
> 0: jdbc:hive2://10.18.98.34:23040> L_RETURNFLAG string,
> 0: jdbc:hive2://10.18.98.34:23040> L_RECEIPTDATE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_ORDERKEY string,
> 0: jdbc:hive2://10.18.98.34:23040> L_PARTKEY string,
> 0: jdbc:hive2://10.18.98.34:23040> L_SUPPKEY   string,
> 0: jdbc:hive2://10.18.98.34:23040> L_LINENUMBER int,
> 0: jdbc:hive2://10.18.98.34:23040> L_QUANTITY double,
> 0: jdbc:hive2://10.18.98.34:23040> L_EXTENDEDPRICE double,
> 0: jdbc:hive2://10.18.98.34:23040> L_DISCOUNT double,
> 0: jdbc:hive2://10.18.98.34:23040> L_TAX double,
> 0: jdbc:hive2://10.18.98.34:23040> L_LINESTATUS string,
> 0: jdbc:hive2://10.18.98.34:23040> L_COMMITDATE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_COMMENT  string
> 0: jdbc:hive2://10.18.98.34:23040> ) STORED BY 'org.apache.carbondata.format'
> 0: jdbc:hive2://10.18.98.34:23040> TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.338 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> load data inpath 
> "hdfs://hacluster/user/test/lineitem.tbl.1" into table lineitem 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (48.634 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> create datamap agr_lineitem ON TABLE 
> lineitem USING "org.apache.carbondata.datamap.AggregateDataMapHandler" as 
> select L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from 
> lineitem group by  L_RETURNFLAG, L_LINESTATUS;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (16.552 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> select 
> L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem 
> group by  L_RETURNFLAG, L_LINESTATUS;
> Error: org.apache.spark.sql.AnalysisException: Column doesnot exists in Pre 
> Aggregate table; (state=,code=0)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (CARBONDATA-1713) Carbon1.3.0-Pre-AggregateTable - Aggregate query on main table fails after creating pre-aggregate table

2017-11-16 Thread Ramakrishna S (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-1713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253213#comment-16253213
 ] 

Ramakrishna S edited comment on CARBONDATA-1713 at 11/17/17 5:05 AM:
-

Changing severity based on the clarification provided, will use lower case for 
query till this issue is fixed.


was (Author: ram@huawei):
Changing severity based on the clarification given.

> Carbon1.3.0-Pre-AggregateTable - Aggregate query on main table fails after 
> creating pre-aggregate table
> ---
>
> Key: CARBONDATA-1713
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1713
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: ANT Test cluster - 3 node
>Reporter: Ramakrishna S
>Assignee: kumar vishal
>Priority: Minor
>  Labels: Functional, sanity
> Fix For: 1.3.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> 0: jdbc:hive2://10.18.98.34:23040> load data inpath 
> "hdfs://hacluster/user/test/lineitem.tbl.1" into table lineitem 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> Error: org.apache.spark.sql.catalyst.analysis.NoSuchTableException: Table or 
> view 'lineitem' not found in database 'default'; (state=,code=0)
> 0: jdbc:hive2://10.18.98.34:23040> create table if not exists lineitem(
> 0: jdbc:hive2://10.18.98.34:23040> L_SHIPDATE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_SHIPMODE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_SHIPINSTRUCT string,
> 0: jdbc:hive2://10.18.98.34:23040> L_RETURNFLAG string,
> 0: jdbc:hive2://10.18.98.34:23040> L_RECEIPTDATE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_ORDERKEY string,
> 0: jdbc:hive2://10.18.98.34:23040> L_PARTKEY string,
> 0: jdbc:hive2://10.18.98.34:23040> L_SUPPKEY   string,
> 0: jdbc:hive2://10.18.98.34:23040> L_LINENUMBER int,
> 0: jdbc:hive2://10.18.98.34:23040> L_QUANTITY double,
> 0: jdbc:hive2://10.18.98.34:23040> L_EXTENDEDPRICE double,
> 0: jdbc:hive2://10.18.98.34:23040> L_DISCOUNT double,
> 0: jdbc:hive2://10.18.98.34:23040> L_TAX double,
> 0: jdbc:hive2://10.18.98.34:23040> L_LINESTATUS string,
> 0: jdbc:hive2://10.18.98.34:23040> L_COMMITDATE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_COMMENT  string
> 0: jdbc:hive2://10.18.98.34:23040> ) STORED BY 'org.apache.carbondata.format'
> 0: jdbc:hive2://10.18.98.34:23040> TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.338 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> load data inpath 
> "hdfs://hacluster/user/test/lineitem.tbl.1" into table lineitem 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (48.634 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> create datamap agr_lineitem ON TABLE 
> lineitem USING "org.apache.carbondata.datamap.AggregateDataMapHandler" as 
> select L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from 
> lineitem group by  L_RETURNFLAG, L_LINESTATUS;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (16.552 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> select 
> L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem 
> group by  L_RETURNFLAG, L_LINESTATUS;
> Error: org.apache.spark.sql.AnalysisException: Column doesnot exists in Pre 
> Aggregate table; (state=,code=0)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (CARBONDATA-1726) Carbon1.3.0-Streaming - Select query from spark-shell does not execute successfully for streaming table load

2017-11-16 Thread Chetan Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1726:

Description: 
Steps :
// prepare csv file for batch loading
cd /srv/spark2.2Bigdata/install/hadoop/datanode/bin

// generate streamSample.csv

10001,batch_1,city_1,0.1,school_1:school_11$20
10002,batch_2,city_2,0.2,school_2:school_22$30
10003,batch_3,city_3,0.3,school_3:school_33$40
10004,batch_4,city_4,0.4,school_4:school_44$50
10005,batch_5,city_5,0.5,school_5:school_55$60

// put to hdfs /tmp/streamSample.csv
./hadoop fs -put streamSample.csv /tmp

// spark-beeline
cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
bin/spark-submit --master yarn-client --executor-memory 10G --executor-cores 5 
--driver-memory 5G --num-executors 3 --class 
org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
 "hdfs://hacluster/user/sparkhive/warehouse"

bin/beeline -u jdbc:hive2://10.18.98.34:23040

CREATE TABLE stream_table(
id INT,
name STRING,
city STRING,
salary FLOAT
)
STORED BY 'carbondata'
TBLPROPERTIES('streaming'='true', 'sort_columns'='name');

LOAD DATA LOCAL INPATH 'hdfs://hacluster/chetan/streamSample.csv' INTO TABLE 
stream_table OPTIONS('HEADER'='false');

// spark-shell 
cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
bin/spark-shell --master yarn-client

import java.io.{File, PrintWriter}
import java.net.ServerSocket

import org.apache.spark.sql.{CarbonEnv, SparkSession}
import org.apache.spark.sql.hive.CarbonRelation
import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}

import org.apache.carbondata.core.constants.CarbonCommonConstants
import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}

CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
 "/MM/dd")

import org.apache.spark.sql.CarbonSession._

val carbonSession = SparkSession.
  builder().
  appName("StreamExample").
  config("spark.sql.warehouse.dir", 
"hdfs://hacluster/user/sparkhive/warehouse").
  config("javax.jdo.option.ConnectionURL", 
"jdbc:mysql://10.18.98.34:3306/sparksql?characterEncoding=UTF-8").
  config("javax.jdo.option.ConnectionDriverName", "com.mysql.jdbc.Driver").
  config("javax.jdo.option.ConnectionPassword", "huawei").
  config("javax.jdo.option.ConnectionUserName", "sparksql").
  getOrCreateCarbonSession()
   
carbonSession.sparkContext.setLogLevel("ERROR")

carbonSession.sql("select * from stream_table").show

Issue : Select query from spark-shell does not execute successfully for 
streaming table load.
In AM logs for the failed attempt the below error is displayed.
AM Container for appattempt_1510838225027_0014_01 exited with exitCode: 11
For more detailed output, check the application tracking 
page:http://BLR114278:45020/cluster/app/application_1510838225027_0014 Then 
click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e06_1510838225027_0014_01_01
Exit code: 11
Stack trace: ExitCodeException exitCode=11:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:636)
at org.apache.hadoop.util.Shell.run(Shell.java:533)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:829)
at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:224)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:313)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:88)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 11 and the last 4096 bytes from the 
error logs are :
op.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:313)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:88)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1 and the last 4096 bytes from the 
error logs are :
Java HotSpot(TM) 64-Bit Server VM warning: Setting CompressedClassSpaceSize has 
no effect when compressed class pointers are not used
| org.apache.spark.internal.Logging$class.logWarning(Logging.scala:66)
2017-11

[GitHub] carbondata issue #1509: [CARBONDATA-1739] Clean up store path interface

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1509
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1208/



---

[GitHub] carbondata issue #1491: [CARBONDATA-1651] [Supported Boolean Type When Savin...

2017-11-16 Thread anubhav100

Github user anubhav100 commented on the issue:

https://github.com/apache/carbondata/pull/1491
  
retest this please


---

[GitHub] carbondata issue #1518: [CARBONDATA-1752] There are some scalastyle error sh...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1518
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1207/



---

[jira] [Updated] (CARBONDATA-1726) Carbon1.3.0-Streaming - Select query from spark-shell does not execute successfully for streaming table load

2017-11-16 Thread Chetan Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1726:

Priority: Blocker  (was: Major)

> Carbon1.3.0-Streaming - Select query from spark-shell does not execute 
> successfully for streaming table load
> 
>
> Key: CARBONDATA-1726
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1726
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: 3 node ant cluster SUSE 11 SP4
>Reporter: Chetan Bhat
>Priority: Blocker
>  Labels: Functional
>
> Steps :
> // prepare csv file for batch loading
> cd /srv/spark2.2Bigdata/install/hadoop/datanode/bin
> // generate streamSample.csv
> 10001,batch_1,city_1,0.1,school_1:school_11$20
> 10002,batch_2,city_2,0.2,school_2:school_22$30
> 10003,batch_3,city_3,0.3,school_3:school_33$40
> 10004,batch_4,city_4,0.4,school_4:school_44$50
> 10005,batch_5,city_5,0.5,school_5:school_55$60
> // put to hdfs /tmp/streamSample.csv
> ./hadoop fs -put streamSample.csv /tmp
> // spark-beeline
> cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
> bin/spark-submit --master yarn-client --executor-memory 10G --executor-cores 
> 5 --driver-memory 5G --num-executors 3 --class 
> org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
> /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
>  "hdfs://hacluster/user/sparkhive/warehouse"
> bin/beeline -u jdbc:hive2://10.18.98.34:23040
> CREATE TABLE stream_table(
> id INT,
> name STRING,
> city STRING,
> salary FLOAT
> )
> STORED BY 'carbondata'
> TBLPROPERTIES('streaming'='true', 'sort_columns'='name');
> LOAD DATA LOCAL INPATH 'hdfs://hacluster/chetan/streamSample.csv' INTO TABLE 
> stream_table OPTIONS('HEADER'='false');
> // spark-shell 
> cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
> bin/spark-shell --master yarn-client
> import java.io.{File, PrintWriter}
> import java.net.ServerSocket
> import org.apache.spark.sql.{CarbonEnv, SparkSession}
> import org.apache.spark.sql.hive.CarbonRelation
> import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}
> import org.apache.carbondata.core.constants.CarbonCommonConstants
> import org.apache.carbondata.core.util.CarbonProperties
> import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}
> CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
>  "/MM/dd")
> import org.apache.spark.sql.CarbonSession._
> val carbonSession = SparkSession.
>   builder().
>   appName("StreamExample").
>   config("spark.sql.warehouse.dir", 
> "hdfs://hacluster/user/sparkhive/warehouse").
>   config("javax.jdo.option.ConnectionURL", 
> "jdbc:mysql://10.18.98.34:3306/sparksql?characterEncoding=UTF-8").
>   config("javax.jdo.option.ConnectionDriverName", "com.mysql.jdbc.Driver").
>   config("javax.jdo.option.ConnectionPassword", "huawei").
>   config("javax.jdo.option.ConnectionUserName", "sparksql").
>   getOrCreateCarbonSession()
>
> carbonSession.sparkContext.setLogLevel("ERROR")
> carbonSession.sql("select * from stream_table").show
> Issue : Select query from spark-shell does not execute successfully for 
> streaming table load.
> Expected : Select query from spark-shell should execute successfully for 
> streaming table load.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1516
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1206/



---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

2017-11-16 Thread chenliang613

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1471#discussion_r151598459
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datamap/dev/AbstractDataMapWriter.java
 ---
@@ -0,0 +1,110 @@
+/*
--- End diff --

Can you explain , why change "DataMapWriter.java" to 
"AbstractDataMapWriter.java",  for easier supporting uses to customize other 
type of datamapwriter?


---

[GitHub] carbondata issue #1517: [CARBONDATA-1750] Fix NPE when tablestatus file is e...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1517
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1204/



---

[GitHub] carbondata pull request #1518: [CARBONDATA-1752] There are some scalastyle e...

2017-11-16 Thread xubo245

GitHub user xubo245 opened a pull request:

https://github.com/apache/carbondata/pull/1518

[CARBONDATA-1752] There are some scalastyle error should be optimized in 
CarbonData


Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 No
 - [ ] Any backward compatibility impacted?
 No
 - [ ] Document update required?
No
 - [ ] Testing done
No
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
No


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xubo245/carbondata fixStyle

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1518.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1518


commit 8a754e3527f5c2de47035d248b7ddb2d8181ff67
Author: xubo245 <601450...@qq.com>
Date:   2017-11-17T03:25:14Z

[CARBONDATA-1752] There are some scalastyle error should be optimized in 
CarbonData




---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

2017-11-16 Thread chenliang613

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1471#discussion_r151595971
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datamap/DataMapMeta.java ---
@@ -19,15 +19,15 @@
 
 import java.util.List;
 
-import org.apache.carbondata.core.indexstore.schema.FilterType;
+import org.apache.carbondata.core.scan.filter.intf.ExpressionType;
 
 public class DataMapMeta {
 
   private List indexedColumns;
 
-  private FilterType optimizedOperation;
+  private List optimizedOperation;
--- End diff --

in ExpressionType,no "like" expression.


---

[jira] [Created] (CARBONDATA-1752) There are some scalastyle error should be optimized in CarbonData

2017-11-16 Thread xubo245 (JIRA)

xubo245 created CARBONDATA-1752:
---

 Summary: There are some scalastyle error should be optimized in 
CarbonData
 Key: CARBONDATA-1752
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1752
 Project: CarbonData
  Issue Type: Bug
  Components: file-format
Affects Versions: 1.2.0
Reporter: xubo245
Assignee: xubo245
Priority: Minor
 Fix For: 1.3.0


There are some scalastyle error should be optimized in CarbonData, including 
removing useless import, optimizing method definition and so on



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1515: [CARBONDATA-1751] Modify sys.err to AnalysisExceptio...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1515
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1203/



---

[GitHub] carbondata issue #1509: [CARBONDATA-1739] Clean up store path interface

2017-11-16 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1509
  
@zzcclp because I implement those PR one by one, so it is on top of #1504 


---

[GitHub] carbondata pull request #1517: [CARBONDATA-1750] Fix NPE when tablestatus fi...

2017-11-16 Thread QiangCai

GitHub user QiangCai opened a pull request:

https://github.com/apache/carbondata/pull/1517

[CARBONDATA-1750] Fix NPE when tablestatus file is empty

 - [x] Any interfaces changed?
 no
 - [x] Any backward compatibility impacted?
 no
 - [x] Document update required?
 no
 - [x] Testing done
 no
 - [x] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/carbondata segmentstatus

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1517.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1517


commit 0333be2f614da59e0622a4b82d7c2fe9dccbf1b1
Author: QiangCai 
Date:   2017-11-17T02:45:13Z

fix npe when tablestatus is empty




---

[GitHub] carbondata issue #1509: [CARBONDATA-1739] Clean up store path interface

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1509
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1201/



---

[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1516
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1202/



---

[GitHub] carbondata issue #1516: [CARBONDATA-1729]Fix the compatibility issue with ha...

2017-11-16 Thread zzcclp

Github user zzcclp commented on the issue:

https://github.com/apache/carbondata/pull/1516
  
@QiangCai @jackylk please review, thanks.


---

[GitHub] carbondata pull request #1516: [CARBONDATA-1729]Fix the compatibility issue ...

2017-11-16 Thread zzcclp

GitHub user zzcclp opened a pull request:

https://github.com/apache/carbondata/pull/1516

[CARBONDATA-1729]Fix the compatibility issue with hadoop <= 2.6 and 2.7

1. Recover profile of 'hadoop-2.2.0' to pom.xml
2. Use reflection mechanism to implement 'truncate' method

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zzcclp/carbondata CARBONDATA-1729

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1516.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1516


commit 66e349b277251ebfb46adc48a833569de32e1799
Author: Zhang Zhichao <441586...@qq.com>
Date:   2017-11-17T02:29:12Z

[CARBONDATA-1729]Fix the compatibility issue with hadoop <= 2.6 and 2.7

1. Recover profile of 'hadoop-2.2.0' to pom.xml
2. Use reflection mechanism to implement 'truncate' method




---

[GitHub] carbondata pull request #1515: [CARBONDATA-1751] Modify sys.err to AnalysisE...

2017-11-16 Thread xubo245

GitHub user xubo245 opened a pull request:

https://github.com/apache/carbondata/pull/1515

[CARBONDATA-1751] Modify sys.err to AnalysisException when uses run related 
operation except IUD,compaction and alter

carbon printout improper error message, for example, it printout system 
error when users run create table with the same column name, but it should 
printout related exception information

So we modify sys.error method to AnalysisException when uses run related 
operation except IUD,compaction and alter

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xubo245/carbondata fixSysError

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1515.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1515


commit 696f02d4ece90308e729d3d7fed222aa58b0e9c9
Author: xubo245 <601450...@qq.com>
Date:   2017-11-17T02:48:31Z

[CARBONDATA-1751] Modify sys.err to AnalysisException when uses run related 
operation except IUD,compaction and alter




---

[jira] [Created] (CARBONDATA-1751) Modify sys.err to AnalysisException when uses run related operation except IUD,compaction and alter

2017-11-16 Thread xubo245 (JIRA)

xubo245 created CARBONDATA-1751:
---

 Summary: Modify sys.err to AnalysisException when  uses run 
related operation except IUD,compaction and alter
 Key: CARBONDATA-1751
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1751
 Project: CarbonData
  Issue Type: Bug
  Components: spark-integration
Affects Versions: 1.2.0
Reporter: xubo245
Assignee: xubo245
Priority: Minor
 Fix For: 1.3.0


carbon printout improper error message, for example, it printout system error 
when users run create table with the same column name, but it should printout 
related exception information

So we modify sys.error method to AnalysisException when uses run related 
operation except IUD,compaction and alter



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (CARBONDATA-1742) Fix NullPointerException in SegmentStatusManager

2017-11-16 Thread xubo245 (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16256384#comment-16256384
 ] 

xubo245 commented on CARBONDATA-1742:
-

It has been added into https://github.com/apache/carbondata/pull/1507/files

> Fix NullPointerException in SegmentStatusManager
> 
>
> Key: CARBONDATA-1742
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1742
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.2.0
>Reporter: xubo245
>Assignee: xubo245
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
>   when loadFolderDetailsArray is null ,there is NullPointerException. We 
> should fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1501: [CARBONDATA-1713] Fixed Aggregate query on main tabl...

2017-11-16 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1501
  
Table name and column name are case sensitive, right?


---

[GitHub] carbondata issue #1509: [CARBONDATA-1739] Clean up store path interface

2017-11-16 Thread zzcclp

Github user zzcclp commented on the issue:

https://github.com/apache/carbondata/pull/1509
  
I found the commit 'add s3 in filefactory' appears in many prs.


---

[jira] [Updated] (CARBONDATA-1729) The compatibility issue with hadoop <= 2.6 and 2.7

2017-11-16 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-1729:
---
Summary: The compatibility issue with hadoop <= 2.6 and 2.7  (was: Fix the 
compatibility issue with hadoop <= 2.6 and 2.7)

> The compatibility issue with hadoop <= 2.6 and 2.7
> --
>
> Key: CARBONDATA-1729
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1729
> Project: CarbonData
>  Issue Type: Bug
>  Components: hadoop-integration
>Affects Versions: 1.3.0
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
> Fix For: 1.3.0
>
>
> On branch master, when compiled with hadoop <= 2.6, it failed, the root cause 
> is using new API FileSystem.truncate which is added in hadoop 2.7. It needs 
> to implement a method called 'truncate' in file 'FileFactory.java' to support 
> hadoop <= 2.6.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1750) SegmentStatusManager.readLoadMetadata showing NPE if tablestatus file is empty

2017-11-16 Thread QiangCai (JIRA)

QiangCai created CARBONDATA-1750:


 Summary: SegmentStatusManager.readLoadMetadata showing NPE if 
tablestatus file is empty
 Key: CARBONDATA-1750
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1750
 Project: CarbonData
  Issue Type: Bug
Reporter: QiangCai
Priority: Minor


SegmentStatusManager.readLoadMetadata showing NPE if tablestatus file is empty



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (CARBONDATA-1729) Fix the compatibility issue with hadoop <= 2.6 and 2.7

2017-11-16 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-1729:
---
Summary: Fix the compatibility issue with hadoop <= 2.6 and 2.7  (was: 
Recover to supporting Hadoop <= 2.6)

> Fix the compatibility issue with hadoop <= 2.6 and 2.7
> --
>
> Key: CARBONDATA-1729
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1729
> Project: CarbonData
>  Issue Type: Bug
>  Components: hadoop-integration
>Affects Versions: 1.3.0
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
> Fix For: 1.3.0
>
>
> On branch master, when compiled with hadoop <= 2.6, it failed, the root cause 
> is using new API FileSystem.truncate which is added in hadoop 2.7. It needs 
> to implement a method called 'truncate' in file 'FileFactory.java' to support 
> hadoop <= 2.6.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1471
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1200/



---

[GitHub] carbondata issue #1509: [CARBONDATA-1739] Clean up store path interface

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1509
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1199/



---

[GitHub] carbondata pull request #1504: [CARBONDATA-1732] Add S3 support in FileFacto...

2017-11-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1504


---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

2017-11-16 Thread chenliang613

Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1471
  
retest this please


---

[GitHub] carbondata issue #1509: [CARBONDATA-1739] Clean up store path interface

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1509
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1198/



---

[GitHub] carbondata issue #1504: [CARBONDATA-1732] Add S3 support in FileFactory

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1504
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1197/



---

[GitHub] carbondata issue #1504: [CARBONDATA-1732] Add S3 support in FileFactory

2017-11-16 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1504
  
retest this please


---

[jira] [Assigned] (CARBONDATA-1740) Carbon1.3.0-Pre-AggregateTable - Query plan exception for aggregate query with order by when main table is having pre-aggregate table

2017-11-16 Thread kumar vishal (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1740:


Assignee: kumar vishal

> Carbon1.3.0-Pre-AggregateTable - Query plan exception for aggregate query 
> with order by when main table is having pre-aggregate table
> -
>
> Key: CARBONDATA-1740
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1740
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: kumar vishal
>  Labels: DFX
> Fix For: 1.3.0
>
>
> lineitem3: has a pre-aggregate table 
> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem3 group by l_returnflag, l_linestatus order by l_returnflag, 
> l_linestatus;
> Error: org.apache.spark.sql.AnalysisException: expression 
> '`lineitem3_l_returnflag`' is neither present in the group by, nor is it an 
> aggregate function. Add to group by or wrap in first() (or first_value) if 
> you don't care which value you get.;;
> Project [l_returnflag#2356, l_linestatus#2366, sum(l_quantity)#2791, 
> sum(l_extendedprice)#2792]
> +- Sort [aggOrder#2795 ASC NULLS FIRST, aggOrder#2796 ASC NULLS FIRST], true
>+- !Aggregate [l_returnflag#2356, l_linestatus#2366], [l_returnflag#2356, 
> l_linestatus#2366, sum(l_quantity#2362) AS sum(l_quantity)#2791, 
> sum(l_extendedprice#2363) AS sum(l_extendedprice)#2792, 
> lineitem3_l_returnflag#2341 AS aggOrder#2795, lineitem3_l_linestatus#2342 AS 
> aggOrder#2796]
>   +- SubqueryAlias lineitem3
>  +- 
> Relation[L_SHIPDATE#2353,L_SHIPMODE#2354,L_SHIPINSTRUCT#2355,L_RETURNFLAG#2356,L_RECEIPTDATE#2357,L_ORDERKEY#2358,L_PARTKEY#2359,L_SUPPKEY#2360,L_LINENUMBER#2361,L_QUANTITY#2362,L_EXTENDEDPRICE#2363,L_DISCOUNT#2364,L_TAX#2365,L_LINESTATUS#2366,L_COMMITDATE#2367,L_COMMENT#2368]
>  CarbonDatasourceHadoopRelation [ Database name :test_db1, Table name 
> :lineitem3, Schema :Some(StructType(StructField(L_SHIPDATE,StringType,true), 
> StructField(L_SHIPMODE,StringType,true), 
> StructField(L_SHIPINSTRUCT,StringType,true), 
> StructField(L_RETURNFLAG,StringType,true), 
> StructField(L_RECEIPTDATE,StringType,true), 
> StructField(L_ORDERKEY,StringType,true), 
> StructField(L_PARTKEY,StringType,true), 
> StructField(L_SUPPKEY,StringType,true), 
> StructField(L_LINENUMBER,IntegerType,true), 
> StructField(L_QUANTITY,DoubleType,true), 
> StructField(L_EXTENDEDPRICE,DoubleType,true), 
> StructField(L_DISCOUNT,DoubleType,true), StructField(L_TAX,DoubleType,true), 
> StructField(L_LINESTATUS,StringType,true), 
> StructField(L_COMMITDATE,StringType,true), 
> StructField(L_COMMENT,StringType,true))) ] (state=,code=0)
> lineitem4: no pre-aggregate table created
> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem4 group by l_returnflag, l_linestatus order by l_returnflag, 
> l_linestatus;
> +---+---+--++--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)  |
> +---+---+--++--+
> | A | F | 1.263625E7   | 1.8938515425239815E10  |
> | N | F | 327800.0 | 4.91387677622E8|
> | N | O | 2.5398626E7  | 3.810981608977963E10   |
> | R | F | 1.2643878E7  | 1.8948524305619884E10  |
> +---+---+--++--+
> *+Expected:+*: aggregate query with order by should run fine
> *+Actual:+* aggregate query with order failed 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (CARBONDATA-1740) Carbon1.3.0-Pre-AggregateTable - Query plan exception for aggregate query with order by when main table is having pre-aggregate table

2017-11-16 Thread kumar vishal (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16255622#comment-16255622
 ] 

kumar vishal commented on CARBONDATA-1740:
--

This is failing because of order by in query. In PreAggregate rules order by 
scenario is not handled

> Carbon1.3.0-Pre-AggregateTable - Query plan exception for aggregate query 
> with order by when main table is having pre-aggregate table
> -
>
> Key: CARBONDATA-1740
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1740
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>  Labels: DFX
> Fix For: 1.3.0
>
>
> lineitem3: has a pre-aggregate table 
> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem3 group by l_returnflag, l_linestatus order by l_returnflag, 
> l_linestatus;
> Error: org.apache.spark.sql.AnalysisException: expression 
> '`lineitem3_l_returnflag`' is neither present in the group by, nor is it an 
> aggregate function. Add to group by or wrap in first() (or first_value) if 
> you don't care which value you get.;;
> Project [l_returnflag#2356, l_linestatus#2366, sum(l_quantity)#2791, 
> sum(l_extendedprice)#2792]
> +- Sort [aggOrder#2795 ASC NULLS FIRST, aggOrder#2796 ASC NULLS FIRST], true
>+- !Aggregate [l_returnflag#2356, l_linestatus#2366], [l_returnflag#2356, 
> l_linestatus#2366, sum(l_quantity#2362) AS sum(l_quantity)#2791, 
> sum(l_extendedprice#2363) AS sum(l_extendedprice)#2792, 
> lineitem3_l_returnflag#2341 AS aggOrder#2795, lineitem3_l_linestatus#2342 AS 
> aggOrder#2796]
>   +- SubqueryAlias lineitem3
>  +- 
> Relation[L_SHIPDATE#2353,L_SHIPMODE#2354,L_SHIPINSTRUCT#2355,L_RETURNFLAG#2356,L_RECEIPTDATE#2357,L_ORDERKEY#2358,L_PARTKEY#2359,L_SUPPKEY#2360,L_LINENUMBER#2361,L_QUANTITY#2362,L_EXTENDEDPRICE#2363,L_DISCOUNT#2364,L_TAX#2365,L_LINESTATUS#2366,L_COMMITDATE#2367,L_COMMENT#2368]
>  CarbonDatasourceHadoopRelation [ Database name :test_db1, Table name 
> :lineitem3, Schema :Some(StructType(StructField(L_SHIPDATE,StringType,true), 
> StructField(L_SHIPMODE,StringType,true), 
> StructField(L_SHIPINSTRUCT,StringType,true), 
> StructField(L_RETURNFLAG,StringType,true), 
> StructField(L_RECEIPTDATE,StringType,true), 
> StructField(L_ORDERKEY,StringType,true), 
> StructField(L_PARTKEY,StringType,true), 
> StructField(L_SUPPKEY,StringType,true), 
> StructField(L_LINENUMBER,IntegerType,true), 
> StructField(L_QUANTITY,DoubleType,true), 
> StructField(L_EXTENDEDPRICE,DoubleType,true), 
> StructField(L_DISCOUNT,DoubleType,true), StructField(L_TAX,DoubleType,true), 
> StructField(L_LINESTATUS,StringType,true), 
> StructField(L_COMMITDATE,StringType,true), 
> StructField(L_COMMENT,StringType,true))) ] (state=,code=0)
> lineitem4: no pre-aggregate table created
> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem4 group by l_returnflag, l_linestatus order by l_returnflag, 
> l_linestatus;
> +---+---+--++--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)  |
> +---+---+--++--+
> | A | F | 1.263625E7   | 1.8938515425239815E10  |
> | N | F | 327800.0 | 4.91387677622E8|
> | N | O | 2.5398626E7  | 3.810981608977963E10   |
> | R | F | 1.2643878E7  | 1.8948524305619884E10  |
> +---+---+--++--+
> *+Expected:+*: aggregate query with order by should run fine
> *+Actual:+* aggregate query with order failed 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1435: [CARBONDATA-1626]add data size and index size in tab...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1435
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1196/



---

[GitHub] carbondata issue #1504: [CARBONDATA-1732] Add S3 support in FileFactory

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1504
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1195/



---

[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...

2017-11-16 Thread kumarvishal09

Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1435#discussion_r151458293
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
 ---
@@ -292,6 +290,35 @@ object CarbonDataRDDFactory {
 var executorMessage: String = ""
 val isSortTable = carbonTable.getNumberOfSortColumns > 0
 val sortScope = 
CarbonDataProcessorUtil.getSortScope(carbonLoadModel.getSortScope)
+
+def updateStatus(status: Array[(String, (LoadMetadataDetails, 
ExecutionErrors))],
--- End diff --

do not update table status file separately in separate method for size, add 
 size while adding loadmetadata details to table status


---

[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...

2017-11-16 Thread kumarvishal09

Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1435#discussion_r151446545
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java ---
@@ -2119,5 +2127,146 @@ public static String getNewTablePath(Path 
carbonTablePath,
 return parentPath.toString() + CarbonCommonConstants.FILE_SEPARATOR + 
carbonTableIdentifier
 .getTableName();
   }
+
+  /*
+   * This method will add data size and index size into tablestatus for 
each segment
+   */
+  public static void addDataIndexSizeIntoMetaEntry(LoadMetadataDetails 
loadMetadataDetails,
+  String segmentId, CarbonTable carbonTable) throws IOException {
+CarbonTablePath carbonTablePath =
+
CarbonStorePath.getCarbonTablePath((carbonTable.getAbsoluteTableIdentifier()));
+HashMap dataIndexSize =
+FileFactory.getDataSizeAndIndexSize(carbonTablePath, segmentId);
+loadMetadataDetails
+
.setDataSize(dataIndexSize.get(CarbonCommonConstants.CARBON_TOTAL_DATA_SIZE).toString());
+loadMetadataDetails
+
.setIndexSize(dataIndexSize.get(CarbonCommonConstants.CARBON_TOTAL_INDEX_SIZE).toString());
+  }
+
+  /**
+   * This method will calculate the data size and index size for carbon 
table
+   */
+  public static HashMap calculateSize(CarbonTable 
carbonTable)
--- End diff --

Update the method signature Map


---

[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...

2017-11-16 Thread kumarvishal09

Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1435#discussion_r151446750
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java ---
@@ -2119,5 +2127,146 @@ public static String getNewTablePath(Path 
carbonTablePath,
 return parentPath.toString() + CarbonCommonConstants.FILE_SEPARATOR + 
carbonTableIdentifier
 .getTableName();
   }
+
+  /*
+   * This method will add data size and index size into tablestatus for 
each segment
+   */
+  public static void addDataIndexSizeIntoMetaEntry(LoadMetadataDetails 
loadMetadataDetails,
+  String segmentId, CarbonTable carbonTable) throws IOException {
+CarbonTablePath carbonTablePath =
+
CarbonStorePath.getCarbonTablePath((carbonTable.getAbsoluteTableIdentifier()));
+HashMap dataIndexSize =
--- End diff --

Change it to Map


---

[jira] [Updated] (CARBONDATA-1749) (Carbon1.3.0- DB creation external path) - mdt file is not created in directory as per configuration in carbon.properties

2017-11-16 Thread Chetan Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1749:

Description: 
Steps :
In carbon.properties the mdt file directory path is configured as 
Carbon.update.sync.folder=hdfs://hacluster/user/test1 or /tmp/test1/

In beeline user creates a database by specifying the carbon store path and 
creates a carbon table in the db.
drop database if exists test_db1 cascade;
create database test_db1 location 'hdfs://hacluster/user/test1';
use test_db1;
create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');

User checks in HDFS UI if the mdt file is created in directory specified 
(hdfs://hacluster/user/test1) as per configuration in carbon.properties.

Issue : mdt file is not created in directory specified 
(hdfs://hacluster/user/test1) as per configuration in carbon.properties. Also 
the folder is not created if the user configures the folder path as 
Carbon.update.sync.folder=/tmp/test1/

Expected : mdt file should be created in directory specified 
(hdfs://hacluster/user/test1) or /tmp/test1/ as per configuration in 
carbon.properties. 

  was:
Steps :
In carbon.properties the mdt file directory path is configured as 
Carbon.update.sync.folder=hdfs://hacluster/user/test1 

In beeline user creates a database by specifying the carbon store path and 
creates a carbon table in the db.
drop database if exists test_db1 cascade;
create database test_db1 location 'hdfs://hacluster/user/test1';
use test_db1;
create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');

User checks in HDFS UI if the mdt file is created in directory specified 
(hdfs://hacluster/user/test1) as per configuration in carbon.properties.

Issue : mdt file is not created in directory specified 
(hdfs://hacluster/user/test1) as per configuration in carbon.properties.

Expected : mdt file should be created in directory specified 
(hdfs://hacluster/user/test1) as per configuration in carbon.properties.


> (Carbon1.3.0- DB creation external path) - mdt file is not created in 
> directory as per configuration in carbon.properties
> -
>
> Key: CARBONDATA-1749
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1749
> Project: CarbonData
>  Issue Type: Bug
>  Components: other
>Affects Versions: 1.3.0
> Environment: 3 node cluster
>Reporter: Chetan Bhat
>  Labels: Functional
>
> Steps :
> In carbon.properties the mdt file directory path is configured as 
> Carbon.update.sync.folder=hdfs://hacluster/user/test1 or /tmp/test1/
> In beeline user creates a database by specifying the carbon store path and 
> creates a carbon table in the db.
> drop database if exists test_db1 cascade;
> create database test_db1 location 'hdfs://hacluster/user/test1';
> use test_db1;
> create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
> string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
> double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
> 'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');
> User checks in HDFS UI if the mdt file is created in directory specified 
> (hdfs://hacluster/user/test1) as per configuration in carbon.properties.
> Issue : mdt file is not created in directory specified 
> (hdfs://hacluster/user/test1) as per configuration in carbon.properties. Also 
> the folder is not created if the user configures the folder path as 
> Carbon.update.sync.folder=/tmp/test1/
> Expected : mdt file should be created in directory specified 
> (hdfs://hacluster/user/test1) or /tmp/test1/ as per configuration in 
> carbon.properties. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1507: [CARBONDATA-1326] Fixed high priority findbug issue

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1507
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1194/



---

[jira] [Updated] (CARBONDATA-1748) (Carbon1.3.0- DB creation external path) - Permission of created table and database folder in carbon store not correct

2017-11-16 Thread Chetan Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1748:

Description: 
Steps : 
In spark Beeline user executes the following queries.
drop database if exists test_db1 cascade;
create database test_db1 location 'hdfs://hacluster/user/test1';
use test_db1;
create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');

User checks the permission of the created database and table in carbon store 
using the  bin/hadoop fs -getfacl command.

Issue : The Permission of created table and database folder in carbon store not 
correct. i.e 
# file: /user/test1/orders
# owner: anonymous
# group: users
user::rwx
group::r-x
other::r-x

Expected : Correct permissions for the created table and database folder in 
carbon store should be 
# file: /user/test1/orders
# owner: anonymous
# group: users
user::rwx
group::---
other::---



  was:
Steps :
drop database if exists test_db1 cascade;
create database test_db1 location 'hdfs://hacluster/user/test1';
use test_db1;
create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');

User checks the permission of the created database and table in carbon store 
using the  bin/hadoop fs -getfacl command.

Issue : The Permission of created table and database folder in carbon store not 
correct. i.e 
# file: /user/test1/orders
# owner: anonymous
# group: users
user::rwx
group::r-x
other::r-x

Expected : Correct permissions for the created table and database folder in 
carbon store should be 
# file: /user/test1/orders
# owner: anonymous
# group: users
user::rwx
group::---
other::---




> (Carbon1.3.0- DB creation external path) - Permission of created table and 
> database folder in carbon store not correct
> --
>
> Key: CARBONDATA-1748
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1748
> Project: CarbonData
>  Issue Type: Bug
>  Components: other
>Affects Versions: 1.3.0
> Environment: 3 node ant cluster
>Reporter: Chetan Bhat
>  Labels: security
>
> Steps : 
> In spark Beeline user executes the following queries.
> drop database if exists test_db1 cascade;
> create database test_db1 location 'hdfs://hacluster/user/test1';
> use test_db1;
> create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
> string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
> double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
> 'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');
> User checks the permission of the created database and table in carbon store 
> using the  bin/hadoop fs -getfacl command.
> Issue : The Permission of created table and database folder in carbon store 
> not correct. i.e 
> # file: /user/test1/orders
> # owner: anonymous
> # group: users
> user::rwx
> group::r-x
> other::r-x
> Expected : Correct permissions for the created table and database folder in 
> carbon store should be 
> # file: /user/test1/orders
> # owner: anonymous
> # group: users
> user::rwx
> group::---
> other::---



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1749) (Carbon1.3.0- DB creation external path) - mdt file is not created in directory as per configuration in carbon.properties

2017-11-16 Thread Chetan Bhat (JIRA)

Chetan Bhat created CARBONDATA-1749:
---

 Summary: (Carbon1.3.0- DB creation external path) - mdt file is 
not created in directory as per configuration in carbon.properties
 Key: CARBONDATA-1749
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1749
 Project: CarbonData
  Issue Type: Bug
  Components: other
Affects Versions: 1.3.0
 Environment: 3 node cluster
Reporter: Chetan Bhat


Steps :
In carbon.properties the mdt file directory path is configured as 
Carbon.update.sync.folder=hdfs://hacluster/user/test1 

In beeline user creates a database by specifying the carbon store path and 
creates a carbon table in the db.
drop database if exists test_db1 cascade;
create database test_db1 location 'hdfs://hacluster/user/test1';
use test_db1;
create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');

User checks in HDFS UI if the mdt file is created in directory specified 
(hdfs://hacluster/user/test1) as per configuration in carbon.properties.

Issue : mdt file is not created in directory specified 
(hdfs://hacluster/user/test1) as per configuration in carbon.properties.

Expected : mdt file should be created in directory specified 
(hdfs://hacluster/user/test1) as per configuration in carbon.properties.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (CARBONDATA-1747) (Carbon1.3.0- DB creation external path) - Owner name of compacted segment and segment after update is not correct

2017-11-16 Thread Chetan Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1747:

Description: 
Steps :
In spark Beeline user executes the following queries
drop database if exists test_db1 cascade;
create database test_db1 location 'hdfs://hacluster/user/test1';
use test_db1;
create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');
load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
alter table ORDERS compact 'major';
update orders set (O_ORDERKEY)=(1) where O_CUSTKEY=6259021;

After compaction and update user checks the Owner name of compacted segment and 
segment name after update in HDFS UI.

Issue : In HDFS UI before compaction and update the owner name of the existing 
segment folders was "anonymous". After compaction and update the owner name of 
the compacted segment folder and segment which is impacted by update is 
displayed as "root".

Expected : After compaction and update the owner name of the compacted segment 
folder and segment which is impacted by update should be "anonymous".

  was:
Steps :
User executes the following queries
drop database if exists test_db1 cascade;
create database test_db1 location 'hdfs://hacluster/user/test1';
use test_db1;
create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');
load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
alter table ORDERS compact 'major';
update orders set (O_ORDERKEY)=(1) where O_CUSTKEY=6259021;

After compaction and update user checks the Owner name of compacted segment and 
segment name after update in HDFS UI.

Issue : In HDFS UI before compaction and update the owner name of the existing 
segment folders was "anonymous". After compaction and update the owner name of 
the compacted segment folder and segment which is impacted by update is 
displayed as "root".

Expected : After compaction and update the owner name of the compacted segment 
folder and segment which is impacted by update should be "anonymous".


> (Carbon1.3.0- DB creation external path) - Owner name of compacted segment 
> and segment after update is not correct
> --
>
> Key: CARBONDATA-1747
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1747
> Project: CarbonData
>  Issue Type: Bug
>  Components: other
>Affects Versions: 1.3.0
> Environment: 3 node ant cluster
>Reporter: Chetan Bhat
>  Labels:

[GitHub] carbondata issue #1513: [CARBONDATA-1745] Use default metastore path from Hi...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1513
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1193/



---

[GitHub] carbondata issue #1514: [CARBONDATA-1746] Count star optimization

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1514
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1192/



---

[GitHub] carbondata issue #1435: [CARBONDATA-1626]add data size and index size in tab...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1435
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1191/



---

[GitHub] carbondata issue #1512: [CARBONDATA-1742] Fix NullPointerException in Segmen...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1512
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1190/



---

[GitHub] carbondata issue #1513: [CARBONDATA-1745] Use default metastore path from Hi...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1513
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1189/



---

[GitHub] carbondata pull request #1505: [CARBONDATA-1733] While load is in progress, ...

2017-11-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1505


---

[GitHub] carbondata issue #1494: [CARBONDATA-1706] Making index merge DDL insensitive...

2017-11-16 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1494
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1188/



---

[GitHub] carbondata issue #1505: [CARBONDATA-1733] While load is in progress, Show se...

2017-11-16 Thread kumarvishal09

Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/1505
  
LGTM


---

[jira] [Updated] (CARBONDATA-1748) (Carbon1.3.0- DB creation external path) - Permission of created table and database folder in carbon store not correct

2017-11-16 Thread Chetan Bhat (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1748:

Description: 
Steps :
drop database if exists test_db1 cascade;
create database test_db1 location 'hdfs://hacluster/user/test1';
use test_db1;
create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');

User checks the permission of the created database and table in carbon store 
using the  bin/hadoop fs -getfacl command.

Issue : The Permission of created table and database folder in carbon store not 
correct. i.e 
# file: /user/test1/orders
# owner: anonymous
# group: users
user::rwx
group::r-x
other::r-x

Expected : Correct permissions for the created table and database folder in 
carbon store should be 
# file: /user/test1/orders
# owner: anonymous
# group: users
user::rwx
group::---
other::---



  was:
Steps :
drop database if exists test_db1 cascade;
create database test_db1 location 'hdfs://hacluster/user/test1';
use test_db1;
create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');

User checks the permission of the created database and table in carbon store 
using the  bin/hadoop fs -getfacl command.

Issue : The Permission of created table and database folder in carbon store not 
correct. i.e 
# file: /user/test1/orders
# owner: anonymous
# group: users
user::rwx
group::r-x
other::r-x

Expected : Correct permissions should be 
# file: /user/test1/orders
# owner: anonymous
# group: users
user::rwx
group::---
other::---




> (Carbon1.3.0- DB creation external path) - Permission of created table and 
> database folder in carbon store not correct
> --
>
> Key: CARBONDATA-1748
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1748
> Project: CarbonData
>  Issue Type: Bug
>  Components: other
>Affects Versions: 1.3.0
> Environment: 3 node ant cluster
>Reporter: Chetan Bhat
>  Labels: security
>
> Steps :
> drop database if exists test_db1 cascade;
> create database test_db1 location 'hdfs://hacluster/user/test1';
> use test_db1;
> create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
> string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
> double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
> 'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');
> User checks the permission of the created database and table in carbon store 
> using the  bin/hadoop fs -getfacl command.
> Issue : The Permission of created table and database folder in carbon store 
> not correct. i.e 
> # file: /user/test1/orders
> # owner: anonymous
> # group: users
> user::rwx
> group::r-x
> other::r-x
> Expected : Correct permissions for the created table and database folder in 
> carbon store should be 
> # file: /user/test1/orders
> # owner: anonymous
> # group: users
> user::rwx
> group::---
> other::---



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1748) (Carbon1.3.0- DB creation external path) - Permission of created table and database folder in carbon store not correct

2017-11-16 Thread Chetan Bhat (JIRA)

Chetan Bhat created CARBONDATA-1748:
---

 Summary: (Carbon1.3.0- DB creation external path) - Permission of 
created table and database folder in carbon store not correct
 Key: CARBONDATA-1748
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1748
 Project: CarbonData
  Issue Type: Bug
  Components: other
Affects Versions: 1.3.0
 Environment: 3 node ant cluster
Reporter: Chetan Bhat


Steps :
drop database if exists test_db1 cascade;
create database test_db1 location 'hdfs://hacluster/user/test1';
use test_db1;
create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');

User checks the permission of the created database and table in carbon store 
using the  bin/hadoop fs -getfacl command.

Issue : The Permission of created table and database folder in carbon store not 
correct. i.e 
# file: /user/test1/orders
# owner: anonymous
# group: users
user::rwx
group::r-x
other::r-x

Expected : Correct permissions should be 
# file: /user/test1/orders
# owner: anonymous
# group: users
user::rwx
group::---
other::---





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata pull request #1432: [CARBONDATA-1608]Support Column Comment for C...

2017-11-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1432


---

[jira] [Assigned] (CARBONDATA-1743) Carbon1.3.0-Pre-AggregateTable - Query returns no value if run at the time of pre-aggregate table creation

2017-11-16 Thread Kunal Kapoor (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor reassigned CARBONDATA-1743:


Assignee: Kunal Kapoor

> Carbon1.3.0-Pre-AggregateTable - Query returns no value if run at the time of 
> pre-aggregate table creation
> --
>
> Key: CARBONDATA-1743
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1743
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: Kunal Kapoor
>  Labels: DFX
> Fix For: 1.3.0
>
>
> Steps:
> 1. Create table and load with large data
> create table if not exists lineitem4(L_SHIPDATE string,L_SHIPMODE 
> string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
> string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
> int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
> double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
> lineitem4 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> 2. Create a pre-aggregate table 
> create datamap agr_lineitem4 ON TABLE lineitem4 USING 
> "org.apache.carbondata.datamap.AggregateDataMapHandler" as select 
> L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem4 
> group by  L_RETURNFLAG, L_LINESTATUS;
> 3. Run aggregate query at the same time
>  select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem4 group by l_returnflag, l_linestatus;
> *+Expected:+*: aggregate query should fetch data either from main table or 
> pre-aggregate table.
> *+Actual:+* aggregate query does not return data until the pre-aggregate 
> table is created
> 0: jdbc:hive2://10.18.98.48:23040> select 
> l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 
> group by l_returnflag, l_linestatus;
> +---+---+--+---+--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  | sum(l_extendedprice)  |
> +---+---+--+---+--+
> +---+---+--+---+--+
> No rows selected (1.74 seconds)
> 0: jdbc:hive2://10.18.98.48:23040> select 
> l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 
> group by l_returnflag, l_linestatus;
> +---+---+--+---+--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  | sum(l_extendedprice)  |
> +---+---+--+---+--+
> +---+---+--+---+--+
> No rows selected (0.746 seconds)
> 0: jdbc:hive2://10.18.98.48:23040> select 
> l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 
> group by l_returnflag, l_linestatus;
> +---+---+--++--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)  |
> +---+---+--++--+
> | N | F | 2.9808092E7  | 4.471079473931997E10   |
> | A | F | 1.145546488E9| 1.717580824169429E12   |
> | N | O | 2.31980219E9 | 3.4789002701143467E12  |
> | R | F | 1.146403932E9| 1.7190627928317903E12  |
> +---+---+--++--+
> 4 rows selected (0.8 seconds)
> 0: jdbc:hive2://10.18.98.48:23040> select 
> l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 
> group by l_returnflag, l_linestatus;
> +---+---+--++--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)  |
> +---+---+--++--+
> | N | F | 2.9808092E7  | 4.471079473931997E10   |
> | A | F | 1.145546488E9| 1.717580824169429E12   |
> | N | O | 2.31980219E9

[jira] [Created] (CARBONDATA-1747) (Carbon1.3.0- DB creation external path) - Owner name of compacted segment and segment after update is not correct

2017-11-16 Thread Chetan Bhat (JIRA)

Chetan Bhat created CARBONDATA-1747:
---

 Summary: (Carbon1.3.0- DB creation external path) - Owner name of 
compacted segment and segment after update is not correct
 Key: CARBONDATA-1747
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1747
 Project: CarbonData
  Issue Type: Bug
  Components: other
Affects Versions: 1.3.0
 Environment: 3 node ant cluster
Reporter: Chetan Bhat


Steps :
User executes the following queries
drop database if exists test_db1 cascade;
create database test_db1 location 'hdfs://hacluster/user/test1';
use test_db1;
create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');
load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
alter table ORDERS compact 'major';
update orders set (O_ORDERKEY)=(1) where O_CUSTKEY=6259021;

After compaction and update user checks the Owner name of compacted segment and 
segment name after update in HDFS UI.

Issue : In HDFS UI before compaction and update the owner name of the existing 
segment folders was "anonymous". After compaction and update the owner name of 
the compacted segment folder and segment which is impacted by update is 
displayed as "root".

Expected : After compaction and update the owner name of the compacted segment 
folder and segment which is impacted by update should be "anonymous".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1504: [CARBONDATA-1732] Add S3 support in FileFactory

2017-11-16 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1504
  
retest this please


---

[GitHub] carbondata pull request #1514: [CARBONDATA-1746] Count star optimization

2017-11-16 Thread jackylk

GitHub user jackylk opened a pull request:

https://github.com/apache/carbondata/pull/1514

[CARBONDATA-1746] Count star optimization

Since carbon records number of row in metadata, count star query can 
leverage it to improve performance.

 - [X] Any interfaces changed?
 No
 - [X] Any backward compatibility impacted?
 No
 - [X] Document update required?
No
 - [X] Testing done
 No testcase added
 - [X] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
MR38

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jackylk/incubator-carbondata count_star

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1514.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1514


commit 09ff30688fe84c199fb86b92b9547c859e05a75c
Author: Jacky Li 
Date:   2017-11-16T09:27:21Z

add s3 in filefactory

commit 8d31b0974071bb7cd8ad72aa58990b9f2621b8a1
Author: Jacky Li 
Date:   2017-11-16T11:41:19Z

remove unnecessary path

commit 5a7008e0691daff9bad3c8bf707d0592500a1f24
Author: Jacky Li 
Date:   2017-11-16T12:56:55Z

clean CarbonEnv

commit aeee0e5f7df61f1add1dca30be0722aab0a8d2dd
Author: Jacky Li 
Date:   2017-11-16T13:23:22Z

remove AKSK in log

commit fc3cf73e7d13212faa9ca4502d09c637c57ff970
Author: Jacky Li 
Date:   2017-11-16T13:57:54Z

change default metastore path

commit 7a4c77526b97b6cc5e9bd286dd3701f4d1ba86c5
Author: Jacky Li 
Date:   2017-11-16T14:57:07Z

fix testcase

commit 90b1841200c1086a2567e54787750b970208ed13
Author: Jacky Li 
Date:   2017-11-16T14:53:02Z

add count star optimization




---

[jira] [Created] (CARBONDATA-1746) Count Star optimization

2017-11-16 Thread Jacky Li (JIRA)

Jacky Li created CARBONDATA-1746:


 Summary: Count Star optimization
 Key: CARBONDATA-1746
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1746
 Project: CarbonData
  Issue Type: New Feature
Reporter: Jacky Li
Assignee: Jacky Li
 Fix For: 1.3.0


Since carbon records number of row in metadata, count star query can leverage 
it to improve performance



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

1 2 >

1 - 100 of 196 matches

Mail list logo