[GitHub] carbondata pull request #1516: [CARBONDATA-1729]Fix the compatibility issue ...

2017-11-17 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1516#discussion_r151622204
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java 
---
@@ -462,39 +461,8 @@ public static DataOutputStream 
getDataOutputStreamUsingAppend(String path, FileT
* @throws IOException
*/
   public static void truncateFile(String path, FileType fileType, long 
newSize) throws IOException {
-path = path.replace("\\", "/");
-FileChannel fileChannel = null;
-switch (fileType) {
-  case LOCAL:
-path = getUpdatedFilePath(path, fileType);
-fileChannel = new FileOutputStream(path, true).getChannel();
-try {
-  fileChannel.truncate(newSize);
-} finally {
-  if (fileChannel != null) {
-fileChannel.close();
-  }
-}
-return;
-  case HDFS:
-  case ALLUXIO:
-  case VIEWFS:
-  case S3:
-Path pt = new Path(path);
-FileSystem fs = pt.getFileSystem(configuration);
-fs.truncate(pt, newSize);
--- End diff --

I think it is better to use java reflection for line 485 only, no need to 
modify previous file


---


[GitHub] carbondata issue #1506: [CARBONDATA-1734] Ignore empty line while reading CS...

2017-11-17 Thread dhatchayani
Github user dhatchayani commented on the issue:

https://github.com/apache/carbondata/pull/1506
  
Please check PR#1520.


---


[GitHub] carbondata issue #1511: [CARBONDATA-1741] Remove AKSK in log when saving to ...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1511
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1216/



---


[GitHub] carbondata issue #1511: [CARBONDATA-1741] Remove AKSK in log when saving to ...

2017-11-17 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/carbondata/pull/1511
  
LGTM


---


[GitHub] carbondata pull request #1511: [CARBONDATA-1741] Remove AKSK in log when sav...

2017-11-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1511


---


[GitHub] carbondata pull request #1516: [CARBONDATA-1729]Fix the compatibility issue ...

2017-11-17 Thread chenliang613
Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1516#discussion_r151626078
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java
 ---
@@ -154,52 +155,68 @@ public boolean delete() {
* This method will delete the data in file data from a given offset
*/
   @Override public boolean truncate(String fileName, long 
validDataEndOffset) {
-DataOutputStream dataOutputStream = null;
-DataInputStream dataInputStream = null;
 boolean fileTruncatedSuccessfully = false;
-// if bytes to read less than 1024 then buffer size should be equal to 
the given offset
-int bufferSize = validDataEndOffset > 
CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR ?
-CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR :
-(int) validDataEndOffset;
-// temporary file name
-String tempWriteFilePath = fileName + 
CarbonCommonConstants.TEMPWRITEFILEEXTENSION;
-FileFactory.FileType fileType = FileFactory.getFileType(fileName);
 try {
-  CarbonFile tempFile;
-  // delete temporary file if it already exists at a given path
-  if (FileFactory.isFileExist(tempWriteFilePath, fileType)) {
-tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType);
-tempFile.delete();
-  }
-  // create new temporary file
-  FileFactory.createNewFile(tempWriteFilePath, fileType);
-  tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType);
-  byte[] buff = new byte[bufferSize];
-  dataInputStream = FileFactory.getDataInputStream(fileName, fileType);
-  // read the data
-  int read = dataInputStream.read(buff, 0, buff.length);
-  dataOutputStream = 
FileFactory.getDataOutputStream(tempWriteFilePath, fileType);
-  dataOutputStream.write(buff, 0, read);
-  long remaining = validDataEndOffset - read;
-  // anytime we should not cross the offset to be read
-  while (remaining > 0) {
-if (remaining > bufferSize) {
-  buff = new byte[bufferSize];
-} else {
-  buff = new byte[(int) remaining];
+  // if hadoop version >= 2.7, it can call method 'truncate' to 
truncate file,
+  // this method was new in hadoop 2.7
+  FileSystem fs = 
fileStatus.getPath().getFileSystem(FileFactory.getConfiguration());
+  Method truncateMethod = fs.getClass().getDeclaredMethod("truncate",
+  new Class[]{Path.class, long.class});
+  fileTruncatedSuccessfully = (boolean)truncateMethod.invoke(fs,
+  new Object[]{fileStatus.getPath(), validDataEndOffset});
+} catch (NoSuchMethodException e) {
+  LOGGER.error("there is no 'truncate' method in FileSystem, the 
version of hadoop is"
+  + " below 2.7, It needs to implement truncate file by other 
way.");
+  DataOutputStream dataOutputStream = null;
+  DataInputStream dataInputStream = null;
+  // if bytes to read less than 1024 then buffer size should be equal 
to the given offset
+  int bufferSize = validDataEndOffset > 
CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR ?
+  CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR :
+  (int) validDataEndOffset;
+  // temporary file name
+  String tempWriteFilePath = fileName + 
CarbonCommonConstants.TEMPWRITEFILEEXTENSION;
+  FileFactory.FileType fileType = FileFactory.getFileType(fileName);
+  try {
+CarbonFile tempFile;
+// delete temporary file if it already exists at a given path
+if (FileFactory.isFileExist(tempWriteFilePath, fileType)) {
+  tempFile = FileFactory.getCarbonFile(tempWriteFilePath, 
fileType);
+  tempFile.delete();
 }
-read = dataInputStream.read(buff, 0, buff.length);
+// create new temporary file
+FileFactory.createNewFile(tempWriteFilePath, fileType);
+tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType);
+byte[] buff = new byte[bufferSize];
+dataInputStream = FileFactory.getDataInputStream(fileName, 
fileType);
+// read the data
+int read = dataInputStream.read(buff, 0, buff.length);
+dataOutputStream = 
FileFactory.getDataOutputStream(tempWriteFilePath, fileType);
 dataOutputStream.write(buff, 0, read);
-remaining = remaining - read;
+long remaining = validDataEndOffset - read;
+// anytime we should not cross the offset to be read
+while (remaining > 0) {
+  if (remaining > bufferSize) {
+buff = new byte[bufferSize];
+  } else {
+buff = new byte[(int)

[GitHub] carbondata pull request #1512: [CARBONDATA-1742] Fix NullPointerException in...

2017-11-17 Thread xubo245
Github user xubo245 closed the pull request at:

https://github.com/apache/carbondata/pull/1512


---


[GitHub] carbondata issue #1512: [CARBONDATA-1742] Fix NullPointerException in Segmen...

2017-11-17 Thread xubo245
Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/1512
  
@akashrn5 ok


---


[GitHub] carbondata pull request #1513: [CARBONDATA-1745] Use default metastore path ...

2017-11-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1513


---


[GitHub] carbondata pull request #1506: [CARBONDATA-1734] Ignore empty line while rea...

2017-11-17 Thread akashrn5
Github user akashrn5 closed the pull request at:

https://github.com/apache/carbondata/pull/1506


---


[GitHub] carbondata pull request #1516: [CARBONDATA-1729]Fix the compatibility issue ...

2017-11-17 Thread chenliang613
Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1516#discussion_r151628700
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java 
---
@@ -462,39 +461,8 @@ public static DataOutputStream 
getDataOutputStreamUsingAppend(String path, FileT
* @throws IOException
*/
   public static void truncateFile(String path, FileType fileType, long 
newSize) throws IOException {
-path = path.replace("\\", "/");
--- End diff --

why remove these code.


---


[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1520
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1217/



---


[GitHub] carbondata issue #1514: [CARBONDATA-1746] Count star optimization

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1514
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1218/



---


[jira] [Created] (CARBONDATA-1754) Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if insert overwrite job is running concurrently

2017-11-17 Thread Ajeet Rai (JIRA)
Ajeet Rai created CARBONDATA-1754:
-

 Summary: Carbon1.3.0 Concurrent Load-Compaction: Compaction job 
fails at run time if insert overwrite job is running concurrently
 Key: CARBONDATA-1754
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1754
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 1.3.0
 Environment: 3 Node ant cluster
Reporter: Ajeet Rai


Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if 
insert overwrite job is running concurrently.

Steps: 
1: Create a table
2: Start three load one by one
3: After load is completed, start insert overwrite and minor compaction 
concurrently from two different session
4: observe that both jobs are are running
5: Observe that Insert overwrite job is success but after that compaction fails 
with below exception:
| ERROR | [pool-23-thread-49] | Error running hive query:  | 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167)
org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: 
Compaction failed. Please check logs for more info. Exception in compaction 
java.lang.Exception: Compaction failed to update metadata for table 
ajeet.flow_carbon_new999

7: Ideally compaction job should give error with message that insert overwrite 
in  progress.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1754) Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if insert overwrite job is running concurrently

2017-11-17 Thread Ajeet Rai (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajeet Rai updated CARBONDATA-1754:
--
Description: 
Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if 
insert overwrite job is running concurrently.

Steps: 
1: Create a table
2: Start three load one by one
3: After load is completed, start insert overwrite and minor compaction 
concurrently from two different session
4: observe that both jobs are are running
5: Observe that Insert overwrite job is success but after that compaction fails 
with below exception:
| ERROR | [pool-23-thread-49] | Error running hive query:  | 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167)
org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: 
Compaction failed. Please check logs for more info. Exception in compaction 
java.lang.Exception: Compaction failed to update metadata for table 
ajeet.flow_carbon_new999

7: Ideally compaction job should give error in start with message that insert 
overwrite in  progress.

  was:
Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if 
insert overwrite job is running concurrently.

Steps: 
1: Create a table
2: Start three load one by one
3: After load is completed, start insert overwrite and minor compaction 
concurrently from two different session
4: observe that both jobs are are running
5: Observe that Insert overwrite job is success but after that compaction fails 
with below exception:
| ERROR | [pool-23-thread-49] | Error running hive query:  | 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167)
org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: 
Compaction failed. Please check logs for more info. Exception in compaction 
java.lang.Exception: Compaction failed to update metadata for table 
ajeet.flow_carbon_new999

7: Ideally compaction job should give error with message that insert overwrite 
in  progress.


> Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if 
> insert overwrite job is running concurrently
> 
>
> Key: CARBONDATA-1754
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1754
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: 3 Node ant cluster
>Reporter: Ajeet Rai
>  Labels: dfx
>
> Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if 
> insert overwrite job is running concurrently.
> Steps: 
> 1: Create a table
> 2: Start three load one by one
> 3: After load is completed, start insert overwrite and minor compaction 
> concurrently from two different session
> 4: observe that both jobs are are running
> 5: Observe that Insert overwrite job is success but after that compaction 
> fails with below exception:
> | ERROR | [pool-23-thread-49] | Error running hive query:  | 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167)
> org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: 
> Compaction failed. Please check logs for more info. Exception in compaction 
> java.lang.Exception: Compaction failed to update metadata for table 
> ajeet.flow_carbon_new999
> 7: Ideally compaction job should give error in start with message that insert 
> overwrite in  progress.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1754) Carbon1.3.0 Concurrent Insert overwrite-Compaction: Compaction job fails at run time if insert overwrite job is running concurrentlyInsert overwrite

2017-11-17 Thread Ajeet Rai (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajeet Rai updated CARBONDATA-1754:
--
Summary: Carbon1.3.0 Concurrent Insert overwrite-Compaction: Compaction job 
fails at run time if insert overwrite job is running concurrentlyInsert 
overwrite  (was: Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails 
at run time if insert overwrite job is running concurrently)

> Carbon1.3.0 Concurrent Insert overwrite-Compaction: Compaction job fails at 
> run time if insert overwrite job is running concurrentlyInsert overwrite
> 
>
> Key: CARBONDATA-1754
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1754
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: 3 Node ant cluster
>Reporter: Ajeet Rai
>  Labels: dfx
>
> Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if 
> insert overwrite job is running concurrently.
> Steps: 
> 1: Create a table
> 2: Start three load one by one
> 3: After load is completed, start insert overwrite and minor compaction 
> concurrently from two different session
> 4: observe that both jobs are are running
> 5: Observe that Insert overwrite job is success but after that compaction 
> fails with below exception:
> | ERROR | [pool-23-thread-49] | Error running hive query:  | 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167)
> org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: 
> Compaction failed. Please check logs for more info. Exception in compaction 
> java.lang.Exception: Compaction failed to update metadata for table 
> ajeet.flow_carbon_new999
> 7: Ideally compaction job should give error in start with message that insert 
> overwrite in  progress.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1516: [CARBONDATA-1729]Fix the compatibility issue ...

2017-11-17 Thread zzcclp
Github user zzcclp commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1516#discussion_r151632100
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java 
---
@@ -462,39 +461,8 @@ public static DataOutputStream 
getDataOutputStreamUsingAppend(String path, FileT
* @throws IOException
*/
   public static void truncateFile(String path, FileType fileType, long 
newSize) throws IOException {
-path = path.replace("\\", "/");
-FileChannel fileChannel = null;
-switch (fileType) {
-  case LOCAL:
-path = getUpdatedFilePath(path, fileType);
-fileChannel = new FileOutputStream(path, true).getChannel();
-try {
-  fileChannel.truncate(newSize);
-} finally {
-  if (fileChannel != null) {
-fileChannel.close();
-  }
-}
-return;
-  case HDFS:
-  case ALLUXIO:
-  case VIEWFS:
-  case S3:
-Path pt = new Path(path);
-FileSystem fs = pt.getFileSystem(configuration);
-fs.truncate(pt, newSize);
--- End diff --

According to discussion with @QiangCai offline, just use the interface 
'CarbonFile.truncate' to truncate file uniformly.
@QiangCai what do you think about this?


---


[jira] [Updated] (CARBONDATA-1754) Carbon1.3.0 Concurrent Insert overwrite-Compaction: Compaction job fails at run time if insert overwrite job is running concurrentlyInsert overwrite

2017-11-17 Thread Ajeet Rai (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajeet Rai updated CARBONDATA-1754:
--
Description: 
Carbon1.3.0 Concurrent Insert overwrite-Compaction: Compaction job fails at run 
time if insert overwrite job is running concurrently.

Steps: 
1: Create a table
2: Start three load one by one
3: After load is completed, start insert overwrite and minor compaction 
concurrently from two different session
4: observe that both jobs are are running
5: Observe that Insert overwrite job is success but after that compaction fails 
with below exception:
| ERROR | [pool-23-thread-49] | Error running hive query:  | 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167)
org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: 
Compaction failed. Please check logs for more info. Exception in compaction 
java.lang.Exception: Compaction failed to update metadata for table 
ajeet.flow_carbon_new999

7: Ideally compaction job should give error in start with message that insert 
overwrite in  progress.

  was:
Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if 
insert overwrite job is running concurrently.

Steps: 
1: Create a table
2: Start three load one by one
3: After load is completed, start insert overwrite and minor compaction 
concurrently from two different session
4: observe that both jobs are are running
5: Observe that Insert overwrite job is success but after that compaction fails 
with below exception:
| ERROR | [pool-23-thread-49] | Error running hive query:  | 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167)
org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: 
Compaction failed. Please check logs for more info. Exception in compaction 
java.lang.Exception: Compaction failed to update metadata for table 
ajeet.flow_carbon_new999

7: Ideally compaction job should give error in start with message that insert 
overwrite in  progress.


> Carbon1.3.0 Concurrent Insert overwrite-Compaction: Compaction job fails at 
> run time if insert overwrite job is running concurrentlyInsert overwrite
> 
>
> Key: CARBONDATA-1754
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1754
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: 3 Node ant cluster
>Reporter: Ajeet Rai
>  Labels: dfx
>
> Carbon1.3.0 Concurrent Insert overwrite-Compaction: Compaction job fails at 
> run time if insert overwrite job is running concurrently.
> Steps: 
> 1: Create a table
> 2: Start three load one by one
> 3: After load is completed, start insert overwrite and minor compaction 
> concurrently from two different session
> 4: observe that both jobs are are running
> 5: Observe that Insert overwrite job is success but after that compaction 
> fails with below exception:
> | ERROR | [pool-23-thread-49] | Error running hive query:  | 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167)
> org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: 
> Compaction failed. Please check logs for more info. Exception in compaction 
> java.lang.Exception: Compaction failed to update metadata for table 
> ajeet.flow_carbon_new999
> 7: Ideally compaction job should give error in start with message that insert 
> overwrite in  progress.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1516: [CARBONDATA-1729]Fix the compatibility issue ...

2017-11-17 Thread zzcclp
Github user zzcclp commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1516#discussion_r151632600
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java 
---
@@ -462,39 +461,8 @@ public static DataOutputStream 
getDataOutputStreamUsingAppend(String path, FileT
* @throws IOException
*/
   public static void truncateFile(String path, FileType fileType, long 
newSize) throws IOException {
-path = path.replace("\\", "/");
--- End diff --

want to use the interface 'CarbonFile.truncate' to truncate file uniformly.


---


[GitHub] carbondata pull request #1516: [CARBONDATA-1729]Fix the compatibility issue ...

2017-11-17 Thread zzcclp
Github user zzcclp commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1516#discussion_r151633344
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java
 ---
@@ -154,52 +155,68 @@ public boolean delete() {
* This method will delete the data in file data from a given offset
*/
   @Override public boolean truncate(String fileName, long 
validDataEndOffset) {
-DataOutputStream dataOutputStream = null;
-DataInputStream dataInputStream = null;
 boolean fileTruncatedSuccessfully = false;
-// if bytes to read less than 1024 then buffer size should be equal to 
the given offset
-int bufferSize = validDataEndOffset > 
CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR ?
-CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR :
-(int) validDataEndOffset;
-// temporary file name
-String tempWriteFilePath = fileName + 
CarbonCommonConstants.TEMPWRITEFILEEXTENSION;
-FileFactory.FileType fileType = FileFactory.getFileType(fileName);
 try {
-  CarbonFile tempFile;
-  // delete temporary file if it already exists at a given path
-  if (FileFactory.isFileExist(tempWriteFilePath, fileType)) {
-tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType);
-tempFile.delete();
-  }
-  // create new temporary file
-  FileFactory.createNewFile(tempWriteFilePath, fileType);
-  tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType);
-  byte[] buff = new byte[bufferSize];
-  dataInputStream = FileFactory.getDataInputStream(fileName, fileType);
-  // read the data
-  int read = dataInputStream.read(buff, 0, buff.length);
-  dataOutputStream = 
FileFactory.getDataOutputStream(tempWriteFilePath, fileType);
-  dataOutputStream.write(buff, 0, read);
-  long remaining = validDataEndOffset - read;
-  // anytime we should not cross the offset to be read
-  while (remaining > 0) {
-if (remaining > bufferSize) {
-  buff = new byte[bufferSize];
-} else {
-  buff = new byte[(int) remaining];
+  // if hadoop version >= 2.7, it can call method 'truncate' to 
truncate file,
+  // this method was new in hadoop 2.7
+  FileSystem fs = 
fileStatus.getPath().getFileSystem(FileFactory.getConfiguration());
+  Method truncateMethod = fs.getClass().getDeclaredMethod("truncate",
+  new Class[]{Path.class, long.class});
+  fileTruncatedSuccessfully = (boolean)truncateMethod.invoke(fs,
+  new Object[]{fileStatus.getPath(), validDataEndOffset});
+} catch (NoSuchMethodException e) {
+  LOGGER.error("there is no 'truncate' method in FileSystem, the 
version of hadoop is"
+  + " below 2.7, It needs to implement truncate file by other 
way.");
+  DataOutputStream dataOutputStream = null;
+  DataInputStream dataInputStream = null;
+  // if bytes to read less than 1024 then buffer size should be equal 
to the given offset
+  int bufferSize = validDataEndOffset > 
CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR ?
+  CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR :
+  (int) validDataEndOffset;
+  // temporary file name
+  String tempWriteFilePath = fileName + 
CarbonCommonConstants.TEMPWRITEFILEEXTENSION;
+  FileFactory.FileType fileType = FileFactory.getFileType(fileName);
+  try {
+CarbonFile tempFile;
+// delete temporary file if it already exists at a given path
+if (FileFactory.isFileExist(tempWriteFilePath, fileType)) {
+  tempFile = FileFactory.getCarbonFile(tempWriteFilePath, 
fileType);
+  tempFile.delete();
 }
-read = dataInputStream.read(buff, 0, buff.length);
+// create new temporary file
+FileFactory.createNewFile(tempWriteFilePath, fileType);
+tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType);
+byte[] buff = new byte[bufferSize];
+dataInputStream = FileFactory.getDataInputStream(fileName, 
fileType);
+// read the data
+int read = dataInputStream.read(buff, 0, buff.length);
+dataOutputStream = 
FileFactory.getDataOutputStream(tempWriteFilePath, fileType);
 dataOutputStream.write(buff, 0, read);
-remaining = remaining - read;
+long remaining = validDataEndOffset - read;
+// anytime we should not cross the offset to be read
+while (remaining > 0) {
+  if (remaining > bufferSize) {
+buff = new byte[bufferSize];
+  } else {
+buff = new byte[(int) remai

[GitHub] carbondata issue #1513: [CARBONDATA-1745] Use default metastore path from Hi...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1513
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1219/



---


[jira] [Created] (CARBONDATA-1755) Carbon1.3.0 Concurrent Insert overwrite-update: User is able to run insert overwrite and update job concurrently.

2017-11-17 Thread Ajeet Rai (JIRA)
Ajeet Rai created CARBONDATA-1755:
-

 Summary: Carbon1.3.0 Concurrent Insert overwrite-update: User is 
able to run insert overwrite and update job concurrently.
 Key: CARBONDATA-1755
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1755
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 1.3.0
 Environment: 3 Node ant cluster
Reporter: Ajeet Rai
Priority: Minor


Carbon1.3.0 Concurrent Insert overwrite-update: User is able to run insert 
overwrite and update job concurrently.

updated data will be overwritten by insert overwrite job. So there is no 
meaning of running update job if insert overwrite is in progress.
Steps:
1: Create a table
2: Do a data load
3: run insert overwrite job.
4: run a update job while overwrite job is still running.
5: Observe that update job is finished and after that overwrite job is also 
finished.
6: All previous segments are marked for delete and there is no impact of update 
job. Update job will use the resources unnecessary.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1515: [CARBONDATA-1751] Modify sys.err to AnalysisE...

2017-11-17 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1515#discussion_r151634168
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/spark/sql/execution/command/CarbonTableSchemaCommonSuite.scala
 ---
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.test.util.QueryTest
+import org.junit.Assert
+import org.scalatest.BeforeAndAfterAll
+
+class CarbonTableSchemaCommonSuite extends QueryTest with 
BeforeAndAfterAll {
+
+  test("Creating table: Duplicate dimensions found with name, it should 
throw AnalysisException") {
+sql("DROP TABLE IF EXISTS carbon_table")
+try {
+  sql(
+s"""
+   | CREATE TABLE carbon_table(
+   | BB INT, bb char(10)
+   | )
+   | STORED BY 'carbondata'
+   """.stripMargin)
+} catch {
+  case ex: AnalysisException => Assert.assertTrue(true)
+  case ex: Exception => Assert.assertTrue(false)
+}
--- End diff --

If no exception, testcase should fail


---


[jira] [Created] (CARBONDATA-1756) Improve Boolean data compress rate by changing RLE to SNNAPY algorithm

2017-11-17 Thread xubo245 (JIRA)
xubo245 created CARBONDATA-1756:
---

 Summary: Improve Boolean data compress rate by changing RLE to 
SNNAPY algorithm
 Key: CARBONDATA-1756
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1756
 Project: CarbonData
  Issue Type: Improvement
  Components: core
Affects Versions: 1.2.0
Reporter: xubo245
Assignee: xubo245
 Fix For: 1.3.0


Improve Boolean data compress rate by changing RLE to SNNAPY algorithm

Because Boolean data compress rate that uses RLE algorithm is lower than SNNAPY 
algorithm in most scenario.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1523: [CARBONDATA-1756] Improve Boolean data compre...

2017-11-17 Thread xubo245
GitHub user xubo245 opened a pull request:

https://github.com/apache/carbondata/pull/1523

[CARBONDATA-1756] Improve Boolean data compress rate by changing RLE to 
SNNAPY algorithm

Improve Boolean data compress rate by changing RLE to SNNAPY algorithm
Because Boolean data compress rate that uses RLE algorithm is lower than 
SNNAPY algorithm in most scenario.

We also add some test case for testing  Boolean data compress rate.

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 No
 - [ ] Any backward compatibility impacted?
 No
 - [ ] Document update required?
No
 - [ ] Testing done
TestBooleanCompressSuite.scala
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
No


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xubo245/carbondata RLE2Snappy

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1523.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1523


commit ac266e07bc446e27914cdf17dca45692722e82b5
Author: xubo245 <601450...@qq.com>
Date:   2017-11-17T09:17:43Z

[CARBONDATA-1756] Improve Boolean data compress rate by changing RLE to 
SNNAPY algorithm




---


[GitHub] carbondata issue #1518: [CARBONDATA-1752] There are some scalastyle error sh...

2017-11-17 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1518
  
LGTM


---


[GitHub] carbondata pull request #1518: [CARBONDATA-1752] There are some scalastyle e...

2017-11-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1518


---


[jira] [Resolved] (CARBONDATA-1752) There are some scalastyle error should be optimized in CarbonData

2017-11-17 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-1752.
--
Resolution: Fixed

> There are some scalastyle error should be optimized in CarbonData
> -
>
> Key: CARBONDATA-1752
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1752
> Project: CarbonData
>  Issue Type: Bug
>  Components: file-format
>Affects Versions: 1.2.0
>Reporter: xubo245
>Assignee: xubo245
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> There are some scalastyle error should be optimized in CarbonData, including 
> removing useless import, optimizing method definition and so on



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

2017-11-17 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1471#discussion_r151636955
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datamap/DataMapMeta.java ---
@@ -19,15 +19,15 @@
 
 import java.util.List;
 
-import org.apache.carbondata.core.indexstore.schema.FilterType;
+import org.apache.carbondata.core.scan.filter.intf.ExpressionType;
 
 public class DataMapMeta {
 
   private List indexedColumns;
 
-  private FilterType optimizedOperation;
+  private List optimizedOperation;
--- End diff --

Currently, the like expression is converted to greater than and less than 
equal to filter. so there is no LIKE expression in the types. 


---


[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

2017-11-17 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1471#discussion_r151637482
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datamap/dev/AbstractDataMapWriter.java
 ---
@@ -0,0 +1,110 @@
+/*
--- End diff --

Changed to Abstract class to enforce the user to pass the needed parameters 
through the constructor. And also the concrete method `commitFile` is added to 
this class.


---


[GitHub] carbondata pull request #1508: [CARBONDATA-1738] Block direct insert/load on...

2017-11-17 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1508#discussion_r151637632
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -67,6 +67,12 @@
   public static final String VALIDATE_CARBON_INPUT_SEGMENTS = 
"validate.carbon.input.segments.";
 
   /**
+   * Whether load/insert command is fired internally or by the user.
+   * Used to block load/insert on pre-aggregate if fired by user
+   */
+  public static final String IS_INTERNAL_LOAD_CALL = 
"is.internal.load.call";
--- End diff --

Seems no testcase for this option. And the option name should start with 
`carbon`


---


[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

2017-11-17 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1471#discussion_r151637852
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java
 ---
@@ -574,7 +482,9 @@ private CopyThread(String fileName) {
  * @throws Exception if unable to compute a result
  */
 @Override public Void call() throws Exception {
-  copyCarbonDataFileToCarbonStorePath(fileName);
+  CarbonUtil.copyCarbonDataFileToCarbonStorePath(fileName,
--- End diff --

ok


---


[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

2017-11-17 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1471#discussion_r151638182
  
--- Diff: 
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
 ---
@@ -755,7 +758,8 @@ private CarbonInputSplit 
convertToCarbonInputSplit(ExtendedBlocklet blocklet)
 
org.apache.carbondata.hadoop.CarbonInputSplit.from(blocklet.getSegmentId(),
 new FileSplit(new Path(blocklet.getPath()), 0, 
blocklet.getLength(),
 blocklet.getLocations()),
-ColumnarFormatVersion.valueOf((short) 
blocklet.getDetailInfo().getVersionNumber()));
+ColumnarFormatVersion.valueOf((short) 
blocklet.getDetailInfo().getVersionNumber()),
+blocklet.getDataMapWriterPath());
--- End diff --

ok


---


[GitHub] carbondata pull request #1503: [CARBONDATA-1730] Support skip.header.line.co...

2017-11-17 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1503#discussion_r151638177
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonHiveSessionState.scala
 ---
@@ -0,0 +1,120 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive
+
+import org.apache.spark.sql._
+import org.apache.spark.sql.catalyst.analysis.Analyzer
+import org.apache.spark.sql.execution.SparkPlanner
+import org.apache.spark.sql.execution.datasources._
+
+
+/**
+ * A class that holds all session-specific state in a given 
[[SparkSession]] backed by Hive.
+ */
+private[hive] class CarbonHiveSessionState(sparkSession: SparkSession)
--- End diff --

Can you just add the necessary part to support `skip.header.line.count` 
option without copying the whole class from spark?


---


[GitHub] carbondata issue #1521: [WIP] [CARBONDATA-1743] fix conurrent pre-agg creati...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1521
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1220/



---


[GitHub] carbondata pull request #1508: [CARBONDATA-1738] Block direct insert/load on...

2017-11-17 Thread kunal642
Github user kunal642 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1508#discussion_r151638473
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -67,6 +67,12 @@
   public static final String VALIDATE_CARBON_INPUT_SEGMENTS = 
"validate.carbon.input.segments.";
 
   /**
+   * Whether load/insert command is fired internally or by the user.
+   * Used to block load/insert on pre-aggregate if fired by user
+   */
+  public static final String IS_INTERNAL_LOAD_CALL = 
"is.internal.load.call";
--- End diff --

This option/property will not be exposed to the user. It will be set by the 
post load listener to know whether the load is fired by the user or is it an 
internal call.

Test case is added in TestPreAggregateLoad


---


[GitHub] carbondata issue #1491: [CARBONDATA-1651] [Supported Boolean Type When Savin...

2017-11-17 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1491
  
retest this please


---


[jira] [Created] (CARBONDATA-1757) Carbon 1.3.0- Pre_aggregate: After creating datamap on parent table, avg is not correct.

2017-11-17 Thread Ayushi Sharma (JIRA)
Ayushi Sharma created CARBONDATA-1757:
-

 Summary: Carbon 1.3.0- Pre_aggregate: After creating datamap on 
parent table, avg is not correct.
 Key: CARBONDATA-1757
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1757
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 1.3.0
Reporter: Ayushi Sharma


Steps:
1. create table cust_2 (c_custkey int, c_name string, c_address string, 
c_nationkey bigint, c_phone string,c_acctbal decimal, c_mktsegment string, 
c_comment string) STORED BY 'org.apache.carbondata.format'; 

2. load data  inpath 'hdfs://hacluster/customer/customer3.csv' into table 
cust_2 
options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment');
load data  inpath 'hdfs://hacluster/customer/customer3.csv' into table cust_2 
options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment');
load data  inpath 'hdfs://hacluster/customer/customer4.csv' into table cust_2 
options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment');
load data  inpath 'hdfs://hacluster/customer/customer5.csv' into table cust_2 
options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment');
load data  inpath 'hdfs://hacluster/customer/customer6.csv' into table cust_2 
options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment');
load data  inpath 'hdfs://hacluster/customer/customer7.csv' into table cust_2 
options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment');
load data  inpath 'hdfs://hacluster/customer/customer8.csv' into table cust_2 
options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment');
load data  inpath 'hdfs://hacluster/customer/customer9.csv' into table cust_2 
options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment');
load data  inpath 'hdfs://hacluster/customer/customer10.csv' into table cust_2 
options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment');
load data  inpath 'hdfs://hacluster/customer/customer11.csv' into table cust_2 
options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment');
load data  inpath 'hdfs://hacluster/customer/customer12.csv' into table cust_2 
options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment');
load data  inpath 'hdfs://hacluster/customer/customer13.csv' into table cust_2 
options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment');
load data  inpath 'hdfs://hacluster/customer/customer14.csv' into table cust_2 
options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment');

3. SELECT c_custkey, c_name, sum(c_acctbal), avg(c_acctbal) FROM cust_2 GROUP 
BY c_custkey, c_name;

4. set carbon.input.segments.default.cust_2=0,1;

5. SELECT c_custkey, c_name, sum(c_acctbal), avg(c_acctbal) FROM cust_2 GROUP 
BY c_custkey, c_name;

6. CREATE DATAMAP tt1 ON TABLE cust_2 USING 
"org.apache.carbondata.datamap.AggregateDataMapHandler" AS SELECT c_custkey, 
c_name, sum(c_acctbal), avg(c_acctbal) FROM cust_2 GROUP BY c_custkey, c_name;

7.  SELECT c_custkey, c_name, sum(c_acctbal), avg(c_acctbal) FROM cust_2 GROUP 
BY c_custkey, c_name;

8. set carbon.input.segments.default.cust_2=*;

9. SELECT c_custkey, c_name, sum(c_acctbal), avg(c_acctbal) FROM cust_2 GROUP 
BY c_custkey, c_name;

Issue:
After creating datamap, avg is not correct

Expected Output:
Avg should have been displayed correctly.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

2017-11-17 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1471#discussion_r151639114
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datamap/dev/AbstractDataMapWriter.java
 ---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.core.datamap.dev;
+
+import java.io.IOException;
+
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.datastore.page.ColumnPage;
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.util.CarbonUtil;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+/**
+ * Data Map writer
+ */
+public abstract class AbstractDataMapWriter {
+
+  protected AbsoluteTableIdentifier identifier;
+
+  protected String segmentId;
+
+  protected String writeDirectoryPath;
+
+  public AbstractDataMapWriter(AbsoluteTableIdentifier identifier, String 
segmentId,
+  String writeDirectoryPath) {
+this.identifier = identifier;
+this.segmentId = segmentId;
+this.writeDirectoryPath = writeDirectoryPath;
+  }
+
+  /**
+   * Start of new block notification.
+   *
+   * @param blockId file name of the carbondata file
+   */
+  public abstract void onBlockStart(String blockId);
+
+  /**
+   * End of block notification
+   */
+  public abstract void onBlockEnd(String blockId);
+
+  /**
+   * Start of new blocklet notification.
+   *
+   * @param blockletId sequence number of blocklet in the block
+   */
+  public abstract void onBlockletStart(int blockletId);
+
+  /**
+   * End of blocklet notification
+   *
+   * @param blockletId sequence number of blocklet in the block
+   */
+  public abstract void onBlockletEnd(int blockletId);
+
+  /**
+   * Add the column pages row to the datamap, order of pages is same as 
`indexColumns` in
+   * DataMapMeta returned in DataMapFactory.
+   * Implementation should copy the content of `pages` as needed, because 
`pages` memory
+   * may be freed after this method returns, if using unsafe column page.
+   */
+  public abstract void onPageAdded(int blockletId, int pageId, 
ColumnPage[] pages);
+
+  /**
+   * This is called during closing of writer.So after this call no more 
data will be sent to this
+   * class.
+   */
+  public abstract void finish();
+
+  /**
+   * It copies the file from temp folder to actual folder
+   *
+   * @param dataMapFile
+   * @throws IOException
+   */
+  protected void commitFile(String dataMapFile) throws IOException {
--- End diff --

Basically, this method should be used inside DataMapWriter to the files 
once they finish writing in it. It is used for copying from temp location to 
store. If the error occurs here then it will throw to the DataMapwriter 
implementation. And writer implementation should handle it otherwise load fails 
because of the error if it thrown to fact writer


---


[jira] [Created] (CARBONDATA-1758) (Carbon1.3.0- No Inverted Index) - Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException

2017-11-17 Thread Chetan Bhat (JIRA)
Chetan Bhat created CARBONDATA-1758:
---

 Summary: (Carbon1.3.0- No Inverted Index) - Select column with is 
null for no_inverted_index column throws 
java.lang.ArrayIndexOutOfBoundsException
 Key: CARBONDATA-1758
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1758
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 1.3.0
 Environment: 3 node cluster
Reporter: Chetan Bhat


Steps :
In Beeline user executes the queries in sequence.
CREATE TABLE uniqdata_DI_int (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) STORED BY 'org.apache.carbondata.format' 
TBLPROPERTIES('DICTIONARY_INCLUDE'='cust_id','NO_INVERTED_INDEX'='cust_id');
LOAD DATA INPATH 'hdfs://hacluster/chetan/3000_UniqData.csv' into table 
uniqdata_DI_int OPTIONS('DELIMITER'=',', 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
Select count(CUST_ID) from uniqdata_DI_int;
Select count(CUST_ID)*10 as multiple from uniqdata_DI_int;
Select avg(CUST_ID) as average from uniqdata_DI_int;
Select floor(CUST_ID) as average from uniqdata_DI_int;
Select ceil(CUST_ID) as average from uniqdata_DI_int;
Select ceiling(CUST_ID) as average from uniqdata_DI_int;
Select CUST_ID*integer_column1 as multiple from uniqdata_DI_int;
Select CUST_ID from uniqdata_DI_int where CUST_ID is null;

Issue : Select column with is null for no_inverted_index column throws 
java.lang.ArrayIndexOutOfBoundsException

0: jdbc:hive2://10.18.98.34:23040> Select CUST_ID from uniqdata_DI_int where 
CUST_ID is null;
Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 
0 in stage 79.0 failed 4 times, most recent failure: Lost task 0.3 in stage 
79.0 (TID 123, BLR114278, executor 18): 
org.apache.spark.util.TaskCompletionListenerException: 
java.util.concurrent.ExecutionException: 
java.lang.ArrayIndexOutOfBoundsException: 0
at 
org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105)
at org.apache.spark.scheduler.Task.run(Task.scala:112)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Driver stacktrace: (state=,code=0)

Expected : Select column with is null for no_inverted_index column should be 
successful displaying the correct result set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1758) (Carbon1.3.0- No Inverted Index) - Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1758:

Description: 
Steps :
In Beeline user executes the queries in sequence.
CREATE TABLE uniqdata_DI_int (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) STORED BY 'org.apache.carbondata.format' 
TBLPROPERTIES('DICTIONARY_INCLUDE'='cust_id','NO_INVERTED_INDEX'='cust_id');
LOAD DATA INPATH 'hdfs://hacluster/chetan/3000_UniqData.csv' into table 
uniqdata_DI_int OPTIONS('DELIMITER'=',', 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
Select count(CUST_ID) from uniqdata_DI_int;
Select count(CUST_ID)*10 as multiple from uniqdata_DI_int;
Select avg(CUST_ID) as average from uniqdata_DI_int;
Select floor(CUST_ID) as average from uniqdata_DI_int;
Select ceil(CUST_ID) as average from uniqdata_DI_int;
Select ceiling(CUST_ID) as average from uniqdata_DI_int;
Select CUST_ID*integer_column1 as multiple from uniqdata_DI_int;
Select CUST_ID from uniqdata_DI_int where CUST_ID is null;

*Issue : Select column with is null for no_inverted_index column throws 
java.lang.ArrayIndexOutOfBoundsException*

0: jdbc:hive2://10.18.98.34:23040> Select CUST_ID from uniqdata_DI_int where 
CUST_ID is null;
Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 
0 in stage 79.0 failed 4 times, most recent failure: Lost task 0.3 in stage 
79.0 (TID 123, BLR114278, executor 18): 
org.apache.spark.util.TaskCompletionListenerException: 
java.util.concurrent.ExecutionException: 
java.lang.ArrayIndexOutOfBoundsException: 0
at 
org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105)
at org.apache.spark.scheduler.Task.run(Task.scala:112)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Driver stacktrace: (state=,code=0)

Expected : Select column with is null for no_inverted_index column should be 
successful displaying the correct result set.

  was:
Steps :
In Beeline user executes the queries in sequence.
CREATE TABLE uniqdata_DI_int (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) STORED BY 'org.apache.carbondata.format' 
TBLPROPERTIES('DICTIONARY_INCLUDE'='cust_id','NO_INVERTED_INDEX'='cust_id');
LOAD DATA INPATH 'hdfs://hacluster/chetan/3000_UniqData.csv' into table 
uniqdata_DI_int OPTIONS('DELIMITER'=',', 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
Select count(CUST_ID) from uniqdata_DI_int;
Select count(CUST_ID)*10 as multiple from uniqdata_DI_int;
Select avg(CUST_ID) as average from uniqdata_DI_int;
Select floor(CUST_ID) as average from uniqdata_DI_int;
Select ceil(CUST_ID) as average from uniqdata_DI_int;
Select ceiling(CUST_ID) as average from uniqdata_DI_int;
Select CUST_ID*integer_column1 as multiple from uniqdata_DI_int;
Select CUST_ID from uniqdata_DI_int where CUST_ID is null;

Issue : Select column with is null for no_inverted_index column throws 
java.lang.ArrayIndexOutOfBoundsException

0: jdbc:hive2://10.18.98.34:23040> Select CUST_ID from uniqdata_DI_int where 
CUST_ID is null;
Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 
0 in stage 79.0 failed 4 times, most recent failure: Lost task 0.3 in stage 
79.0 (TID 123, BLR114278, executor 18): 
org.apache.spark.util.TaskCompletionListenerException: 
java.util.concurrent.ExecutionException: 
java.lang.ArrayIndexOutOfBoundsException: 0
at 
org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105)
at org.apache.spark.scheduler.Task.run(Task.scala:112)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Driver stacktrace: (state=,code=0)

Expected : Select column wit

[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...

2017-11-17 Thread dhatchayani
Github user dhatchayani commented on the issue:

https://github.com/apache/carbondata/pull/1520
  
retest sdv please


---


[GitHub] carbondata issue #1435: [CARBONDATA-1626]add data size and index size in tab...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1435
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1221/



---


[jira] [Created] (CARBONDATA-1759) Carbon1.3.0 Clean command is not working correctly for segments marked for delete due to insert overwrite job

2017-11-17 Thread Ajeet Rai (JIRA)
Ajeet Rai created CARBONDATA-1759:
-

 Summary: Carbon1.3.0  Clean command is not working correctly for  
segments marked for delete due to insert overwrite job
 Key: CARBONDATA-1759
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1759
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 1.3.0
 Environment: 3 Node ant cluster
Reporter: Ajeet Rai


Carbon1.3.0  Clean command is not working correctly for  segments marked for 
delete due to insert overwrite job.
1: Create a table
CREATE TABLE IF NOT EXISTS flow_carbon_new999(txn_dte String,dt String,txn_bk 
String,txn_br String,own_bk String,own_br String,opp_bk String,bus_opr_cde 
String,opt_prd_cde String,cus_no String,cus_ac String,opp_ac_nme  String,opp_ac 
String,bv_no  String,aco_ac String,ac_dte String,txn_cnt int,jrn_par 
int,mfm_jrn_no String,cbn_jrn_no String,ibs_jrn_no String,vch_no String,vch_seq 
String,srv_cde String,bus_cd_no  String,id_flg String,bv_cde String,txn_time  
String,txn_tlr String,ety_tlr String,ety_bk String,ety_br String,bus_pss_no 
String,chk_flg String,chk_tlr String,chk_jrn_no String,  bus_sys_no 
String,txn_sub_cde String,fin_bus_cde String,fin_bus_sub_cde String,chl  
String,tml_id String,sus_no String,sus_seq String,  cho_seq String,  itm_itm 
String,itm_sub String,itm_sss String,dc_flg String,amt  decimal(15,2),bal  
decimal(15,2),ccy  String,spv_flg String,vch_vld_dte String,pst_bk 
String,pst_br String,ec_flg String,aco_tlr String,gen_flg 
String,his_rec_sum_flg String,his_flg String,vch_typ String,val_dte 
String,opp_ac_flg String,cmb_flg String,ass_vch_flg String,cus_pps_flg 
String,bus_rmk_cde String,vch_bus_rmk String,tec_rmk_cde String,vch_tec_rmk 
String,gems_last_upd_d String,maps_date String,maps_job String)STORED BY 
'org.apache.carbondata.format' 
TBLPROPERTIES('DICTIONARY_INCLUDE'='txn_cnt,jrn_par,amt,bal','No_Inverted_Index'=
 'txn_dte,dt,txn_bk,txn_br,own_bk ,own_br ,opp_bk ,bus_opr_cde ,opt_prd_cde 
,cus_no ,cus_ac ,opp_ac_nme  ,opp_ac ,bv_no  ,aco_ac ,ac_dte ,txn_cnt  ,jrn_par 
 ,mfm_jrn_no ,cbn_jrn_no ,ibs_jrn_no ,vch_no ,vch_seq ,srv_cde ,bus_cd_no  
,id_flg ,bv_cde ,txn_time  ,txn_tlr ,ety_tlr ,ety_bk ,ety_br ,bus_pss_no 
,chk_flg ,chk_tlr ,chk_jrn_no , bus_sys_no ,txn_sub_cde ,fin_bus_cde 
,fin_bus_sub_cde ,chl  ,tml_id ,sus_no ,sus_seq , cho_seq , itm_itm ,itm_sub 
,itm_sss ,dc_flg ,amt,bal,ccy  ,spv_flg ,vch_vld_dte ,pst_bk ,pst_br ,ec_flg 
,aco_tlr ,gen_flg ,his_rec_sum_flg ,his_flg ,vch_typ ,val_dte ,opp_ac_flg 
,cmb_flg ,ass_vch_flg ,cus_pps_flg ,bus_rmk_cde ,vch_bus_rmk ,tec_rmk_cde 
,vch_tec_rmk ,gems_last_upd_d ,maps_date ,maps_job' );

2: start a data load.
LOAD DATA inpath 'hdfs://hacluster/user/test/20140101_1_1.csv' into 
table flow_carbon_new999 options('DELIMITER'=',', 
'QUOTECHAR'='"','header'='false');
3: run a insert overwrite job 
insert into table  flow_carbon_new999 select * from flow_carbon_new666;
4: run show segment query:
show segments for table ajeet.flow_carbon_new999
5: Observe that all previous segments are marked for delete
6: run clean query
CLEAN FILES FOR TABLE ajeet.flow_carbon_new999;
7: again run show segment query
8: Observe that still all previous segments which are marked for delete are 
shown as result.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1759) (Carbon1.3.0 - Clean Files) Clean command is not working correctly for segments marked for delete due to insert overwrite job

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1759:

Summary: (Carbon1.3.0 - Clean Files) Clean command is not working correctly 
for  segments marked for delete due to insert overwrite job  (was: Carbon1.3.0  
Clean command is not working correctly for  segments marked for delete due to 
insert overwrite job)

> (Carbon1.3.0 - Clean Files) Clean command is not working correctly for  
> segments marked for delete due to insert overwrite job
> --
>
> Key: CARBONDATA-1759
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1759
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: 3 Node ant cluster
>Reporter: Ajeet Rai
>  Labels: dfx
>
> Carbon1.3.0  Clean command is not working correctly for  segments marked for 
> delete due to insert overwrite job.
> 1: Create a table
> CREATE TABLE IF NOT EXISTS flow_carbon_new999(txn_dte String,dt String,txn_bk 
> String,txn_br String,own_bk String,own_br String,opp_bk String,bus_opr_cde 
> String,opt_prd_cde String,cus_no String,cus_ac String,opp_ac_nme  
> String,opp_ac String,bv_no  String,aco_ac String,ac_dte String,txn_cnt 
> int,jrn_par int,mfm_jrn_no String,cbn_jrn_no String,ibs_jrn_no String,vch_no 
> String,vch_seq String,srv_cde String,bus_cd_no  String,id_flg String,bv_cde 
> String,txn_time  String,txn_tlr String,ety_tlr String,ety_bk String,ety_br 
> String,bus_pss_no String,chk_flg String,chk_tlr String,chk_jrn_no String,  
> bus_sys_no String,txn_sub_cde String,fin_bus_cde String,fin_bus_sub_cde 
> String,chl  String,tml_id String,sus_no String,sus_seq String,  cho_seq 
> String,  itm_itm String,itm_sub String,itm_sss String,dc_flg String,amt  
> decimal(15,2),bal  decimal(15,2),ccy  String,spv_flg String,vch_vld_dte 
> String,pst_bk String,pst_br String,ec_flg String,aco_tlr String,gen_flg 
> String,his_rec_sum_flg String,his_flg String,vch_typ String,val_dte 
> String,opp_ac_flg String,cmb_flg String,ass_vch_flg String,cus_pps_flg 
> String,bus_rmk_cde String,vch_bus_rmk String,tec_rmk_cde String,vch_tec_rmk 
> String,gems_last_upd_d String,maps_date String,maps_job String)STORED BY 
> 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='txn_cnt,jrn_par,amt,bal','No_Inverted_Index'=
>  'txn_dte,dt,txn_bk,txn_br,own_bk ,own_br ,opp_bk ,bus_opr_cde ,opt_prd_cde 
> ,cus_no ,cus_ac ,opp_ac_nme  ,opp_ac ,bv_no  ,aco_ac ,ac_dte ,txn_cnt  
> ,jrn_par  ,mfm_jrn_no ,cbn_jrn_no ,ibs_jrn_no ,vch_no ,vch_seq ,srv_cde 
> ,bus_cd_no  ,id_flg ,bv_cde ,txn_time  ,txn_tlr ,ety_tlr ,ety_bk ,ety_br 
> ,bus_pss_no ,chk_flg ,chk_tlr ,chk_jrn_no , bus_sys_no ,txn_sub_cde 
> ,fin_bus_cde ,fin_bus_sub_cde ,chl  ,tml_id ,sus_no ,sus_seq , cho_seq , 
> itm_itm ,itm_sub ,itm_sss ,dc_flg ,amt,bal,ccy  ,spv_flg ,vch_vld_dte ,pst_bk 
> ,pst_br ,ec_flg ,aco_tlr ,gen_flg ,his_rec_sum_flg ,his_flg ,vch_typ ,val_dte 
> ,opp_ac_flg ,cmb_flg ,ass_vch_flg ,cus_pps_flg ,bus_rmk_cde ,vch_bus_rmk 
> ,tec_rmk_cde ,vch_tec_rmk ,gems_last_upd_d ,maps_date ,maps_job' );
> 2: start a data load.
> LOAD DATA inpath 'hdfs://hacluster/user/test/20140101_1_1.csv' into 
> table flow_carbon_new999 options('DELIMITER'=',', 
> 'QUOTECHAR'='"','header'='false');
> 3: run a insert overwrite job 
> insert into table  flow_carbon_new999 select * from flow_carbon_new666;
> 4: run show segment query:
> show segments for table ajeet.flow_carbon_new999
> 5: Observe that all previous segments are marked for delete
> 6: run clean query
> CLEAN FILES FOR TABLE ajeet.flow_carbon_new999;
> 7: again run show segment query
> 8: Observe that still all previous segments which are marked for delete are 
> shown as result.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1760) Carbon 1.3.0- Pre_aggregate: Proper Error message should be displayed, when parent table name is not correct while creating datamap.

2017-11-17 Thread Ayushi Sharma (JIRA)
Ayushi Sharma created CARBONDATA-1760:
-

 Summary: Carbon 1.3.0- Pre_aggregate: Proper Error message should 
be displayed, when parent table name is not correct while creating datamap.
 Key: CARBONDATA-1760
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1760
 Project: CarbonData
  Issue Type: Bug
  Components: sql
Affects Versions: 1.3.0
Reporter: Ayushi Sharma
Priority: Minor


Steps:
1. CREATE DATAMAP tt3 ON TABLE cust_2 USING 
"org.apache.carbondata.datamap.AggregateDataMapHandler" AS SELECT c_custkey, 
c_name, sum(c_acctbal), avg(c_acctbal), count(c_acctbal) FROM tstcust GROUP BY 
c_custkey, c_name;

Issue:
Proper error message is not displayed. It throws "assertion failed" error.

Expected:
Proper error message should be displayed, if parent table name has any 
ambiguity.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1760) Carbon 1.3.0- Pre_aggregate: Proper Error message should be displayed, when parent table name is not correct while creating datamap.

2017-11-17 Thread Ayushi Sharma (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayushi Sharma updated CARBONDATA-1760:
--
Labels: dfx  (was: )

> Carbon 1.3.0- Pre_aggregate: Proper Error message should be displayed, when 
> parent table name is not correct while creating datamap.
> 
>
> Key: CARBONDATA-1760
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1760
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.3.0
>Reporter: Ayushi Sharma
>Priority: Minor
>  Labels: dfx
>
> Steps:
> 1. CREATE DATAMAP tt3 ON TABLE cust_2 USING 
> "org.apache.carbondata.datamap.AggregateDataMapHandler" AS SELECT c_custkey, 
> c_name, sum(c_acctbal), avg(c_acctbal), count(c_acctbal) FROM tstcust GROUP 
> BY c_custkey, c_name;
> Issue:
> Proper error message is not displayed. It throws "assertion failed" error.
> Expected:
> Proper error message should be displayed, if parent table name has any 
> ambiguity.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1520
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1222/



---


[GitHub] carbondata issue #1523: [CARBONDATA-1756] Improve Boolean data compress rate...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1523
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1223/



---


[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...

2017-11-17 Thread dhatchayani
Github user dhatchayani commented on the issue:

https://github.com/apache/carbondata/pull/1520
  
Retest this please


---


[GitHub] carbondata issue #1515: [CARBONDATA-1751] Modify sys.err to AnalysisExceptio...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1515
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1224/



---


[jira] [Created] (CARBONDATA-1761) (Carbon1.3.0 - DELETE SEGMENT BY ID) In Progress Segment is marked for delete if respective id is given in delete segment by id query

2017-11-17 Thread Ajeet Rai (JIRA)
Ajeet Rai created CARBONDATA-1761:
-

 Summary: (Carbon1.3.0 - DELETE SEGMENT BY ID) In Progress Segment 
is marked for delete if respective id is given in delete segment by id query
 Key: CARBONDATA-1761
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1761
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 1.3.0
 Environment: 3 Node ant cluster
Description
Reporter: Ajeet Rai


(Carbon1.3.0 - DELETE SEGMENT BY ID) In Progress Segment is marked for delete 
if respective id is given in delete segment by id query.
1: Create a table
CREATE TABLE IF NOT EXISTS flow_carbon_new999(txn_dte String,dt String,txn_bk 
String,txn_br String,own_bk String,own_br String,opp_bk String,bus_opr_cde 
String,opt_prd_cde String,cus_no String,cus_ac String,opp_ac_nme String,opp_ac 
String,bv_no String,aco_ac String,ac_dte String,txn_cnt int,jrn_par 
int,mfm_jrn_no String,cbn_jrn_no String,ibs_jrn_no String,vch_no String,vch_seq 
String,srv_cde String,bus_cd_no String,id_flg String,bv_cde String,txn_time 
String,txn_tlr String,ety_tlr String,ety_bk String,ety_br String,bus_pss_no 
String,chk_flg String,chk_tlr String,chk_jrn_no String, bus_sys_no 
String,txn_sub_cde String,fin_bus_cde String,fin_bus_sub_cde String,chl 
String,tml_id String,sus_no String,sus_seq String, cho_seq String, itm_itm 
String,itm_sub String,itm_sss String,dc_flg String,amt decimal(15,2),bal 
decimal(15,2),ccy String,spv_flg String,vch_vld_dte String,pst_bk String,pst_br 
String,ec_flg String,aco_tlr String,gen_flg String,his_rec_sum_flg 
String,his_flg String,vch_typ String,val_dte String,opp_ac_flg String,cmb_flg 
String,ass_vch_flg String,cus_pps_flg String,bus_rmk_cde String,vch_bus_rmk 
String,tec_rmk_cde String,vch_tec_rmk String,gems_last_upd_d String,maps_date 
String,maps_job String)STORED BY 'org.apache.carbondata.format' 
TBLPROPERTIES('DICTIONARY_INCLUDE'='txn_cnt,jrn_par,amt,bal','No_Inverted_Index'=
 'txn_dte,dt,txn_bk,txn_br,own_bk ,own_br ,opp_bk ,bus_opr_cde ,opt_prd_cde 
,cus_no ,cus_ac ,opp_ac_nme ,opp_ac ,bv_no ,aco_ac ,ac_dte ,txn_cnt ,jrn_par 
,mfm_jrn_no ,cbn_jrn_no ,ibs_jrn_no ,vch_no ,vch_seq ,srv_cde ,bus_cd_no 
,id_flg ,bv_cde ,txn_time ,txn_tlr ,ety_tlr ,ety_bk ,ety_br ,bus_pss_no 
,chk_flg ,chk_tlr ,chk_jrn_no , bus_sys_no ,txn_sub_cde ,fin_bus_cde 
,fin_bus_sub_cde ,chl ,tml_id ,sus_no ,sus_seq , cho_seq , itm_itm ,itm_sub 
,itm_sss ,dc_flg ,amt,bal,ccy ,spv_flg ,vch_vld_dte ,pst_bk ,pst_br ,ec_flg 
,aco_tlr ,gen_flg ,his_rec_sum_flg ,his_flg ,vch_typ ,val_dte ,opp_ac_flg 
,cmb_flg ,ass_vch_flg ,cus_pps_flg ,bus_rmk_cde ,vch_bus_rmk ,tec_rmk_cde 
,vch_tec_rmk ,gems_last_upd_d ,maps_date ,maps_job' );
2: start a data load.
LOAD DATA inpath 'hdfs://hacluster/user/test/20140101_1_1.csv' into 
table flow_carbon_new999 options('DELIMITER'=',', 
'QUOTECHAR'='"','header'='false');
3: run a insert into/overwrite job
insert into table flow_carbon_new999 select * from flow_carbon_new666;
4: show segments for table flow_carbon_new999;
5: Observe that load/insert/overwrite job is started with new segment id
6: now run a delete segment by id query with this id.
DELETE FROM TABLE ajeet.flow_carbon_new999 WHERE SEGMENT.ID IN (34)
7: again run show segment and see this segment which is still in progress is 
marked for delete.
8: Observe that insert/load job is still running and after some time(in next 
job of load/insert/overwrite), this job fails with below error:
Error: java.lang.RuntimeException: It seems insert overwrite has been issued 
during load (state=,code=0)
This is not correct behaviour and it should be handled.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1762) Remove existing column level dateformat and support dateformat, timestampformat in the load option

2017-11-17 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-1762:
---

 Summary: Remove existing column level dateformat and support 
dateformat, timestampformat in the load option
 Key: CARBONDATA-1762
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1762
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: Akash R Nilugal






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1763) Carbon1.3.0-Pre-AggregateTable - Recreating a failed pre-aggregate table fails due to table exists

2017-11-17 Thread Ramakrishna S (JIRA)
Ramakrishna S created CARBONDATA-1763:
-

 Summary: Carbon1.3.0-Pre-AggregateTable - Recreating a failed 
pre-aggregate table fails due to table exists
 Key: CARBONDATA-1763
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1763
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 1.3.0
 Environment: Test - 3 node ant cluster
Reporter: Ramakrishna S
Assignee: Kunal Kapoor
 Fix For: 1.3.0


Steps:
1. Create table and load with large data
create table if not exists lineitem4(L_SHIPDATE string,L_SHIPMODE 
string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
'org.apache.carbondata.format' TBLPROPERTIES 
('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
lineitem4 
options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');

2. Create a pre-aggregate table 
create datamap agr_lineitem4 ON TABLE lineitem4 USING 
"org.apache.carbondata.datamap.AggregateDataMapHandler" as select 
L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem4 
group by  L_RETURNFLAG, L_LINESTATUS;

3. Run aggregate query at the same time
 select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
lineitem4 group by l_returnflag, l_linestatus;

*+Expected:+*: aggregate query should fetch data either from main table or 
pre-aggregate table.
*+Actual:+* aggregate query does not return data until the pre-aggregate table 
is created


0: jdbc:hive2://10.18.98.48:23040> select 
l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 
group by l_returnflag, l_linestatus;
+---+---+--+---+--+
| l_returnflag  | l_linestatus  | sum(l_quantity)  | sum(l_extendedprice)  |
+---+---+--+---+--+
+---+---+--+---+--+
No rows selected (1.74 seconds)
0: jdbc:hive2://10.18.98.48:23040> select 
l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 
group by l_returnflag, l_linestatus;
+---+---+--+---+--+
| l_returnflag  | l_linestatus  | sum(l_quantity)  | sum(l_extendedprice)  |
+---+---+--+---+--+
+---+---+--+---+--+
No rows selected (0.746 seconds)
0: jdbc:hive2://10.18.98.48:23040> select 
l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 
group by l_returnflag, l_linestatus;
+---+---+--++--+
| l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)  |
+---+---+--++--+
| N | F | 2.9808092E7  | 4.471079473931997E10   |
| A | F | 1.145546488E9| 1.717580824169429E12   |
| N | O | 2.31980219E9 | 3.4789002701143467E12  |
| R | F | 1.146403932E9| 1.7190627928317903E12  |
+---+---+--++--+
4 rows selected (0.8 seconds)
0: jdbc:hive2://10.18.98.48:23040> select 
l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 
group by l_returnflag, l_linestatus;
+---+---+--++--+
| l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)  |
+---+---+--++--+
| N | F | 2.9808092E7  | 4.471079473931997E10   |
| A | F | 1.145546488E9| 1.717580824169429E12   |
| N | O | 2.31980219E9 | 3.4789002701143467E12  |
| R | F | 1.146403932E9| 1.7190627928317903E12  |
+---+---+--++--+




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1514: [CARBONDATA-1746] Count star optimization

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1514
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1225/



---


[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

2017-11-17 Thread akashrn5
GitHub user akashrn5 opened a pull request:

https://github.com/apache/carbondata/pull/1524

[CARBONDATA-1762] Remove existing column level dateformat and support 
dateformat, timestampformat in the load option

(1) Remove column level dateformat option
(2) Support dateformat and timestampformat in load options(table level)

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [X] Document update required?
 2 new load level properties are added. Document to be updated 
accordingly.
 - [X] Testing done
UT Added
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/akashrn5/incubator-carbondata timeformat

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1524.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1524


commit f7b253c672c4eed21466806cd4bc9990264a8a37
Author: akashrn5 
Date:   2017-11-17T11:25:33Z

[CARBONDATA-1762] Remove existing column level dateformat and support 
dateformat, timestampformat in the load option




---


[jira] [Updated] (CARBONDATA-1763) Carbon1.3.0-Pre-AggregateTable - Recreating a failed pre-aggregate table fails due to table exists

2017-11-17 Thread Ramakrishna S (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramakrishna S updated CARBONDATA-1763:
--
Description: 
Steps:
1. Create table and load with  data
2. Run update query on the table - this will take table metalock
3. In parallel run the pre-aggregate table create step - this will not be 
allowed due to table lock
4. Rerun pre-aggegate table create step

*+Expected:+* Pre-aggregate table should be created 
*+Actual:+* Pre-aggregate table creation fails

+Create, Load & Update+:
0: jdbc:hive2://10.18.98.136:23040> create table if not exists 
lineitem4(L_SHIPDATE string,L_SHIPMODE string,L_SHIPINSTRUCT 
string,L_RETURNFLAG string,L_RECEIPTDATE string,L_ORDERKEY string,L_PARTKEY 
string,L_SUPPKEY   string,L_LINENUMBER int,L_QUANTITY double,L_EXTENDEDPRICE 
double,L_DISCOUNT double,L_TAX double,L_LINESTATUS string,L_COMMITDATE 
string,L_COMMENT  string) STORED BY 'org.apache.carbondata.format' 
TBLPROPERTIES 
('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (0.266 seconds)
0: jdbc:hive2://10.18.98.136:23040> load data inpath 
"hdfs://hacluster/user/test/lineitem.tbl.5" into table lineitem4 
options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (6.331 seconds)
0: jdbc:hive2://10.18.98.136:23040> update lineitem4 set (l_linestatus) = 
('xx');

+Create Datamap:+
0: jdbc:hive2://10.18.98.136:23040> create datamap agr_lineitem4 ON TABLE 
lineitem4 USING "org.apache.carbondata.datamap.AggregateDataMapHandler" as 
select 
l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
from lineitem4  group by l_returnflag, l_linestatus;
Error: java.lang.RuntimeException: Acquire table lock failed after retry, 
please try after some time (state=,code=0)
0: jdbc:hive2://10.18.98.136:23040> select 
l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
from lineitem4 group by l_returnflag, l_linestatus;
+---+---+--+-++--+
| l_returnflag  | l_linestatus  | sum(l_quantity)  |   avg(l_quantity)   | 
count(l_quantity)  |
+---+---+--+-++--+
| N | xx| 1.2863213E7  | 25.48745561614304   | 
504688 |
| A | xx| 6318125.0| 25.506342144783375  | 
247708 |
| R | xx| 6321939.0| 25.532459087898417  | 
247604 |
+---+---+--+-++--+
3 rows selected (1.033 seconds)
0: jdbc:hive2://10.18.98.136:23040> create datamap agr_lineitem4 ON TABLE 
lineitem4 USING "org.apache.carbondata.datamap.AggregateDataMapHandler" as 
select 
l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
from lineitem4  group by l_returnflag, l_linestatus;
Error: java.lang.RuntimeException: Table [lineitem4_agr_lineitem4] already 
exists under database [test_db1] (state=,code=0)


  was:
Steps:
1. Create table and load with large data
create table if not exists lineitem4(L_SHIPDATE string,L_SHIPMODE 
string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
'org.apache.carbondata.format' TBLPROPERTIES 
('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
lineitem4 
options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');

2. Create a pre-aggregate table 
create datamap agr_lineitem4 ON TABLE lineitem4 USING 
"org.apache.carbondata.datamap.AggregateDataMapHandler" as select 
L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem4 
group by  L_RETURNFLAG, L_LINESTATUS;

3. Run aggregate query at the same time
 select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
lineitem4 group by l_returnflag, l_linestatus;

*+Expected:+*: aggregate query should fetch data either from main table or 
pre-aggr

[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...

2017-11-17 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1435#discussion_r151665636
  
--- Diff: 
integration/spark2/src/test/scala/org/apache/spark/sql/GetDataSizeAndIndexSizeTest.scala
 ---
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import org.apache.spark.sql.test.util.QueryTest
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.scalatest.BeforeAndAfterAll
+
+class GetDataSizeAndIndexSizeTest extends QueryTest with BeforeAndAfterAll 
{
--- End diff --

Please add testcase of update scenerio


---


[GitHub] carbondata issue #1491: [CARBONDATA-1651] [Supported Boolean Type When Savin...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1491
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1226/



---


[GitHub] carbondata issue #1435: [CARBONDATA-1626]add data size and index size in tab...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1435
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1227/



---


[jira] [Updated] (CARBONDATA-1726) Carbon1.3.0-Streaming - Select query from spark-shell does not execute successfully for streaming table load

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1726:

Description: 
Steps :
// prepare csv file for batch loading
cd /srv/spark2.2Bigdata/install/hadoop/datanode/bin

// generate streamSample.csv

10001,batch_1,city_1,0.1,school_1:school_11$20
10002,batch_2,city_2,0.2,school_2:school_22$30
10003,batch_3,city_3,0.3,school_3:school_33$40
10004,batch_4,city_4,0.4,school_4:school_44$50
10005,batch_5,city_5,0.5,school_5:school_55$60

// put to hdfs /tmp/streamSample.csv
./hadoop fs -put streamSample.csv /tmp

// spark-beeline
cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
bin/spark-submit --master yarn-client --executor-memory 10G --executor-cores 5 
--driver-memory 5G --num-executors 3 --class 
org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
 "hdfs://hacluster/user/sparkhive/warehouse"

bin/beeline -u jdbc:hive2://10.18.98.34:23040

CREATE TABLE stream_table(
id INT,
name STRING,
city STRING,
salary FLOAT
)
STORED BY 'carbondata'
TBLPROPERTIES('streaming'='true', 'sort_columns'='name');

LOAD DATA LOCAL INPATH 'hdfs://hacluster/chetan/streamSample.csv' INTO TABLE 
stream_table OPTIONS('HEADER'='false');

// spark-shell 
cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
bin/spark-shell --master yarn-client --executor-memory 10G --executor-cores 5 
--driver-memory 5G --num-executors 3 --jars 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar

import java.io.{File, PrintWriter}
import java.net.ServerSocket

import org.apache.spark.sql.{CarbonEnv, SparkSession}
import org.apache.spark.sql.hive.CarbonRelation
import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}

import org.apache.carbondata.core.constants.CarbonCommonConstants
import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}

CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
 "/MM/dd")

import org.apache.spark.sql.CarbonSession._

val carbonSession = SparkSession.
  builder().
  appName("StreamExample").
  config("spark.sql.warehouse.dir", 
"hdfs://hacluster/user/sparkhive/warehouse").
  config("javax.jdo.option.ConnectionURL", 
"jdbc:mysql://10.18.98.34:3306/sparksql?characterEncoding=UTF-8").
  config("javax.jdo.option.ConnectionDriverName", "com.mysql.jdbc.Driver").
  config("javax.jdo.option.ConnectionPassword", "huawei").
  config("javax.jdo.option.ConnectionUserName", "sparksql").
  getOrCreateCarbonSession()
   
carbonSession.sparkContext.setLogLevel("ERROR")

carbonSession.sql("select * from stream_table").show

def writeSocket(serverSocket: ServerSocket): Thread = {
  val thread = new Thread() {
override def run(): Unit = {
  // wait for client to connection request and accept
  val clientSocket = serverSocket.accept()
  val socketWriter = new PrintWriter(clientSocket.getOutputStream())
  var index = 0
  for (_ <- 1 to 1000) {
// write 5 records per iteration
for (_ <- 0 to 100) {
  index = index + 1
  socketWriter.println(index.toString + ",name_" + index
   + ",city_" + index + "," + (index * 
1.00).toString +
   ",school_" + index + ":school_" + index + index 
+ "$" + index)
}
socketWriter.flush()
Thread.sleep(2000)
  }
  socketWriter.close()
  System.out.println("Socket closed")
}
  }
  thread.start()
  thread
}
  
def startStreaming(spark: SparkSession, tablePath: CarbonTablePath): Thread = {
  val thread = new Thread() {
override def run(): Unit = {
  var qry: StreamingQuery = null
  try {
val readSocketDF = spark.readStream
  .format("socket")
  .option("host", "10.18.98.34")
  .option("port", 7071)
  .load()

// Write data from socket stream to carbondata file
qry = readSocketDF.writeStream
  .format("carbondata")
  .trigger(ProcessingTime("5 seconds"))
  .option("checkpointLocation", tablePath.getStreamingCheckpointDir)
  .option("tablePath", tablePath.getPath)
  .start()

qry.awaitTermination()
  } catch {
case _: InterruptedException =>
  println("Done reading and writing streaming data")
  } finally {
qry.stop()
  }
}
  }
  thread.start()
  thread
}

val streamTableName = s"stream_table"

val carbonTable = CarbonEnv.getInstance(carbonSession).carbonMetastore.
  lookupRelation(Some("default"), 
streamTableName)(carbonSession).asInstanceOf[CarbonRelation].
  tableMeta.carbonTable

val tablePath = 
CarbonStorePath.getCarbonTablePath(carbonTable.get

[jira] [Updated] (CARBONDATA-1726) Carbon1.3.0-Streaming - Select query from spark-shell does not execute successfully for streaming table load

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1726:

Description: 
Steps :
// prepare csv file for batch loading
cd /srv/spark2.2Bigdata/install/hadoop/datanode/bin

// generate streamSample.csv

10001,batch_1,city_1,0.1,school_1:school_11$20
10002,batch_2,city_2,0.2,school_2:school_22$30
10003,batch_3,city_3,0.3,school_3:school_33$40
10004,batch_4,city_4,0.4,school_4:school_44$50
10005,batch_5,city_5,0.5,school_5:school_55$60

// put to hdfs /tmp/streamSample.csv
./hadoop fs -put streamSample.csv /tmp

// spark-beeline
cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
bin/spark-submit --master yarn-client --executor-memory 10G --executor-cores 5 
--driver-memory 5G --num-executors 3 --class 
org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
 "hdfs://hacluster/user/sparkhive/warehouse"

bin/beeline -u jdbc:hive2://10.18.98.34:23040

CREATE TABLE stream_table(
id INT,
name STRING,
city STRING,
salary FLOAT
)
STORED BY 'carbondata'
TBLPROPERTIES('streaming'='true', 'sort_columns'='name');

LOAD DATA LOCAL INPATH 'hdfs://hacluster/chetan/streamSample.csv' INTO TABLE 
stream_table OPTIONS('HEADER'='false');

// spark-shell 
cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
bin/spark-shell --master yarn-client --executor-memory 10G --executor-cores 5 
--driver-memory 5G --num-executors 3 --jars 
/srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar

import java.io.{File, PrintWriter}
import java.net.ServerSocket

import org.apache.spark.sql.{CarbonEnv, SparkSession}
import org.apache.spark.sql.hive.CarbonRelation
import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}

import org.apache.carbondata.core.constants.CarbonCommonConstants
import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}

CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
 "/MM/dd")

import org.apache.spark.sql.CarbonSession._

val carbonSession = SparkSession.
  builder().
  appName("StreamExample").
  config("spark.sql.warehouse.dir", 
"hdfs://hacluster/user/sparkhive/warehouse").
  config("javax.jdo.option.ConnectionURL", 
"jdbc:mysql://10.18.98.34:3306/sparksql?characterEncoding=UTF-8").
  config("javax.jdo.option.ConnectionDriverName", "com.mysql.jdbc.Driver").
  config("javax.jdo.option.ConnectionPassword", "huawei").
  config("javax.jdo.option.ConnectionUserName", "sparksql").
  getOrCreateCarbonSession()
   
carbonSession.sparkContext.setLogLevel("ERROR")

carbonSession.sql("select * from stream_table").show

def writeSocket(serverSocket: ServerSocket): Thread = {
  val thread = new Thread() {
override def run(): Unit = {
  // wait for client to connection request and accept
  val clientSocket = serverSocket.accept()
  val socketWriter = new PrintWriter(clientSocket.getOutputStream())
  var index = 0
  for (_ <- 1 to 1000) {
// write 5 records per iteration
for (_ <- 0 to 100) {
  index = index + 1
  socketWriter.println(index.toString + ",name_" + index
   + ",city_" + index + "," + (index * 
1.00).toString +
   ",school_" + index + ":school_" + index + index 
+ "$" + index)
}
socketWriter.flush()
Thread.sleep(2000)
  }
  socketWriter.close()
  System.out.println("Socket closed")
}
  }
  thread.start()
  thread
}
  
def startStreaming(spark: SparkSession, tablePath: CarbonTablePath): Thread = {
  val thread = new Thread() {
override def run(): Unit = {
  var qry: StreamingQuery = null
  try {
val readSocketDF = spark.readStream
  .format("socket")
  .option("host", "10.18.98.34")
  .option("port", 7071)
  .load()

// Write data from socket stream to carbondata file
qry = readSocketDF.writeStream
  .format("carbondata")
  .trigger(ProcessingTime("5 seconds"))
  .option("checkpointLocation", tablePath.getStreamingCheckpointDir)
  .option("tablePath", tablePath.getPath)
  .start()

qry.awaitTermination()
  } catch {
case _: InterruptedException =>
  println("Done reading and writing streaming data")
  } finally {
qry.stop()
  }
}
  }
  thread.start()
  thread
}

val streamTableName = s"stream_table"

val carbonTable = CarbonEnv.getInstance(carbonSession).carbonMetastore.
  lookupRelation(Some("default"), 
streamTableName)(carbonSession).asInstanceOf[CarbonRelation].
  tableMeta.carbonTable

val tablePath = 
CarbonStorePath.getCarbonTablePath(carbonTable.get

[jira] [Updated] (CARBONDATA-1726) Carbon1.3.0-Streaming - Null pointer exception is thrown when streaming is started in spark-shell

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1726:

Summary: Carbon1.3.0-Streaming - Null pointer exception is thrown when 
streaming is started in spark-shell  (was: Carbon1.3.0-Streaming - Select query 
from spark-shell does not execute successfully for streaming table load)

> Carbon1.3.0-Streaming - Null pointer exception is thrown when streaming is 
> started in spark-shell
> -
>
> Key: CARBONDATA-1726
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1726
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: 3 node ant cluster SUSE 11 SP4
>Reporter: Chetan Bhat
>Priority: Blocker
>  Labels: Functional
>
> Steps :
> // prepare csv file for batch loading
> cd /srv/spark2.2Bigdata/install/hadoop/datanode/bin
> // generate streamSample.csv
> 10001,batch_1,city_1,0.1,school_1:school_11$20
> 10002,batch_2,city_2,0.2,school_2:school_22$30
> 10003,batch_3,city_3,0.3,school_3:school_33$40
> 10004,batch_4,city_4,0.4,school_4:school_44$50
> 10005,batch_5,city_5,0.5,school_5:school_55$60
> // put to hdfs /tmp/streamSample.csv
> ./hadoop fs -put streamSample.csv /tmp
> // spark-beeline
> cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
> bin/spark-submit --master yarn-client --executor-memory 10G --executor-cores 
> 5 --driver-memory 5G --num-executors 3 --class 
> org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
> /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
>  "hdfs://hacluster/user/sparkhive/warehouse"
> bin/beeline -u jdbc:hive2://10.18.98.34:23040
> CREATE TABLE stream_table(
> id INT,
> name STRING,
> city STRING,
> salary FLOAT
> )
> STORED BY 'carbondata'
> TBLPROPERTIES('streaming'='true', 'sort_columns'='name');
> LOAD DATA LOCAL INPATH 'hdfs://hacluster/chetan/streamSample.csv' INTO TABLE 
> stream_table OPTIONS('HEADER'='false');
> // spark-shell 
> cd /srv/spark2.2Bigdata/install/spark/sparkJdbc
> bin/spark-shell --master yarn-client --executor-memory 10G --executor-cores 5 
> --driver-memory 5G --num-executors 3 --jars 
> /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar
> import java.io.{File, PrintWriter}
> import java.net.ServerSocket
> import org.apache.spark.sql.{CarbonEnv, SparkSession}
> import org.apache.spark.sql.hive.CarbonRelation
> import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}
> import org.apache.carbondata.core.constants.CarbonCommonConstants
> import org.apache.carbondata.core.util.CarbonProperties
> import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath}
> CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
>  "/MM/dd")
> import org.apache.spark.sql.CarbonSession._
> val carbonSession = SparkSession.
>   builder().
>   appName("StreamExample").
>   config("spark.sql.warehouse.dir", 
> "hdfs://hacluster/user/sparkhive/warehouse").
>   config("javax.jdo.option.ConnectionURL", 
> "jdbc:mysql://10.18.98.34:3306/sparksql?characterEncoding=UTF-8").
>   config("javax.jdo.option.ConnectionDriverName", "com.mysql.jdbc.Driver").
>   config("javax.jdo.option.ConnectionPassword", "huawei").
>   config("javax.jdo.option.ConnectionUserName", "sparksql").
>   getOrCreateCarbonSession()
>
> carbonSession.sparkContext.setLogLevel("ERROR")
> carbonSession.sql("select * from stream_table").show
> def writeSocket(serverSocket: ServerSocket): Thread = {
>   val thread = new Thread() {
> override def run(): Unit = {
>   // wait for client to connection request and accept
>   val clientSocket = serverSocket.accept()
>   val socketWriter = new PrintWriter(clientSocket.getOutputStream())
>   var index = 0
>   for (_ <- 1 to 1000) {
> // write 5 records per iteration
> for (_ <- 0 to 100) {
>   index = index + 1
>   socketWriter.println(index.toString + ",name_" + index
>+ ",city_" + index + "," + (index * 
> 1.00).toString +
>",school_" + index + ":school_" + index + 
> index + "$" + index)
> }
> socketWriter.flush()
> Thread.sleep(2000)
>   }
>   socketWriter.close()
>   System.out.println("Socket closed")
> }
>   }
>   thread.start()
>   thread
> }
>   
> def startStreaming(spark: SparkSession, tablePath: CarbonTablePath): Thread = 
> {
>   val thread = new Thread() {
> override def run(): Unit = {
>   var qry: StreamingQuery = null
>   try {
> val readSocketDF = spark.re

[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

2017-11-17 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1524#discussion_r151673677
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
 ---
@@ -52,6 +52,14 @@
   public static final String CARBON_OPTIONS_DATEFORMAT =
   "carbon.options.dateformat";
   public static final String CARBON_OPTIONS_DATEFORMAT_DEFAULT = "";
+
+  /**
+   * option to specify the load option
+   */
+  @CarbonProperty
+  public static final String CARBON_OPTIONS_TIMESTAMPFORMAT =
+  "carbon.options.dateformat";
--- End diff --

I think `carbon.options.dateformat ` should be 
`carbon.options.timestampformat`


---


[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

2017-11-17 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1524#discussion_r151674183
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala
 ---
@@ -17,35 +17,32 @@
 
 package org.apache.carbondata.spark.load
 
-import scala.collection.JavaConverters._
+import java.text.SimpleDateFormat
 
-import org.apache.carbondata.core.constants.CarbonCommonConstants
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable
-import org.apache.carbondata.processing.loading.model.CarbonLoadModel
 import org.apache.carbondata.processing.loading.sort.SortScopeOptions
 import 
org.apache.carbondata.spark.exception.MalformedCarbonCommandException
 
 object ValidateUtil {
-  def validateDateFormat(dateFormat: String, table: CarbonTable, 
tableName: String): Unit = {
-val dimensions = table.getDimensionByTableName(tableName).asScala
+
+  /**
+   * validates both timestamp and date for illegal values
+   *
+   * @param optionValue
+   * @param optionName
+   */
+  def validateDateTimeFormat(optionValue: String, optionName: String): 
Unit = {
--- End diff --

give proper names to parameters


---


[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...

2017-11-17 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1524#discussion_r151674289
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
 ---
@@ -799,18 +799,19 @@ object CarbonDataRDDFactory {
   throw new DataLoadingException("Partition column not found.")
 }
 
-val dateFormatMap = 
CarbonDataProcessorUtil.getDateFormatMap(carbonLoadModel.getDateFormat)
-val specificFormat = 
Option(dateFormatMap.get(partitionColumn.toLowerCase))
-val timeStampFormat = if (specificFormat.isDefined) {
-  new SimpleDateFormat(specificFormat.get)
+val specificTimestampFormat = carbonLoadModel.getTimestampformat
+val specificDateFormat = carbonLoadModel.getDateFormat
+val timeStampFormat = if (specificTimestampFormat != null &&
+  !specificTimestampFormat.trim.isEmpty) {
--- End diff --

format it properly.


---


[GitHub] carbondata issue #1521: [WIP] [CARBONDATA-1743] fix conurrent pre-agg creati...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1521
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1228/



---


[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1520
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1229/



---


[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1471
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1230/



---


[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1524
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1231/



---


[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

2017-11-17 Thread dhatchayani
Github user dhatchayani commented on the issue:

https://github.com/apache/carbondata/pull/1524
  
retest this please


---


[jira] [Updated] (CARBONDATA-1688) Carbon 1.3.0-Partitioning: Hash Partitioning is not working for Date Column

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1688:

Summary: Carbon 1.3.0-Partitioning: Hash Partitioning is not working for 
Date Column  (was: Hash Partitioning is not working for Date Column)

> Carbon 1.3.0-Partitioning: Hash Partitioning is not working for Date Column
> ---
>
> Key: CARBONDATA-1688
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1688
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.3.0
>Reporter: Ayushi Sharma
> Attachments: Date_hash.PNG, Segment_0.PNG, Segment_1.PNG, 
> Segment_2.PNG
>
>
> On applying hash partition on date column, all the data by default goes to 
> Default partition.
> create table if not exists date_hash(col_A String) partitioned by (col_F 
> Timestamp) stored by 'carbondata' 
> tblproperties('partition_type'='hash','num_partitions'='5');
> insert into table date_hash select 'ayushi','2016-02-02';
> insert into table date_hash select 'ayushi','2016-02-05';
> insert into table date_hash select 'ayushi','2016-01-05';



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1687) Carbon 1.3.0-Partitioning:Hash Partitioning is not working for Date Column

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1687:

Summary: Carbon 1.3.0-Partitioning:Hash Partitioning is not working for 
Date Column  (was: Hash Partitioning is not working for Date Column)

> Carbon 1.3.0-Partitioning:Hash Partitioning is not working for Date Column
> --
>
> Key: CARBONDATA-1687
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1687
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.3.0
>Reporter: Ayushi Sharma
> Attachments: Date_hash.PNG, Segment_0.PNG, Segment_1.PNG, 
> Segment_2.PNG
>
>
> On applying hash partition on date column, all the data by default goes to 
> Default partition.
> create table if not exists date_hash(col_A String) partitioned by (col_F 
> Timestamp) stored by 'carbondata' 
> tblproperties('partition_type'='hash','num_partitions'='5');
> insert into table date_hash select 'ayushi','2016-02-02';
> insert into table date_hash select 'ayushi','2016-02-05';
> insert into table date_hash select 'ayushi','2016-01-05';



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1679) Carbon 1.3.0-Partitioning:After Splitting the Partition,no records are displayed

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1679:

Summary: Carbon 1.3.0-Partitioning:After Splitting the Partition,no records 
are displayed  (was: After Splitting the Partition,no records are displayed)

> Carbon 1.3.0-Partitioning:After Splitting the Partition,no records are 
> displayed
> 
>
> Key: CARBONDATA-1679
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1679
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.3.0
>Reporter: Ayushi Sharma
> Attachments: Split1.PNG
>
>
> create table part_nation_4 (N_NATIONKEY BIGINT,N_REGIONKEY BIGINT,N_COMMENT 
> STRING) partitioned by (N_NAME STRING) stored by 'carbondata' 
> tblproperties('partition_type'='list','list_info'='ALGERIA,ARGENTINA,BRAZIL,CANADA,(EGYPT,ETHIOPIA,FRANCE),JAPAN');
> load data inpath '/spark-warehouse/tpchhive.db/nation/nation.tbl' into table 
> part_nation_4 
> options('DELIMITER'='|','FILEHEADER'='N_NATIONKEY,N_NAME,N_REGIONKEY,N_COMMENT');
> show partitions part_nation_4;
> ALTER TABLE part_nation_4 SPLIT PARTITION(5) 
> INTO('(EGYPT,ETHIOPIA)','FRANCE');
> show partitions part_nation_4;
> select * from part_nation_4 where N_NAME='FRANCE';
> select * from part_nation_4 where N_NAME='EGYPT';
>  select * from part_nation_4 where N_NAME='CANADA';



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1678) Carbon 1.3.0-Partitioning:Show partition throws index out of bounds exception

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1678:

Summary: Carbon 1.3.0-Partitioning:Show partition throws index out of 
bounds exception  (was: Show partition throws index out of bounds exception)

> Carbon 1.3.0-Partitioning:Show partition throws index out of bounds exception
> -
>
> Key: CARBONDATA-1678
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1678
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.3.0
>Reporter: Ayushi Sharma
> Attachments: Show_part.PNG, Show_part.txt
>
>
> create table part_nation_3 (N_NATIONKEY BIGINT,N_REGIONKEY BIGINT,N_COMMENT 
> STRING) partitioned by (N_NAME STRING) stored by 'carbondata' 
> tblproperties('partition_type'='list','list_info'='ALGERIA,ARGENTINA,BRAZIL,CANADA,(EGYPT,ETHIOPIA,FRANCE),JAPAN');
> ALTER TABLE part_nation_3 ADD PARTITION('SAUDI ARABIA,(VIETNAM,RUSSIA,UNITED 
> KINGDOM,UNITED STATES)');
> show partitions part_nation_3;



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1676) Carbon 1.3.0-Partitioning:No records are displayed for the newly added partition.

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1676:

Summary: Carbon 1.3.0-Partitioning:No records are displayed for the newly 
added partition.  (was: No records are displayed for the newly added partition.)

> Carbon 1.3.0-Partitioning:No records are displayed for the newly added 
> partition.
> -
>
> Key: CARBONDATA-1676
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1676
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.3.0
>Reporter: Ayushi Sharma
> Attachments: AddPart.PNG, Add_part_logs.txt
>
>
> create table part_nation (N_NATIONKEY BIGINT,N_REGIONKEY BIGINT,N_COMMENT 
> STRING) partitioned by (N_NAME STRING) stored by 'carbondata' 
> tblproperties('partition_type'='list','list_info'='ALGERIA,ARGENTINA,BRAZIL,CANADA,(EGYPT,ETHIOPIA),FRANCE,JAPAN');
> load data inpath '/spark-warehouse/tpchhive.db/nation/nation.tbl' into table 
> part_nation 
> options('DELIMITER'='|','FILEHEADER'='N_NATIONKEY,N_NAME,N_REGIONKEY,N_COMMENT');
> show partitions part_nation
> select * from part_nation where N_NAME='Germany';
> ALTER TABLE part_nation ADD PARTITION('GERMANY');
> select * from part_nation where N_NAME='GERMANY';



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1681) Carbon 1.3.0-Partitioning:After dropping the partition, the data is also getting dropped.

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1681:

Summary: Carbon 1.3.0-Partitioning:After dropping the partition, the data 
is also getting dropped.  (was: After dropping the partition, the data is also 
getting dropped.)

> Carbon 1.3.0-Partitioning:After dropping the partition, the data is also 
> getting dropped.
> -
>
> Key: CARBONDATA-1681
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1681
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.3.0
>Reporter: Ayushi Sharma
> Attachments: drop_part1.PNG, drop_part_2.PNG
>
>
> create table part_nation_drop (N_NATIONKEY BIGINT,N_REGIONKEY 
> BIGINT,N_COMMENT STRING) partitioned by (N_NAME STRING) stored by 
> 'carbondata' 
> tblproperties('partition_type'='list','list_info'='ALGERIA,ARGENTINA,BRAZIL,CANADA,(EGYPT,ETHIOPIA,FRANCE),JAPAN');
> show partitions part_nation_drop;
> load data inpath '/spark-warehouse/tpchhive.db/nation/nation.tbl' into table 
> part_nation_drop 
> options('DELIMITER'='|','FILEHEADER'='N_NATIONKEY,N_NAME,N_REGIONKEY,N_COMMENT');
> select * from part_nation_drop where N_Name='ALGERIA';
> ALTER TABLE part_nation_drop DROP PARTITION(1);
> select * from part_nation_drop where N_Name='ALGERIA';



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1691) Carbon 1.3.0-Partitioning:Document needs to be updated for Table properties (Sort_Scope) in create table

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1691:

Summary: Carbon 1.3.0-Partitioning:Document needs to be updated for Table 
properties (Sort_Scope) in create table  (was: Document needs to be updated for 
Table properties (Sort_Scope) in create table)

> Carbon 1.3.0-Partitioning:Document needs to be updated for Table properties 
> (Sort_Scope) in create table
> 
>
> Key: CARBONDATA-1691
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1691
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.3.0
>Reporter: Ayushi Sharma
>Priority: Minor
> Attachments: batch_sort.PNG
>
>
> Document needs to be updated for Table properties (Sort_Scope) in create 
> table.
> As per JIRA-1438, the sort_scope will be supported in the create statement 
> itself, but the same thing is not mentioned in the document.
> Document Site- https://carbondata.apache.org/ddl-operation-on-carbondata.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1674) Carbon 1.3.0-Partitioning:Describe Formatted Should show the type of partition as well.

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1674:

Summary: Carbon 1.3.0-Partitioning:Describe Formatted Should show the type 
of partition as well.  (was: Describe Formatted Should show the type of 
partition as well.)

> Carbon 1.3.0-Partitioning:Describe Formatted Should show the type of 
> partition as well.
> ---
>
> Key: CARBONDATA-1674
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1674
> Project: CarbonData
>  Issue Type: Improvement
>  Components: sql
>Affects Versions: 1.3.0
>Reporter: Ayushi Sharma
>Priority: Minor
> Attachments: Jira_req_part1.PNG, jira_req_part2.PNG
>
>
> Describe Formatted should show type of partitions as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1680) Carbon 1.3.0-Partitioning:Show Partition for Hash Partition doesn't display the partition id

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1680:

Summary: Carbon 1.3.0-Partitioning:Show Partition for Hash Partition 
doesn't display the partition id  (was: Show Partition for Hash Partition 
doesn't display the partition id)

> Carbon 1.3.0-Partitioning:Show Partition for Hash Partition doesn't display 
> the partition id
> 
>
> Key: CARBONDATA-1680
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1680
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.3.0
>Reporter: Ayushi Sharma
>Priority: Minor
> Attachments: Show_part_1_doc.PNG, show_part_1.PNG
>
>
> CREATE TABLE IF NOT EXISTS t9(
>  id Int,
>  logdate Timestamp,
>  phonenumber Int,
>  country String,
>  area String
>  )
>  PARTITIONED BY (vin String)
>  STORED BY 'carbondata'
>  TBLPROPERTIES('PARTITION_TYPE'='HASH','NUM_PARTITIONS'='5');
> show partitions t9;



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1673) Carbon 1.3.0-Partitioning:Show Partition for Range Partition is not showing the correct details.

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1673:

Summary: Carbon 1.3.0-Partitioning:Show Partition for Range Partition is 
not showing the correct details.  (was: Show Partition for Range Partition is 
not showing the correct details.)

> Carbon 1.3.0-Partitioning:Show Partition for Range Partition is not showing 
> the correct details.
> 
>
> Key: CARBONDATA-1673
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1673
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.3.0
>Reporter: Ayushi Sharma
>Priority: Minor
> Attachments: Range_recording.htm, Range_recording.swf
>
>
> For description, please refer to the attachment.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1672) Carbon 1.3.0-Partitioning:Hash Partition is not working as specified in the document.

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1672:

Summary: Carbon 1.3.0-Partitioning:Hash Partition is not working as 
specified in the document.  (was: Hash Partition is not working as specified in 
the document.)

> Carbon 1.3.0-Partitioning:Hash Partition is not working as specified in the 
> document.
> -
>
> Key: CARBONDATA-1672
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1672
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.3.0
>Reporter: Ayushi Sharma
>Priority: Minor
> Attachments: Part2.PNG, Partition1.PNG
>
>
> create table Carb_part (P_PARTKEY BIGINT,P_NAME STRING,P_MFGR STRING,P_BRAND 
> STRING,P_TYPE STRING,P_CONTAINER STRING,P_RETAILPRICE DOUBLE,P_COMMENT 
> STRING)PARTITIONED BY (P_SIZE int) STORED BY 'CARBONDATA' 
> TBLPROPERTIES('partition_type'='HASH','partition_num'='3');
> This command displays error as mentioned below:
> Error: org.apache.carbondata.spark.exception.MalformedCarbonCommandException: 
> Error: Invalid partition definition (state=,code=0)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1751) Modify sys.err to AnalysisException when uses run related operation except IUD,compaction and alter

2017-11-17 Thread xubo245 (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xubo245 updated CARBONDATA-1751:

Description: 
carbon printout improper error message, for example, it printout system error 
when users run create table with the same column name, but it should printout 
related exception information

So we modify sys.error method to AnalysisException when uses run related 
operation except IUD,compaction and alter

Make the type of exception and message correctly,including Spark2 and 
spark-common module

  was:
carbon printout improper error message, for example, it printout system error 
when users run create table with the same column name, but it should printout 
related exception information

So we modify sys.error method to AnalysisException when uses run related 
operation except IUD,compaction and alter


> Modify sys.err to AnalysisException when  uses run related operation except 
> IUD,compaction and alter
> 
>
> Key: CARBONDATA-1751
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1751
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.2.0
>Reporter: xubo245
>Assignee: xubo245
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> carbon printout improper error message, for example, it printout system error 
> when users run create table with the same column name, but it should printout 
> related exception information
> So we modify sys.error method to AnalysisException when uses run related 
> operation except IUD,compaction and alter
> Make the type of exception and message correctly,including Spark2 and 
> spark-common module



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1728) Carbon1.3.0- DB creation external path : Delete data with select in where clause not successful for large data

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1728:

Summary: Carbon1.3.0- DB creation external path : Delete data with select 
in where clause not successful for large data  (was: (Carbon1.3.0- DB creation 
external path) - Delete data with select in where clause not successful for 
large data)

> Carbon1.3.0- DB creation external path : Delete data with select in where 
> clause not successful for large data
> --
>
> Key: CARBONDATA-1728
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1728
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: 3 node ant cluster
>Reporter: Chetan Bhat
>  Labels: DFX
>
> Steps :
> 0: jdbc:hive2://10.18.98.34:23040> create database test_db1 location 
> 'hdfs://hacluster/user/test1';
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.032 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> use test_db1;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.01 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> create table if not exists 
> ORDERS(O_ORDERDATE string,O_ORDERPRIORITY string,O_ORDERSTATUS 
> string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE double,O_CLERK 
> string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES ('table_blocksize'='128');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.174 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> load data inpath 
> "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
> options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (27.421 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> create table h_orders as select * from 
> orders;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (9.779 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> Delete from test_db1.orders a where exists 
> (select 1 from test_db1.h_orders b where b.o_ORDERKEY=a.O_ORDERKEY);
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (48.998 seconds)
> select count(*) from test_db1.orders;
> Actual Issue : Select count displays shows all records present which means 
> the records are not deleted.
> 0: jdbc:hive2://10.18.98.34:23040> select count(*) from test_db1.orders;
> +---+--+
> | count(1)  |
> +---+--+
> | 750   |
> +---+--+
> 1 row selected (7.967 seconds)
> This indicates Delete data with select in where clause not successful for 
> large data. 
> Expected : The Delete data with select in where clause should be successful 
> for large data. The select count should return 0 records which indicates that 
> the records are deleted successfully.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1747) Carbon1.3.0- DB creation external path : Owner name of compacted segment and segment after update is not correct

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1747:

Summary: Carbon1.3.0- DB creation external path : Owner name of compacted 
segment and segment after update is not correct  (was: (Carbon1.3.0- DB 
creation external path) - Owner name of compacted segment and segment after 
update is not correct)

> Carbon1.3.0- DB creation external path : Owner name of compacted segment and 
> segment after update is not correct
> 
>
> Key: CARBONDATA-1747
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1747
> Project: CarbonData
>  Issue Type: Bug
>  Components: other
>Affects Versions: 1.3.0
> Environment: 3 node ant cluster
>Reporter: Chetan Bhat
>  Labels: security
>
> Steps :
> In spark Beeline user executes the following queries
> drop database if exists test_db1 cascade;
> create database test_db1 location 'hdfs://hacluster/user/test1';
> use test_db1;
> create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
> string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
> double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
> 'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');
> load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
> options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
> load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
> options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
> load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
> options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
> load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
> options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
> alter table ORDERS compact 'major';
> update orders set (O_ORDERKEY)=(1) where O_CUSTKEY=6259021;
> After compaction and update user checks the Owner name of compacted segment 
> and segment name after update in HDFS UI.
> Issue : In HDFS UI before compaction and update the owner name of the 
> existing segment folders was "anonymous". After compaction and update the 
> owner name of the compacted segment folder and segment which is impacted by 
> update is displayed as "root".
> Expected : After compaction and update the owner name of the compacted 
> segment folder and segment which is impacted by update should be "anonymous".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1748) Carbon1.3.0- DB creation external path : Permission of created table and database folder in carbon store not correct

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1748:

Summary: Carbon1.3.0- DB creation external path : Permission of created 
table and database folder in carbon store not correct  (was: (Carbon1.3.0- DB 
creation external path) - Permission of created table and database folder in 
carbon store not correct)

> Carbon1.3.0- DB creation external path : Permission of created table and 
> database folder in carbon store not correct
> 
>
> Key: CARBONDATA-1748
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1748
> Project: CarbonData
>  Issue Type: Bug
>  Components: other
>Affects Versions: 1.3.0
> Environment: 3 node ant cluster
>Reporter: Chetan Bhat
>  Labels: security
>
> Steps : 
> In spark Beeline user executes the following queries.
> drop database if exists test_db1 cascade;
> create database test_db1 location 'hdfs://hacluster/user/test1';
> use test_db1;
> create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
> string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
> double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
> 'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');
> User checks the permission of the created database and table in carbon store 
> using the  bin/hadoop fs -getfacl command.
> Issue : The Permission of created table and database folder in carbon store 
> not correct. i.e 
> # file: /user/test1/orders
> # owner: anonymous
> # group: users
> user::rwx
> group::r-x
> other::r-x
> Expected : Correct permissions for the created table and database folder in 
> carbon store should be 
> # file: /user/test1/orders
> # owner: anonymous
> # group: users
> user::rwx
> group::---
> other::---



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1731) Carbon1.3.0- DB creation external path: Update fails incorrectly with error for table created in external db location

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1731:

Summary: Carbon1.3.0- DB creation external path: Update fails incorrectly 
with error for table created in external db location  (was: (Carbon1.3.0- DB 
creation external path) Update fails incorrectly with error for table created 
in external db location)

> Carbon1.3.0- DB creation external path: Update fails incorrectly with error 
> for table created in external db location
> -
>
> Key: CARBONDATA-1731
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1731
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: 3 node ant cluster
>Reporter: Chetan Bhat
>  Labels: DFX
>
> Steps :
> 0: jdbc:hive2://10.18.98.34:23040> drop database if exists test_db1 cascade;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.279 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> create database test_db1 location 
> 'hdfs://hacluster/user/test1';
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.04 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> use test_db1;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.011 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> create table if not exists 
> ORDERS(O_ORDERDATE string,O_ORDERPRIORITY string,O_ORDERSTATUS 
> string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE double,O_CLERK 
> string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES ('table_blocksize'='128');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.15 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> load data inpath 
> "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS 
> options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (23.228 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> update test_Db1.ORDERS set (o_comment) = 
> ('yyy');
> Issue : Update fails incorrectly with error for table created in external db 
> location.
> 0: jdbc:hive2://10.18.98.34:23040> update test_Db1.ORDERS set (o_comment) = 
> ('yyy');
> *Error: java.lang.RuntimeException: Update operation failed. Multiple input 
> rows matched for same row. (state=,code=0)*
> Expected : The update should be success for table created in external db 
> location.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1758) Carbon1.3.0- No Inverted Index : Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1758:

Summary: Carbon1.3.0- No Inverted Index : Select column with is null for 
no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException  (was: 
(Carbon1.3.0- No Inverted Index) - Select column with is null for 
no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException)

> Carbon1.3.0- No Inverted Index : Select column with is null for 
> no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException
> 
>
> Key: CARBONDATA-1758
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1758
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: 3 node cluster
>Reporter: Chetan Bhat
>  Labels: Functional
>
> Steps :
> In Beeline user executes the queries in sequence.
> CREATE TABLE uniqdata_DI_int (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='cust_id','NO_INVERTED_INDEX'='cust_id');
> LOAD DATA INPATH 'hdfs://hacluster/chetan/3000_UniqData.csv' into table 
> uniqdata_DI_int OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> Select count(CUST_ID) from uniqdata_DI_int;
> Select count(CUST_ID)*10 as multiple from uniqdata_DI_int;
> Select avg(CUST_ID) as average from uniqdata_DI_int;
> Select floor(CUST_ID) as average from uniqdata_DI_int;
> Select ceil(CUST_ID) as average from uniqdata_DI_int;
> Select ceiling(CUST_ID) as average from uniqdata_DI_int;
> Select CUST_ID*integer_column1 as multiple from uniqdata_DI_int;
> Select CUST_ID from uniqdata_DI_int where CUST_ID is null;
> *Issue : Select column with is null for no_inverted_index column throws 
> java.lang.ArrayIndexOutOfBoundsException*
> 0: jdbc:hive2://10.18.98.34:23040> Select CUST_ID from uniqdata_DI_int where 
> CUST_ID is null;
> Error: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 0 in stage 79.0 failed 4 times, most recent failure: Lost task 0.3 in 
> stage 79.0 (TID 123, BLR114278, executor 18): 
> org.apache.spark.util.TaskCompletionListenerException: 
> java.util.concurrent.ExecutionException: 
> java.lang.ArrayIndexOutOfBoundsException: 0
> at 
> org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105)
> at org.apache.spark.scheduler.Task.run(Task.scala:112)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Driver stacktrace: (state=,code=0)
> Expected : Select column with is null for no_inverted_index column should be 
> successful displaying the correct result set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1749) Carbon1.3.0- DB creation external path : mdt file is not created in directory as per configuration in carbon.properties

2017-11-17 Thread Chetan Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-1749:

Summary: Carbon1.3.0- DB creation external path : mdt file is not created 
in directory as per configuration in carbon.properties  (was: (Carbon1.3.0- DB 
creation external path) - mdt file is not created in directory as per 
configuration in carbon.properties)

> Carbon1.3.0- DB creation external path : mdt file is not created in directory 
> as per configuration in carbon.properties
> ---
>
> Key: CARBONDATA-1749
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1749
> Project: CarbonData
>  Issue Type: Bug
>  Components: other
>Affects Versions: 1.3.0
> Environment: 3 node cluster
>Reporter: Chetan Bhat
>  Labels: Functional
>
> Steps :
> In carbon.properties the mdt file directory path is configured as 
> Carbon.update.sync.folder=hdfs://hacluster/user/test1 or /tmp/test1/
> In beeline user creates a database by specifying the carbon store path and 
> creates a carbon table in the db.
> drop database if exists test_db1 cascade;
> create database test_db1 location 'hdfs://hacluster/user/test1';
> use test_db1;
> create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY 
> string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE 
> double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY 
> 'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128');
> User checks in HDFS UI if the mdt file is created in directory specified 
> (hdfs://hacluster/user/test1) as per configuration in carbon.properties.
> Issue : mdt file is not created in directory specified 
> (hdfs://hacluster/user/test1) as per configuration in carbon.properties. Also 
> the folder is not created if the user configures the folder path as 
> Carbon.update.sync.folder=/tmp/test1/
> Expected : mdt file should be created in directory specified 
> (hdfs://hacluster/user/test1) or /tmp/test1/ as per configuration in 
> carbon.properties. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1520: [CARBONDATA-1734] Ignore empty line while rea...

2017-11-17 Thread kumarvishal09
Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1520#discussion_r151698402
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
 ---
@@ -45,6 +45,9 @@
   "carbon.options.is.empty.data.bad.record";
   public static final String 
CARBON_OPTIONS_IS_EMPTY_DATA_BAD_RECORD_DEFAULT = "false";
 
+  @CarbonProperty public static final String 
CARBON_OPTIONS_SKIP_EMPTY_LINE =
--- End diff --

Please add documentation 


---


[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...

2017-11-17 Thread kumarvishal09
Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/1520
  
LGTM except one small comment


---


[GitHub] carbondata pull request #1525: [CARBONDATA-1751] Make the type of exception ...

2017-11-17 Thread xubo245
GitHub user xubo245 opened a pull request:

https://github.com/apache/carbondata/pull/1525

[CARBONDATA-1751] Make the type of exception and message correctly

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 No
 - [ ] Any backward compatibility impacted?
 No
 - [ ] Document update required?
No
 - [ ] Testing done
change old test cases to adapt changed massage
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
MR55


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xubo245/carbondata msgYaDong

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1525.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1525


commit 80ea7c95b00680c43fd8d16f5aa51a30311a8930
Author: xubo245 <601450...@qq.com>
Date:   2017-11-17T14:36:46Z

[CARBONDATA-1751] Make the type of exception and message correctly




---


[GitHub] carbondata issue #1508: [CARBONDATA-1738] Block direct insert/load on pre-ag...

2017-11-17 Thread kunal642
Github user kunal642 commented on the issue:

https://github.com/apache/carbondata/pull/1508
  
retest this please


---


[jira] [Created] (CARBONDATA-1764) Fix issue of when create table with short data type

2017-11-17 Thread xubo245 (JIRA)
xubo245 created CARBONDATA-1764:
---

 Summary: Fix issue of when create table with short data type
 Key: CARBONDATA-1764
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1764
 Project: CarbonData
  Issue Type: Bug
  Components: spark-integration
Affects Versions: 1.2.0
Reporter: xubo245
Assignee: xubo245
 Fix For: 1.3.0


Fix issue of when create table with short data type



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1526: [CARBONDATA-1764] Fix issue of when create ta...

2017-11-17 Thread xubo245
GitHub user xubo245 opened a pull request:

https://github.com/apache/carbondata/pull/1526

[CARBONDATA-1764] Fix issue of when create table with short data type

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 No
 - [ ] Any backward compatibility impacted?
 No
 - [ ] Document update required?
No
 - [ ] Testing done
   add DataTypeConverterUtilSuite.scala test case for 
DataTypeConverterUtil
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
   MR37


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xubo245/carbondata shortTable

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1526.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1526


commit de8bf19a36a11e3ca32bb98bf4ff1453ae21c630
Author: xubo245 <601450...@qq.com>
Date:   2017-11-17T15:23:31Z

[CARBONDATA-1764] Fix issue of when create table with short data type




---


[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...

2017-11-17 Thread kumarvishal09
Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/1520
  
rest this please


---


[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...

2017-11-17 Thread kumarvishal09
Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/1520
  
retest this please


---


[GitHub] carbondata issue #1435: [CARBONDATA-1626]add data size and index size in tab...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1435
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1232/



---


[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...

2017-11-17 Thread kumarvishal09
Github user kumarvishal09 commented on the issue:

https://github.com/apache/carbondata/pull/1520
  
retest this please


---


[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...

2017-11-17 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1524
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1233/



---


  1   2   >