[GitHub] carbondata pull request #1516: [CARBONDATA-1729]Fix the compatibility issue ...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1516#discussion_r151622204 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java --- @@ -462,39 +461,8 @@ public static DataOutputStream getDataOutputStreamUsingAppend(String path, FileT * @throws IOException */ public static void truncateFile(String path, FileType fileType, long newSize) throws IOException { -path = path.replace("\\", "/"); -FileChannel fileChannel = null; -switch (fileType) { - case LOCAL: -path = getUpdatedFilePath(path, fileType); -fileChannel = new FileOutputStream(path, true).getChannel(); -try { - fileChannel.truncate(newSize); -} finally { - if (fileChannel != null) { -fileChannel.close(); - } -} -return; - case HDFS: - case ALLUXIO: - case VIEWFS: - case S3: -Path pt = new Path(path); -FileSystem fs = pt.getFileSystem(configuration); -fs.truncate(pt, newSize); --- End diff -- I think it is better to use java reflection for line 485 only, no need to modify previous file ---
[GitHub] carbondata issue #1506: [CARBONDATA-1734] Ignore empty line while reading CS...
Github user dhatchayani commented on the issue: https://github.com/apache/carbondata/pull/1506 Please check PR#1520. ---
[GitHub] carbondata issue #1511: [CARBONDATA-1741] Remove AKSK in log when saving to ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1511 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1216/ ---
[GitHub] carbondata issue #1511: [CARBONDATA-1741] Remove AKSK in log when saving to ...
Github user QiangCai commented on the issue: https://github.com/apache/carbondata/pull/1511 LGTM ---
[GitHub] carbondata pull request #1511: [CARBONDATA-1741] Remove AKSK in log when sav...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1511 ---
[GitHub] carbondata pull request #1516: [CARBONDATA-1729]Fix the compatibility issue ...
Github user chenliang613 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1516#discussion_r151626078 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java --- @@ -154,52 +155,68 @@ public boolean delete() { * This method will delete the data in file data from a given offset */ @Override public boolean truncate(String fileName, long validDataEndOffset) { -DataOutputStream dataOutputStream = null; -DataInputStream dataInputStream = null; boolean fileTruncatedSuccessfully = false; -// if bytes to read less than 1024 then buffer size should be equal to the given offset -int bufferSize = validDataEndOffset > CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR ? -CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR : -(int) validDataEndOffset; -// temporary file name -String tempWriteFilePath = fileName + CarbonCommonConstants.TEMPWRITEFILEEXTENSION; -FileFactory.FileType fileType = FileFactory.getFileType(fileName); try { - CarbonFile tempFile; - // delete temporary file if it already exists at a given path - if (FileFactory.isFileExist(tempWriteFilePath, fileType)) { -tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); -tempFile.delete(); - } - // create new temporary file - FileFactory.createNewFile(tempWriteFilePath, fileType); - tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); - byte[] buff = new byte[bufferSize]; - dataInputStream = FileFactory.getDataInputStream(fileName, fileType); - // read the data - int read = dataInputStream.read(buff, 0, buff.length); - dataOutputStream = FileFactory.getDataOutputStream(tempWriteFilePath, fileType); - dataOutputStream.write(buff, 0, read); - long remaining = validDataEndOffset - read; - // anytime we should not cross the offset to be read - while (remaining > 0) { -if (remaining > bufferSize) { - buff = new byte[bufferSize]; -} else { - buff = new byte[(int) remaining]; + // if hadoop version >= 2.7, it can call method 'truncate' to truncate file, + // this method was new in hadoop 2.7 + FileSystem fs = fileStatus.getPath().getFileSystem(FileFactory.getConfiguration()); + Method truncateMethod = fs.getClass().getDeclaredMethod("truncate", + new Class[]{Path.class, long.class}); + fileTruncatedSuccessfully = (boolean)truncateMethod.invoke(fs, + new Object[]{fileStatus.getPath(), validDataEndOffset}); +} catch (NoSuchMethodException e) { + LOGGER.error("there is no 'truncate' method in FileSystem, the version of hadoop is" + + " below 2.7, It needs to implement truncate file by other way."); + DataOutputStream dataOutputStream = null; + DataInputStream dataInputStream = null; + // if bytes to read less than 1024 then buffer size should be equal to the given offset + int bufferSize = validDataEndOffset > CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR ? + CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR : + (int) validDataEndOffset; + // temporary file name + String tempWriteFilePath = fileName + CarbonCommonConstants.TEMPWRITEFILEEXTENSION; + FileFactory.FileType fileType = FileFactory.getFileType(fileName); + try { +CarbonFile tempFile; +// delete temporary file if it already exists at a given path +if (FileFactory.isFileExist(tempWriteFilePath, fileType)) { + tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); + tempFile.delete(); } -read = dataInputStream.read(buff, 0, buff.length); +// create new temporary file +FileFactory.createNewFile(tempWriteFilePath, fileType); +tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); +byte[] buff = new byte[bufferSize]; +dataInputStream = FileFactory.getDataInputStream(fileName, fileType); +// read the data +int read = dataInputStream.read(buff, 0, buff.length); +dataOutputStream = FileFactory.getDataOutputStream(tempWriteFilePath, fileType); dataOutputStream.write(buff, 0, read); -remaining = remaining - read; +long remaining = validDataEndOffset - read; +// anytime we should not cross the offset to be read +while (remaining > 0) { + if (remaining > bufferSize) { +buff = new byte[bufferSize]; + } else { +buff = new byte[(int)
[GitHub] carbondata pull request #1512: [CARBONDATA-1742] Fix NullPointerException in...
Github user xubo245 closed the pull request at: https://github.com/apache/carbondata/pull/1512 ---
[GitHub] carbondata issue #1512: [CARBONDATA-1742] Fix NullPointerException in Segmen...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/1512 @akashrn5 ok ---
[GitHub] carbondata pull request #1513: [CARBONDATA-1745] Use default metastore path ...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1513 ---
[GitHub] carbondata pull request #1506: [CARBONDATA-1734] Ignore empty line while rea...
Github user akashrn5 closed the pull request at: https://github.com/apache/carbondata/pull/1506 ---
[GitHub] carbondata pull request #1516: [CARBONDATA-1729]Fix the compatibility issue ...
Github user chenliang613 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1516#discussion_r151628700 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java --- @@ -462,39 +461,8 @@ public static DataOutputStream getDataOutputStreamUsingAppend(String path, FileT * @throws IOException */ public static void truncateFile(String path, FileType fileType, long newSize) throws IOException { -path = path.replace("\\", "/"); --- End diff -- why remove these code. ---
[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1520 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1217/ ---
[GitHub] carbondata issue #1514: [CARBONDATA-1746] Count star optimization
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1514 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1218/ ---
[jira] [Created] (CARBONDATA-1754) Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if insert overwrite job is running concurrently
Ajeet Rai created CARBONDATA-1754: - Summary: Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if insert overwrite job is running concurrently Key: CARBONDATA-1754 URL: https://issues.apache.org/jira/browse/CARBONDATA-1754 Project: CarbonData Issue Type: Bug Components: data-load Affects Versions: 1.3.0 Environment: 3 Node ant cluster Reporter: Ajeet Rai Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if insert overwrite job is running concurrently. Steps: 1: Create a table 2: Start three load one by one 3: After load is completed, start insert overwrite and minor compaction concurrently from two different session 4: observe that both jobs are are running 5: Observe that Insert overwrite job is success but after that compaction fails with below exception: | ERROR | [pool-23-thread-49] | Error running hive query: | org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167) org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: Compaction failed. Please check logs for more info. Exception in compaction java.lang.Exception: Compaction failed to update metadata for table ajeet.flow_carbon_new999 7: Ideally compaction job should give error with message that insert overwrite in progress. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1754) Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if insert overwrite job is running concurrently
[ https://issues.apache.org/jira/browse/CARBONDATA-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajeet Rai updated CARBONDATA-1754: -- Description: Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if insert overwrite job is running concurrently. Steps: 1: Create a table 2: Start three load one by one 3: After load is completed, start insert overwrite and minor compaction concurrently from two different session 4: observe that both jobs are are running 5: Observe that Insert overwrite job is success but after that compaction fails with below exception: | ERROR | [pool-23-thread-49] | Error running hive query: | org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167) org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: Compaction failed. Please check logs for more info. Exception in compaction java.lang.Exception: Compaction failed to update metadata for table ajeet.flow_carbon_new999 7: Ideally compaction job should give error in start with message that insert overwrite in progress. was: Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if insert overwrite job is running concurrently. Steps: 1: Create a table 2: Start three load one by one 3: After load is completed, start insert overwrite and minor compaction concurrently from two different session 4: observe that both jobs are are running 5: Observe that Insert overwrite job is success but after that compaction fails with below exception: | ERROR | [pool-23-thread-49] | Error running hive query: | org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167) org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: Compaction failed. Please check logs for more info. Exception in compaction java.lang.Exception: Compaction failed to update metadata for table ajeet.flow_carbon_new999 7: Ideally compaction job should give error with message that insert overwrite in progress. > Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if > insert overwrite job is running concurrently > > > Key: CARBONDATA-1754 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1754 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.3.0 > Environment: 3 Node ant cluster >Reporter: Ajeet Rai > Labels: dfx > > Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if > insert overwrite job is running concurrently. > Steps: > 1: Create a table > 2: Start three load one by one > 3: After load is completed, start insert overwrite and minor compaction > concurrently from two different session > 4: observe that both jobs are are running > 5: Observe that Insert overwrite job is success but after that compaction > fails with below exception: > | ERROR | [pool-23-thread-49] | Error running hive query: | > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167) > org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: > Compaction failed. Please check logs for more info. Exception in compaction > java.lang.Exception: Compaction failed to update metadata for table > ajeet.flow_carbon_new999 > 7: Ideally compaction job should give error in start with message that insert > overwrite in progress. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1754) Carbon1.3.0 Concurrent Insert overwrite-Compaction: Compaction job fails at run time if insert overwrite job is running concurrentlyInsert overwrite
[ https://issues.apache.org/jira/browse/CARBONDATA-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajeet Rai updated CARBONDATA-1754: -- Summary: Carbon1.3.0 Concurrent Insert overwrite-Compaction: Compaction job fails at run time if insert overwrite job is running concurrentlyInsert overwrite (was: Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if insert overwrite job is running concurrently) > Carbon1.3.0 Concurrent Insert overwrite-Compaction: Compaction job fails at > run time if insert overwrite job is running concurrentlyInsert overwrite > > > Key: CARBONDATA-1754 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1754 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.3.0 > Environment: 3 Node ant cluster >Reporter: Ajeet Rai > Labels: dfx > > Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if > insert overwrite job is running concurrently. > Steps: > 1: Create a table > 2: Start three load one by one > 3: After load is completed, start insert overwrite and minor compaction > concurrently from two different session > 4: observe that both jobs are are running > 5: Observe that Insert overwrite job is success but after that compaction > fails with below exception: > | ERROR | [pool-23-thread-49] | Error running hive query: | > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167) > org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: > Compaction failed. Please check logs for more info. Exception in compaction > java.lang.Exception: Compaction failed to update metadata for table > ajeet.flow_carbon_new999 > 7: Ideally compaction job should give error in start with message that insert > overwrite in progress. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1516: [CARBONDATA-1729]Fix the compatibility issue ...
Github user zzcclp commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1516#discussion_r151632100 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java --- @@ -462,39 +461,8 @@ public static DataOutputStream getDataOutputStreamUsingAppend(String path, FileT * @throws IOException */ public static void truncateFile(String path, FileType fileType, long newSize) throws IOException { -path = path.replace("\\", "/"); -FileChannel fileChannel = null; -switch (fileType) { - case LOCAL: -path = getUpdatedFilePath(path, fileType); -fileChannel = new FileOutputStream(path, true).getChannel(); -try { - fileChannel.truncate(newSize); -} finally { - if (fileChannel != null) { -fileChannel.close(); - } -} -return; - case HDFS: - case ALLUXIO: - case VIEWFS: - case S3: -Path pt = new Path(path); -FileSystem fs = pt.getFileSystem(configuration); -fs.truncate(pt, newSize); --- End diff -- According to discussion with @QiangCai offline, just use the interface 'CarbonFile.truncate' to truncate file uniformly. @QiangCai what do you think about this? ---
[jira] [Updated] (CARBONDATA-1754) Carbon1.3.0 Concurrent Insert overwrite-Compaction: Compaction job fails at run time if insert overwrite job is running concurrentlyInsert overwrite
[ https://issues.apache.org/jira/browse/CARBONDATA-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajeet Rai updated CARBONDATA-1754: -- Description: Carbon1.3.0 Concurrent Insert overwrite-Compaction: Compaction job fails at run time if insert overwrite job is running concurrently. Steps: 1: Create a table 2: Start three load one by one 3: After load is completed, start insert overwrite and minor compaction concurrently from two different session 4: observe that both jobs are are running 5: Observe that Insert overwrite job is success but after that compaction fails with below exception: | ERROR | [pool-23-thread-49] | Error running hive query: | org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167) org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: Compaction failed. Please check logs for more info. Exception in compaction java.lang.Exception: Compaction failed to update metadata for table ajeet.flow_carbon_new999 7: Ideally compaction job should give error in start with message that insert overwrite in progress. was: Carbon1.3.0 Concurrent Load-Compaction: Compaction job fails at run time if insert overwrite job is running concurrently. Steps: 1: Create a table 2: Start three load one by one 3: After load is completed, start insert overwrite and minor compaction concurrently from two different session 4: observe that both jobs are are running 5: Observe that Insert overwrite job is success but after that compaction fails with below exception: | ERROR | [pool-23-thread-49] | Error running hive query: | org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167) org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: Compaction failed. Please check logs for more info. Exception in compaction java.lang.Exception: Compaction failed to update metadata for table ajeet.flow_carbon_new999 7: Ideally compaction job should give error in start with message that insert overwrite in progress. > Carbon1.3.0 Concurrent Insert overwrite-Compaction: Compaction job fails at > run time if insert overwrite job is running concurrentlyInsert overwrite > > > Key: CARBONDATA-1754 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1754 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.3.0 > Environment: 3 Node ant cluster >Reporter: Ajeet Rai > Labels: dfx > > Carbon1.3.0 Concurrent Insert overwrite-Compaction: Compaction job fails at > run time if insert overwrite job is running concurrently. > Steps: > 1: Create a table > 2: Start three load one by one > 3: After load is completed, start insert overwrite and minor compaction > concurrently from two different session > 4: observe that both jobs are are running > 5: Observe that Insert overwrite job is success but after that compaction > fails with below exception: > | ERROR | [pool-23-thread-49] | Error running hive query: | > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:167) > org.apache.hive.service.cli.HiveSQLException: java.lang.RuntimeException: > Compaction failed. Please check logs for more info. Exception in compaction > java.lang.Exception: Compaction failed to update metadata for table > ajeet.flow_carbon_new999 > 7: Ideally compaction job should give error in start with message that insert > overwrite in progress. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1516: [CARBONDATA-1729]Fix the compatibility issue ...
Github user zzcclp commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1516#discussion_r151632600 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java --- @@ -462,39 +461,8 @@ public static DataOutputStream getDataOutputStreamUsingAppend(String path, FileT * @throws IOException */ public static void truncateFile(String path, FileType fileType, long newSize) throws IOException { -path = path.replace("\\", "/"); --- End diff -- want to use the interface 'CarbonFile.truncate' to truncate file uniformly. ---
[GitHub] carbondata pull request #1516: [CARBONDATA-1729]Fix the compatibility issue ...
Github user zzcclp commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1516#discussion_r151633344 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java --- @@ -154,52 +155,68 @@ public boolean delete() { * This method will delete the data in file data from a given offset */ @Override public boolean truncate(String fileName, long validDataEndOffset) { -DataOutputStream dataOutputStream = null; -DataInputStream dataInputStream = null; boolean fileTruncatedSuccessfully = false; -// if bytes to read less than 1024 then buffer size should be equal to the given offset -int bufferSize = validDataEndOffset > CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR ? -CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR : -(int) validDataEndOffset; -// temporary file name -String tempWriteFilePath = fileName + CarbonCommonConstants.TEMPWRITEFILEEXTENSION; -FileFactory.FileType fileType = FileFactory.getFileType(fileName); try { - CarbonFile tempFile; - // delete temporary file if it already exists at a given path - if (FileFactory.isFileExist(tempWriteFilePath, fileType)) { -tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); -tempFile.delete(); - } - // create new temporary file - FileFactory.createNewFile(tempWriteFilePath, fileType); - tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); - byte[] buff = new byte[bufferSize]; - dataInputStream = FileFactory.getDataInputStream(fileName, fileType); - // read the data - int read = dataInputStream.read(buff, 0, buff.length); - dataOutputStream = FileFactory.getDataOutputStream(tempWriteFilePath, fileType); - dataOutputStream.write(buff, 0, read); - long remaining = validDataEndOffset - read; - // anytime we should not cross the offset to be read - while (remaining > 0) { -if (remaining > bufferSize) { - buff = new byte[bufferSize]; -} else { - buff = new byte[(int) remaining]; + // if hadoop version >= 2.7, it can call method 'truncate' to truncate file, + // this method was new in hadoop 2.7 + FileSystem fs = fileStatus.getPath().getFileSystem(FileFactory.getConfiguration()); + Method truncateMethod = fs.getClass().getDeclaredMethod("truncate", + new Class[]{Path.class, long.class}); + fileTruncatedSuccessfully = (boolean)truncateMethod.invoke(fs, + new Object[]{fileStatus.getPath(), validDataEndOffset}); +} catch (NoSuchMethodException e) { + LOGGER.error("there is no 'truncate' method in FileSystem, the version of hadoop is" + + " below 2.7, It needs to implement truncate file by other way."); + DataOutputStream dataOutputStream = null; + DataInputStream dataInputStream = null; + // if bytes to read less than 1024 then buffer size should be equal to the given offset + int bufferSize = validDataEndOffset > CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR ? + CarbonCommonConstants.BYTE_TO_KB_CONVERSION_FACTOR : + (int) validDataEndOffset; + // temporary file name + String tempWriteFilePath = fileName + CarbonCommonConstants.TEMPWRITEFILEEXTENSION; + FileFactory.FileType fileType = FileFactory.getFileType(fileName); + try { +CarbonFile tempFile; +// delete temporary file if it already exists at a given path +if (FileFactory.isFileExist(tempWriteFilePath, fileType)) { + tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); + tempFile.delete(); } -read = dataInputStream.read(buff, 0, buff.length); +// create new temporary file +FileFactory.createNewFile(tempWriteFilePath, fileType); +tempFile = FileFactory.getCarbonFile(tempWriteFilePath, fileType); +byte[] buff = new byte[bufferSize]; +dataInputStream = FileFactory.getDataInputStream(fileName, fileType); +// read the data +int read = dataInputStream.read(buff, 0, buff.length); +dataOutputStream = FileFactory.getDataOutputStream(tempWriteFilePath, fileType); dataOutputStream.write(buff, 0, read); -remaining = remaining - read; +long remaining = validDataEndOffset - read; +// anytime we should not cross the offset to be read +while (remaining > 0) { + if (remaining > bufferSize) { +buff = new byte[bufferSize]; + } else { +buff = new byte[(int) remai
[GitHub] carbondata issue #1513: [CARBONDATA-1745] Use default metastore path from Hi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1513 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1219/ ---
[jira] [Created] (CARBONDATA-1755) Carbon1.3.0 Concurrent Insert overwrite-update: User is able to run insert overwrite and update job concurrently.
Ajeet Rai created CARBONDATA-1755: - Summary: Carbon1.3.0 Concurrent Insert overwrite-update: User is able to run insert overwrite and update job concurrently. Key: CARBONDATA-1755 URL: https://issues.apache.org/jira/browse/CARBONDATA-1755 Project: CarbonData Issue Type: Bug Components: data-load Affects Versions: 1.3.0 Environment: 3 Node ant cluster Reporter: Ajeet Rai Priority: Minor Carbon1.3.0 Concurrent Insert overwrite-update: User is able to run insert overwrite and update job concurrently. updated data will be overwritten by insert overwrite job. So there is no meaning of running update job if insert overwrite is in progress. Steps: 1: Create a table 2: Do a data load 3: run insert overwrite job. 4: run a update job while overwrite job is still running. 5: Observe that update job is finished and after that overwrite job is also finished. 6: All previous segments are marked for delete and there is no impact of update job. Update job will use the resources unnecessary. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1515: [CARBONDATA-1751] Modify sys.err to AnalysisE...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1515#discussion_r151634168 --- Diff: integration/spark-common-test/src/test/scala/org/apache/spark/sql/execution/command/CarbonTableSchemaCommonSuite.scala --- @@ -0,0 +1,69 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.test.util.QueryTest +import org.junit.Assert +import org.scalatest.BeforeAndAfterAll + +class CarbonTableSchemaCommonSuite extends QueryTest with BeforeAndAfterAll { + + test("Creating table: Duplicate dimensions found with name, it should throw AnalysisException") { +sql("DROP TABLE IF EXISTS carbon_table") +try { + sql( +s""" + | CREATE TABLE carbon_table( + | BB INT, bb char(10) + | ) + | STORED BY 'carbondata' + """.stripMargin) +} catch { + case ex: AnalysisException => Assert.assertTrue(true) + case ex: Exception => Assert.assertTrue(false) +} --- End diff -- If no exception, testcase should fail ---
[jira] [Created] (CARBONDATA-1756) Improve Boolean data compress rate by changing RLE to SNNAPY algorithm
xubo245 created CARBONDATA-1756: --- Summary: Improve Boolean data compress rate by changing RLE to SNNAPY algorithm Key: CARBONDATA-1756 URL: https://issues.apache.org/jira/browse/CARBONDATA-1756 Project: CarbonData Issue Type: Improvement Components: core Affects Versions: 1.2.0 Reporter: xubo245 Assignee: xubo245 Fix For: 1.3.0 Improve Boolean data compress rate by changing RLE to SNNAPY algorithm Because Boolean data compress rate that uses RLE algorithm is lower than SNNAPY algorithm in most scenario. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1523: [CARBONDATA-1756] Improve Boolean data compre...
GitHub user xubo245 opened a pull request: https://github.com/apache/carbondata/pull/1523 [CARBONDATA-1756] Improve Boolean data compress rate by changing RLE to SNNAPY algorithm Improve Boolean data compress rate by changing RLE to SNNAPY algorithm Because Boolean data compress rate that uses RLE algorithm is lower than SNNAPY algorithm in most scenario. We also add some test case for testing Boolean data compress rate. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [ ] Testing done TestBooleanCompressSuite.scala - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. No You can merge this pull request into a Git repository by running: $ git pull https://github.com/xubo245/carbondata RLE2Snappy Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1523.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1523 commit ac266e07bc446e27914cdf17dca45692722e82b5 Author: xubo245 <601450...@qq.com> Date: 2017-11-17T09:17:43Z [CARBONDATA-1756] Improve Boolean data compress rate by changing RLE to SNNAPY algorithm ---
[GitHub] carbondata issue #1518: [CARBONDATA-1752] There are some scalastyle error sh...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/1518 LGTM ---
[GitHub] carbondata pull request #1518: [CARBONDATA-1752] There are some scalastyle e...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1518 ---
[jira] [Resolved] (CARBONDATA-1752) There are some scalastyle error should be optimized in CarbonData
[ https://issues.apache.org/jira/browse/CARBONDATA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1752. -- Resolution: Fixed > There are some scalastyle error should be optimized in CarbonData > - > > Key: CARBONDATA-1752 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1752 > Project: CarbonData > Issue Type: Bug > Components: file-format >Affects Versions: 1.2.0 >Reporter: xubo245 >Assignee: xubo245 >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 40m > Remaining Estimate: 0h > > There are some scalastyle error should be optimized in CarbonData, including > removing useless import, optimizing method definition and so on -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1471#discussion_r151636955 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/DataMapMeta.java --- @@ -19,15 +19,15 @@ import java.util.List; -import org.apache.carbondata.core.indexstore.schema.FilterType; +import org.apache.carbondata.core.scan.filter.intf.ExpressionType; public class DataMapMeta { private List indexedColumns; - private FilterType optimizedOperation; + private List optimizedOperation; --- End diff -- Currently, the like expression is converted to greater than and less than equal to filter. so there is no LIKE expression in the types. ---
[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1471#discussion_r151637482 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/AbstractDataMapWriter.java --- @@ -0,0 +1,110 @@ +/* --- End diff -- Changed to Abstract class to enforce the user to pass the needed parameters through the constructor. And also the concrete method `commitFile` is added to this class. ---
[GitHub] carbondata pull request #1508: [CARBONDATA-1738] Block direct insert/load on...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1508#discussion_r151637632 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -67,6 +67,12 @@ public static final String VALIDATE_CARBON_INPUT_SEGMENTS = "validate.carbon.input.segments."; /** + * Whether load/insert command is fired internally or by the user. + * Used to block load/insert on pre-aggregate if fired by user + */ + public static final String IS_INTERNAL_LOAD_CALL = "is.internal.load.call"; --- End diff -- Seems no testcase for this option. And the option name should start with `carbon` ---
[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1471#discussion_r151637852 --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java --- @@ -574,7 +482,9 @@ private CopyThread(String fileName) { * @throws Exception if unable to compute a result */ @Override public Void call() throws Exception { - copyCarbonDataFileToCarbonStorePath(fileName); + CarbonUtil.copyCarbonDataFileToCarbonStorePath(fileName, --- End diff -- ok ---
[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1471#discussion_r151638182 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java --- @@ -755,7 +758,8 @@ private CarbonInputSplit convertToCarbonInputSplit(ExtendedBlocklet blocklet) org.apache.carbondata.hadoop.CarbonInputSplit.from(blocklet.getSegmentId(), new FileSplit(new Path(blocklet.getPath()), 0, blocklet.getLength(), blocklet.getLocations()), -ColumnarFormatVersion.valueOf((short) blocklet.getDetailInfo().getVersionNumber())); +ColumnarFormatVersion.valueOf((short) blocklet.getDetailInfo().getVersionNumber()), +blocklet.getDataMapWriterPath()); --- End diff -- ok ---
[GitHub] carbondata pull request #1503: [CARBONDATA-1730] Support skip.header.line.co...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1503#discussion_r151638177 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonHiveSessionState.scala --- @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import org.apache.spark.sql._ +import org.apache.spark.sql.catalyst.analysis.Analyzer +import org.apache.spark.sql.execution.SparkPlanner +import org.apache.spark.sql.execution.datasources._ + + +/** + * A class that holds all session-specific state in a given [[SparkSession]] backed by Hive. + */ +private[hive] class CarbonHiveSessionState(sparkSession: SparkSession) --- End diff -- Can you just add the necessary part to support `skip.header.line.count` option without copying the whole class from spark? ---
[GitHub] carbondata issue #1521: [WIP] [CARBONDATA-1743] fix conurrent pre-agg creati...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1521 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1220/ ---
[GitHub] carbondata pull request #1508: [CARBONDATA-1738] Block direct insert/load on...
Github user kunal642 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1508#discussion_r151638473 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -67,6 +67,12 @@ public static final String VALIDATE_CARBON_INPUT_SEGMENTS = "validate.carbon.input.segments."; /** + * Whether load/insert command is fired internally or by the user. + * Used to block load/insert on pre-aggregate if fired by user + */ + public static final String IS_INTERNAL_LOAD_CALL = "is.internal.load.call"; --- End diff -- This option/property will not be exposed to the user. It will be set by the post load listener to know whether the load is fired by the user or is it an internal call. Test case is added in TestPreAggregateLoad ---
[GitHub] carbondata issue #1491: [CARBONDATA-1651] [Supported Boolean Type When Savin...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/1491 retest this please ---
[jira] [Created] (CARBONDATA-1757) Carbon 1.3.0- Pre_aggregate: After creating datamap on parent table, avg is not correct.
Ayushi Sharma created CARBONDATA-1757: - Summary: Carbon 1.3.0- Pre_aggregate: After creating datamap on parent table, avg is not correct. Key: CARBONDATA-1757 URL: https://issues.apache.org/jira/browse/CARBONDATA-1757 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 1.3.0 Reporter: Ayushi Sharma Steps: 1. create table cust_2 (c_custkey int, c_name string, c_address string, c_nationkey bigint, c_phone string,c_acctbal decimal, c_mktsegment string, c_comment string) STORED BY 'org.apache.carbondata.format'; 2. load data inpath 'hdfs://hacluster/customer/customer3.csv' into table cust_2 options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment'); load data inpath 'hdfs://hacluster/customer/customer3.csv' into table cust_2 options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment'); load data inpath 'hdfs://hacluster/customer/customer4.csv' into table cust_2 options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment'); load data inpath 'hdfs://hacluster/customer/customer5.csv' into table cust_2 options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment'); load data inpath 'hdfs://hacluster/customer/customer6.csv' into table cust_2 options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment'); load data inpath 'hdfs://hacluster/customer/customer7.csv' into table cust_2 options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment'); load data inpath 'hdfs://hacluster/customer/customer8.csv' into table cust_2 options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment'); load data inpath 'hdfs://hacluster/customer/customer9.csv' into table cust_2 options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment'); load data inpath 'hdfs://hacluster/customer/customer10.csv' into table cust_2 options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment'); load data inpath 'hdfs://hacluster/customer/customer11.csv' into table cust_2 options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment'); load data inpath 'hdfs://hacluster/customer/customer12.csv' into table cust_2 options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment'); load data inpath 'hdfs://hacluster/customer/customer13.csv' into table cust_2 options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment'); load data inpath 'hdfs://hacluster/customer/customer14.csv' into table cust_2 options('DELIMITER'='|','QUOTECHAR'='"','FILEHEADER'='c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment'); 3. SELECT c_custkey, c_name, sum(c_acctbal), avg(c_acctbal) FROM cust_2 GROUP BY c_custkey, c_name; 4. set carbon.input.segments.default.cust_2=0,1; 5. SELECT c_custkey, c_name, sum(c_acctbal), avg(c_acctbal) FROM cust_2 GROUP BY c_custkey, c_name; 6. CREATE DATAMAP tt1 ON TABLE cust_2 USING "org.apache.carbondata.datamap.AggregateDataMapHandler" AS SELECT c_custkey, c_name, sum(c_acctbal), avg(c_acctbal) FROM cust_2 GROUP BY c_custkey, c_name; 7. SELECT c_custkey, c_name, sum(c_acctbal), avg(c_acctbal) FROM cust_2 GROUP BY c_custkey, c_name; 8. set carbon.input.segments.default.cust_2=*; 9. SELECT c_custkey, c_name, sum(c_acctbal), avg(c_acctbal) FROM cust_2 GROUP BY c_custkey, c_name; Issue: After creating datamap, avg is not correct Expected Output: Avg should have been displayed correctly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1471#discussion_r151639114 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/AbstractDataMapWriter.java --- @@ -0,0 +1,110 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.core.datamap.dev; + +import java.io.IOException; + +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.datastore.page.ColumnPage; +import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier; +import org.apache.carbondata.core.util.CarbonUtil; +import org.apache.carbondata.core.util.path.CarbonTablePath; + +/** + * Data Map writer + */ +public abstract class AbstractDataMapWriter { + + protected AbsoluteTableIdentifier identifier; + + protected String segmentId; + + protected String writeDirectoryPath; + + public AbstractDataMapWriter(AbsoluteTableIdentifier identifier, String segmentId, + String writeDirectoryPath) { +this.identifier = identifier; +this.segmentId = segmentId; +this.writeDirectoryPath = writeDirectoryPath; + } + + /** + * Start of new block notification. + * + * @param blockId file name of the carbondata file + */ + public abstract void onBlockStart(String blockId); + + /** + * End of block notification + */ + public abstract void onBlockEnd(String blockId); + + /** + * Start of new blocklet notification. + * + * @param blockletId sequence number of blocklet in the block + */ + public abstract void onBlockletStart(int blockletId); + + /** + * End of blocklet notification + * + * @param blockletId sequence number of blocklet in the block + */ + public abstract void onBlockletEnd(int blockletId); + + /** + * Add the column pages row to the datamap, order of pages is same as `indexColumns` in + * DataMapMeta returned in DataMapFactory. + * Implementation should copy the content of `pages` as needed, because `pages` memory + * may be freed after this method returns, if using unsafe column page. + */ + public abstract void onPageAdded(int blockletId, int pageId, ColumnPage[] pages); + + /** + * This is called during closing of writer.So after this call no more data will be sent to this + * class. + */ + public abstract void finish(); + + /** + * It copies the file from temp folder to actual folder + * + * @param dataMapFile + * @throws IOException + */ + protected void commitFile(String dataMapFile) throws IOException { --- End diff -- Basically, this method should be used inside DataMapWriter to the files once they finish writing in it. It is used for copying from temp location to store. If the error occurs here then it will throw to the DataMapwriter implementation. And writer implementation should handle it otherwise load fails because of the error if it thrown to fact writer ---
[jira] [Created] (CARBONDATA-1758) (Carbon1.3.0- No Inverted Index) - Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException
Chetan Bhat created CARBONDATA-1758: --- Summary: (Carbon1.3.0- No Inverted Index) - Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException Key: CARBONDATA-1758 URL: https://issues.apache.org/jira/browse/CARBONDATA-1758 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 1.3.0 Environment: 3 node cluster Reporter: Chetan Bhat Steps : In Beeline user executes the queries in sequence. CREATE TABLE uniqdata_DI_int (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES('DICTIONARY_INCLUDE'='cust_id','NO_INVERTED_INDEX'='cust_id'); LOAD DATA INPATH 'hdfs://hacluster/chetan/3000_UniqData.csv' into table uniqdata_DI_int OPTIONS('DELIMITER'=',', 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); Select count(CUST_ID) from uniqdata_DI_int; Select count(CUST_ID)*10 as multiple from uniqdata_DI_int; Select avg(CUST_ID) as average from uniqdata_DI_int; Select floor(CUST_ID) as average from uniqdata_DI_int; Select ceil(CUST_ID) as average from uniqdata_DI_int; Select ceiling(CUST_ID) as average from uniqdata_DI_int; Select CUST_ID*integer_column1 as multiple from uniqdata_DI_int; Select CUST_ID from uniqdata_DI_int where CUST_ID is null; Issue : Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException 0: jdbc:hive2://10.18.98.34:23040> Select CUST_ID from uniqdata_DI_int where CUST_ID is null; Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 79.0 failed 4 times, most recent failure: Lost task 0.3 in stage 79.0 (TID 123, BLR114278, executor 18): org.apache.spark.util.TaskCompletionListenerException: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105) at org.apache.spark.scheduler.Task.run(Task.scala:112) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace: (state=,code=0) Expected : Select column with is null for no_inverted_index column should be successful displaying the correct result set. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1758) (Carbon1.3.0- No Inverted Index) - Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/CARBONDATA-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1758: Description: Steps : In Beeline user executes the queries in sequence. CREATE TABLE uniqdata_DI_int (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES('DICTIONARY_INCLUDE'='cust_id','NO_INVERTED_INDEX'='cust_id'); LOAD DATA INPATH 'hdfs://hacluster/chetan/3000_UniqData.csv' into table uniqdata_DI_int OPTIONS('DELIMITER'=',', 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); Select count(CUST_ID) from uniqdata_DI_int; Select count(CUST_ID)*10 as multiple from uniqdata_DI_int; Select avg(CUST_ID) as average from uniqdata_DI_int; Select floor(CUST_ID) as average from uniqdata_DI_int; Select ceil(CUST_ID) as average from uniqdata_DI_int; Select ceiling(CUST_ID) as average from uniqdata_DI_int; Select CUST_ID*integer_column1 as multiple from uniqdata_DI_int; Select CUST_ID from uniqdata_DI_int where CUST_ID is null; *Issue : Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException* 0: jdbc:hive2://10.18.98.34:23040> Select CUST_ID from uniqdata_DI_int where CUST_ID is null; Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 79.0 failed 4 times, most recent failure: Lost task 0.3 in stage 79.0 (TID 123, BLR114278, executor 18): org.apache.spark.util.TaskCompletionListenerException: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105) at org.apache.spark.scheduler.Task.run(Task.scala:112) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace: (state=,code=0) Expected : Select column with is null for no_inverted_index column should be successful displaying the correct result set. was: Steps : In Beeline user executes the queries in sequence. CREATE TABLE uniqdata_DI_int (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES('DICTIONARY_INCLUDE'='cust_id','NO_INVERTED_INDEX'='cust_id'); LOAD DATA INPATH 'hdfs://hacluster/chetan/3000_UniqData.csv' into table uniqdata_DI_int OPTIONS('DELIMITER'=',', 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); Select count(CUST_ID) from uniqdata_DI_int; Select count(CUST_ID)*10 as multiple from uniqdata_DI_int; Select avg(CUST_ID) as average from uniqdata_DI_int; Select floor(CUST_ID) as average from uniqdata_DI_int; Select ceil(CUST_ID) as average from uniqdata_DI_int; Select ceiling(CUST_ID) as average from uniqdata_DI_int; Select CUST_ID*integer_column1 as multiple from uniqdata_DI_int; Select CUST_ID from uniqdata_DI_int where CUST_ID is null; Issue : Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException 0: jdbc:hive2://10.18.98.34:23040> Select CUST_ID from uniqdata_DI_int where CUST_ID is null; Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 79.0 failed 4 times, most recent failure: Lost task 0.3 in stage 79.0 (TID 123, BLR114278, executor 18): org.apache.spark.util.TaskCompletionListenerException: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105) at org.apache.spark.scheduler.Task.run(Task.scala:112) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace: (state=,code=0) Expected : Select column wit
[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...
Github user dhatchayani commented on the issue: https://github.com/apache/carbondata/pull/1520 retest sdv please ---
[GitHub] carbondata issue #1435: [CARBONDATA-1626]add data size and index size in tab...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1435 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1221/ ---
[jira] [Created] (CARBONDATA-1759) Carbon1.3.0 Clean command is not working correctly for segments marked for delete due to insert overwrite job
Ajeet Rai created CARBONDATA-1759: - Summary: Carbon1.3.0 Clean command is not working correctly for segments marked for delete due to insert overwrite job Key: CARBONDATA-1759 URL: https://issues.apache.org/jira/browse/CARBONDATA-1759 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 1.3.0 Environment: 3 Node ant cluster Reporter: Ajeet Rai Carbon1.3.0 Clean command is not working correctly for segments marked for delete due to insert overwrite job. 1: Create a table CREATE TABLE IF NOT EXISTS flow_carbon_new999(txn_dte String,dt String,txn_bk String,txn_br String,own_bk String,own_br String,opp_bk String,bus_opr_cde String,opt_prd_cde String,cus_no String,cus_ac String,opp_ac_nme String,opp_ac String,bv_no String,aco_ac String,ac_dte String,txn_cnt int,jrn_par int,mfm_jrn_no String,cbn_jrn_no String,ibs_jrn_no String,vch_no String,vch_seq String,srv_cde String,bus_cd_no String,id_flg String,bv_cde String,txn_time String,txn_tlr String,ety_tlr String,ety_bk String,ety_br String,bus_pss_no String,chk_flg String,chk_tlr String,chk_jrn_no String, bus_sys_no String,txn_sub_cde String,fin_bus_cde String,fin_bus_sub_cde String,chl String,tml_id String,sus_no String,sus_seq String, cho_seq String, itm_itm String,itm_sub String,itm_sss String,dc_flg String,amt decimal(15,2),bal decimal(15,2),ccy String,spv_flg String,vch_vld_dte String,pst_bk String,pst_br String,ec_flg String,aco_tlr String,gen_flg String,his_rec_sum_flg String,his_flg String,vch_typ String,val_dte String,opp_ac_flg String,cmb_flg String,ass_vch_flg String,cus_pps_flg String,bus_rmk_cde String,vch_bus_rmk String,tec_rmk_cde String,vch_tec_rmk String,gems_last_upd_d String,maps_date String,maps_job String)STORED BY 'org.apache.carbondata.format' TBLPROPERTIES('DICTIONARY_INCLUDE'='txn_cnt,jrn_par,amt,bal','No_Inverted_Index'= 'txn_dte,dt,txn_bk,txn_br,own_bk ,own_br ,opp_bk ,bus_opr_cde ,opt_prd_cde ,cus_no ,cus_ac ,opp_ac_nme ,opp_ac ,bv_no ,aco_ac ,ac_dte ,txn_cnt ,jrn_par ,mfm_jrn_no ,cbn_jrn_no ,ibs_jrn_no ,vch_no ,vch_seq ,srv_cde ,bus_cd_no ,id_flg ,bv_cde ,txn_time ,txn_tlr ,ety_tlr ,ety_bk ,ety_br ,bus_pss_no ,chk_flg ,chk_tlr ,chk_jrn_no , bus_sys_no ,txn_sub_cde ,fin_bus_cde ,fin_bus_sub_cde ,chl ,tml_id ,sus_no ,sus_seq , cho_seq , itm_itm ,itm_sub ,itm_sss ,dc_flg ,amt,bal,ccy ,spv_flg ,vch_vld_dte ,pst_bk ,pst_br ,ec_flg ,aco_tlr ,gen_flg ,his_rec_sum_flg ,his_flg ,vch_typ ,val_dte ,opp_ac_flg ,cmb_flg ,ass_vch_flg ,cus_pps_flg ,bus_rmk_cde ,vch_bus_rmk ,tec_rmk_cde ,vch_tec_rmk ,gems_last_upd_d ,maps_date ,maps_job' ); 2: start a data load. LOAD DATA inpath 'hdfs://hacluster/user/test/20140101_1_1.csv' into table flow_carbon_new999 options('DELIMITER'=',', 'QUOTECHAR'='"','header'='false'); 3: run a insert overwrite job insert into table flow_carbon_new999 select * from flow_carbon_new666; 4: run show segment query: show segments for table ajeet.flow_carbon_new999 5: Observe that all previous segments are marked for delete 6: run clean query CLEAN FILES FOR TABLE ajeet.flow_carbon_new999; 7: again run show segment query 8: Observe that still all previous segments which are marked for delete are shown as result. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1759) (Carbon1.3.0 - Clean Files) Clean command is not working correctly for segments marked for delete due to insert overwrite job
[ https://issues.apache.org/jira/browse/CARBONDATA-1759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1759: Summary: (Carbon1.3.0 - Clean Files) Clean command is not working correctly for segments marked for delete due to insert overwrite job (was: Carbon1.3.0 Clean command is not working correctly for segments marked for delete due to insert overwrite job) > (Carbon1.3.0 - Clean Files) Clean command is not working correctly for > segments marked for delete due to insert overwrite job > -- > > Key: CARBONDATA-1759 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1759 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.3.0 > Environment: 3 Node ant cluster >Reporter: Ajeet Rai > Labels: dfx > > Carbon1.3.0 Clean command is not working correctly for segments marked for > delete due to insert overwrite job. > 1: Create a table > CREATE TABLE IF NOT EXISTS flow_carbon_new999(txn_dte String,dt String,txn_bk > String,txn_br String,own_bk String,own_br String,opp_bk String,bus_opr_cde > String,opt_prd_cde String,cus_no String,cus_ac String,opp_ac_nme > String,opp_ac String,bv_no String,aco_ac String,ac_dte String,txn_cnt > int,jrn_par int,mfm_jrn_no String,cbn_jrn_no String,ibs_jrn_no String,vch_no > String,vch_seq String,srv_cde String,bus_cd_no String,id_flg String,bv_cde > String,txn_time String,txn_tlr String,ety_tlr String,ety_bk String,ety_br > String,bus_pss_no String,chk_flg String,chk_tlr String,chk_jrn_no String, > bus_sys_no String,txn_sub_cde String,fin_bus_cde String,fin_bus_sub_cde > String,chl String,tml_id String,sus_no String,sus_seq String, cho_seq > String, itm_itm String,itm_sub String,itm_sss String,dc_flg String,amt > decimal(15,2),bal decimal(15,2),ccy String,spv_flg String,vch_vld_dte > String,pst_bk String,pst_br String,ec_flg String,aco_tlr String,gen_flg > String,his_rec_sum_flg String,his_flg String,vch_typ String,val_dte > String,opp_ac_flg String,cmb_flg String,ass_vch_flg String,cus_pps_flg > String,bus_rmk_cde String,vch_bus_rmk String,tec_rmk_cde String,vch_tec_rmk > String,gems_last_upd_d String,maps_date String,maps_job String)STORED BY > 'org.apache.carbondata.format' > TBLPROPERTIES('DICTIONARY_INCLUDE'='txn_cnt,jrn_par,amt,bal','No_Inverted_Index'= > 'txn_dte,dt,txn_bk,txn_br,own_bk ,own_br ,opp_bk ,bus_opr_cde ,opt_prd_cde > ,cus_no ,cus_ac ,opp_ac_nme ,opp_ac ,bv_no ,aco_ac ,ac_dte ,txn_cnt > ,jrn_par ,mfm_jrn_no ,cbn_jrn_no ,ibs_jrn_no ,vch_no ,vch_seq ,srv_cde > ,bus_cd_no ,id_flg ,bv_cde ,txn_time ,txn_tlr ,ety_tlr ,ety_bk ,ety_br > ,bus_pss_no ,chk_flg ,chk_tlr ,chk_jrn_no , bus_sys_no ,txn_sub_cde > ,fin_bus_cde ,fin_bus_sub_cde ,chl ,tml_id ,sus_no ,sus_seq , cho_seq , > itm_itm ,itm_sub ,itm_sss ,dc_flg ,amt,bal,ccy ,spv_flg ,vch_vld_dte ,pst_bk > ,pst_br ,ec_flg ,aco_tlr ,gen_flg ,his_rec_sum_flg ,his_flg ,vch_typ ,val_dte > ,opp_ac_flg ,cmb_flg ,ass_vch_flg ,cus_pps_flg ,bus_rmk_cde ,vch_bus_rmk > ,tec_rmk_cde ,vch_tec_rmk ,gems_last_upd_d ,maps_date ,maps_job' ); > 2: start a data load. > LOAD DATA inpath 'hdfs://hacluster/user/test/20140101_1_1.csv' into > table flow_carbon_new999 options('DELIMITER'=',', > 'QUOTECHAR'='"','header'='false'); > 3: run a insert overwrite job > insert into table flow_carbon_new999 select * from flow_carbon_new666; > 4: run show segment query: > show segments for table ajeet.flow_carbon_new999 > 5: Observe that all previous segments are marked for delete > 6: run clean query > CLEAN FILES FOR TABLE ajeet.flow_carbon_new999; > 7: again run show segment query > 8: Observe that still all previous segments which are marked for delete are > shown as result. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1760) Carbon 1.3.0- Pre_aggregate: Proper Error message should be displayed, when parent table name is not correct while creating datamap.
Ayushi Sharma created CARBONDATA-1760: - Summary: Carbon 1.3.0- Pre_aggregate: Proper Error message should be displayed, when parent table name is not correct while creating datamap. Key: CARBONDATA-1760 URL: https://issues.apache.org/jira/browse/CARBONDATA-1760 Project: CarbonData Issue Type: Bug Components: sql Affects Versions: 1.3.0 Reporter: Ayushi Sharma Priority: Minor Steps: 1. CREATE DATAMAP tt3 ON TABLE cust_2 USING "org.apache.carbondata.datamap.AggregateDataMapHandler" AS SELECT c_custkey, c_name, sum(c_acctbal), avg(c_acctbal), count(c_acctbal) FROM tstcust GROUP BY c_custkey, c_name; Issue: Proper error message is not displayed. It throws "assertion failed" error. Expected: Proper error message should be displayed, if parent table name has any ambiguity. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1760) Carbon 1.3.0- Pre_aggregate: Proper Error message should be displayed, when parent table name is not correct while creating datamap.
[ https://issues.apache.org/jira/browse/CARBONDATA-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayushi Sharma updated CARBONDATA-1760: -- Labels: dfx (was: ) > Carbon 1.3.0- Pre_aggregate: Proper Error message should be displayed, when > parent table name is not correct while creating datamap. > > > Key: CARBONDATA-1760 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1760 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 1.3.0 >Reporter: Ayushi Sharma >Priority: Minor > Labels: dfx > > Steps: > 1. CREATE DATAMAP tt3 ON TABLE cust_2 USING > "org.apache.carbondata.datamap.AggregateDataMapHandler" AS SELECT c_custkey, > c_name, sum(c_acctbal), avg(c_acctbal), count(c_acctbal) FROM tstcust GROUP > BY c_custkey, c_name; > Issue: > Proper error message is not displayed. It throws "assertion failed" error. > Expected: > Proper error message should be displayed, if parent table name has any > ambiguity. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1520 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1222/ ---
[GitHub] carbondata issue #1523: [CARBONDATA-1756] Improve Boolean data compress rate...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1523 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1223/ ---
[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...
Github user dhatchayani commented on the issue: https://github.com/apache/carbondata/pull/1520 Retest this please ---
[GitHub] carbondata issue #1515: [CARBONDATA-1751] Modify sys.err to AnalysisExceptio...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1515 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1224/ ---
[jira] [Created] (CARBONDATA-1761) (Carbon1.3.0 - DELETE SEGMENT BY ID) In Progress Segment is marked for delete if respective id is given in delete segment by id query
Ajeet Rai created CARBONDATA-1761: - Summary: (Carbon1.3.0 - DELETE SEGMENT BY ID) In Progress Segment is marked for delete if respective id is given in delete segment by id query Key: CARBONDATA-1761 URL: https://issues.apache.org/jira/browse/CARBONDATA-1761 Project: CarbonData Issue Type: Bug Components: data-load Affects Versions: 1.3.0 Environment: 3 Node ant cluster Description Reporter: Ajeet Rai (Carbon1.3.0 - DELETE SEGMENT BY ID) In Progress Segment is marked for delete if respective id is given in delete segment by id query. 1: Create a table CREATE TABLE IF NOT EXISTS flow_carbon_new999(txn_dte String,dt String,txn_bk String,txn_br String,own_bk String,own_br String,opp_bk String,bus_opr_cde String,opt_prd_cde String,cus_no String,cus_ac String,opp_ac_nme String,opp_ac String,bv_no String,aco_ac String,ac_dte String,txn_cnt int,jrn_par int,mfm_jrn_no String,cbn_jrn_no String,ibs_jrn_no String,vch_no String,vch_seq String,srv_cde String,bus_cd_no String,id_flg String,bv_cde String,txn_time String,txn_tlr String,ety_tlr String,ety_bk String,ety_br String,bus_pss_no String,chk_flg String,chk_tlr String,chk_jrn_no String, bus_sys_no String,txn_sub_cde String,fin_bus_cde String,fin_bus_sub_cde String,chl String,tml_id String,sus_no String,sus_seq String, cho_seq String, itm_itm String,itm_sub String,itm_sss String,dc_flg String,amt decimal(15,2),bal decimal(15,2),ccy String,spv_flg String,vch_vld_dte String,pst_bk String,pst_br String,ec_flg String,aco_tlr String,gen_flg String,his_rec_sum_flg String,his_flg String,vch_typ String,val_dte String,opp_ac_flg String,cmb_flg String,ass_vch_flg String,cus_pps_flg String,bus_rmk_cde String,vch_bus_rmk String,tec_rmk_cde String,vch_tec_rmk String,gems_last_upd_d String,maps_date String,maps_job String)STORED BY 'org.apache.carbondata.format' TBLPROPERTIES('DICTIONARY_INCLUDE'='txn_cnt,jrn_par,amt,bal','No_Inverted_Index'= 'txn_dte,dt,txn_bk,txn_br,own_bk ,own_br ,opp_bk ,bus_opr_cde ,opt_prd_cde ,cus_no ,cus_ac ,opp_ac_nme ,opp_ac ,bv_no ,aco_ac ,ac_dte ,txn_cnt ,jrn_par ,mfm_jrn_no ,cbn_jrn_no ,ibs_jrn_no ,vch_no ,vch_seq ,srv_cde ,bus_cd_no ,id_flg ,bv_cde ,txn_time ,txn_tlr ,ety_tlr ,ety_bk ,ety_br ,bus_pss_no ,chk_flg ,chk_tlr ,chk_jrn_no , bus_sys_no ,txn_sub_cde ,fin_bus_cde ,fin_bus_sub_cde ,chl ,tml_id ,sus_no ,sus_seq , cho_seq , itm_itm ,itm_sub ,itm_sss ,dc_flg ,amt,bal,ccy ,spv_flg ,vch_vld_dte ,pst_bk ,pst_br ,ec_flg ,aco_tlr ,gen_flg ,his_rec_sum_flg ,his_flg ,vch_typ ,val_dte ,opp_ac_flg ,cmb_flg ,ass_vch_flg ,cus_pps_flg ,bus_rmk_cde ,vch_bus_rmk ,tec_rmk_cde ,vch_tec_rmk ,gems_last_upd_d ,maps_date ,maps_job' ); 2: start a data load. LOAD DATA inpath 'hdfs://hacluster/user/test/20140101_1_1.csv' into table flow_carbon_new999 options('DELIMITER'=',', 'QUOTECHAR'='"','header'='false'); 3: run a insert into/overwrite job insert into table flow_carbon_new999 select * from flow_carbon_new666; 4: show segments for table flow_carbon_new999; 5: Observe that load/insert/overwrite job is started with new segment id 6: now run a delete segment by id query with this id. DELETE FROM TABLE ajeet.flow_carbon_new999 WHERE SEGMENT.ID IN (34) 7: again run show segment and see this segment which is still in progress is marked for delete. 8: Observe that insert/load job is still running and after some time(in next job of load/insert/overwrite), this job fails with below error: Error: java.lang.RuntimeException: It seems insert overwrite has been issued during load (state=,code=0) This is not correct behaviour and it should be handled. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1762) Remove existing column level dateformat and support dateformat, timestampformat in the load option
dhatchayani created CARBONDATA-1762: --- Summary: Remove existing column level dateformat and support dateformat, timestampformat in the load option Key: CARBONDATA-1762 URL: https://issues.apache.org/jira/browse/CARBONDATA-1762 Project: CarbonData Issue Type: Improvement Reporter: dhatchayani Assignee: Akash R Nilugal -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1763) Carbon1.3.0-Pre-AggregateTable - Recreating a failed pre-aggregate table fails due to table exists
Ramakrishna S created CARBONDATA-1763: - Summary: Carbon1.3.0-Pre-AggregateTable - Recreating a failed pre-aggregate table fails due to table exists Key: CARBONDATA-1763 URL: https://issues.apache.org/jira/browse/CARBONDATA-1763 Project: CarbonData Issue Type: Bug Components: data-load Affects Versions: 1.3.0 Environment: Test - 3 node ant cluster Reporter: Ramakrishna S Assignee: Kunal Kapoor Fix For: 1.3.0 Steps: 1. Create table and load with large data create table if not exists lineitem4(L_SHIPDATE string,L_SHIPMODE string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY string,L_LINENUMBER int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT string) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'=''); load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table lineitem4 options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT'); 2. Create a pre-aggregate table create datamap agr_lineitem4 ON TABLE lineitem4 USING "org.apache.carbondata.datamap.AggregateDataMapHandler" as select L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem4 group by L_RETURNFLAG, L_LINESTATUS; 3. Run aggregate query at the same time select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 group by l_returnflag, l_linestatus; *+Expected:+*: aggregate query should fetch data either from main table or pre-aggregate table. *+Actual:+* aggregate query does not return data until the pre-aggregate table is created 0: jdbc:hive2://10.18.98.48:23040> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 group by l_returnflag, l_linestatus; +---+---+--+---+--+ | l_returnflag | l_linestatus | sum(l_quantity) | sum(l_extendedprice) | +---+---+--+---+--+ +---+---+--+---+--+ No rows selected (1.74 seconds) 0: jdbc:hive2://10.18.98.48:23040> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 group by l_returnflag, l_linestatus; +---+---+--+---+--+ | l_returnflag | l_linestatus | sum(l_quantity) | sum(l_extendedprice) | +---+---+--+---+--+ +---+---+--+---+--+ No rows selected (0.746 seconds) 0: jdbc:hive2://10.18.98.48:23040> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 group by l_returnflag, l_linestatus; +---+---+--++--+ | l_returnflag | l_linestatus | sum(l_quantity) | sum(l_extendedprice) | +---+---+--++--+ | N | F | 2.9808092E7 | 4.471079473931997E10 | | A | F | 1.145546488E9| 1.717580824169429E12 | | N | O | 2.31980219E9 | 3.4789002701143467E12 | | R | F | 1.146403932E9| 1.7190627928317903E12 | +---+---+--++--+ 4 rows selected (0.8 seconds) 0: jdbc:hive2://10.18.98.48:23040> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 group by l_returnflag, l_linestatus; +---+---+--++--+ | l_returnflag | l_linestatus | sum(l_quantity) | sum(l_extendedprice) | +---+---+--++--+ | N | F | 2.9808092E7 | 4.471079473931997E10 | | A | F | 1.145546488E9| 1.717580824169429E12 | | N | O | 2.31980219E9 | 3.4789002701143467E12 | | R | F | 1.146403932E9| 1.7190627928317903E12 | +---+---+--++--+ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata issue #1514: [CARBONDATA-1746] Count star optimization
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1514 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1225/ ---
[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...
GitHub user akashrn5 opened a pull request: https://github.com/apache/carbondata/pull/1524 [CARBONDATA-1762] Remove existing column level dateformat and support dateformat, timestampformat in the load option (1) Remove column level dateformat option (2) Support dateformat and timestampformat in load options(table level) Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [X] Document update required? 2 new load level properties are added. Document to be updated accordingly. - [X] Testing done UT Added - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/akashrn5/incubator-carbondata timeformat Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1524.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1524 commit f7b253c672c4eed21466806cd4bc9990264a8a37 Author: akashrn5 Date: 2017-11-17T11:25:33Z [CARBONDATA-1762] Remove existing column level dateformat and support dateformat, timestampformat in the load option ---
[jira] [Updated] (CARBONDATA-1763) Carbon1.3.0-Pre-AggregateTable - Recreating a failed pre-aggregate table fails due to table exists
[ https://issues.apache.org/jira/browse/CARBONDATA-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna S updated CARBONDATA-1763: -- Description: Steps: 1. Create table and load with data 2. Run update query on the table - this will take table metalock 3. In parallel run the pre-aggregate table create step - this will not be allowed due to table lock 4. Rerun pre-aggegate table create step *+Expected:+* Pre-aggregate table should be created *+Actual:+* Pre-aggregate table creation fails +Create, Load & Update+: 0: jdbc:hive2://10.18.98.136:23040> create table if not exists lineitem4(L_SHIPDATE string,L_SHIPMODE string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY string,L_LINENUMBER int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT string) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'=''); +-+--+ | Result | +-+--+ +-+--+ No rows selected (0.266 seconds) 0: jdbc:hive2://10.18.98.136:23040> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.5" into table lineitem4 options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT'); +-+--+ | Result | +-+--+ +-+--+ No rows selected (6.331 seconds) 0: jdbc:hive2://10.18.98.136:23040> update lineitem4 set (l_linestatus) = ('xx'); +Create Datamap:+ 0: jdbc:hive2://10.18.98.136:23040> create datamap agr_lineitem4 ON TABLE lineitem4 USING "org.apache.carbondata.datamap.AggregateDataMapHandler" as select l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) from lineitem4 group by l_returnflag, l_linestatus; Error: java.lang.RuntimeException: Acquire table lock failed after retry, please try after some time (state=,code=0) 0: jdbc:hive2://10.18.98.136:23040> select l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) from lineitem4 group by l_returnflag, l_linestatus; +---+---+--+-++--+ | l_returnflag | l_linestatus | sum(l_quantity) | avg(l_quantity) | count(l_quantity) | +---+---+--+-++--+ | N | xx| 1.2863213E7 | 25.48745561614304 | 504688 | | A | xx| 6318125.0| 25.506342144783375 | 247708 | | R | xx| 6321939.0| 25.532459087898417 | 247604 | +---+---+--+-++--+ 3 rows selected (1.033 seconds) 0: jdbc:hive2://10.18.98.136:23040> create datamap agr_lineitem4 ON TABLE lineitem4 USING "org.apache.carbondata.datamap.AggregateDataMapHandler" as select l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) from lineitem4 group by l_returnflag, l_linestatus; Error: java.lang.RuntimeException: Table [lineitem4_agr_lineitem4] already exists under database [test_db1] (state=,code=0) was: Steps: 1. Create table and load with large data create table if not exists lineitem4(L_SHIPDATE string,L_SHIPMODE string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY string,L_LINENUMBER int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT string) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'=''); load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table lineitem4 options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT'); 2. Create a pre-aggregate table create datamap agr_lineitem4 ON TABLE lineitem4 USING "org.apache.carbondata.datamap.AggregateDataMapHandler" as select L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem4 group by L_RETURNFLAG, L_LINESTATUS; 3. Run aggregate query at the same time select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 group by l_returnflag, l_linestatus; *+Expected:+*: aggregate query should fetch data either from main table or pre-aggr
[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1435#discussion_r151665636 --- Diff: integration/spark2/src/test/scala/org/apache/spark/sql/GetDataSizeAndIndexSizeTest.scala --- @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql + +import org.apache.spark.sql.test.util.QueryTest +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.scalatest.BeforeAndAfterAll + +class GetDataSizeAndIndexSizeTest extends QueryTest with BeforeAndAfterAll { --- End diff -- Please add testcase of update scenerio ---
[GitHub] carbondata issue #1491: [CARBONDATA-1651] [Supported Boolean Type When Savin...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1491 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1226/ ---
[GitHub] carbondata issue #1435: [CARBONDATA-1626]add data size and index size in tab...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1435 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1227/ ---
[jira] [Updated] (CARBONDATA-1726) Carbon1.3.0-Streaming - Select query from spark-shell does not execute successfully for streaming table load
[ https://issues.apache.org/jira/browse/CARBONDATA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1726: Description: Steps : // prepare csv file for batch loading cd /srv/spark2.2Bigdata/install/hadoop/datanode/bin // generate streamSample.csv 10001,batch_1,city_1,0.1,school_1:school_11$20 10002,batch_2,city_2,0.2,school_2:school_22$30 10003,batch_3,city_3,0.3,school_3:school_33$40 10004,batch_4,city_4,0.4,school_4:school_44$50 10005,batch_5,city_5,0.5,school_5:school_55$60 // put to hdfs /tmp/streamSample.csv ./hadoop fs -put streamSample.csv /tmp // spark-beeline cd /srv/spark2.2Bigdata/install/spark/sparkJdbc bin/spark-submit --master yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G --num-executors 3 --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar "hdfs://hacluster/user/sparkhive/warehouse" bin/beeline -u jdbc:hive2://10.18.98.34:23040 CREATE TABLE stream_table( id INT, name STRING, city STRING, salary FLOAT ) STORED BY 'carbondata' TBLPROPERTIES('streaming'='true', 'sort_columns'='name'); LOAD DATA LOCAL INPATH 'hdfs://hacluster/chetan/streamSample.csv' INTO TABLE stream_table OPTIONS('HEADER'='false'); // spark-shell cd /srv/spark2.2Bigdata/install/spark/sparkJdbc bin/spark-shell --master yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G --num-executors 3 --jars /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar import java.io.{File, PrintWriter} import java.net.ServerSocket import org.apache.spark.sql.{CarbonEnv, SparkSession} import org.apache.spark.sql.hive.CarbonRelation import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery} import org.apache.carbondata.core.constants.CarbonCommonConstants import org.apache.carbondata.core.util.CarbonProperties import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath} CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd") import org.apache.spark.sql.CarbonSession._ val carbonSession = SparkSession. builder(). appName("StreamExample"). config("spark.sql.warehouse.dir", "hdfs://hacluster/user/sparkhive/warehouse"). config("javax.jdo.option.ConnectionURL", "jdbc:mysql://10.18.98.34:3306/sparksql?characterEncoding=UTF-8"). config("javax.jdo.option.ConnectionDriverName", "com.mysql.jdbc.Driver"). config("javax.jdo.option.ConnectionPassword", "huawei"). config("javax.jdo.option.ConnectionUserName", "sparksql"). getOrCreateCarbonSession() carbonSession.sparkContext.setLogLevel("ERROR") carbonSession.sql("select * from stream_table").show def writeSocket(serverSocket: ServerSocket): Thread = { val thread = new Thread() { override def run(): Unit = { // wait for client to connection request and accept val clientSocket = serverSocket.accept() val socketWriter = new PrintWriter(clientSocket.getOutputStream()) var index = 0 for (_ <- 1 to 1000) { // write 5 records per iteration for (_ <- 0 to 100) { index = index + 1 socketWriter.println(index.toString + ",name_" + index + ",city_" + index + "," + (index * 1.00).toString + ",school_" + index + ":school_" + index + index + "$" + index) } socketWriter.flush() Thread.sleep(2000) } socketWriter.close() System.out.println("Socket closed") } } thread.start() thread } def startStreaming(spark: SparkSession, tablePath: CarbonTablePath): Thread = { val thread = new Thread() { override def run(): Unit = { var qry: StreamingQuery = null try { val readSocketDF = spark.readStream .format("socket") .option("host", "10.18.98.34") .option("port", 7071) .load() // Write data from socket stream to carbondata file qry = readSocketDF.writeStream .format("carbondata") .trigger(ProcessingTime("5 seconds")) .option("checkpointLocation", tablePath.getStreamingCheckpointDir) .option("tablePath", tablePath.getPath) .start() qry.awaitTermination() } catch { case _: InterruptedException => println("Done reading and writing streaming data") } finally { qry.stop() } } } thread.start() thread } val streamTableName = s"stream_table" val carbonTable = CarbonEnv.getInstance(carbonSession).carbonMetastore. lookupRelation(Some("default"), streamTableName)(carbonSession).asInstanceOf[CarbonRelation]. tableMeta.carbonTable val tablePath = CarbonStorePath.getCarbonTablePath(carbonTable.get
[jira] [Updated] (CARBONDATA-1726) Carbon1.3.0-Streaming - Select query from spark-shell does not execute successfully for streaming table load
[ https://issues.apache.org/jira/browse/CARBONDATA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1726: Description: Steps : // prepare csv file for batch loading cd /srv/spark2.2Bigdata/install/hadoop/datanode/bin // generate streamSample.csv 10001,batch_1,city_1,0.1,school_1:school_11$20 10002,batch_2,city_2,0.2,school_2:school_22$30 10003,batch_3,city_3,0.3,school_3:school_33$40 10004,batch_4,city_4,0.4,school_4:school_44$50 10005,batch_5,city_5,0.5,school_5:school_55$60 // put to hdfs /tmp/streamSample.csv ./hadoop fs -put streamSample.csv /tmp // spark-beeline cd /srv/spark2.2Bigdata/install/spark/sparkJdbc bin/spark-submit --master yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G --num-executors 3 --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar "hdfs://hacluster/user/sparkhive/warehouse" bin/beeline -u jdbc:hive2://10.18.98.34:23040 CREATE TABLE stream_table( id INT, name STRING, city STRING, salary FLOAT ) STORED BY 'carbondata' TBLPROPERTIES('streaming'='true', 'sort_columns'='name'); LOAD DATA LOCAL INPATH 'hdfs://hacluster/chetan/streamSample.csv' INTO TABLE stream_table OPTIONS('HEADER'='false'); // spark-shell cd /srv/spark2.2Bigdata/install/spark/sparkJdbc bin/spark-shell --master yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G --num-executors 3 --jars /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar import java.io.{File, PrintWriter} import java.net.ServerSocket import org.apache.spark.sql.{CarbonEnv, SparkSession} import org.apache.spark.sql.hive.CarbonRelation import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery} import org.apache.carbondata.core.constants.CarbonCommonConstants import org.apache.carbondata.core.util.CarbonProperties import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath} CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd") import org.apache.spark.sql.CarbonSession._ val carbonSession = SparkSession. builder(). appName("StreamExample"). config("spark.sql.warehouse.dir", "hdfs://hacluster/user/sparkhive/warehouse"). config("javax.jdo.option.ConnectionURL", "jdbc:mysql://10.18.98.34:3306/sparksql?characterEncoding=UTF-8"). config("javax.jdo.option.ConnectionDriverName", "com.mysql.jdbc.Driver"). config("javax.jdo.option.ConnectionPassword", "huawei"). config("javax.jdo.option.ConnectionUserName", "sparksql"). getOrCreateCarbonSession() carbonSession.sparkContext.setLogLevel("ERROR") carbonSession.sql("select * from stream_table").show def writeSocket(serverSocket: ServerSocket): Thread = { val thread = new Thread() { override def run(): Unit = { // wait for client to connection request and accept val clientSocket = serverSocket.accept() val socketWriter = new PrintWriter(clientSocket.getOutputStream()) var index = 0 for (_ <- 1 to 1000) { // write 5 records per iteration for (_ <- 0 to 100) { index = index + 1 socketWriter.println(index.toString + ",name_" + index + ",city_" + index + "," + (index * 1.00).toString + ",school_" + index + ":school_" + index + index + "$" + index) } socketWriter.flush() Thread.sleep(2000) } socketWriter.close() System.out.println("Socket closed") } } thread.start() thread } def startStreaming(spark: SparkSession, tablePath: CarbonTablePath): Thread = { val thread = new Thread() { override def run(): Unit = { var qry: StreamingQuery = null try { val readSocketDF = spark.readStream .format("socket") .option("host", "10.18.98.34") .option("port", 7071) .load() // Write data from socket stream to carbondata file qry = readSocketDF.writeStream .format("carbondata") .trigger(ProcessingTime("5 seconds")) .option("checkpointLocation", tablePath.getStreamingCheckpointDir) .option("tablePath", tablePath.getPath) .start() qry.awaitTermination() } catch { case _: InterruptedException => println("Done reading and writing streaming data") } finally { qry.stop() } } } thread.start() thread } val streamTableName = s"stream_table" val carbonTable = CarbonEnv.getInstance(carbonSession).carbonMetastore. lookupRelation(Some("default"), streamTableName)(carbonSession).asInstanceOf[CarbonRelation]. tableMeta.carbonTable val tablePath = CarbonStorePath.getCarbonTablePath(carbonTable.get
[jira] [Updated] (CARBONDATA-1726) Carbon1.3.0-Streaming - Null pointer exception is thrown when streaming is started in spark-shell
[ https://issues.apache.org/jira/browse/CARBONDATA-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1726: Summary: Carbon1.3.0-Streaming - Null pointer exception is thrown when streaming is started in spark-shell (was: Carbon1.3.0-Streaming - Select query from spark-shell does not execute successfully for streaming table load) > Carbon1.3.0-Streaming - Null pointer exception is thrown when streaming is > started in spark-shell > - > > Key: CARBONDATA-1726 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1726 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.3.0 > Environment: 3 node ant cluster SUSE 11 SP4 >Reporter: Chetan Bhat >Priority: Blocker > Labels: Functional > > Steps : > // prepare csv file for batch loading > cd /srv/spark2.2Bigdata/install/hadoop/datanode/bin > // generate streamSample.csv > 10001,batch_1,city_1,0.1,school_1:school_11$20 > 10002,batch_2,city_2,0.2,school_2:school_22$30 > 10003,batch_3,city_3,0.3,school_3:school_33$40 > 10004,batch_4,city_4,0.4,school_4:school_44$50 > 10005,batch_5,city_5,0.5,school_5:school_55$60 > // put to hdfs /tmp/streamSample.csv > ./hadoop fs -put streamSample.csv /tmp > // spark-beeline > cd /srv/spark2.2Bigdata/install/spark/sparkJdbc > bin/spark-submit --master yarn-client --executor-memory 10G --executor-cores > 5 --driver-memory 5G --num-executors 3 --class > org.apache.carbondata.spark.thriftserver.CarbonThriftServer > /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar > "hdfs://hacluster/user/sparkhive/warehouse" > bin/beeline -u jdbc:hive2://10.18.98.34:23040 > CREATE TABLE stream_table( > id INT, > name STRING, > city STRING, > salary FLOAT > ) > STORED BY 'carbondata' > TBLPROPERTIES('streaming'='true', 'sort_columns'='name'); > LOAD DATA LOCAL INPATH 'hdfs://hacluster/chetan/streamSample.csv' INTO TABLE > stream_table OPTIONS('HEADER'='false'); > // spark-shell > cd /srv/spark2.2Bigdata/install/spark/sparkJdbc > bin/spark-shell --master yarn-client --executor-memory 10G --executor-cores 5 > --driver-memory 5G --num-executors 3 --jars > /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar > import java.io.{File, PrintWriter} > import java.net.ServerSocket > import org.apache.spark.sql.{CarbonEnv, SparkSession} > import org.apache.spark.sql.hive.CarbonRelation > import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery} > import org.apache.carbondata.core.constants.CarbonCommonConstants > import org.apache.carbondata.core.util.CarbonProperties > import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath} > CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, > "/MM/dd") > import org.apache.spark.sql.CarbonSession._ > val carbonSession = SparkSession. > builder(). > appName("StreamExample"). > config("spark.sql.warehouse.dir", > "hdfs://hacluster/user/sparkhive/warehouse"). > config("javax.jdo.option.ConnectionURL", > "jdbc:mysql://10.18.98.34:3306/sparksql?characterEncoding=UTF-8"). > config("javax.jdo.option.ConnectionDriverName", "com.mysql.jdbc.Driver"). > config("javax.jdo.option.ConnectionPassword", "huawei"). > config("javax.jdo.option.ConnectionUserName", "sparksql"). > getOrCreateCarbonSession() > > carbonSession.sparkContext.setLogLevel("ERROR") > carbonSession.sql("select * from stream_table").show > def writeSocket(serverSocket: ServerSocket): Thread = { > val thread = new Thread() { > override def run(): Unit = { > // wait for client to connection request and accept > val clientSocket = serverSocket.accept() > val socketWriter = new PrintWriter(clientSocket.getOutputStream()) > var index = 0 > for (_ <- 1 to 1000) { > // write 5 records per iteration > for (_ <- 0 to 100) { > index = index + 1 > socketWriter.println(index.toString + ",name_" + index >+ ",city_" + index + "," + (index * > 1.00).toString + >",school_" + index + ":school_" + index + > index + "$" + index) > } > socketWriter.flush() > Thread.sleep(2000) > } > socketWriter.close() > System.out.println("Socket closed") > } > } > thread.start() > thread > } > > def startStreaming(spark: SparkSession, tablePath: CarbonTablePath): Thread = > { > val thread = new Thread() { > override def run(): Unit = { > var qry: StreamingQuery = null > try { > val readSocketDF = spark.re
[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1524#discussion_r151673677 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java --- @@ -52,6 +52,14 @@ public static final String CARBON_OPTIONS_DATEFORMAT = "carbon.options.dateformat"; public static final String CARBON_OPTIONS_DATEFORMAT_DEFAULT = ""; + + /** + * option to specify the load option + */ + @CarbonProperty + public static final String CARBON_OPTIONS_TIMESTAMPFORMAT = + "carbon.options.dateformat"; --- End diff -- I think `carbon.options.dateformat ` should be `carbon.options.timestampformat` ---
[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1524#discussion_r151674183 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/load/ValidateUtil.scala --- @@ -17,35 +17,32 @@ package org.apache.carbondata.spark.load -import scala.collection.JavaConverters._ +import java.text.SimpleDateFormat -import org.apache.carbondata.core.constants.CarbonCommonConstants import org.apache.carbondata.core.metadata.schema.table.CarbonTable -import org.apache.carbondata.processing.loading.model.CarbonLoadModel import org.apache.carbondata.processing.loading.sort.SortScopeOptions import org.apache.carbondata.spark.exception.MalformedCarbonCommandException object ValidateUtil { - def validateDateFormat(dateFormat: String, table: CarbonTable, tableName: String): Unit = { -val dimensions = table.getDimensionByTableName(tableName).asScala + + /** + * validates both timestamp and date for illegal values + * + * @param optionValue + * @param optionName + */ + def validateDateTimeFormat(optionValue: String, optionName: String): Unit = { --- End diff -- give proper names to parameters ---
[GitHub] carbondata pull request #1524: [CARBONDATA-1762] Remove existing column leve...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1524#discussion_r151674289 --- Diff: integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -799,18 +799,19 @@ object CarbonDataRDDFactory { throw new DataLoadingException("Partition column not found.") } -val dateFormatMap = CarbonDataProcessorUtil.getDateFormatMap(carbonLoadModel.getDateFormat) -val specificFormat = Option(dateFormatMap.get(partitionColumn.toLowerCase)) -val timeStampFormat = if (specificFormat.isDefined) { - new SimpleDateFormat(specificFormat.get) +val specificTimestampFormat = carbonLoadModel.getTimestampformat +val specificDateFormat = carbonLoadModel.getDateFormat +val timeStampFormat = if (specificTimestampFormat != null && + !specificTimestampFormat.trim.isEmpty) { --- End diff -- format it properly. ---
[GitHub] carbondata issue #1521: [WIP] [CARBONDATA-1743] fix conurrent pre-agg creati...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1521 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1228/ ---
[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1520 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1229/ ---
[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1471 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1230/ ---
[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1524 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1231/ ---
[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...
Github user dhatchayani commented on the issue: https://github.com/apache/carbondata/pull/1524 retest this please ---
[jira] [Updated] (CARBONDATA-1688) Carbon 1.3.0-Partitioning: Hash Partitioning is not working for Date Column
[ https://issues.apache.org/jira/browse/CARBONDATA-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1688: Summary: Carbon 1.3.0-Partitioning: Hash Partitioning is not working for Date Column (was: Hash Partitioning is not working for Date Column) > Carbon 1.3.0-Partitioning: Hash Partitioning is not working for Date Column > --- > > Key: CARBONDATA-1688 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1688 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 1.3.0 >Reporter: Ayushi Sharma > Attachments: Date_hash.PNG, Segment_0.PNG, Segment_1.PNG, > Segment_2.PNG > > > On applying hash partition on date column, all the data by default goes to > Default partition. > create table if not exists date_hash(col_A String) partitioned by (col_F > Timestamp) stored by 'carbondata' > tblproperties('partition_type'='hash','num_partitions'='5'); > insert into table date_hash select 'ayushi','2016-02-02'; > insert into table date_hash select 'ayushi','2016-02-05'; > insert into table date_hash select 'ayushi','2016-01-05'; -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1687) Carbon 1.3.0-Partitioning:Hash Partitioning is not working for Date Column
[ https://issues.apache.org/jira/browse/CARBONDATA-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1687: Summary: Carbon 1.3.0-Partitioning:Hash Partitioning is not working for Date Column (was: Hash Partitioning is not working for Date Column) > Carbon 1.3.0-Partitioning:Hash Partitioning is not working for Date Column > -- > > Key: CARBONDATA-1687 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1687 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 1.3.0 >Reporter: Ayushi Sharma > Attachments: Date_hash.PNG, Segment_0.PNG, Segment_1.PNG, > Segment_2.PNG > > > On applying hash partition on date column, all the data by default goes to > Default partition. > create table if not exists date_hash(col_A String) partitioned by (col_F > Timestamp) stored by 'carbondata' > tblproperties('partition_type'='hash','num_partitions'='5'); > insert into table date_hash select 'ayushi','2016-02-02'; > insert into table date_hash select 'ayushi','2016-02-05'; > insert into table date_hash select 'ayushi','2016-01-05'; -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1679) Carbon 1.3.0-Partitioning:After Splitting the Partition,no records are displayed
[ https://issues.apache.org/jira/browse/CARBONDATA-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1679: Summary: Carbon 1.3.0-Partitioning:After Splitting the Partition,no records are displayed (was: After Splitting the Partition,no records are displayed) > Carbon 1.3.0-Partitioning:After Splitting the Partition,no records are > displayed > > > Key: CARBONDATA-1679 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1679 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 1.3.0 >Reporter: Ayushi Sharma > Attachments: Split1.PNG > > > create table part_nation_4 (N_NATIONKEY BIGINT,N_REGIONKEY BIGINT,N_COMMENT > STRING) partitioned by (N_NAME STRING) stored by 'carbondata' > tblproperties('partition_type'='list','list_info'='ALGERIA,ARGENTINA,BRAZIL,CANADA,(EGYPT,ETHIOPIA,FRANCE),JAPAN'); > load data inpath '/spark-warehouse/tpchhive.db/nation/nation.tbl' into table > part_nation_4 > options('DELIMITER'='|','FILEHEADER'='N_NATIONKEY,N_NAME,N_REGIONKEY,N_COMMENT'); > show partitions part_nation_4; > ALTER TABLE part_nation_4 SPLIT PARTITION(5) > INTO('(EGYPT,ETHIOPIA)','FRANCE'); > show partitions part_nation_4; > select * from part_nation_4 where N_NAME='FRANCE'; > select * from part_nation_4 where N_NAME='EGYPT'; > select * from part_nation_4 where N_NAME='CANADA'; -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1678) Carbon 1.3.0-Partitioning:Show partition throws index out of bounds exception
[ https://issues.apache.org/jira/browse/CARBONDATA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1678: Summary: Carbon 1.3.0-Partitioning:Show partition throws index out of bounds exception (was: Show partition throws index out of bounds exception) > Carbon 1.3.0-Partitioning:Show partition throws index out of bounds exception > - > > Key: CARBONDATA-1678 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1678 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 1.3.0 >Reporter: Ayushi Sharma > Attachments: Show_part.PNG, Show_part.txt > > > create table part_nation_3 (N_NATIONKEY BIGINT,N_REGIONKEY BIGINT,N_COMMENT > STRING) partitioned by (N_NAME STRING) stored by 'carbondata' > tblproperties('partition_type'='list','list_info'='ALGERIA,ARGENTINA,BRAZIL,CANADA,(EGYPT,ETHIOPIA,FRANCE),JAPAN'); > ALTER TABLE part_nation_3 ADD PARTITION('SAUDI ARABIA,(VIETNAM,RUSSIA,UNITED > KINGDOM,UNITED STATES)'); > show partitions part_nation_3; -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1676) Carbon 1.3.0-Partitioning:No records are displayed for the newly added partition.
[ https://issues.apache.org/jira/browse/CARBONDATA-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1676: Summary: Carbon 1.3.0-Partitioning:No records are displayed for the newly added partition. (was: No records are displayed for the newly added partition.) > Carbon 1.3.0-Partitioning:No records are displayed for the newly added > partition. > - > > Key: CARBONDATA-1676 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1676 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 1.3.0 >Reporter: Ayushi Sharma > Attachments: AddPart.PNG, Add_part_logs.txt > > > create table part_nation (N_NATIONKEY BIGINT,N_REGIONKEY BIGINT,N_COMMENT > STRING) partitioned by (N_NAME STRING) stored by 'carbondata' > tblproperties('partition_type'='list','list_info'='ALGERIA,ARGENTINA,BRAZIL,CANADA,(EGYPT,ETHIOPIA),FRANCE,JAPAN'); > load data inpath '/spark-warehouse/tpchhive.db/nation/nation.tbl' into table > part_nation > options('DELIMITER'='|','FILEHEADER'='N_NATIONKEY,N_NAME,N_REGIONKEY,N_COMMENT'); > show partitions part_nation > select * from part_nation where N_NAME='Germany'; > ALTER TABLE part_nation ADD PARTITION('GERMANY'); > select * from part_nation where N_NAME='GERMANY'; -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1681) Carbon 1.3.0-Partitioning:After dropping the partition, the data is also getting dropped.
[ https://issues.apache.org/jira/browse/CARBONDATA-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1681: Summary: Carbon 1.3.0-Partitioning:After dropping the partition, the data is also getting dropped. (was: After dropping the partition, the data is also getting dropped.) > Carbon 1.3.0-Partitioning:After dropping the partition, the data is also > getting dropped. > - > > Key: CARBONDATA-1681 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1681 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 1.3.0 >Reporter: Ayushi Sharma > Attachments: drop_part1.PNG, drop_part_2.PNG > > > create table part_nation_drop (N_NATIONKEY BIGINT,N_REGIONKEY > BIGINT,N_COMMENT STRING) partitioned by (N_NAME STRING) stored by > 'carbondata' > tblproperties('partition_type'='list','list_info'='ALGERIA,ARGENTINA,BRAZIL,CANADA,(EGYPT,ETHIOPIA,FRANCE),JAPAN'); > show partitions part_nation_drop; > load data inpath '/spark-warehouse/tpchhive.db/nation/nation.tbl' into table > part_nation_drop > options('DELIMITER'='|','FILEHEADER'='N_NATIONKEY,N_NAME,N_REGIONKEY,N_COMMENT'); > select * from part_nation_drop where N_Name='ALGERIA'; > ALTER TABLE part_nation_drop DROP PARTITION(1); > select * from part_nation_drop where N_Name='ALGERIA'; -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1691) Carbon 1.3.0-Partitioning:Document needs to be updated for Table properties (Sort_Scope) in create table
[ https://issues.apache.org/jira/browse/CARBONDATA-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1691: Summary: Carbon 1.3.0-Partitioning:Document needs to be updated for Table properties (Sort_Scope) in create table (was: Document needs to be updated for Table properties (Sort_Scope) in create table) > Carbon 1.3.0-Partitioning:Document needs to be updated for Table properties > (Sort_Scope) in create table > > > Key: CARBONDATA-1691 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1691 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 1.3.0 >Reporter: Ayushi Sharma >Priority: Minor > Attachments: batch_sort.PNG > > > Document needs to be updated for Table properties (Sort_Scope) in create > table. > As per JIRA-1438, the sort_scope will be supported in the create statement > itself, but the same thing is not mentioned in the document. > Document Site- https://carbondata.apache.org/ddl-operation-on-carbondata.html -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1674) Carbon 1.3.0-Partitioning:Describe Formatted Should show the type of partition as well.
[ https://issues.apache.org/jira/browse/CARBONDATA-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1674: Summary: Carbon 1.3.0-Partitioning:Describe Formatted Should show the type of partition as well. (was: Describe Formatted Should show the type of partition as well.) > Carbon 1.3.0-Partitioning:Describe Formatted Should show the type of > partition as well. > --- > > Key: CARBONDATA-1674 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1674 > Project: CarbonData > Issue Type: Improvement > Components: sql >Affects Versions: 1.3.0 >Reporter: Ayushi Sharma >Priority: Minor > Attachments: Jira_req_part1.PNG, jira_req_part2.PNG > > > Describe Formatted should show type of partitions as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1680) Carbon 1.3.0-Partitioning:Show Partition for Hash Partition doesn't display the partition id
[ https://issues.apache.org/jira/browse/CARBONDATA-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1680: Summary: Carbon 1.3.0-Partitioning:Show Partition for Hash Partition doesn't display the partition id (was: Show Partition for Hash Partition doesn't display the partition id) > Carbon 1.3.0-Partitioning:Show Partition for Hash Partition doesn't display > the partition id > > > Key: CARBONDATA-1680 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1680 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 1.3.0 >Reporter: Ayushi Sharma >Priority: Minor > Attachments: Show_part_1_doc.PNG, show_part_1.PNG > > > CREATE TABLE IF NOT EXISTS t9( > id Int, > logdate Timestamp, > phonenumber Int, > country String, > area String > ) > PARTITIONED BY (vin String) > STORED BY 'carbondata' > TBLPROPERTIES('PARTITION_TYPE'='HASH','NUM_PARTITIONS'='5'); > show partitions t9; -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1673) Carbon 1.3.0-Partitioning:Show Partition for Range Partition is not showing the correct details.
[ https://issues.apache.org/jira/browse/CARBONDATA-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1673: Summary: Carbon 1.3.0-Partitioning:Show Partition for Range Partition is not showing the correct details. (was: Show Partition for Range Partition is not showing the correct details.) > Carbon 1.3.0-Partitioning:Show Partition for Range Partition is not showing > the correct details. > > > Key: CARBONDATA-1673 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1673 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 1.3.0 >Reporter: Ayushi Sharma >Priority: Minor > Attachments: Range_recording.htm, Range_recording.swf > > > For description, please refer to the attachment. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1672) Carbon 1.3.0-Partitioning:Hash Partition is not working as specified in the document.
[ https://issues.apache.org/jira/browse/CARBONDATA-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1672: Summary: Carbon 1.3.0-Partitioning:Hash Partition is not working as specified in the document. (was: Hash Partition is not working as specified in the document.) > Carbon 1.3.0-Partitioning:Hash Partition is not working as specified in the > document. > - > > Key: CARBONDATA-1672 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1672 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 1.3.0 >Reporter: Ayushi Sharma >Priority: Minor > Attachments: Part2.PNG, Partition1.PNG > > > create table Carb_part (P_PARTKEY BIGINT,P_NAME STRING,P_MFGR STRING,P_BRAND > STRING,P_TYPE STRING,P_CONTAINER STRING,P_RETAILPRICE DOUBLE,P_COMMENT > STRING)PARTITIONED BY (P_SIZE int) STORED BY 'CARBONDATA' > TBLPROPERTIES('partition_type'='HASH','partition_num'='3'); > This command displays error as mentioned below: > Error: org.apache.carbondata.spark.exception.MalformedCarbonCommandException: > Error: Invalid partition definition (state=,code=0) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1751) Modify sys.err to AnalysisException when uses run related operation except IUD,compaction and alter
[ https://issues.apache.org/jira/browse/CARBONDATA-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated CARBONDATA-1751: Description: carbon printout improper error message, for example, it printout system error when users run create table with the same column name, but it should printout related exception information So we modify sys.error method to AnalysisException when uses run related operation except IUD,compaction and alter Make the type of exception and message correctly,including Spark2 and spark-common module was: carbon printout improper error message, for example, it printout system error when users run create table with the same column name, but it should printout related exception information So we modify sys.error method to AnalysisException when uses run related operation except IUD,compaction and alter > Modify sys.err to AnalysisException when uses run related operation except > IUD,compaction and alter > > > Key: CARBONDATA-1751 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1751 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.2.0 >Reporter: xubo245 >Assignee: xubo245 >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 1h > Remaining Estimate: 0h > > carbon printout improper error message, for example, it printout system error > when users run create table with the same column name, but it should printout > related exception information > So we modify sys.error method to AnalysisException when uses run related > operation except IUD,compaction and alter > Make the type of exception and message correctly,including Spark2 and > spark-common module -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1728) Carbon1.3.0- DB creation external path : Delete data with select in where clause not successful for large data
[ https://issues.apache.org/jira/browse/CARBONDATA-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1728: Summary: Carbon1.3.0- DB creation external path : Delete data with select in where clause not successful for large data (was: (Carbon1.3.0- DB creation external path) - Delete data with select in where clause not successful for large data) > Carbon1.3.0- DB creation external path : Delete data with select in where > clause not successful for large data > -- > > Key: CARBONDATA-1728 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1728 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.3.0 > Environment: 3 node ant cluster >Reporter: Chetan Bhat > Labels: DFX > > Steps : > 0: jdbc:hive2://10.18.98.34:23040> create database test_db1 location > 'hdfs://hacluster/user/test1'; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.032 seconds) > 0: jdbc:hive2://10.18.98.34:23040> use test_db1; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.01 seconds) > 0: jdbc:hive2://10.18.98.34:23040> create table if not exists > ORDERS(O_ORDERDATE string,O_ORDERPRIORITY string,O_ORDERSTATUS > string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE double,O_CLERK > string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY > 'org.apache.carbondata.format' TBLPROPERTIES ('table_blocksize'='128'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.174 seconds) > 0: jdbc:hive2://10.18.98.34:23040> load data inpath > "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS > options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (27.421 seconds) > 0: jdbc:hive2://10.18.98.34:23040> create table h_orders as select * from > orders; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (9.779 seconds) > 0: jdbc:hive2://10.18.98.34:23040> Delete from test_db1.orders a where exists > (select 1 from test_db1.h_orders b where b.o_ORDERKEY=a.O_ORDERKEY); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (48.998 seconds) > select count(*) from test_db1.orders; > Actual Issue : Select count displays shows all records present which means > the records are not deleted. > 0: jdbc:hive2://10.18.98.34:23040> select count(*) from test_db1.orders; > +---+--+ > | count(1) | > +---+--+ > | 750 | > +---+--+ > 1 row selected (7.967 seconds) > This indicates Delete data with select in where clause not successful for > large data. > Expected : The Delete data with select in where clause should be successful > for large data. The select count should return 0 records which indicates that > the records are deleted successfully. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1747) Carbon1.3.0- DB creation external path : Owner name of compacted segment and segment after update is not correct
[ https://issues.apache.org/jira/browse/CARBONDATA-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1747: Summary: Carbon1.3.0- DB creation external path : Owner name of compacted segment and segment after update is not correct (was: (Carbon1.3.0- DB creation external path) - Owner name of compacted segment and segment after update is not correct) > Carbon1.3.0- DB creation external path : Owner name of compacted segment and > segment after update is not correct > > > Key: CARBONDATA-1747 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1747 > Project: CarbonData > Issue Type: Bug > Components: other >Affects Versions: 1.3.0 > Environment: 3 node ant cluster >Reporter: Chetan Bhat > Labels: security > > Steps : > In spark Beeline user executes the following queries > drop database if exists test_db1 cascade; > create database test_db1 location 'hdfs://hacluster/user/test1'; > use test_db1; > create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY > string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE > double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY > 'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128'); > load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS > options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32'); > load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS > options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32'); > load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS > options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32'); > load data inpath "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS > options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32'); > alter table ORDERS compact 'major'; > update orders set (O_ORDERKEY)=(1) where O_CUSTKEY=6259021; > After compaction and update user checks the Owner name of compacted segment > and segment name after update in HDFS UI. > Issue : In HDFS UI before compaction and update the owner name of the > existing segment folders was "anonymous". After compaction and update the > owner name of the compacted segment folder and segment which is impacted by > update is displayed as "root". > Expected : After compaction and update the owner name of the compacted > segment folder and segment which is impacted by update should be "anonymous". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1748) Carbon1.3.0- DB creation external path : Permission of created table and database folder in carbon store not correct
[ https://issues.apache.org/jira/browse/CARBONDATA-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1748: Summary: Carbon1.3.0- DB creation external path : Permission of created table and database folder in carbon store not correct (was: (Carbon1.3.0- DB creation external path) - Permission of created table and database folder in carbon store not correct) > Carbon1.3.0- DB creation external path : Permission of created table and > database folder in carbon store not correct > > > Key: CARBONDATA-1748 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1748 > Project: CarbonData > Issue Type: Bug > Components: other >Affects Versions: 1.3.0 > Environment: 3 node ant cluster >Reporter: Chetan Bhat > Labels: security > > Steps : > In spark Beeline user executes the following queries. > drop database if exists test_db1 cascade; > create database test_db1 location 'hdfs://hacluster/user/test1'; > use test_db1; > create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY > string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE > double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY > 'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128'); > User checks the permission of the created database and table in carbon store > using the bin/hadoop fs -getfacl command. > Issue : The Permission of created table and database folder in carbon store > not correct. i.e > # file: /user/test1/orders > # owner: anonymous > # group: users > user::rwx > group::r-x > other::r-x > Expected : Correct permissions for the created table and database folder in > carbon store should be > # file: /user/test1/orders > # owner: anonymous > # group: users > user::rwx > group::--- > other::--- -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1731) Carbon1.3.0- DB creation external path: Update fails incorrectly with error for table created in external db location
[ https://issues.apache.org/jira/browse/CARBONDATA-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1731: Summary: Carbon1.3.0- DB creation external path: Update fails incorrectly with error for table created in external db location (was: (Carbon1.3.0- DB creation external path) Update fails incorrectly with error for table created in external db location) > Carbon1.3.0- DB creation external path: Update fails incorrectly with error > for table created in external db location > - > > Key: CARBONDATA-1731 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1731 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.3.0 > Environment: 3 node ant cluster >Reporter: Chetan Bhat > Labels: DFX > > Steps : > 0: jdbc:hive2://10.18.98.34:23040> drop database if exists test_db1 cascade; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.279 seconds) > 0: jdbc:hive2://10.18.98.34:23040> create database test_db1 location > 'hdfs://hacluster/user/test1'; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.04 seconds) > 0: jdbc:hive2://10.18.98.34:23040> use test_db1; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.011 seconds) > 0: jdbc:hive2://10.18.98.34:23040> create table if not exists > ORDERS(O_ORDERDATE string,O_ORDERPRIORITY string,O_ORDERSTATUS > string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE double,O_CLERK > string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY > 'org.apache.carbondata.format' TBLPROPERTIES ('table_blocksize'='128'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.15 seconds) > 0: jdbc:hive2://10.18.98.34:23040> load data inpath > "hdfs://hacluster/chetan/orders.tbl.1" into table ORDERS > options('DELIMITER'='|','FILEHEADER'='O_ORDERKEY,O_CUSTKEY,O_ORDERSTATUS,O_TOTALPRICE,O_ORDERDATE,O_ORDERPRIORITY,O_CLERK,O_SHIPPRIORITY,O_COMMENT','batch_sort_size_inmb'='32'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (23.228 seconds) > 0: jdbc:hive2://10.18.98.34:23040> update test_Db1.ORDERS set (o_comment) = > ('yyy'); > Issue : Update fails incorrectly with error for table created in external db > location. > 0: jdbc:hive2://10.18.98.34:23040> update test_Db1.ORDERS set (o_comment) = > ('yyy'); > *Error: java.lang.RuntimeException: Update operation failed. Multiple input > rows matched for same row. (state=,code=0)* > Expected : The update should be success for table created in external db > location. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1758) Carbon1.3.0- No Inverted Index : Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/CARBONDATA-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1758: Summary: Carbon1.3.0- No Inverted Index : Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException (was: (Carbon1.3.0- No Inverted Index) - Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException) > Carbon1.3.0- No Inverted Index : Select column with is null for > no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException > > > Key: CARBONDATA-1758 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1758 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.3.0 > Environment: 3 node cluster >Reporter: Chetan Bhat > Labels: Functional > > Steps : > In Beeline user executes the queries in sequence. > CREATE TABLE uniqdata_DI_int (CUST_ID int,CUST_NAME > String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, > BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), > DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 > double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' > TBLPROPERTIES('DICTIONARY_INCLUDE'='cust_id','NO_INVERTED_INDEX'='cust_id'); > LOAD DATA INPATH 'hdfs://hacluster/chetan/3000_UniqData.csv' into table > uniqdata_DI_int OPTIONS('DELIMITER'=',', > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > Select count(CUST_ID) from uniqdata_DI_int; > Select count(CUST_ID)*10 as multiple from uniqdata_DI_int; > Select avg(CUST_ID) as average from uniqdata_DI_int; > Select floor(CUST_ID) as average from uniqdata_DI_int; > Select ceil(CUST_ID) as average from uniqdata_DI_int; > Select ceiling(CUST_ID) as average from uniqdata_DI_int; > Select CUST_ID*integer_column1 as multiple from uniqdata_DI_int; > Select CUST_ID from uniqdata_DI_int where CUST_ID is null; > *Issue : Select column with is null for no_inverted_index column throws > java.lang.ArrayIndexOutOfBoundsException* > 0: jdbc:hive2://10.18.98.34:23040> Select CUST_ID from uniqdata_DI_int where > CUST_ID is null; > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 0 in stage 79.0 failed 4 times, most recent failure: Lost task 0.3 in > stage 79.0 (TID 123, BLR114278, executor 18): > org.apache.spark.util.TaskCompletionListenerException: > java.util.concurrent.ExecutionException: > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:105) > at org.apache.spark.scheduler.Task.run(Task.scala:112) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Driver stacktrace: (state=,code=0) > Expected : Select column with is null for no_inverted_index column should be > successful displaying the correct result set. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1749) Carbon1.3.0- DB creation external path : mdt file is not created in directory as per configuration in carbon.properties
[ https://issues.apache.org/jira/browse/CARBONDATA-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-1749: Summary: Carbon1.3.0- DB creation external path : mdt file is not created in directory as per configuration in carbon.properties (was: (Carbon1.3.0- DB creation external path) - mdt file is not created in directory as per configuration in carbon.properties) > Carbon1.3.0- DB creation external path : mdt file is not created in directory > as per configuration in carbon.properties > --- > > Key: CARBONDATA-1749 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1749 > Project: CarbonData > Issue Type: Bug > Components: other >Affects Versions: 1.3.0 > Environment: 3 node cluster >Reporter: Chetan Bhat > Labels: Functional > > Steps : > In carbon.properties the mdt file directory path is configured as > Carbon.update.sync.folder=hdfs://hacluster/user/test1 or /tmp/test1/ > In beeline user creates a database by specifying the carbon store path and > creates a carbon table in the db. > drop database if exists test_db1 cascade; > create database test_db1 location 'hdfs://hacluster/user/test1'; > use test_db1; > create table if not exists ORDERS(O_ORDERDATE string,O_ORDERPRIORITY > string,O_ORDERSTATUS string,O_ORDERKEY string,O_CUSTKEY string,O_TOTALPRICE > double,O_CLERK string,O_SHIPPRIORITY int,O_COMMENT string) STORED BY > 'org.apache.carbondata.format'TBLPROPERTIES ('table_blocksize'='128'); > User checks in HDFS UI if the mdt file is created in directory specified > (hdfs://hacluster/user/test1) as per configuration in carbon.properties. > Issue : mdt file is not created in directory specified > (hdfs://hacluster/user/test1) as per configuration in carbon.properties. Also > the folder is not created if the user configures the folder path as > Carbon.update.sync.folder=/tmp/test1/ > Expected : mdt file should be created in directory specified > (hdfs://hacluster/user/test1) or /tmp/test1/ as per configuration in > carbon.properties. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1520: [CARBONDATA-1734] Ignore empty line while rea...
Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1520#discussion_r151698402 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java --- @@ -45,6 +45,9 @@ "carbon.options.is.empty.data.bad.record"; public static final String CARBON_OPTIONS_IS_EMPTY_DATA_BAD_RECORD_DEFAULT = "false"; + @CarbonProperty public static final String CARBON_OPTIONS_SKIP_EMPTY_LINE = --- End diff -- Please add documentation ---
[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/1520 LGTM except one small comment ---
[GitHub] carbondata pull request #1525: [CARBONDATA-1751] Make the type of exception ...
GitHub user xubo245 opened a pull request: https://github.com/apache/carbondata/pull/1525 [CARBONDATA-1751] Make the type of exception and message correctly Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [ ] Testing done change old test cases to adapt changed massage - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. MR55 You can merge this pull request into a Git repository by running: $ git pull https://github.com/xubo245/carbondata msgYaDong Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1525.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1525 commit 80ea7c95b00680c43fd8d16f5aa51a30311a8930 Author: xubo245 <601450...@qq.com> Date: 2017-11-17T14:36:46Z [CARBONDATA-1751] Make the type of exception and message correctly ---
[GitHub] carbondata issue #1508: [CARBONDATA-1738] Block direct insert/load on pre-ag...
Github user kunal642 commented on the issue: https://github.com/apache/carbondata/pull/1508 retest this please ---
[jira] [Created] (CARBONDATA-1764) Fix issue of when create table with short data type
xubo245 created CARBONDATA-1764: --- Summary: Fix issue of when create table with short data type Key: CARBONDATA-1764 URL: https://issues.apache.org/jira/browse/CARBONDATA-1764 Project: CarbonData Issue Type: Bug Components: spark-integration Affects Versions: 1.2.0 Reporter: xubo245 Assignee: xubo245 Fix For: 1.3.0 Fix issue of when create table with short data type -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1526: [CARBONDATA-1764] Fix issue of when create ta...
GitHub user xubo245 opened a pull request: https://github.com/apache/carbondata/pull/1526 [CARBONDATA-1764] Fix issue of when create table with short data type Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [ ] Testing done add DataTypeConverterUtilSuite.scala test case for DataTypeConverterUtil - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. MR37 You can merge this pull request into a Git repository by running: $ git pull https://github.com/xubo245/carbondata shortTable Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1526.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1526 commit de8bf19a36a11e3ca32bb98bf4ff1453ae21c630 Author: xubo245 <601450...@qq.com> Date: 2017-11-17T15:23:31Z [CARBONDATA-1764] Fix issue of when create table with short data type ---
[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/1520 rest this please ---
[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/1520 retest this please ---
[GitHub] carbondata issue #1435: [CARBONDATA-1626]add data size and index size in tab...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1435 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1232/ ---
[GitHub] carbondata issue #1520: [CARBONDATA-1734] Ignore empty line while reading CS...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/1520 retest this please ---
[GitHub] carbondata issue #1524: [CARBONDATA-1762] Remove existing column level datef...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1524 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1233/ ---