[GitHub] carbondata issue #2919: [CARBONDATA-3097] Optimize getVersionDetails
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2919 @KanakaKumar Please review it again. ---
[GitHub] carbondata issue #3037: [CARBONDATA-3190] Open example module code style che...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/3037 @KanakaKumar @ravipesala @jackylk @QiangCai @kunal642 Please review it. ---
[GitHub] carbondata issue #3010: [CARBONDATA-3189] Fix PreAggregate Datamap Issue
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/3010 LGTM ---
[GitHub] carbondata pull request #3010: [CARBONDATA-3189] Fix PreAggregate Datamap Is...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/3010 ---
[GitHub] carbondata issue #3046: [CARBONDATA-3231] Fix OOM exception when dictionary ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3046 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2188/ ---
[GitHub] carbondata pull request #3045: [CARBONDATA-3222]Fix dataload failure after c...
Github user shardul-cr7 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3045#discussion_r245585973 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/preaaggregate/PreAggregateTableHelper.scala --- @@ -110,22 +109,42 @@ case class PreAggregateTableHelper( // Datamap table name and columns are automatically added prefix with parent table name // in carbon. For convenient, users can type column names same as the ones in select statement // when config dmproperties, and here we update column names with prefix. -val longStringColumn = tableProperties.get(CarbonCommonConstants.LONG_STRING_COLUMNS) +// If longStringColumn is not present in dmproperties then we take long_string_columns from +// the parent table. +var longStringColumn = tableProperties.get(CarbonCommonConstants.LONG_STRING_COLUMNS) +val longStringColumnInParents = parentTable.getTableInfo.getFactTable.getTableProperties.asScala + .getOrElse(CarbonCommonConstants.LONG_STRING_COLUMNS, "").split(",").map(_.trim) +var varcharDatamapFields = "" +fieldRelationMap foreach (fields => { + val aggFunc = fields._2.aggregateFunction + if (aggFunc == "") { --- End diff -- Done! ---
[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3045 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2189/ ---
[GitHub] carbondata issue #2996: [WIP] Fix Rename-Fail & Datamap-creation-Fail
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2996 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2190/ ---
[GitHub] carbondata issue #3046: [CARBONDATA-3231] Fix OOM exception when dictionary ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3046 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10444/ ---
[GitHub] carbondata issue #3046: [CARBONDATA-3231] Fix OOM exception when dictionary ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3046 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2404/ ---
[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3045 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10445/ ---
[GitHub] carbondata issue #3029: [CARBONDATA-3200] No-Sort compaction
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3029 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2191/ ---
[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3014 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2192/ ---
[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3045 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2405/ ---
[GitHub] carbondata issue #2996: [WIP] Fix Rename-Fail & Datamap-creation-Fail
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2996 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2406/ ---
[GitHub] carbondata pull request #3053: [WIP]JVM crash issue in snappy compressor
GitHub user akashrn5 opened a pull request: https://github.com/apache/carbondata/pull/3053 [WIP]JVM crash issue in snappy compressor Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/akashrn5/incubator-carbondata jvmcrash Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/3053.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3053 commit b50d1c8b9c69565231cabf1d5dd507a006312a19 Author: akashrn5 Date: 2019-01-07T11:04:48Z JVM crash issue in snappy compressor ---
[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3014 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2408/ ---
[GitHub] carbondata issue #3029: [CARBONDATA-3200] No-Sort compaction
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3029 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2407/ ---
[GitHub] carbondata issue #2996: [WIP] Fix Rename-Fail & Datamap-creation-Fail
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2996 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10446/ ---
[GitHub] carbondata issue #3029: [CARBONDATA-3200] No-Sort compaction
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3029 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10447/ ---
[GitHub] carbondata issue #3053: [WIP]JVM crash issue in snappy compressor
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3053 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2193/ ---
[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3045 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2194/ ---
[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3014 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10448/ ---
[GitHub] carbondata issue #2971: [CARBONDATA-3219] Support range partition the input ...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2971 @QiangCai we should restrict changing that property from table properties. I am just explaining about how we can do the compaction on range column since there are similarities with partitioning I mentioned it here. I feel range boundaries can be recalculated during the compaction using min/max of range column and go for the merge sort. ---
[GitHub] carbondata pull request #3046: [CARBONDATA-3231] Fix OOM exception when dict...
Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3046#discussion_r245630539 --- Diff: core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java --- @@ -1491,6 +1491,27 @@ private void validateSortMemorySpillPercentage() { } } + public int getMaxDictionaryThreshold() { +int localDictionaryMaxThreshold = Integer.parseInt(carbonProperties + .getProperty(CarbonCommonConstants.CARBON_LOCAL_DICTIONARY_MAX_SIZE_THRESHOLD, + CarbonCommonConstants.CARBON_LOCAL_DICTIONARY_MAX_SIZE_THRESHOLD_DEFAULT)); +if (localDictionaryMaxThreshold --- End diff -- add min check also ---
[GitHub] carbondata pull request #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap e...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2963#discussion_r245633754 --- Diff: pom.xml --- @@ -527,6 +526,7 @@ examples/spark2 datamap/lucene datamap/bloom +datamap/example --- End diff -- Excluding this will cause the datamap example module outdated and has potential unfixed bugs later, which is the previous status of this module. Maybe we can execlude this from the assembly jar ---
[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3045 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2410/ ---
[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...
Github user kunal642 commented on the issue: https://github.com/apache/carbondata/pull/3045 LGTM ---
[GitHub] carbondata pull request #3045: [CARBONDATA-3222]Fix dataload failure after c...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/3045 ---
[jira] [Resolved] (CARBONDATA-3222) Fix dataload failure after creation of preaggregate datamap on main table with long_string_columns
[ https://issues.apache.org/jira/browse/CARBONDATA-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Kapoor resolved CARBONDATA-3222. -- Resolution: Fixed Fix Version/s: 1.5.2 > Fix dataload failure after creation of preaggregate datamap on main table > with long_string_columns > -- > > Key: CARBONDATA-3222 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3222 > Project: CarbonData > Issue Type: Bug >Reporter: Shardul Singh >Priority: Minor > Fix For: 1.5.2 > > Time Spent: 9h 10m > Remaining Estimate: 0h > > Fix dataload failure after creation of preaggregate datamap on main table > with long_string_columns. > Dataload is gettling failed because child table properties are not getting > modified according to the parent table for long_string_columns. > This occurs only when we dont pass long_string_columns is not specified in > dmproperties for preaggregate datamap. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #3054: [CARBONDATA-3232] Optimize carbonData using a...
GitHub user xubo245 opened a pull request: https://github.com/apache/carbondata/pull/3054 [CARBONDATA-3232] Optimize carbonData using alluxio Optimize carbonData using alluxio: 1.Add doc 2.optimize the example Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? Yes - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. optimize the example - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. No You can merge this pull request into a Git repository by running: $ git pull https://github.com/xubo245/carbondata CARBONDATA-3232_OptimizeSupportAlluxio Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/3054.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3054 commit 4ccc9fefaf590f092fa64978ebc3ce0b8533d437 Author: xubo245 Date: 2019-01-07T12:27:37Z [CARBONDATA-3232] Optimize carbonData using alluxio ---
[GitHub] carbondata issue #3053: [WIP]JVM crash issue in snappy compressor
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3053 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10449/ ---
[jira] [Created] (CARBONDATA-3232) Optimize carbonData using alluxio
xubo245 created CARBONDATA-3232: --- Summary: Optimize carbonData using alluxio Key: CARBONDATA-3232 URL: https://issues.apache.org/jira/browse/CARBONDATA-3232 Project: CarbonData Issue Type: Improvement Affects Versions: 1.5.1 Reporter: xubo245 Assignee: xubo245 Optimize carbonData using alluxio: 1.Add doc 2.optimize the example -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #3054: [CARBONDATA-3232] Optimize carbonData using a...
Github user kevinjmh commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3054#discussion_r245639533 --- Diff: docs/Integration/alluxio-guide.md --- @@ -0,0 +1,44 @@ + + + +# Presto guide +This tutorial provides a quick introduction to using Alluxio. + +## How to use Alluxio for CarbonData? +### Install and start Alluxio +Please refer to [https://www.alluxio.org/docs/1.8/en/Getting-Started.html#starting-alluxio](https://www.alluxio.org/docs/1.8/en/Getting-Started.html#starting-alluxio) +Access the Alluxio web: [http://localhost:1/home](http://localhost:1/home) +By command, for example: +```$xslt +./bin/alluxio fs ls / +``` +Result: +``` +drwxr-xr-x xubo staff1 NOT_PERSISTED 01-07-2019 15:39:24:960 DIR /carbondata +-rw-r--r-- xubo staff50686 NOT_PERSISTED 01-07-2019 11:37:48:924 100% /data.csv +``` +### Upload Alluxio jar to CarbonData +Upload the jar "/alluxio_path/client/alluxio-YOUR-VERSION-client.jar" to CarbonData --- End diff -- "Upload to CarbonData" is confusing. What we need to do is to add the alluxio client jar to classpath, right? ---
[GitHub] carbondata pull request #3054: [CARBONDATA-3232] Optimize carbonData using a...
Github user kevinjmh commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3054#discussion_r245639998 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/AlluxioExample.scala --- @@ -26,48 +30,88 @@ import org.apache.carbondata.examples.util.ExampleUtils /** - * configure alluxio: - * 1.start alluxio - * 2.upload the jar :"/alluxio_path/core/client/target/ - * alluxio-core-client-YOUR-VERSION-jar-with-dependencies.jar" - * 3.Get more detail at:http://www.alluxio.org/docs/master/en/Running-Spark-on-Alluxio.html - */ + * configure alluxio: + * 1.start alluxio + * 2.upload the jar: "/alluxio_path/core/client/target/ + * alluxio-core-client-YOUR-VERSION-jar-with-dependencies.jar" + * 3.Get more detail at:http://www.alluxio.org/docs/master/en/Running-Spark-on-Alluxio.html + */ object AlluxioExample { - def main(args: Array[String]) { -val spark = ExampleUtils.createCarbonSession("AlluxioExample") -exampleBody(spark) -spark.close() + def main (args: Array[String]) { +val carbon = ExampleUtils.createCarbonSession("AlluxioExample", + storePath = "alluxio://localhost:19998/carbondata") +exampleBody(carbon) +carbon.close() } - def exampleBody(spark : SparkSession): Unit = { + def exampleBody (spark: SparkSession): Unit = { +val rootPath = new File(this.getClass.getResource("/").getPath + + "../../../..").getCanonicalPath spark.sparkContext.hadoopConfiguration.set("fs.alluxio.impl", "alluxio.hadoop.FileSystem") FileFactory.getConfiguration.set("fs.alluxio.impl", "alluxio.hadoop.FileSystem") // Specify date format based on raw data CarbonProperties.getInstance() .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd") -spark.sql("DROP TABLE IF EXISTS alluxio_table") +val mFsShell = new FileSystemShell() +val localFile = rootPath + "/hadoop/src/test/resources/data.csv" +val remotePath = "/carbon_alluxio.csv" +val remoteFile = "alluxio://localhost:19998/carbon_alluxio.csv" +mFsShell.run("rm", remotePath) --- End diff -- As an example, I think we should not do this operation ---
[GitHub] carbondata issue #3053: [WIP]JVM crash issue in snappy compressor
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3053 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2409/ ---
[GitHub] carbondata issue #3054: [CARBONDATA-3232] Optimize carbonData using alluxio
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3054 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2195/ ---
[GitHub] carbondata pull request #3054: [CARBONDATA-3232] Optimize carbonData using a...
Github user kevinjmh commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3054#discussion_r245641130 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/util/ExampleUtils.scala --- @@ -30,13 +30,17 @@ object ExampleUtils { .getCanonicalPath val storeLocation: String = currentPath + "/target/store" - def createCarbonSession(appName: String, workThreadNum: Int = 1): SparkSession = { + def createCarbonSession (appName: String, workThreadNum: Int = 1, + storePath: String = null): SparkSession = { val rootPath = new File(this.getClass.getResource("/").getPath -+ "../../../..").getCanonicalPath -val storeLocation = s"$rootPath/examples/spark2/target/store" + + "../../../..").getCanonicalPath +var storeLocation = s"$rootPath/examples/spark2/target/store" val warehouse = s"$rootPath/examples/spark2/target/warehouse" val metaStoreDB = s"$rootPath/examples/spark2/target" +if (storePath != null) { + storeLocation = storePath; +} --- End diff -- ```suggestion val storeLocation = if (null != storePath) { storePath } else { s"$rootPath/examples/spark2/target/store" } ``` ---
[GitHub] carbondata issue #3048: [CARBONDATA-3224] Support SDK/CSDK validate the impr...
Github user KanakaKumar commented on the issue: https://github.com/apache/carbondata/pull/3048 LGTM ---
[GitHub] carbondata issue #3045: [CARBONDATA-3222]Fix dataload failure after creation...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3045 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10450/ ---
[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3014 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2196/ ---
[GitHub] carbondata issue #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap example
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2963 @jackylk Actually after applying the above commit, the size of the shade decrease from 40652 Bytes to 40620 Bytes ---
[GitHub] carbondata issue #3048: [CARBONDATA-3224] Support SDK/CSDK validate the impr...
Github user ajantha-bhat commented on the issue: https://github.com/apache/carbondata/pull/3048 LGTM ---
[GitHub] carbondata pull request #3046: [CARBONDATA-3231] Fix OOM exception when dict...
Github user kunal642 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3046#discussion_r245653091 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/DecoderBasedFallbackEncoder.java --- @@ -57,10 +57,7 @@ public DecoderBasedFallbackEncoder(EncodedColumnPage encodedColumnPage, int page int pageSize = encodedColumnPage.getActualPage().getPageSize(); int offset = 0; -int[] reverseInvertedIndex = new int[pageSize]; --- End diff -- In case of No_Sort where inverted index is not there, this variable was being created unnecessarily. ---
[GitHub] carbondata issue #3054: [CARBONDATA-3232] Optimize carbonData using alluxio
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3054 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10451/ ---
[GitHub] carbondata issue #3054: [CARBONDATA-3232] Optimize carbonData using alluxio
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3054 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2411/ ---
[GitHub] carbondata issue #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap example
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2963 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2197/ ---
[jira] [Created] (CARBONDATA-3233) JVM is getting crashed during dataload while compressing in snappy
Akash R Nilugal created CARBONDATA-3233: --- Summary: JVM is getting crashed during dataload while compressing in snappy Key: CARBONDATA-3233 URL: https://issues.apache.org/jira/browse/CARBONDATA-3233 Project: CarbonData Issue Type: Bug Reporter: Akash R Nilugal Assignee: Akash R Nilugal when huge dataload is done, some times dataload is failed and jvm is crashed during snappy compression Below is the logs: Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j org.xerial.snappy.SnappyNative.rawCompress(JJJ)J+0 j org.apache.carbondata.core.datastore.compression.SnappyCompressor.rawCompress(JIJ)J+9 j org.apache.carbondata.core.datastore.page.UnsafeFixLengthColumnPage.compress(Lorg/apache/carbondata/core/datastore/compression/Compressor;)[B+50 j org.apache.carbondata.core.datastore.page.encoding.adaptive.AdaptiveCodec.encodeAndCompressPage(Lorg/apache/carbondata/core/datastore/page/ColumnPage;Lorg/apache/carbondata/core/datastore/page/ColumnPageValueConverter;Lorg/apache/carbondata/core/datastore/compression/Compressor;)[B+85 j org.apache.carbondata.core.datastore.page.encoding.adaptive.AdaptiveDeltaIntegralCodec$1.encodeData(Lorg/apache/carbondata/core/datastore/page/ColumnPage;)[B+45 j org.apache.carbondata.core.datastore.page.encoding.ColumnPageEncoder.encode(Lorg/apache/carbondata/core/datastore/page/ColumnPage;)Lorg/apache/carbondata/core/datastore/page/encoding/EncodedColumnPage;+2 j org.apache.carbondata.processing.store.TablePage.encodeAndCompressMeasures()[Lorg/apache/carbondata/core/datastore/page/encoding/EncodedColumnPage;+54 j org.apache.carbondata.processing.store.TablePage.encode()V+6 j org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.processDataRows(Ljava/util/List;)Lorg/apache/carbondata/processing/store/TablePage;+86 j org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.access$500(Lorg/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar;Ljava/util/List;)Lorg/apache/carbondata/processing/store/TablePage;+2 j org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call()Ljava/lang/Void;+8 j org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call()Ljava/lang/Object;+1 j java.util.concurrent.FutureTask.run()V+42 j java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95 j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 j java.lang.Thread.run()V+11 v ~StubRoutines::call_stub -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-3233) JVM is getting crashed during dataload while compressing in snappy
[ https://issues.apache.org/jira/browse/CARBONDATA-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal updated CARBONDATA-3233: Description: when huge dataload is done, some times dataload is failed and jvm is crashed during snappy compression Below is the logs: Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j org.xerial.snappy.SnappyNative.rawCompress(JJJ)J+0 j org.apache.carbondata.core.datastore.compression.SnappyCompressor.rawCompress(JIJ)J+9 j org.apache.carbondata.core.datastore.page.UnsafeFixLengthColumnPage.compress(Lorg/apache/carbondata/core/datastore/compression/Compressor[B+50 j org.apache.carbondata.core.datastore.page.encoding.adaptive.AdaptiveCodec.encodeAndCompressPage(Lorg/apache/carbondata/core/datastore/page/ColumnPage;Lorg/apache/carbondata/core/datastore/page/ColumnPageValueConverter;Lorg/apache/carbondata/core/datastore/compression/Compressor[B+85 j org.apache.carbondata.core.datastore.page.encoding.adaptive.AdaptiveDeltaIntegralCodec$1.encodeData(Lorg/apache/carbondata/core/datastore/page/ColumnPage[B+45 j org.apache.carbondata.core.datastore.page.encoding.ColumnPageEncoder.encode(Lorg/apache/carbondata/core/datastore/page/ColumnPage;)Lorg/apache/carbondata/core/datastore/page/encoding/EncodedColumnPage;+2 j org.apache.carbondata.processing.store.TablePage.encodeAndCompressMeasures()[Lorg/apache/carbondata/core/datastore/page/encoding/EncodedColumnPage;+54 j org.apache.carbondata.processing.store.TablePage.encode()V+6 j org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.processDataRows(Ljava/util/List;)Lorg/apache/carbondata/processing/store/TablePage;+86 j org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.access$500(Lorg/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar;Ljava/util/List;)Lorg/apache/carbondata/processing/store/TablePage;+2 j org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call()Ljava/lang/Void;+8 j org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call()Ljava/lang/Object;+1 j java.util.concurrent.FutureTask.run()V+42 j java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95 j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 j java.lang.Thread.run()V+11 v ~StubRoutines::call_stub was: when huge dataload is done, some times dataload is failed and jvm is crashed during snappy compression Below is the logs: Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j org.xerial.snappy.SnappyNative.rawCompress(JJJ)J+0 j org.apache.carbondata.core.datastore.compression.SnappyCompressor.rawCompress(JIJ)J+9 j org.apache.carbondata.core.datastore.page.UnsafeFixLengthColumnPage.compress(Lorg/apache/carbondata/core/datastore/compression/Compressor;)[B+50 j org.apache.carbondata.core.datastore.page.encoding.adaptive.AdaptiveCodec.encodeAndCompressPage(Lorg/apache/carbondata/core/datastore/page/ColumnPage;Lorg/apache/carbondata/core/datastore/page/ColumnPageValueConverter;Lorg/apache/carbondata/core/datastore/compression/Compressor;)[B+85 j org.apache.carbondata.core.datastore.page.encoding.adaptive.AdaptiveDeltaIntegralCodec$1.encodeData(Lorg/apache/carbondata/core/datastore/page/ColumnPage;)[B+45 j org.apache.carbondata.core.datastore.page.encoding.ColumnPageEncoder.encode(Lorg/apache/carbondata/core/datastore/page/ColumnPage;)Lorg/apache/carbondata/core/datastore/page/encoding/EncodedColumnPage;+2 j org.apache.carbondata.processing.store.TablePage.encodeAndCompressMeasures()[Lorg/apache/carbondata/core/datastore/page/encoding/EncodedColumnPage;+54 j org.apache.carbondata.processing.store.TablePage.encode()V+6 j org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.processDataRows(Ljava/util/List;)Lorg/apache/carbondata/processing/store/TablePage;+86 j org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.access$500(Lorg/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar;Ljava/util/List;)Lorg/apache/carbondata/processing/store/TablePage;+2 j org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call()Ljava/lang/Void;+8 j org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call()Ljava/lang/Object;+1 j java.util.concurrent.FutureTask.run()V+42 j java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95 j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 j java.lang.Thread.run()V+11 v ~StubRoutines::call_stub > JVM is getting crashed during dataload while compressing in snappy > -- > > Key: CARBONDATA-3233 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3233 > Project: CarbonData > Issu
[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/3014 LGTM ---
[GitHub] carbondata issue #3053: [CARBONDATA-3233]Fix JVM crash issue in snappy compr...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3053 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2198/ ---
[GitHub] carbondata pull request #3054: [CARBONDATA-3232] Optimize carbonData using a...
Github user qiuchenjian commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3054#discussion_r245661158 --- Diff: docs/Integration/alluxio-guide.md --- @@ -0,0 +1,44 @@ + + + +# Presto guide +This tutorial provides a quick introduction to using Alluxio. --- End diff -- ```suggestion This tutorial provides a brief introduction to using Alluxio. ``` ---
[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3014 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10453/ ---
[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3014 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2413/ ---
[GitHub] carbondata issue #3051: [CARBONDATA-3221] Fix the error of SDK don't support...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/3051 LGTM ---
[GitHub] carbondata pull request #2996: [WIP] Fix Rename-Fail & Datamap-creation-Fail
Github user NamanRastogi commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2996#discussion_r245669672 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableRenameCommand.scala --- @@ -165,15 +167,22 @@ private[sql] case class CarbonAlterTableRenameCommand( case e: ConcurrentOperationException => throw e case e: Exception => +if (hiveRenameSuccess) { + sparkSession.sessionState.catalog.asInstanceOf[CarbonSessionCatalog].alterTableRename( +TableIdentifier(newTableName, Some(oldDatabaseName)), +TableIdentifier(oldTableName, Some(oldDatabaseName)), --- End diff -- Done ---
[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE
Github user NamanRastogi commented on the issue: https://github.com/apache/carbondata/pull/3014 retest this please ---
[GitHub] carbondata issue #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap example
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2963 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10454/ ---
[GitHub] carbondata pull request #3054: [CARBONDATA-3232] Optimize carbonData using a...
Github user chenliang613 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3054#discussion_r245676210 --- Diff: README.md --- @@ -68,8 +68,8 @@ CarbonData is built using Apache Maven, to [build CarbonData](https://github.com * [FAQs](https://github.com/apache/carbondata/blob/master/docs/faq.md) ## Integration -* [Hive](https://github.com/apache/carbondata/blob/master/docs/hive-guide.md) -* [Presto](https://github.com/apache/carbondata/blob/master/docs/presto-guide.md) +* [Hive](https://github.com/apache/carbondata/blob/master/docs/Integration/hive-guide.md) --- End diff -- Don't suggest creating many folders under docs. ---
[GitHub] carbondata issue #3054: [CARBONDATA-3232] Optimize carbonData using alluxio
Github user chenliang613 commented on the issue: https://github.com/apache/carbondata/pull/3054 the pr title is not consistent with pr content. how about : Add example for alluxio integration ---
[GitHub] carbondata issue #3024: [CARBONDATA-3230] Add alter test case for datasource
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/3024 LGTM ---
[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3014 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2199/ ---
[GitHub] carbondata issue #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap example
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2963 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2414/ ---
[GitHub] carbondata issue #3053: [CARBONDATA-3233]Fix JVM crash issue in snappy compr...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3053 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2415/ ---
[GitHub] carbondata issue #2996: [WIP] Fix Rename-Fail & Datamap-creation-Fail
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2996 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2200/ ---
[GitHub] carbondata issue #3053: [CARBONDATA-3233]Fix JVM crash issue in snappy compr...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3053 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10455/ ---
[jira] [Created] (CARBONDATA-3234) Unable to read data from carbondata table stored in S3 from Presto running on EMR
charles horrell created CARBONDATA-3234: --- Summary: Unable to read data from carbondata table stored in S3 from Presto running on EMR Key: CARBONDATA-3234 URL: https://issues.apache.org/jira/browse/CARBONDATA-3234 Project: CarbonData Issue Type: Bug Reporter: charles horrell -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #3055: [HOTFIX] presto can't read dictionary include...
GitHub user ajantha-bhat opened a pull request: https://github.com/apache/carbondata/pull/3055 [HOTFIX] presto can't read dictionary include decimal column problem: decimal column with dictionary include cannot be read in presto cause: int is typecasted to decimal for dictionary columns in decimal stream reader. solution: keep original data type as well as new data type for decimal stream reader. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? NA - [ ] Any backward compatibility impacted? NA - [ ] Document update required? NA - [ ] Testing done. done - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/ajantha-bhat/carbondata issue_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/3055.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3055 commit 205189d121f80aa87598a6c5f5e34562036c03c5 Author: ajantha-bhat Date: 2019-01-07T09:20:11Z dictionary include decimal column type cast issue problem: decimal column with dictionary include cannot be read in presto cause: int is typecasted to decimal for dictionary columns. solution: keep original data type as well as new data type for decimal slice reader ---
[jira] [Updated] (CARBONDATA-3234) Unable to read data from carbondata table stored in S3 from Presto running on EMR
[ https://issues.apache.org/jira/browse/CARBONDATA-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] charles horrell updated CARBONDATA-3234: Environment: Amazon EMR 5.19 Description: Once creating a carbondata table stored in S3 we are unable to query with presto and get the following error: {code:java} presto:default> select count(*) from test_table; Query 20190107_135333_00026_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation presto:default> select * test_table; Query 20190107_135610_00028_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation {code} The catalog appears to have been picked up okay as show tables works as expected as does describing the table it is just when actually trying to access the data that we see the error. We configured presto as per the examples here: [http://carbondata.apache.org/quick-start-guide.html] Querying from Spark works okay however it is vital for our use case that presto also works and with S3. Amazon EMR version 5.19 Spark 2.3.2 Hadoop 2.8.5 Presto 0.212 Component/s: presto-integration > Unable to read data from carbondata table stored in S3 from Presto running on > EMR > - > > Key: CARBONDATA-3234 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3234 > Project: CarbonData > Issue Type: Bug > Components: presto-integration > Environment: Amazon EMR 5.19 >Reporter: charles horrell >Priority: Major > > Once creating a carbondata table stored in S3 we are unable to query with > presto and get the following error: > {code:java} > presto:default> select count(*) from test_table; > Query 20190107_135333_00026_8r2c8 failed: tried to access method > org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V > from class org.apache.hadoop.fs.s3a.S3AInstrumentation > > presto:default> select * test_table; > Query 20190107_135610_00028_8r2c8 failed: tried to access method > org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V > from class org.apache.hadoop.fs.s3a.S3AInstrumentation > {code} > The catalog appears to have been picked up okay as show tables works as > expected as does describing the table it is just when actually trying to > access the data that we see the error. > We configured presto as per the examples here: > [http://carbondata.apache.org/quick-start-guide.html] > Querying from Spark works okay however it is vital for our use case that > presto also works and with S3. > Amazon EMR version 5.19 > Spark 2.3.2 > Hadoop 2.8.5 > Presto 0.212 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-3234) Unable to read data from carbondata table stored in S3 with Presto running on EMR
[ https://issues.apache.org/jira/browse/CARBONDATA-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] charles horrell updated CARBONDATA-3234: Description: After carbondata table stored in S3 we are unable to query with presto and get the following error: {code:java} presto:default> select count(*) from test_table; Query 20190107_135333_00026_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation presto:default> select * test_table; Query 20190107_135610_00028_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation {code} The catalog appears to have been picked up okay as show tables works as expected as does describing the table it is just when actually trying to access the data that we see the error. We configured presto as per the examples here: [http://carbondata.apache.org/quick-start-guide.html] Querying from Spark works okay however it is vital for our use case that presto also works and with S3. Amazon EMR version 5.19 Spark 2.3.2 Hadoop 2.8.5 Presto 0.212 was: Once creating a carbondata table stored in S3 we are unable to query with presto and get the following error: {code:java} presto:default> select count(*) from test_table; Query 20190107_135333_00026_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation presto:default> select * test_table; Query 20190107_135610_00028_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation {code} The catalog appears to have been picked up okay as show tables works as expected as does describing the table it is just when actually trying to access the data that we see the error. We configured presto as per the examples here: [http://carbondata.apache.org/quick-start-guide.html] Querying from Spark works okay however it is vital for our use case that presto also works and with S3. Amazon EMR version 5.19 Spark 2.3.2 Hadoop 2.8.5 Presto 0.212 Summary: Unable to read data from carbondata table stored in S3 with Presto running on EMR (was: Unable to read data from carbondata table stored in S3 from Presto running on EMR) > Unable to read data from carbondata table stored in S3 with Presto running on > EMR > - > > Key: CARBONDATA-3234 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3234 > Project: CarbonData > Issue Type: Bug > Components: presto-integration > Environment: Amazon EMR 5.19 >Reporter: charles horrell >Priority: Major > > After carbondata table stored in S3 we are unable to query with presto and > get the following error: > {code:java} > presto:default> select count(*) from test_table; > Query 20190107_135333_00026_8r2c8 failed: tried to access method > org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V > from class org.apache.hadoop.fs.s3a.S3AInstrumentation > > presto:default> select * test_table; > Query 20190107_135610_00028_8r2c8 failed: tried to access method > org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V > from class org.apache.hadoop.fs.s3a.S3AInstrumentation > {code} > The catalog appears to have been picked up okay as show tables works as > expected as does describing the table it is just when actually trying to > access the data that we see the error. > We configured presto as per the examples here: > [http://carbondata.apache.org/quick-start-guide.html] > Querying from Spark works okay however it is vital for our use case that > presto also works and with S3. > Amazon EMR version 5.19 > Spark 2.3.2 > Hadoop 2.8.5 > Presto 0.212 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-3234) Unable to read data from carbondata table stored in S3 using Presto running on EMR
[ https://issues.apache.org/jira/browse/CARBONDATA-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] charles horrell updated CARBONDATA-3234: Summary: Unable to read data from carbondata table stored in S3 using Presto running on EMR (was: Unable to read data from carbondata table stored in S3 with Presto running on EMR) > Unable to read data from carbondata table stored in S3 using Presto running > on EMR > -- > > Key: CARBONDATA-3234 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3234 > Project: CarbonData > Issue Type: Bug > Components: presto-integration > Environment: Amazon EMR 5.19 >Reporter: charles horrell >Priority: Major > > We are unable to use presto to query a carbondata table stored in S3. > {code:java} > presto:default> select count(*) from test_table; > Query 20190107_135333_00026_8r2c8 failed: tried to access method > org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V > from class org.apache.hadoop.fs.s3a.S3AInstrumentation > > presto:default> select * from test_table; > Query 20190107_135610_00028_8r2c8 failed: tried to access method > org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V > from class org.apache.hadoop.fs.s3a.S3AInstrumentation > {code} > The catalog appears to have been picked up okay as show tables works as > expected as does describing the table it is just when actually trying to > access the data that we see the error. > We configured presto as per the examples here: > [http://carbondata.apache.org/quick-start-guide.html] > Querying from Spark works okay however it is vital for our use case that > presto also works and with S3. > Amazon EMR version 5.19 > Spark 2.3.2 > Hadoop 2.8.5 > Presto 0.212 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-3234) Unable to read data from carbondata table stored in S3 with Presto running on EMR
[ https://issues.apache.org/jira/browse/CARBONDATA-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] charles horrell updated CARBONDATA-3234: Description: We are unable to use presto to query a carbondata table stored in S3. {code:java} presto:default> select count(*) from test_table; Query 20190107_135333_00026_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation presto:default> select * from test_table; Query 20190107_135610_00028_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation {code} The catalog appears to have been picked up okay as show tables works as expected as does describing the table it is just when actually trying to access the data that we see the error. We configured presto as per the examples here: [http://carbondata.apache.org/quick-start-guide.html] Querying from Spark works okay however it is vital for our use case that presto also works and with S3. Amazon EMR version 5.19 Spark 2.3.2 Hadoop 2.8.5 Presto 0.212 was: We are unable to use presto to query a carbondata table stored in S3. {code:java} presto:default> select count(*) from test_table; Query 20190107_135333_00026_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation presto:default> select * test_table; Query 20190107_135610_00028_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation {code} The catalog appears to have been picked up okay as show tables works as expected as does describing the table it is just when actually trying to access the data that we see the error. We configured presto as per the examples here: [http://carbondata.apache.org/quick-start-guide.html] Querying from Spark works okay however it is vital for our use case that presto also works and with S3. Amazon EMR version 5.19 Spark 2.3.2 Hadoop 2.8.5 Presto 0.212 > Unable to read data from carbondata table stored in S3 with Presto running on > EMR > - > > Key: CARBONDATA-3234 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3234 > Project: CarbonData > Issue Type: Bug > Components: presto-integration > Environment: Amazon EMR 5.19 >Reporter: charles horrell >Priority: Major > > We are unable to use presto to query a carbondata table stored in S3. > {code:java} > presto:default> select count(*) from test_table; > Query 20190107_135333_00026_8r2c8 failed: tried to access method > org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V > from class org.apache.hadoop.fs.s3a.S3AInstrumentation > > presto:default> select * from test_table; > Query 20190107_135610_00028_8r2c8 failed: tried to access method > org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V > from class org.apache.hadoop.fs.s3a.S3AInstrumentation > {code} > The catalog appears to have been picked up okay as show tables works as > expected as does describing the table it is just when actually trying to > access the data that we see the error. > We configured presto as per the examples here: > [http://carbondata.apache.org/quick-start-guide.html] > Querying from Spark works okay however it is vital for our use case that > presto also works and with S3. > Amazon EMR version 5.19 > Spark 2.3.2 > Hadoop 2.8.5 > Presto 0.212 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-3234) Unable to read data from carbondata table stored in S3 using Presto running on EMR
[ https://issues.apache.org/jira/browse/CARBONDATA-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] charles horrell updated CARBONDATA-3234: Description: We are unable to use presto to query a carbondata table stored in S3. {code:java} presto:default> select count(*) from test_table; Query 20190107_135333_00026_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation presto:default> select * from test_table; Query 20190107_135610_00028_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation {code} The catalog appears to have been picked up okay as show tables works as expected as does describing the table it is just when actually trying to access the data that we see the error. We configured presto as per the examples here: [http://carbondata.apache.org/quick-start-guide.html] Querying from Spark works okay however it is vital for our use case that presto also works and with S3. Amazon EMR version 5.19 Spark 2.3.2 Hadoop 2.8.5 Presto 0.212 Stack trace from presto server log {code:java} 2019-01-07T12:19:57.562Z WARN statement-response-4 com.facebook.presto.server.ThrowableMapper Request failed for /v1/statement/20190107_121957_4_k6t5p/1 java.lang.IllegalAccessError: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation at org.apache.hadoop.fs.s3a.S3AInstrumentation.streamCounter(S3AInstrumentation.java:194) at org.apache.hadoop.fs.s3a.S3AInstrumentation.streamCounter(S3AInstrumentation.java:216) at org.apache.hadoop.fs.s3a.S3AInstrumentation.(S3AInstrumentation.java:139) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:174) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.(AbstractDFSCarbonFile.java:74) at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.(AbstractDFSCarbonFile.java:66) at org.apache.carbondata.core.datastore.filesystem.HDFSCarbonFile.(HDFSCarbonFile.java:41) at org.apache.carbondata.core.datastore.filesystem.S3CarbonFile.(S3CarbonFile.java:41) at org.apache.carbondata.core.datastore.impl.DefaultFileTypeProvider.getCarbonFile(DefaultFileTypeProvider.java:53) at org.apache.carbondata.core.datastore.impl.FileFactory.getCarbonFile(FileFactory.java:102) at org.apache.carbondata.presto.impl.CarbonTableReader.updateCarbonFile(CarbonTableReader.java:202) at org.apache.carbondata.presto.impl.CarbonTableReader.updateSchemaList(CarbonTableReader.java:216) at org.apache.carbondata.presto.impl.CarbonTableReader.getSchemaNames(CarbonTableReader.java:189) at org.apache.carbondata.presto.CarbondataMetadata.listSchemaNamesInternal(CarbondataMetadata.java:86) at org.apache.carbondata.presto.CarbondataMetadata.getTableMetadata(CarbondataMetadata.java:135) at org.apache.carbondata.presto.CarbondataMetadata.getTableMetadataInternal(CarbondataMetadata.java:240) at org.apache.carbondata.presto.CarbondataMetadata.getTableMetadata(CarbondataMetadata.java:232) at com.facebook.presto.spi.connector.classloader.ClassLoaderSafeConnectorMetadata.getTableMetadata(ClassLoaderSafeConnectorMetadata.java:145) at com.facebook.presto.metadata.MetadataManager.getTableMetadata(MetadataManager.java:388) at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitTable(StatementAnalyzer.java:850) at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitTable(StatementAnalyzer.java:258) at com.facebook.presto.sql.tree.Table.accept(Table.java:53) at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27) at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:270) at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.analyzeFrom(StatementAnalyzer.java:1772) at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:954) at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visitor.visitQuerySpecification(StatementAnalyzer.java:258) at com.facebook.presto.sql.tree.QuerySpecification.accept(QuerySpecification.java:127) at com.facebook.presto.sql.tree.AstVisitor.process(AstVisitor.java:27) at com.facebook.presto.sql.analyzer.StatementAnalyzer$Visi
[GitHub] carbondata pull request #3048: [CARBONDATA-3224] Support SDK/CSDK validate t...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/3048 ---
[jira] [Updated] (CARBONDATA-3234) Unable to read data from carbondata table stored in S3 with Presto running on EMR
[ https://issues.apache.org/jira/browse/CARBONDATA-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] charles horrell updated CARBONDATA-3234: Description: We are unable to use presto to query a carbondata table stored in S3. {code:java} presto:default> select count(*) from test_table; Query 20190107_135333_00026_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation presto:default> select * test_table; Query 20190107_135610_00028_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation {code} The catalog appears to have been picked up okay as show tables works as expected as does describing the table it is just when actually trying to access the data that we see the error. We configured presto as per the examples here: [http://carbondata.apache.org/quick-start-guide.html] Querying from Spark works okay however it is vital for our use case that presto also works and with S3. Amazon EMR version 5.19 Spark 2.3.2 Hadoop 2.8.5 Presto 0.212 was: After carbondata table stored in S3 we are unable to query with presto and get the following error: {code:java} presto:default> select count(*) from test_table; Query 20190107_135333_00026_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation presto:default> select * test_table; Query 20190107_135610_00028_8r2c8 failed: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation {code} The catalog appears to have been picked up okay as show tables works as expected as does describing the table it is just when actually trying to access the data that we see the error. We configured presto as per the examples here: [http://carbondata.apache.org/quick-start-guide.html] Querying from Spark works okay however it is vital for our use case that presto also works and with S3. Amazon EMR version 5.19 Spark 2.3.2 Hadoop 2.8.5 Presto 0.212 > Unable to read data from carbondata table stored in S3 with Presto running on > EMR > - > > Key: CARBONDATA-3234 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3234 > Project: CarbonData > Issue Type: Bug > Components: presto-integration > Environment: Amazon EMR 5.19 >Reporter: charles horrell >Priority: Major > > We are unable to use presto to query a carbondata table stored in S3. > {code:java} > presto:default> select count(*) from test_table; > Query 20190107_135333_00026_8r2c8 failed: tried to access method > org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V > from class org.apache.hadoop.fs.s3a.S3AInstrumentation > > presto:default> select * test_table; > Query 20190107_135610_00028_8r2c8 failed: tried to access method > org.apache.hadoop.metrics2.lib.MutableCounterLong.(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V > from class org.apache.hadoop.fs.s3a.S3AInstrumentation > {code} > The catalog appears to have been picked up okay as show tables works as > expected as does describing the table it is just when actually trying to > access the data that we see the error. > We configured presto as per the examples here: > [http://carbondata.apache.org/quick-start-guide.html] > Querying from Spark works okay however it is vital for our use case that > presto also works and with S3. > Amazon EMR version 5.19 > Spark 2.3.2 > Hadoop 2.8.5 > Presto 0.212 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2996: [WIP] Fix Rename-Fail & Datamap-creation-Fail
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2996 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2416/ ---
[GitHub] carbondata issue #2996: [WIP] Fix Rename-Fail & Datamap-creation-Fail
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2996 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10456/ ---
[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3014 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2417/ ---
[GitHub] carbondata pull request #3050: [CARBONDATA-3211] Optimize the documentation
Github user bbinwang commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3050#discussion_r245696590 --- Diff: datamap/lucene/src/main/java/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapFactory.java --- @@ -57,7 +57,7 @@ public LuceneFineGrainDataMapFactory(CarbonTable carbonTable, DataMapSchema data DataMapWriter.getDefaultDataMapPath(tableIdentifier.getTablePath(), segment.getSegmentNo(), dataMapName), segment.getConfiguration())); } catch (MemoryException e) { - LOGGER.error("failed to get lucene datamap , detail is {}" + e.getMessage()); + LOGGER.error("failed to get lucene datamap, detail is {}" + e.getMessage()); --- End diff -- fixed ---
[GitHub] carbondata issue #3055: [HOTFIX] presto can't read dictionary include decima...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3055 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2201/ ---
[GitHub] carbondata issue #3014: [CARBONDATA-3201] Added load level SORT_SCOPE
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3014 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10457/ ---
[GitHub] carbondata issue #3055: [HOTFIX] presto can't read dictionary include decima...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3055 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10458/ ---
[GitHub] carbondata issue #3055: [HOTFIX] presto can't read dictionary include decima...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3055 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2418/ ---
[GitHub] carbondata issue #3052: [CARBONDATA-3227] Fix some spell errors in the proje...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3052 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2203/ ---
[GitHub] carbondata issue #3050: [CARBONDATA-3211] Optimize the documentation
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3050 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2202/ ---
[GitHub] carbondata issue #3050: [CARBONDATA-3211] Optimize the documentation
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3050 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10459/ ---
[GitHub] carbondata issue #3052: [CARBONDATA-3227] Fix some spell errors in the proje...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3052 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2420/ ---
[GitHub] carbondata issue #3050: [CARBONDATA-3211] Optimize the documentation
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3050 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2419/ ---
[GitHub] carbondata issue #3052: [CARBONDATA-3227] Fix some spell errors in the proje...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/3052 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10460/ ---
[GitHub] carbondata pull request #3053: [CARBONDATA-3233]Fix JVM crash issue in snapp...
Github user qiuchenjian commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3053#discussion_r245848950 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java --- @@ -369,7 +367,7 @@ public BigDecimal getDecimal(int rowId) { @Override public double[] getDoublePage() { -double[] data = new double[getPageSize()]; +double[] data = new double[getEndLoop()]; --- End diff -- the return values of getPageSize() and getEndLoop seem to be same, when they are diff? ---
[GitHub] carbondata issue #3053: [CARBONDATA-3233]Fix JVM crash issue in snappy compr...
Github user qiuchenjian commented on the issue: https://github.com/apache/carbondata/pull/3053 i think the performance of rawCompress is better than compressLong,compressInt, can we find the root cause of JVM crashï¼ ---
[GitHub] carbondata pull request #3032: [CARBONDATA-3210] Merge common method into Ca...
Github user xiaohui0318 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/3032#discussion_r245852864 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/S3Example.scala --- @@ -18,52 +18,50 @@ package org.apache.carbondata.examples import java.io.File -import org.apache.hadoop.fs.s3a.Constants.{ACCESS_KEY, ENDPOINT, SECRET_KEY} import org.apache.spark.sql.{Row, SparkSession} import org.slf4j.{Logger, LoggerFactory} -import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.spark.util.CarbonSparkUtil object S3Example { - /** - * This example demonstrate usage of - * 1. create carbon table with storage location on object based storage - * like AWS S3, Huawei OBS, etc - * 2. load data into carbon table, the generated file will be stored on object based storage - * query the table. - * - * @param args require three parameters "Access-key" "Secret-key" - * "table-path on s3" "s3-endpoint" "spark-master" - */ + /** +* This example demonstrate usage of +* 1. create carbon table with storage location on object based storage +* like AWS S3, Huawei OBS, etc +* 2. load data into carbon table, the generated file will be stored on object based storage +* query the table. +* +* @param args require three parameters "Access-key" "Secret-key" +* "table-path on s3" "s3-endpoint" "spark-master" +*/ --- End diff -- checked and fix ---
[GitHub] carbondata issue #3052: [CARBONDATA-3227] Fix some spell errors in the proje...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/3052 LGTM! Thanks for you contribution! ---
[GitHub] carbondata issue #3050: [CARBONDATA-3211] Optimize the documentation
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/3050 LGTM! Thanks for you contribution! ---
[GitHub] carbondata issue #3052: [CARBONDATA-3227] Fix some spell errors in the proje...
Github user zzcclp commented on the issue: https://github.com/apache/carbondata/pull/3052 LGTM ---
[GitHub] carbondata issue #3050: [CARBONDATA-3211] Optimize the documentation
Github user zzcclp commented on the issue: https://github.com/apache/carbondata/pull/3050 LGTM ---
[GitHub] carbondata issue #3032: [CARBONDATA-3210] Merge common method into CarbonSpa...
Github user zzcclp commented on the issue: https://github.com/apache/carbondata/pull/3032 LGTM ---
[GitHub] carbondata pull request #3052: [CARBONDATA-3227] Fix some spell errors in th...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/3052 ---
[jira] [Resolved] (CARBONDATA-3227) There are some spell errors in the project
[ https://issues.apache.org/jira/browse/CARBONDATA-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 resolved CARBONDATA-3227. - Resolution: Fixed > There are some spell errors in the project > -- > > Key: CARBONDATA-3227 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3227 > Project: CarbonData > Issue Type: Bug >Affects Versions: 1.5.1 >Reporter: xubo245 >Assignee: Bellamy Yi >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > There are some spell errors in the project: > escapechar > optionlist > hivedefaultpartition > pvalue > errormsg > isDetectAsDimentionDatatype > Please fix it if there are other spell error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #3050: [CARBONDATA-3211] Optimize the documentation
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/3050 ---