[GitHub] carbondata issue #2953: [CARBONDATA-3132]Correct the task disrtibution in ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2953 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1756/ ---
[GitHub] carbondata pull request #2954: [CARBONDATA-3128]HiveExample has some excepti...
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2954#discussion_r236555834 --- Diff: pom.xml --- @@ -100,7 +100,6 @@ processing hadoop integration/spark-common -integration/spark-datasource --- End diff -- please add it to PR content ---
[GitHub] carbondata issue #2953: [CARBONDATA-3132]Correct the task disrtibution in ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2953 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1546/ ---
[GitHub] carbondata issue #2955: [CARBONDATA-3133] update build document
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2955 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9805/ ---
[GitHub] carbondata issue #2955: [CARBONDATA-3133] update build document
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2955 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1757/ ---
[jira] [Created] (CARBONDATA-3134) Wrong result when a column is dropped and added using alter with blocklet cache.
Kunal Kapoor created CARBONDATA-3134: Summary: Wrong result when a column is dropped and added using alter with blocklet cache. Key: CARBONDATA-3134 URL: https://issues.apache.org/jira/browse/CARBONDATA-3134 Project: CarbonData Issue Type: Bug Reporter: Kunal Kapoor Assignee: Kunal Kapoor *Steps to reproduce:* spark.sql("drop table if exists tile") spark.sql("create table tile(b int, s int,bi bigint, t timestamp) partitioned by (i int) stored by 'carbondata' TBLPROPERTIES ('DICTIONARY_EXCLUDE'='b,s,i,bi,t','SORT_COLUMS'='b,s,i,bi,t', 'cache_level'='blocklet')") spark.sql("load data inpath 'C:/Users/k00475610/Documents/en_all.csv' into table tile options('fileheader'='b,s,i,bi,t','DELIMITER'=',')") spark.sql("select * from tile") spark.sql("alter table tile drop columns(t)") spark.sql("alter table tile add columns(t timestamp)") spark.sql("load data inpath 'C:/Users/k00475610/Documents/en_all.csv' into table tile options('fileheader'='b,s,i,bi,t','DELIMITER'=',')") spark.sql("select * from tile").show() *Result:* *+---+-+---+++* *| b| s| bi| t| i|* +---+-+---+++ |100|2|93405673097|null|1644| |100|2|93405673097|null|1644| +---+-+---+++ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #2955: [CARBONDATA-3133] update build document
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2955#discussion_r236563139 --- Diff: build/README.md --- @@ -29,10 +29,40 @@ Build with different supported versions of Spark, by default using Spark 2.2.1 t ``` mvn -DskipTests -Pspark-2.1 -Dspark.version=2.1.0 clean package mvn -DskipTests -Pspark-2.2 -Dspark.version=2.2.1 clean package +mvn -DskipTests -Pspark-2.3 -Dspark.version=2.3.2 clean package ``` Note: If you are working in Windows environment, remember to add `-Pwindows` while building the project. +## MV Feature Build +Add mv module and sourceDirectory to the spark profile corresponding to the parent pom.xml file and recompile. +The compile command is the same as the command in the previous section +``` + --- End diff -- Do we really need this? Currently we can use `-Pmv` to include the MV feature while compiling. ---
[GitHub] carbondata pull request #2955: [CARBONDATA-3133] update build document
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2955#discussion_r236563388 --- Diff: build/README.md --- @@ -29,10 +29,40 @@ Build with different supported versions of Spark, by default using Spark 2.2.1 t ``` mvn -DskipTests -Pspark-2.1 -Dspark.version=2.1.0 clean package mvn -DskipTests -Pspark-2.2 -Dspark.version=2.2.1 clean package +mvn -DskipTests -Pspark-2.3 -Dspark.version=2.3.2 clean package ``` Note: If you are working in Windows environment, remember to add `-Pwindows` while building the project. +## MV Feature Build --- End diff -- no need so lone describe, only add `-Pmv` , for example:If you want to use MV, remember to add `-Pmv ` while building the project. ---
[GitHub] carbondata pull request #2955: [CARBONDATA-3133] update build document
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2955#discussion_r236563581 --- Diff: build/README.md --- @@ -29,10 +29,40 @@ Build with different supported versions of Spark, by default using Spark 2.2.1 t ``` mvn -DskipTests -Pspark-2.1 -Dspark.version=2.1.0 clean package mvn -DskipTests -Pspark-2.2 -Dspark.version=2.2.1 clean package +mvn -DskipTests -Pspark-2.3 -Dspark.version=2.3.2 clean package ``` Note: If you are working in Windows environment, remember to add `-Pwindows` while building the project. +## MV Feature Build --- End diff -- in the parent pom.xml, there is already mv profile. ---
[GitHub] carbondata pull request #2956: [CARBONDATA-3134] fixed null values when cach...
GitHub user kunal642 opened a pull request: https://github.com/apache/carbondata/pull/2956 [CARBONDATA-3134] fixed null values when cachelevel is set as blocklet **Problem:** For each blocklet an object of SegmentPropertiesAndSchemaHolder is created to store the schema used for query. This object is created only if no other blocklet has the same schema. To check the schema we are comparing List, as the equals method in ColumnSchema does not check for columnUniqueId therefore this check is failing and the new restructured blocklet is using the schema of the old blocklet. Due to this the newly added column is being ignored as the old blocklet schema specifies that the column is delete(alter drop). **Solution:** Instead of checking the equality through equals and hashcode, write a new implementation for both and check based on columnUniqueId. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kunal642/carbondata bug/CARBONDATA-3134 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2956.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2956 commit e74b3f22d285f5b5c37b17a11248b5e7977326b2 Author: kunal642 Date: 2018-11-27T08:43:27Z [CARBONDATA-3134] fixed null values when cachelevel is set as blocklet Problem: For each blocklet an object of SegmentPropertiesAndSchemaHolder is created to store the schema used for query. This object is created only if no other blocklet has the same schema. To check the schema we are comparing List, as the equals method in ColumnSchema does not check for columnUniqueId therefore this check is failing and the new restructured blocklet is using the schema of the old blocklet. Due to this the newly added column is being ignored as the old blocklet schema specifies that the column is delete(alter drop). Solution: Instead of checking the equality through equals and hashcode, write a new implementation for both and check based on columnUniqueId. ---
[GitHub] carbondata pull request #2936: [CARBONDATA-3118] Parallelize block pruning o...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2936#discussion_r236565449 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -1399,6 +1399,17 @@ private CarbonCommonConstants() { public static final String CARBON_PUSH_ROW_FILTERS_FOR_VECTOR_DEFAULT = "false"; + /** + * max driver threads used for block pruning [1 to 4 threads] + */ + @CarbonProperty public static final String CARBON_MAX_DRIVER_THREADS_FOR_BLOCK_PRUNING = + "carbon.max.driver.threads.for.block.pruning"; + + public static final String CARBON_MAX_DRIVER_THREADS_FOR_BLOCK_PRUNING_DEFAULT = "4"; + + // block prune in multi-thread if files size more than 100K files. + public static final int CARBON_DRIVER_PRUNING_MULTI_THREAD_ENABLE_FILES_COUNT = 10; --- End diff -- Why add this constraint? ---
[GitHub] carbondata pull request #2936: [CARBONDATA-3118] Parallelize block pruning o...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2936#discussion_r236564769 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java --- @@ -63,6 +75,8 @@ private SegmentPropertiesFetcher segmentPropertiesFetcher; + private static final Log LOG = LogFactory.getLog(TableDataMap.class); --- End diff -- We do not use apache-common-logs in carbondata project! Please take care of this ---
[GitHub] carbondata pull request #2936: [CARBONDATA-3118] Parallelize block pruning o...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2936#discussion_r236568719 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java --- @@ -487,6 +487,8 @@ private int getBlockCount(List blocklets) { // First prune using default datamap on driver side. TableDataMap defaultDataMap = DataMapStoreManager.getInstance().getDefaultDataMap(carbonTable); List prunedBlocklets = null; +// This is to log the event, so user will know what is happening by seeing logs. +LOG.info("Started block pruning ..."); --- End diff -- Instead of adding these logs, I think we'd better add the time consumed for pruning in statistics. ---
[GitHub] carbondata pull request #2936: [CARBONDATA-3118] Parallelize block pruning o...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2936#discussion_r236565153 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -1399,6 +1399,17 @@ private CarbonCommonConstants() { public static final String CARBON_PUSH_ROW_FILTERS_FOR_VECTOR_DEFAULT = "false"; + /** + * max driver threads used for block pruning [1 to 4 threads] + */ + @CarbonProperty public static final String CARBON_MAX_DRIVER_THREADS_FOR_BLOCK_PRUNING = + "carbon.max.driver.threads.for.block.pruning"; --- End diff -- I think it's better to use the name `carbon.query.pruning.parallelism.driver` ---
[GitHub] carbondata issue #2956: [CARBONDATA-3134] fixed null values when cachelevel ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2956 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1547/ ---
[GitHub] carbondata pull request #2949: [WIP] support parallel block pruning for non-...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2949#discussion_r236571984 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java --- @@ -70,4 +70,6 @@ void init(DataMapModel dataMapModel) */ void finish(); + // can return , number of records information that are stored in datamap. --- End diff -- "can return"? What does this mean? ---
[GitHub] carbondata issue #2942: [CARBONDATA-3121] Improvement of CarbonReader build ...
Github user kunal642 commented on the issue: https://github.com/apache/carbondata/pull/2942 LGTM ---
[GitHub] carbondata issue #2955: [CARBONDATA-3133] update build document
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2955 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1548/ ---
[GitHub] carbondata pull request #2942: [CARBONDATA-3121] Improvement of CarbonReader...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2942 ---
[jira] [Resolved] (CARBONDATA-3121) CarbonReader build time is huge
[ https://issues.apache.org/jira/browse/CARBONDATA-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Kapoor resolved CARBONDATA-3121. -- Resolution: Fixed Fix Version/s: 1.5.1 > CarbonReader build time is huge > --- > > Key: CARBONDATA-3121 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3121 > Project: CarbonData > Issue Type: Improvement > Components: core >Reporter: Naman Rastogi >Assignee: Naman Rastogi >Priority: Minor > Fix For: 1.5.1 > > Time Spent: 3.5h > Remaining Estimate: 0h > > CarbonReader build is fetching data and triggering I/O operation instead of > only initializing the iterator, thus large build time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2953: [CARBONDATA-3132]Correct the task disrtibution in ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2953 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9806/ ---
[GitHub] carbondata issue #2953: [CARBONDATA-3132]Correct the task disrtibution in ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2953 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1758/ ---
[GitHub] carbondata pull request #2957: [DOCUMENT] Added filter push handling paramet...
GitHub user ravipesala opened a pull request: https://github.com/apache/carbondata/pull/2957 [DOCUMENT] Added filter push handling parameter in documents. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? - [ ] Any backward compatibility impacted? - [ ] Document update required? - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata filter-push-doc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2957.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2957 commit ef07cb8d607d276205c79cb04cc0b4195cd477a1 Author: ravipesala Date: 2018-11-27T09:46:57Z Added rowpush doc ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]HiveExample has some exception
Github user SteNicholas commented on the issue: https://github.com/apache/carbondata/pull/2954 Add RunHiveExample into CI test. ---
[jira] [Created] (CARBONDATA-3135) Remove sub-module profile, which can avoid multiple version jar
xubo245 created CARBONDATA-3135: --- Summary: Remove sub-module profile, which can avoid multiple version jar Key: CARBONDATA-3135 URL: https://issues.apache.org/jira/browse/CARBONDATA-3135 Project: CarbonData Issue Type: Improvement Reporter: xubo245 Assignee: xubo245 When I run hiveExample with spark-2.1 and spark-2.3, there are two version spark, spark2.1+2.2 or spark2.3+2.2, the hiveExample will fail. For more details: https://issues.apache.org/jira/browse/CARBONDATA-3128 The root cause is because user can't select sub-module profile in IDEA, it will use default version. but other module use parent profile version. So there are two version jar. Remove sub-module profile, which can avoid multiple version jar -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2957: [DOCUMENT] Added filter push handling parameter in d...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2957 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1549/ ---
[GitHub] carbondata issue #2956: [CARBONDATA-3134] fixed null values when cachelevel ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2956 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9807/ ---
[GitHub] carbondata issue #2956: [CARBONDATA-3134] fixed null values when cachelevel ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2956 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1759/ ---
[GitHub] carbondata issue #2955: [CARBONDATA-3133] update build document
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2955 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1760/ ---
[GitHub] carbondata issue #2955: [CARBONDATA-3133] update build document
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2955 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9808/ ---
[GitHub] carbondata issue #2957: [DOCUMENT] Added filter push handling parameter in d...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2957 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9809/ ---
[GitHub] carbondata issue #2957: [DOCUMENT] Added filter push handling parameter in d...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2957 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1761/ ---
[GitHub] carbondata issue #2955: [CARBONDATA-3133] update build document
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2955 Please optimize the title: Update build document ---
[GitHub] carbondata pull request #2954: [CARBONDATA-3128]HiveExample has some excepti...
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2954#discussion_r236622117 --- Diff: integration/hive/src/test/scala/org/apache/carbondata/hiveexampleCI/RunHiveExample.scala --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.hiveexampleCI + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.carbondata.hiveexample.HiveExample +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +/** + * Test suite for examples + */ + --- End diff -- Remove this line ---
[GitHub] carbondata pull request #2954: [CARBONDATA-3128]HiveExample has some excepti...
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2954#discussion_r236622183 --- Diff: integration/hive/src/test/scala/org/apache/carbondata/hiveexampleCI/RunHiveExample.scala --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.hiveexampleCI + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.carbondata.hiveexample.HiveExample +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +/** + * Test suite for examples + */ + +class RunHiveExample extends QueryTest with BeforeAndAfterAll { + + private val spark = sqlContext.sparkSession + + override def beforeAll: Unit = { +CarbonProperties.getInstance().addProperty( + CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, + CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT) +CarbonProperties.getInstance().addProperty( + CarbonCommonConstants.CARBON_DATE_FORMAT, + CarbonCommonConstants.CARBON_DATE_DEFAULT_FORMAT) + } + + override def afterAll { +sql("USE default") + --- End diff -- please remove this line ---
[GitHub] carbondata issue #2957: [DOCUMENT] Added filter push handling parameter in d...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/2957 LGTM ---
[GitHub] carbondata pull request #2954: [CARBONDATA-3128]HiveExample has some excepti...
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2954#discussion_r236623474 --- Diff: integration/hive/src/test/scala/org/apache/carbondata/hiveexampleCI/RunHiveExample.scala --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.hiveexampleCI + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.carbondata.hiveexample.HiveExample +import org.apache.spark.sql.test.util.QueryTest --- End diff -- Please move the two line to before org.apache.carbondata.* ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]HiveExample has some exception
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2954 retest this please ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]HiveExample has some exception
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2954 Please optimize the title, likeï¼ Fix the HiveExample exception ---
[GitHub] carbondata issue #2956: [CARBONDATA-3134] fixed null values when cachelevel ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2956 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1551/ ---
[GitHub] carbondata pull request #2954: [CARBONDATA-3128]Fix HiveExample Exception
Github user SteNicholas commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2954#discussion_r236627192 --- Diff: integration/hive/src/test/scala/org/apache/carbondata/hiveexampleCI/RunHiveExample.scala --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.hiveexampleCI + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.carbondata.hiveexample.HiveExample +import org.apache.spark.sql.test.util.QueryTest --- End diff -- Do it ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2954 add to whitelist ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user zzcclp commented on the issue: https://github.com/apache/carbondata/pull/2954 retest this please ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user zzcclp commented on the issue: https://github.com/apache/carbondata/pull/2954 add to whitelist ---
[GitHub] carbondata pull request #2954: [CARBONDATA-3128]Fix the HiveExample exceptio...
Github user zzcclp commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2954#discussion_r236634194 --- Diff: integration/hive/src/test/scala/org/apache/carbondata/hiveexampleCI/RunHiveExample.scala --- @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.hiveexampleCI + +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.carbondata.hiveexample.HiveExample + +class RunHiveExample extends QueryTest with BeforeAndAfterAll { + + private val spark = sqlContext.sparkSession + + override def beforeAll: Unit = { +CarbonProperties.getInstance().addProperty( + CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, + CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT) +CarbonProperties.getInstance().addProperty( + CarbonCommonConstants.CARBON_DATE_FORMAT, + CarbonCommonConstants.CARBON_DATE_DEFAULT_FORMAT) + } + + test("HiveExample") { +HiveExample.main(null) --- End diff -- Don't call HiveExample.main directly, it better write some test cases. ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user zzcclp commented on the issue: https://github.com/apache/carbondata/pull/2954 There are still some dependency of spark-core including version tag, please remove them too, in module carbondata-examples-spark2, carbondata-mv-core and so on. ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user chenliang613 commented on the issue: https://github.com/apache/carbondata/pull/2954 add to whitelist ---
[GitHub] carbondata pull request #2953: [CARBONDATA-3132]Correct the task disrtibutio...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2953#discussion_r236640610 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala --- @@ -401,7 +401,11 @@ class CarbonMergerRDD[K, V]( .add(new CarbonInputSplitTaskInfo(entry._1, entry._2).asInstanceOf[Distributable]) ) -val nodeBlockMap = CarbonLoaderUtil.nodeBlockMapping(taskInfoList, -1) +// get all the active nodes of cluster and prepare the nodeBlockMap based on these nodes +val activeNodes = DistributionUtil + .ensureExecutorsAndGetNodeList(taskInfoList.asScala, sparkContext) + +val nodeBlockMap = CarbonLoaderUtil.nodeBlockMapping(taskInfoList, -1, activeNodes.asJava) --- End diff -- Below code is redundant, please remove it ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2954 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1553/ ---
[GitHub] carbondata issue #2956: [CARBONDATA-3134] fixed null values when cachelevel ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2956 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9811/ ---
[GitHub] carbondata pull request #2956: [CARBONDATA-3134] fixed null values when cach...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2956#discussion_r236646004 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java --- @@ -332,13 +334,42 @@ public void clear() { } SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper other = (SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper) obj; - return tableIdentifier.equals(other.tableIdentifier) && columnsInTable - .equals(other.columnsInTable) && Arrays + return tableIdentifier.equals(other.tableIdentifier) && checkColumnSchemaEquality( + columnsInTable, other.columnsInTable) && Arrays .equals(columnCardinality, other.columnCardinality); } +private boolean checkColumnSchemaEquality(List obj1, List obj2) { + List clonedObj1 = new ArrayList<>(obj1); + List clonedObj2 = new ArrayList<>(obj2); + clonedObj1.addAll(obj1); + clonedObj2.addAll(obj2); + Collections.sort(clonedObj1, new Comparator() { +@Override public int compare(ColumnSchema o1, ColumnSchema o2) { + return o1.getColumnUniqueId().compareTo(o2.getColumnUniqueId()); +} + }); + Collections.sort(clonedObj2, new Comparator() { +@Override public int compare(ColumnSchema o1, ColumnSchema o2) { + return o1.getColumnUniqueId().compareTo(o2.getColumnUniqueId()); +} + }); + boolean exists = true; + for (int i = 0; i < obj1.size(); i++) { +if (!clonedObj1.get(i).equalsWithColumnId(clonedObj2.get(i))) { + exists = false; + break; +} + } + return exists; +} + @Override public int hashCode() { - return tableIdentifier.hashCode() + columnsInTable.hashCode() + Arrays + int hashCode = 0; + for (ColumnSchema columnSchema: columnsInTable) { +hashCode += columnSchema.hashCodeWithColumnId(); --- End diff -- rename variable name to `allColumnsHashCode` ---
[GitHub] carbondata pull request #2956: [CARBONDATA-3134] fixed null values when cach...
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2956#discussion_r236644573 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java --- @@ -332,13 +334,42 @@ public void clear() { } SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper other = (SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper) obj; - return tableIdentifier.equals(other.tableIdentifier) && columnsInTable - .equals(other.columnsInTable) && Arrays + return tableIdentifier.equals(other.tableIdentifier) && checkColumnSchemaEquality( + columnsInTable, other.columnsInTable) && Arrays .equals(columnCardinality, other.columnCardinality); } +private boolean checkColumnSchemaEquality(List obj1, List obj2) { + List clonedObj1 = new ArrayList<>(obj1); --- End diff -- You can add the first check for length in the first line of method...if length of 2 lists is not same then we can return false from here itself ---
[GitHub] carbondata pull request #2953: [CARBONDATA-3132]Correct the task disrtibutio...
Github user akashrn5 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2953#discussion_r236646439 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala --- @@ -401,7 +401,11 @@ class CarbonMergerRDD[K, V]( .add(new CarbonInputSplitTaskInfo(entry._1, entry._2).asInstanceOf[Distributable]) ) -val nodeBlockMap = CarbonLoaderUtil.nodeBlockMapping(taskInfoList, -1) +// get all the active nodes of cluster and prepare the nodeBlockMap based on these nodes +val activeNodes = DistributionUtil + .ensureExecutorsAndGetNodeList(taskInfoList.asScala, sparkContext) + +val nodeBlockMap = CarbonLoaderUtil.nodeBlockMapping(taskInfoList, -1, activeNodes.asJava) --- End diff -- done ---
[GitHub] carbondata issue #2953: [CARBONDATA-3132]Correct the task disrtibution in ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2953 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1554/ ---
[GitHub] carbondata issue #2953: [CARBONDATA-3132]Correct the task disrtibution in ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2953 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9813/ ---
[GitHub] carbondata issue #2953: [CARBONDATA-3132]Correct the task disrtibution in ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2953 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1765/ ---
[GitHub] carbondata issue #2956: [CARBONDATA-3134] fixed null values when cachelevel ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2956 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1763/ ---
[GitHub] carbondata issue #2953: [CARBONDATA-3132]Correct the task disrtibution in ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2953 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1555/ ---
[GitHub] carbondata pull request #2954: [CARBONDATA-3128]Fix the HiveExample exceptio...
Github user SteNicholas commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2954#discussion_r236660688 --- Diff: integration/hive/src/test/scala/org/apache/carbondata/hiveexampleCI/RunHiveExample.scala --- @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.hiveexampleCI + +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.carbondata.hiveexample.HiveExample + +class RunHiveExample extends QueryTest with BeforeAndAfterAll { + + private val spark = sqlContext.sparkSession + + override def beforeAll: Unit = { +CarbonProperties.getInstance().addProperty( + CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, + CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT) +CarbonProperties.getInstance().addProperty( + CarbonCommonConstants.CARBON_DATE_FORMAT, + CarbonCommonConstants.CARBON_DATE_DEFAULT_FORMAT) + } + + test("HiveExample") { +HiveExample.main(null) --- End diff -- Explain this perfromance that is referred to examples/spark2/src/test/scala/org/apache/carbondata/examplesCI/RunExamples.scala implement. ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2954 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1764/ ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2954 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9812/ ---
[GitHub] carbondata issue #2915: [CARBONDATA-3095] Optimize the documentation of SDK/...
Github user KanakaKumar commented on the issue: https://github.com/apache/carbondata/pull/2915 LGTM ---
[jira] [Created] (CARBONDATA-3136) JVM crash with preaggregate datamap
Ajantha Bhat created CARBONDATA-3136: Summary: JVM crash with preaggregate datamap Key: CARBONDATA-3136 URL: https://issues.apache.org/jira/browse/CARBONDATA-3136 Project: CarbonData Issue Type: Improvement Reporter: Ajantha Bhat Assignee: Ajantha Bhat JVM crash with preaggregate datamap. callstack: Stack: [0x7efebd49a000,0x7efebd59b000], sp=0x7efebd598dc8, free space=1019k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x7b2b50] J 7620 sun.misc.Unsafe.copyMemory(Ljava/lang/Object;JLjava/lang/Object;JJ)V (0 bytes) @ 0x7eff4a3479e1 [0x7eff4a347900+0xe1] j org.apache.spark.unsafe.Platform.copyMemory(Ljava/lang/Object;JLjava/lang/Object;JJ)V+34 j org.apache.spark.sql.catalyst.expressions.UnsafeRow.getBinary(I)[B+54 j org.apache.spark.sql.catalyst.expressions.UnsafeRow.getDecimal(III)Lorg/apache/spark/sql/types/Decimal;+30 j org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Lorg/apache/spark/sql/catalyst/InternalRow;)Lorg/apache/spark/sql/catalyst/expressions/UnsafeRow;+36 j org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Ljava/lang/Object;)Ljava/lang/Object;+5 J 3104 C1 scala.collection.Iterator$$anon$11.next()Ljava/lang/Object; (19 bytes) @ 0x7eff49154724 [0x7eff49154560+0x1c4] j org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(Lscala/collection/Iterator;)Lscala/collection/Iterator;+78 j org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(Ljava/lang/Object;)Ljava/lang/Object;+5 J 14007 C1 org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; (17 bytes) @ 0x7eff4a6ed204 [0x7eff4a6ecc40+0x5c4] J 11684 C1 org.apache.spark.rdd.MapPartitionsRDD.compute(Lorg/apache/spark/Partition;Lorg/apache/spark/TaskContext;)Lscala/collection/Iterator; (36 bytes) @ 0x7eff4ad11274 [0x7eff4ad10f60+0x314] J 13771 C1 org.apache.spark.rdd.RDD.iterator(Lorg/apache/spark/Partition;Lorg/apache/spark/TaskContext;)Lscala/collection/Iterator; (46 bytes) @ 0x7eff4b39dd3c [0x7eff4b39d160+0xbdc] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2953: [CARBONDATA-3132]Correct the task disrtibution in ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2953 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9814/ ---
[GitHub] carbondata pull request #2956: [CARBONDATA-3134] fixed null values when cach...
Github user qiuchenjian commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2956#discussion_r236685568 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java --- @@ -332,13 +334,42 @@ public void clear() { } SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper other = (SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper) obj; - return tableIdentifier.equals(other.tableIdentifier) && columnsInTable - .equals(other.columnsInTable) && Arrays + return tableIdentifier.equals(other.tableIdentifier) && checkColumnSchemaEquality( + columnsInTable, other.columnsInTable) && Arrays .equals(columnCardinality, other.columnCardinality); } +private boolean checkColumnSchemaEquality(List obj1, List obj2) { + List clonedObj1 = new ArrayList<>(obj1); + List clonedObj2 = new ArrayList<>(obj2); + clonedObj1.addAll(obj1); + clonedObj2.addAll(obj2); + Collections.sort(clonedObj1, new Comparator() { +@Override public int compare(ColumnSchema o1, ColumnSchema o2) { + return o1.getColumnUniqueId().compareTo(o2.getColumnUniqueId()); +} + }); + Collections.sort(clonedObj2, new Comparator() { --- End diff -- You can optimize the duplicate code of two comparators, because they are same ---
[GitHub] carbondata issue #2953: [CARBONDATA-3132]Correct the task disrtibution in ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2953 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1766/ ---
[GitHub] carbondata pull request #2956: [CARBONDATA-3134] fixed null values when cach...
Github user qiuchenjian commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2956#discussion_r236689465 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java --- @@ -332,13 +334,42 @@ public void clear() { } SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper other = (SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper) obj; - return tableIdentifier.equals(other.tableIdentifier) && columnsInTable - .equals(other.columnsInTable) && Arrays + return tableIdentifier.equals(other.tableIdentifier) && checkColumnSchemaEquality( + columnsInTable, other.columnsInTable) && Arrays .equals(columnCardinality, other.columnCardinality); } +private boolean checkColumnSchemaEquality(List obj1, List obj2) { + List clonedObj1 = new ArrayList<>(obj1); --- End diff -- I think checkColumnSchemaEquality method need consider "obj2 == null" ---
[GitHub] carbondata pull request #2954: [CARBONDATA-3128]Fix the HiveExample exceptio...
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2954#discussion_r236695520 --- Diff: pom.xml --- @@ -200,12 +199,29 @@ + --- End diff -- Please add scope for spark-core ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user zzcclp commented on the issue: https://github.com/apache/carbondata/pull/2954 @ravipesala @jackylk please review this, thanks. ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2954 retest this please ---
[jira] [Updated] (CARBONDATA-3136) JVM crash with preaggregate datamap
[ https://issues.apache.org/jira/browse/CARBONDATA-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajantha Bhat updated CARBONDATA-3136: - Description: JVM crash with preaggregate datamap. callstack: Stack: [0x7efebd49a000,0x7efebd59b000], sp=0x7efebd598dc8, free space=1019k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x7b2b50] J 7620 sun.misc.Unsafe.copyMemory(Ljava/lang/Object;JLjava/lang/Object;JJ)V (0 bytes) @ 0x7eff4a3479e1 [0x7eff4a347900+0xe1] j org.apache.spark.unsafe.Platform.copyMemory(Ljava/lang/Object;JLjava/lang/Object;JJ)V+34 j org.apache.spark.sql.catalyst.expressions.UnsafeRow.getBinary(I)[B+54 j org.apache.spark.sql.catalyst.expressions.UnsafeRow.getDecimal(III)Lorg/apache/spark/sql/types/Decimal;+30 j org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Lorg/apache/spark/sql/catalyst/InternalRow;)Lorg/apache/spark/sql/catalyst/expressions/UnsafeRow;+36 j org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Ljava/lang/Object;)Ljava/lang/Object;+5 J 3104 C1 scala.collection.Iterator$$anon$11.next()Ljava/lang/Object; (19 bytes) @ 0x7eff49154724 [0x7eff49154560+0x1c4] j org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(Lscala/collection/Iterator;)Lscala/collection/Iterator;+78 j org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(Ljava/lang/Object;)Ljava/lang/Object;+5 J 14007 C1 org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; (17 bytes) @ 0x7eff4a6ed204 [0x7eff4a6ecc40+0x5c4] J 11684 C1 org.apache.spark.rdd.MapPartitionsRDD.compute(Lorg/apache/spark/Partition;Lorg/apache/spark/TaskContext;)Lscala/collection/Iterator; (36 bytes) @ 0x7eff4ad11274 [0x7eff4ad10f60+0x314] J 13771 C1 org.apache.spark.rdd.RDD.iterator(Lorg/apache/spark/Partition;Lorg/apache/spark/TaskContext;)Lscala/collection/Iterator; (46 bytes) @ 0x7eff4b39dd3c [0x7eff4b39d160+0xbdc] test({color:#008000}"Test Pre_aggregate with decimal column with order by"{color}) { sql({color:#008000}"drop table if exists maintable"{color}) sql({color:#008000}"create table maintable(name string, decimal_col decimal(30,16)) stored by 'carbondata'"{color}) sql({color:#008000}"insert into table maintable select 'abc',452.564"{color}) sql( {color:#008000}"create datamap ag1 on table maintable using 'preaggregate' as select name,avg(decimal_col)" {color}+ {color:#008000}" from maintable group by name"{color}) checkAnswer(sql({color:#008000}"select avg(decimal_col) from maintable group by name order by name"{color}), {color:#660e7a}Seq{color}(Row({color:#ff}452.5640{color}))) } was: JVM crash with preaggregate datamap. callstack: Stack: [0x7efebd49a000,0x7efebd59b000], sp=0x7efebd598dc8, free space=1019k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x7b2b50] J 7620 sun.misc.Unsafe.copyMemory(Ljava/lang/Object;JLjava/lang/Object;JJ)V (0 bytes) @ 0x7eff4a3479e1 [0x7eff4a347900+0xe1] j org.apache.spark.unsafe.Platform.copyMemory(Ljava/lang/Object;JLjava/lang/Object;JJ)V+34 j org.apache.spark.sql.catalyst.expressions.UnsafeRow.getBinary(I)[B+54 j org.apache.spark.sql.catalyst.expressions.UnsafeRow.getDecimal(III)Lorg/apache/spark/sql/types/Decimal;+30 j org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Lorg/apache/spark/sql/catalyst/InternalRow;)Lorg/apache/spark/sql/catalyst/expressions/UnsafeRow;+36 j org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Ljava/lang/Object;)Ljava/lang/Object;+5 J 3104 C1 scala.collection.Iterator$$anon$11.next()Ljava/lang/Object; (19 bytes) @ 0x7eff49154724 [0x7eff49154560+0x1c4] j org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(Lscala/collection/Iterator;)Lscala/collection/Iterator;+78 j org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(Ljava/lang/Object;)Ljava/lang/Object;+5 J 14007 C1 org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; (17 bytes) @ 0x7eff4a6ed204 [0x7eff4a6ecc40+0x5c4] J 11684 C1 org.apache.spark.rdd.MapPartitionsRDD.compute(Lorg/apache/spark/Partition;Lorg/apache/spark/TaskContext;)Lscala/collection/Iterator; (36 bytes) @ 0x7eff4ad11274 [0x7eff4ad10f60+0x314] J 13771 C1 org.apache.spark.rdd.RDD.iterator(Lorg/apache/spark/Partition;Lorg/apache/spark/TaskContext;)Lscala/collection/Iterator; (46 bytes) @ 0x7eff4b39dd3c [0x7eff4b39d160+0xbdc] > JVM crash with preaggregate datamap > --- > > K
[GitHub] carbondata issue #2956: [CARBONDATA-3134] fixed null values when cachelevel ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2956 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1557/ ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2954 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1558/ ---
[GitHub] carbondata pull request #2958: [CARBONDATA-3136] JVM crash with preaggregate...
GitHub user ajantha-bhat opened a pull request: https://github.com/apache/carbondata/pull/2958 [CARBONDATA-3136] JVM crash with preaggregate datamap when average of decimal column is taken with orderby. problem: JVM crash with preaggregate datamap when average of decimal column is taken with orderby. cause: When preparing plan with preaggregate datamap, decimal is cast to double in average expression. This was leading to JVM crash in spark as we were filling with wrong precision (callstack mentioned in JIRA) solution: division result of average, should be casted to decimal instead of double for decimal datatype. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? NA - [ ] Any backward compatibility impacted? NA - [ ] Document update required? NA - [ ] Testing done. yes, added UT - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/ajantha-bhat/carbondata issue_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2958.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2958 commit 8d95838e5d5991d7c355944d40a54972ea1c1424 Author: ajantha-bhat Date: 2018-11-27T14:07:49Z jvm crash when query pre-aggreagte table with avg(decimal_column) and order by ---
[GitHub] carbondata issue #2958: [CARBONDATA-3136] JVM crash with preaggregate datama...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2958 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1559/ ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2954 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9815/ ---
[GitHub] carbondata issue #2956: [CARBONDATA-3134] fixed null values when cachelevel ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2956 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1769/ ---
[GitHub] carbondata issue #2956: [CARBONDATA-3134] fixed null values when cachelevel ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2956 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9816/ ---
[GitHub] carbondata issue #2958: [CARBONDATA-3136] JVM crash with preaggregate datama...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2958 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1770/ ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2954 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1768/ ---
[GitHub] carbondata pull request #2949: [WIP] support parallel block pruning for non-...
Github user ajantha-bhat commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2949#discussion_r236746764 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMap.java --- @@ -70,4 +70,6 @@ void init(DataMapModel dataMapModel) */ void finish(); + // can return , number of records information that are stored in datamap. --- End diff -- ok, changed to just "returns" ---
[GitHub] carbondata issue #2958: [CARBONDATA-3136] JVM crash with preaggregate datama...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2958 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9817/ ---
[GitHub] carbondata issue #2949: [CARBONDATA-3118] support parallel block pruning for...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2949 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1560/ ---
[GitHub] carbondata pull request #2956: [CARBONDATA-3134] fixed null values when cach...
Github user kunal642 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2956#discussion_r236768643 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java --- @@ -332,13 +334,42 @@ public void clear() { } SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper other = (SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper) obj; - return tableIdentifier.equals(other.tableIdentifier) && columnsInTable - .equals(other.columnsInTable) && Arrays + return tableIdentifier.equals(other.tableIdentifier) && checkColumnSchemaEquality( + columnsInTable, other.columnsInTable) && Arrays .equals(columnCardinality, other.columnCardinality); } +private boolean checkColumnSchemaEquality(List obj1, List obj2) { + List clonedObj1 = new ArrayList<>(obj1); + List clonedObj2 = new ArrayList<>(obj2); + clonedObj1.addAll(obj1); + clonedObj2.addAll(obj2); + Collections.sort(clonedObj1, new Comparator() { +@Override public int compare(ColumnSchema o1, ColumnSchema o2) { + return o1.getColumnUniqueId().compareTo(o2.getColumnUniqueId()); +} + }); + Collections.sort(clonedObj2, new Comparator() { +@Override public int compare(ColumnSchema o1, ColumnSchema o2) { + return o1.getColumnUniqueId().compareTo(o2.getColumnUniqueId()); +} + }); + boolean exists = true; + for (int i = 0; i < obj1.size(); i++) { +if (!clonedObj1.get(i).equalsWithColumnId(clonedObj2.get(i))) { + exists = false; + break; +} + } + return exists; +} + @Override public int hashCode() { - return tableIdentifier.hashCode() + columnsInTable.hashCode() + Arrays + int hashCode = 0; + for (ColumnSchema columnSchema: columnsInTable) { +hashCode += columnSchema.hashCodeWithColumnId(); --- End diff -- done ---
[GitHub] carbondata pull request #2956: [CARBONDATA-3134] fixed null values when cach...
Github user kunal642 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2956#discussion_r236768636 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java --- @@ -332,13 +334,42 @@ public void clear() { } SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper other = (SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper) obj; - return tableIdentifier.equals(other.tableIdentifier) && columnsInTable - .equals(other.columnsInTable) && Arrays + return tableIdentifier.equals(other.tableIdentifier) && checkColumnSchemaEquality( + columnsInTable, other.columnsInTable) && Arrays .equals(columnCardinality, other.columnCardinality); } +private boolean checkColumnSchemaEquality(List obj1, List obj2) { + List clonedObj1 = new ArrayList<>(obj1); + List clonedObj2 = new ArrayList<>(obj2); + clonedObj1.addAll(obj1); + clonedObj2.addAll(obj2); + Collections.sort(clonedObj1, new Comparator() { +@Override public int compare(ColumnSchema o1, ColumnSchema o2) { + return o1.getColumnUniqueId().compareTo(o2.getColumnUniqueId()); +} + }); + Collections.sort(clonedObj2, new Comparator() { --- End diff -- done ---
[GitHub] carbondata pull request #2956: [CARBONDATA-3134] fixed null values when cach...
Github user kunal642 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2956#discussion_r236768667 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java --- @@ -332,13 +334,42 @@ public void clear() { } SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper other = (SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper) obj; - return tableIdentifier.equals(other.tableIdentifier) && columnsInTable - .equals(other.columnsInTable) && Arrays + return tableIdentifier.equals(other.tableIdentifier) && checkColumnSchemaEquality( + columnsInTable, other.columnsInTable) && Arrays .equals(columnCardinality, other.columnCardinality); } +private boolean checkColumnSchemaEquality(List obj1, List obj2) { + List clonedObj1 = new ArrayList<>(obj1); --- End diff -- done ---
[GitHub] carbondata issue #2956: [CARBONDATA-3134] fixed null values when cachelevel ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2956 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1561/ ---
[GitHub] carbondata issue #2956: [CARBONDATA-3134] fixed null values when cachelevel ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2956 Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9819/ ---
[GitHub] carbondata issue #2956: [CARBONDATA-3134] fixed null values when cachelevel ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2956 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1772/ ---
[GitHub] carbondata issue #2949: [CARBONDATA-3118] support parallel block pruning for...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2949 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9818/ ---
[GitHub] carbondata issue #2949: [CARBONDATA-3118] support parallel block pruning for...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2949 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1771/ ---
[GitHub] carbondata issue #2956: [CARBONDATA-3134] fixed null values when cachelevel ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2956 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1562/ ---
[GitHub] carbondata issue #2956: [CARBONDATA-3134] fixed null values when cachelevel ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2956 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1773/ ---
[GitHub] carbondata pull request #2949: [CARBONDATA-3118] support parallel block prun...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2949#discussion_r236907320 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java --- @@ -205,26 +195,53 @@ public BlockletDetailsFetcher getBlockletDetailsFetcher() { final FilterResolverIntf filterExp, final List partitions, List blocklets, final Map> dataMaps, int totalFiles) { +/* + * + * Below is the example of how this part of code works. + * consider a scenario of having 5 segments, 10 datamaps in each segment, --- End diff -- Also what does the 'record' mean below? ---
[GitHub] carbondata pull request #2949: [CARBONDATA-3118] support parallel block prun...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2949#discussion_r236907065 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java --- @@ -205,26 +195,53 @@ public BlockletDetailsFetcher getBlockletDetailsFetcher() { final FilterResolverIntf filterExp, final List partitions, List blocklets, final Map> dataMaps, int totalFiles) { +/* + * + * Below is the example of how this part of code works. + * consider a scenario of having 5 segments, 10 datamaps in each segment, --- End diff -- What do you mean by saying '10 datamaps in each segment'? Do you mean '10 index files or merged index files or blocklet or something else?' ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2954 retest this please ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2954 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1563/ ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2954 Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9821/ ---
[GitHub] carbondata issue #2954: [CARBONDATA-3128]Fix the HiveExample exception
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2954 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1774/ ---
[GitHub] carbondata issue #2914: [WIP][CARBONDATA-3093] Provide property builder for ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2914 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1564/ ---