[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...
Github user chenliang613 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/696#discussion_r108052532 --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/v3/CarbonFactDataWriterImplV3.java --- @@ -528,8 +528,7 @@ protected void fillBlockIndexInfoDetails(long numberOfRows, String filePath, org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex blockletIndex = new org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex(btree, minmax); BlockIndexInfo blockIndexInfo = -new BlockIndexInfo(numberOfRows, filePath.substring(0, filePath.lastIndexOf('.')), -currentPosition, blockletIndex); +new BlockIndexInfo(numberOfRows, filePath, currentPosition, blockletIndex); --- End diff -- can you explain ,why do this change ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...
Github user chenliang613 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/696#discussion_r108052521 --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/v1/CarbonFactDataWriterImplV1.java --- @@ -373,7 +373,7 @@ protected void writeBlockletInfoToFile(FileChannel channel, String filePath) FileFooter convertFileMeta = CarbonMetadataUtil .convertFileFooter(blockletInfoList, localCardinality.length, localCardinality, thriftColumnSchemaList, dataWriterVo.getSegmentProperties()); - fillBlockIndexInfoDetails(convertFileMeta.getNum_rows(), filePath, currentPosition); + fillBlockIndexInfoDetails(convertFileMeta.getNum_rows(), carbonDataFileName, currentPosition); --- End diff -- Please align the parameter name(filePath) for fillBlockIndexInfoDetails of AbstractFactDataWriter.java For example : protected void fillBlockIndexInfoDetails(long numberOfRows, String carbonDataFileName, long currentPosition) Please modify accordingly for all part. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (CARBONDATA-820) Redundant BitSet created in data load
[ https://issues.apache.org/jira/browse/CARBONDATA-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-820: Request participants: (was: ) Description: In CarbonFactDataHandlerColumnar.getMeasureNullValueIndexBitSet method > Redundant BitSet created in data load > - > > Key: CARBONDATA-820 > URL: https://issues.apache.org/jira/browse/CARBONDATA-820 > Project: CarbonData > Issue Type: Bug >Affects Versions: 1.0.0-incubating >Reporter: Jacky Li >Priority: Minor > Fix For: 1.1.0-incubating > > > In CarbonFactDataHandlerColumnar.getMeasureNullValueIndexBitSet method -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] incubator-carbondata pull request #698: [CARBONDATA-820] Remove redundant cr...
GitHub user jackylk opened a pull request: https://github.com/apache/incubator-carbondata/pull/698 [CARBONDATA-820] Remove redundant creation Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[CARBONDATA-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - What manual testing you have done? - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/jackylk/incubator-carbondata hotfix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/698.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #698 commit a430472d39b261b9fe85316a496ae27783d5b2bc Author: jackylkDate: 2017-03-26T05:47:02Z remove redundant object creation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (CARBONDATA-820) Redundant BitSet created in data load
Jacky Li created CARBONDATA-820: --- Summary: Redundant BitSet created in data load Key: CARBONDATA-820 URL: https://issues.apache.org/jira/browse/CARBONDATA-820 Project: CarbonData Issue Type: Bug Affects Versions: 1.0.0-incubating Reporter: Jacky Li Priority: Minor Fix For: 1.1.0-incubating -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...
Github user chenliang613 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/696#discussion_r108052363 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestDataLoadWithFileName.scala --- @@ -0,0 +1,111 @@ +package org.apache.carbondata.spark.testsuite.dataload --- End diff -- Please add license header --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/696 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1336/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...
Github user chenliang613 commented on the issue: https://github.com/apache/incubator-carbondata/pull/696 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...
Github user Sephiroth-Lin commented on the issue: https://github.com/apache/incubator-carbondata/pull/696 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #659: [CARBONDATA-781] Store one SegmentPr...
Github user QiangCai commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/659#discussion_r108033699 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentTaskIndex.java --- @@ -16,30 +16,52 @@ */ package org.apache.carbondata.core.datastore.block; +import java.util.HashMap; import java.util.List; +import java.util.Map; import org.apache.carbondata.core.datastore.BTreeBuilderInfo; import org.apache.carbondata.core.datastore.BtreeBuilder; import org.apache.carbondata.core.datastore.impl.btree.BlockBTreeBuilder; +import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier; import org.apache.carbondata.core.metadata.blocklet.DataFileFooter; /** * Class which is responsible for loading the b+ tree block. This class will * persist all the detail of a table segment */ public class SegmentTaskIndex extends AbstractIndex { + private static MapsegmentPropertiesCached = --- End diff -- why not use TableSegmentUniqueIdentifier? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...
Github user QiangCai commented on the issue: https://github.com/apache/incubator-carbondata/pull/696 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/659 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1335/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/659 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1334/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...
Github user watermen commented on the issue: https://github.com/apache/incubator-carbondata/pull/659 @jackylk @QiangCai I have already modified code with "Store one SegmentProperties object each segment" solution. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (CARBONDATA-801) [Documentation] Examples format to be fixed
[ https://issues.apache.org/jira/browse/CARBONDATA-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Chen resolved CARBONDATA-801. --- Resolution: Fixed Fix Version/s: 1.1.0-incubating > [Documentation] Examples format to be fixed > --- > > Key: CARBONDATA-801 > URL: https://issues.apache.org/jira/browse/CARBONDATA-801 > Project: CarbonData > Issue Type: Bug >Reporter: Gururaj Shetty >Assignee: Srinath Thota >Priority: Minor > Fix For: 1.1.0-incubating > > Time Spent: 1h > Remaining Estimate: 0h > > Some examples provided in DDL are enclosed in “” which might not work in some > scenarios. Need to replace the “” in the examples to ‘’. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (CARBONDATA-801) [Documentation] Examples format to be fixed
[ https://issues.apache.org/jira/browse/CARBONDATA-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Chen reassigned CARBONDATA-801: - Assignee: Srinath Thota (was: Gururaj Shetty) > [Documentation] Examples format to be fixed > --- > > Key: CARBONDATA-801 > URL: https://issues.apache.org/jira/browse/CARBONDATA-801 > Project: CarbonData > Issue Type: Bug >Reporter: Gururaj Shetty >Assignee: Srinath Thota >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > > Some examples provided in DDL are enclosed in “” which might not work in some > scenarios. Need to replace the “” in the examples to ‘’. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] incubator-carbondata issue #672: [CARBONDATA-815] add hive integration for c...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/672 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1333/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/672#discussion_r108030714 --- Diff: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java --- @@ -0,0 +1,49 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.hive; + + +import java.io.IOException; +import java.util.Properties; + +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hive.ql.exec.FileSinkOperator; +import org.apache.hadoop.hive.ql.io.HiveOutputFormat; +import org.apache.hadoop.io.Writable; +import org.apache.hadoop.mapred.FileOutputFormat; +import org.apache.hadoop.mapred.JobConf; +import org.apache.hadoop.mapred.RecordWriter; +import org.apache.hadoop.util.Progressable; + + +public class MapredCarbonOutputFormat extends FileOutputFormat--- End diff -- MapredCarbonOutputFormat also needs to implements HiveOutputFormat --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/696 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1332/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...
Github user watermen commented on the issue: https://github.com/apache/incubator-carbondata/pull/696 @QiangCai Store fileName insteads of filePath in carbonindex now. Please review it again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/672#discussion_r108030455 --- Diff: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java --- @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.hive; + +import java.io.IOException; +import java.util.List; + +import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier; +import org.apache.carbondata.core.metadata.schema.table.CarbonTable; +import org.apache.carbondata.core.scan.expression.Expression; +import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf; +import org.apache.carbondata.core.scan.model.CarbonQueryPlan; +import org.apache.carbondata.core.scan.model.QueryModel; +import org.apache.carbondata.hadoop.CarbonInputFormat; +import org.apache.carbondata.hadoop.CarbonInputSplit; +import org.apache.carbondata.hadoop.readsupport.CarbonReadSupport; +import org.apache.carbondata.hadoop.util.CarbonInputFormatUtil; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hive.ql.io.CombineHiveInputFormat; +import org.apache.hadoop.io.ArrayWritable; +import org.apache.hadoop.mapred.InputFormat; +import org.apache.hadoop.mapred.InputSplit; +import org.apache.hadoop.mapred.JobConf; +import org.apache.hadoop.mapred.RecordReader; +import org.apache.hadoop.mapred.Reporter; +import org.apache.hadoop.mapreduce.Job; + + +public class MapredCarbonInputFormat extends CarbonInputFormat +implements InputFormat, CombineHiveInputFormat.AvoidSplitCombination { + + @Override + public InputSplit[] getSplits(JobConf jobConf, int numSplits) throws IOException { +org.apache.hadoop.mapreduce.JobContext jobContext = Job.getInstance(jobConf); +List splitList = super.getSplits(jobContext); --- End diff -- Are invalid segments are only useful for CarbonMultiBlockSplit? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (CARBONDATA-818) The file_name stored in carbonindex is wrong
[ https://issues.apache.org/jira/browse/CARBONDATA-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated CARBONDATA-818: - Description: The file_name stored in carbonindex is a local path which used on executor as temp dir {code} /tmp/6937581525189542/0/default/carbon_v3/Fact/Part0/Segment_0/0/part-0-0_batchno0-0-1490345609845.carbondata {code} But I think we want to store the actual carbondata path like {code} part-0-0_batchno0-0-1490345609845.carbondata {code} was: The file_name stored in carbonindex is a local path which used on executor as temp dir {code} /tmp/6937581525189542/0/default/carbon_v3/Fact/Part0/Segment_0/0/part-0-0_batchno0-0-1490345609845.carbondata {code} But I think we want to store the actual carbondata path like {code} Segment_0/part-0-0_batchno0-0-1490345609845.carbondata {code} > The file_name stored in carbonindex is wrong > > > Key: CARBONDATA-818 > URL: https://issues.apache.org/jira/browse/CARBONDATA-818 > Project: CarbonData > Issue Type: Bug >Reporter: Yadong Qi > Time Spent: 50m > Remaining Estimate: 0h > > The file_name stored in carbonindex is a local path which used on executor as > temp dir > {code} > /tmp/6937581525189542/0/default/carbon_v3/Fact/Part0/Segment_0/0/part-0-0_batchno0-0-1490345609845.carbondata > {code} > But I think we want to store the actual carbondata path like > {code} > part-0-0_batchno0-0-1490345609845.carbondata > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)