[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

2017-03-25 Thread chenliang613
Github user chenliang613 commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/696#discussion_r108052532
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/writer/v3/CarbonFactDataWriterImplV3.java
 ---
@@ -528,8 +528,7 @@ protected void fillBlockIndexInfoDetails(long 
numberOfRows, String filePath,
 org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex 
blockletIndex =
 new 
org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex(btree, minmax);
 BlockIndexInfo blockIndexInfo =
-new BlockIndexInfo(numberOfRows, filePath.substring(0, 
filePath.lastIndexOf('.')),
-currentPosition, blockletIndex);
+new BlockIndexInfo(numberOfRows, filePath, currentPosition, 
blockletIndex);
--- End diff --

can you explain ,why do this change ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

2017-03-25 Thread chenliang613
Github user chenliang613 commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/696#discussion_r108052521
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/writer/v1/CarbonFactDataWriterImplV1.java
 ---
@@ -373,7 +373,7 @@ protected void writeBlockletInfoToFile(FileChannel 
channel, String filePath)
   FileFooter convertFileMeta = CarbonMetadataUtil
   .convertFileFooter(blockletInfoList, localCardinality.length, 
localCardinality,
   thriftColumnSchemaList, dataWriterVo.getSegmentProperties());
-  fillBlockIndexInfoDetails(convertFileMeta.getNum_rows(), filePath, 
currentPosition);
+  fillBlockIndexInfoDetails(convertFileMeta.getNum_rows(), 
carbonDataFileName, currentPosition);
--- End diff --

Please align the parameter name(filePath) for fillBlockIndexInfoDetails of 
AbstractFactDataWriter.java
For example : 
 protected void fillBlockIndexInfoDetails(long numberOfRows,
String carbonDataFileName, long currentPosition)

Please modify accordingly for all part.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (CARBONDATA-820) Redundant BitSet created in data load

2017-03-25 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li updated CARBONDATA-820:

Request participants:   (was: )
 Description: In 
CarbonFactDataHandlerColumnar.getMeasureNullValueIndexBitSet method

> Redundant BitSet created in data load
> -
>
> Key: CARBONDATA-820
> URL: https://issues.apache.org/jira/browse/CARBONDATA-820
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.0.0-incubating
>Reporter: Jacky Li
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>
> In CarbonFactDataHandlerColumnar.getMeasureNullValueIndexBitSet method



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] incubator-carbondata pull request #698: [CARBONDATA-820] Remove redundant cr...

2017-03-25 Thread jackylk
GitHub user jackylk opened a pull request:

https://github.com/apache/incubator-carbondata/pull/698

[CARBONDATA-820] Remove redundant  creation

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[CARBONDATA-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).
 - [ ] Testing done
 
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- What manual testing you have done?
- Any additional information to help reviewers in testing this 
change.
 
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
 
---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jackylk/incubator-carbondata hotfix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/698.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #698


commit a430472d39b261b9fe85316a496ae27783d5b2bc
Author: jackylk 
Date:   2017-03-26T05:47:02Z

remove redundant object creation




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-820) Redundant BitSet created in data load

2017-03-25 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-820:
---

 Summary: Redundant BitSet created in data load
 Key: CARBONDATA-820
 URL: https://issues.apache.org/jira/browse/CARBONDATA-820
 Project: CarbonData
  Issue Type: Bug
Affects Versions: 1.0.0-incubating
Reporter: Jacky Li
Priority: Minor
 Fix For: 1.1.0-incubating






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

2017-03-25 Thread chenliang613
Github user chenliang613 commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/696#discussion_r108052363
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestDataLoadWithFileName.scala
 ---
@@ -0,0 +1,111 @@
+package org.apache.carbondata.spark.testsuite.dataload
--- End diff --

Please add license header


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

2017-03-25 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/696
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1336/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

2017-03-25 Thread chenliang613
Github user chenliang613 commented on the issue:

https://github.com/apache/incubator-carbondata/pull/696
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

2017-03-25 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the issue:

https://github.com/apache/incubator-carbondata/pull/696
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #659: [CARBONDATA-781] Store one SegmentPr...

2017-03-25 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/659#discussion_r108033699
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentTaskIndex.java
 ---
@@ -16,30 +16,52 @@
  */
 package org.apache.carbondata.core.datastore.block;
 
+import java.util.HashMap;
 import java.util.List;
+import java.util.Map;
 
 import org.apache.carbondata.core.datastore.BTreeBuilderInfo;
 import org.apache.carbondata.core.datastore.BtreeBuilder;
 import org.apache.carbondata.core.datastore.impl.btree.BlockBTreeBuilder;
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
 import org.apache.carbondata.core.metadata.blocklet.DataFileFooter;
 
 /**
  * Class which is responsible for loading the b+ tree block. This class 
will
  * persist all the detail of a table segment
  */
 public class SegmentTaskIndex extends AbstractIndex {
+  private static Map 
segmentPropertiesCached =
--- End diff --

why not use TableSegmentUniqueIdentifier?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

2017-03-25 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/696
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

2017-03-25 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/659
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1335/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

2017-03-25 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/659
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1334/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

2017-03-25 Thread watermen
Github user watermen commented on the issue:

https://github.com/apache/incubator-carbondata/pull/659
  
@jackylk @QiangCai I have already modified code with "Store one 
SegmentProperties object each segment" solution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (CARBONDATA-801) [Documentation] Examples format to be fixed

2017-03-25 Thread Liang Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Chen resolved CARBONDATA-801.
---
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> [Documentation] Examples format to be fixed
> ---
>
> Key: CARBONDATA-801
> URL: https://issues.apache.org/jira/browse/CARBONDATA-801
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Gururaj Shetty
>Assignee: Srinath Thota
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Some examples provided in DDL are enclosed in “” which might not work in some 
> scenarios. Need to replace the “” in the examples to ‘’.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CARBONDATA-801) [Documentation] Examples format to be fixed

2017-03-25 Thread Liang Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Chen reassigned CARBONDATA-801:
-

Assignee: Srinath Thota  (was: Gururaj Shetty)

> [Documentation] Examples format to be fixed
> ---
>
> Key: CARBONDATA-801
> URL: https://issues.apache.org/jira/browse/CARBONDATA-801
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Gururaj Shetty
>Assignee: Srinath Thota
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Some examples provided in DDL are enclosed in “” which might not work in some 
> scenarios. Need to replace the “” in the examples to ‘’.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] incubator-carbondata issue #672: [CARBONDATA-815] add hive integration for c...

2017-03-25 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/672
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1333/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-25 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r108030714
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java
 ---
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+
+import java.io.IOException;
+import java.util.Properties;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.exec.FileSinkOperator;
+import org.apache.hadoop.hive.ql.io.HiveOutputFormat;
+import org.apache.hadoop.io.Writable;
+import org.apache.hadoop.mapred.FileOutputFormat;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.RecordWriter;
+import org.apache.hadoop.util.Progressable;
+
+
+public class MapredCarbonOutputFormat extends FileOutputFormat
--- End diff --

MapredCarbonOutputFormat also needs to implements HiveOutputFormat


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

2017-03-25 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/696
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1332/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

2017-03-25 Thread watermen
Github user watermen commented on the issue:

https://github.com/apache/incubator-carbondata/pull/696
  
@QiangCai Store fileName insteads of filePath in carbonindex now. Please 
review it again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-25 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r108030455
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
 ---
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+import java.io.IOException;
+import java.util.List;
+
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
+import org.apache.carbondata.core.scan.model.CarbonQueryPlan;
+import org.apache.carbondata.core.scan.model.QueryModel;
+import org.apache.carbondata.hadoop.CarbonInputFormat;
+import org.apache.carbondata.hadoop.CarbonInputSplit;
+import org.apache.carbondata.hadoop.readsupport.CarbonReadSupport;
+import org.apache.carbondata.hadoop.util.CarbonInputFormatUtil;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
+import org.apache.hadoop.io.ArrayWritable;
+import org.apache.hadoop.mapred.InputFormat;
+import org.apache.hadoop.mapred.InputSplit;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.RecordReader;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.hadoop.mapreduce.Job;
+
+
+public class MapredCarbonInputFormat extends 
CarbonInputFormat
+implements InputFormat, 
CombineHiveInputFormat.AvoidSplitCombination {
+
+  @Override
+  public InputSplit[] getSplits(JobConf jobConf, int numSplits) throws 
IOException {
+org.apache.hadoop.mapreduce.JobContext jobContext = 
Job.getInstance(jobConf);
+List splitList = 
super.getSplits(jobContext);
--- End diff --

Are invalid segments are only useful for CarbonMultiBlockSplit?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (CARBONDATA-818) The file_name stored in carbonindex is wrong

2017-03-25 Thread Yadong Qi (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yadong Qi updated CARBONDATA-818:
-
Description: 
The file_name stored in carbonindex is a local path which used on executor as 
temp dir 
{code}
/tmp/6937581525189542/0/default/carbon_v3/Fact/Part0/Segment_0/0/part-0-0_batchno0-0-1490345609845.carbondata
{code}
But I think we want to store the actual carbondata path like
{code}
part-0-0_batchno0-0-1490345609845.carbondata
{code}

  was:
The file_name stored in carbonindex is a local path which used on executor as 
temp dir 
{code}
/tmp/6937581525189542/0/default/carbon_v3/Fact/Part0/Segment_0/0/part-0-0_batchno0-0-1490345609845.carbondata
{code}
But I think we want to store the actual carbondata path like
{code}
Segment_0/part-0-0_batchno0-0-1490345609845.carbondata
{code}


> The file_name stored in carbonindex is wrong
> 
>
> Key: CARBONDATA-818
> URL: https://issues.apache.org/jira/browse/CARBONDATA-818
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Yadong Qi
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The file_name stored in carbonindex is a local path which used on executor as 
> temp dir 
> {code}
> /tmp/6937581525189542/0/default/carbon_v3/Fact/Part0/Segment_0/0/part-0-0_batchno0-0-1490345609845.carbondata
> {code}
> But I think we want to store the actual carbondata path like
> {code}
> part-0-0_batchno0-0-1490345609845.carbondata
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)