[jira] [Commented] (CARBONDATA-308) Use CarbonInputFormat in CarbonScanRDD compute

2016-11-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15627358#comment-15627358
 ] 

ASF GitHub Bot commented on CARBONDATA-308:
---

Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/262#discussion_r86058188
  
--- Diff: 
hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java ---
@@ -22,28 +22,44 @@
 import java.io.DataOutput;
 import java.io.IOException;
 import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.carbondata.core.carbon.datastore.block.BlockletInfos;
+import org.apache.carbondata.core.carbon.datastore.block.Distributable;
+import org.apache.carbondata.core.carbon.datastore.block.TableBlockInfo;
+import org.apache.carbondata.core.carbon.path.CarbonTablePath;
 
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.io.Writable;
 import org.apache.hadoop.mapreduce.lib.input.FileSplit;
 
+
 /**
  * Carbon input split to allow distributed read of CarbonInputFormat.
  */
-public class CarbonInputSplit extends FileSplit implements Serializable, 
Writable {
+public class CarbonInputSplit extends FileSplit implements Distributable, 
Serializable, Writable {
 
   private static final long serialVersionUID = 3520344046772190207L;
   private String segmentId;
-  /**
+  public String taskId = "0";
+
+  /*
* Number of BlockLets in a block
*/
   private int numberOfBlocklets = 0;
 
-  public CarbonInputSplit() {
-super(null, 0, 0, new String[0]);
+  public  CarbonInputSplit() {
   }
 
-  public CarbonInputSplit(String segmentId, Path path, long start, long 
length,
+  private void parserPath(Path path) {
+String[] nameParts = path.getName().split("-");
+if (nameParts != null && nameParts.length >= 3) {
+  this.taskId = nameParts[2];
+}
+  }
+
+  private CarbonInputSplit(String segmentId, Path path, long start, long 
length,
--- End diff --

please initialize taskId


> Use CarbonInputFormat in CarbonScanRDD compute
> --
>
> Key: CARBONDATA-308
> URL: https://issues.apache.org/jira/browse/CARBONDATA-308
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: spark-integration
>Reporter: Jacky Li
> Fix For: 0.2.0-incubating
>
>
> Take CarbonScanRDD as the target RDD, modify as following:
> 1. In driver side, only getSplit is required, so only filter condition is 
> required, no need to create full QueryModel object, so we can move creation 
> of QueryModel from driver side to executor side.
> 2. use CarbonInputFormat.createRecordReader in CarbonScanRDD.compute instead 
> of use QueryExecutor directly



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CARBONDATA-353) Update doc for dateformat option

2016-11-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15627280#comment-15627280
 ] 

ASF GitHub Bot commented on CARBONDATA-353:
---

Github user lion-x commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/272#discussion_r86058866
  
--- Diff: docs/DML-Operations-on-Carbon.md ---
@@ -91,12 +91,17 @@ Following are the options that can be used in load data:
 ```ruby
 OPTIONS('ALL_DICTIONARY_PATH'='/opt/alldictionary/data.dictionary')
 ```
-- **COLUMNDICT:** dictionary file path for single column.
+- **COLUMNDICT:** Dictionary file path for each column.
 
 ```ruby
 OPTIONS('COLUMNDICT'='column1:dictionaryFilePath1, 
column2:dictionaryFilePath2')
 ```
 Note: ALL_DICTIONARY_PATH and COLUMNDICT can't be used together.
+- **DATEFORMAT:** Date format for each column.
+
+```ruby
+OPTIONS('DATEFORMAT'='column1:dateFormat1, column2:dateFormat2')
--- End diff --

I add a note, ref to the JAVA SimpleDateFormat Class Doc. It provides more 
details.


> Update doc for dateformat option
> 
>
> Key: CARBONDATA-353
> URL: https://issues.apache.org/jira/browse/CARBONDATA-353
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Lionx
>Assignee: Lionx
>Priority: Minor
>
> Update doc for dateformat option



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-355) Remove unnecessary method argument columnIdentifier of PathService.getCarbonTablePath

2016-11-01 Thread He Xiaoqiao (JIRA)
He Xiaoqiao created CARBONDATA-355:
--

 Summary: Remove unnecessary method argument columnIdentifier of 
PathService.getCarbonTablePath
 Key: CARBONDATA-355
 URL: https://issues.apache.org/jira/browse/CARBONDATA-355
 Project: CarbonData
  Issue Type: Improvement
  Components: core
Affects Versions: 0.2.0-incubating
Reporter: He Xiaoqiao
Assignee: He Xiaoqiao
Priority: Minor


Remove one of method arguments of PathService#getCarbonTablePath since it is 
not necessary pass columnIdentifier when get table path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CARBONDATA-353) Update doc for dateformat option

2016-11-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15625795#comment-15625795
 ] 

ASF GitHub Bot commented on CARBONDATA-353:
---

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/272#discussion_r85959032
  
--- Diff: docs/DML-Operations-on-Carbon.md ---
@@ -91,12 +91,17 @@ Following are the options that can be used in load data:
 ```ruby
 OPTIONS('ALL_DICTIONARY_PATH'='/opt/alldictionary/data.dictionary')
 ```
-- **COLUMNDICT:** dictionary file path for single column.
+- **COLUMNDICT:** Dictionary file path for each column.
 
 ```ruby
 OPTIONS('COLUMNDICT'='column1:dictionaryFilePath1, 
column2:dictionaryFilePath2')
 ```
 Note: ALL_DICTIONARY_PATH and COLUMNDICT can't be used together.
+- **DATEFORMAT:** Date format for each column.
+
+```ruby
+OPTIONS('DATEFORMAT'='column1:dateFormat1, column2:dateFormat2')
--- End diff --

give an example of the data format


> Update doc for dateformat option
> 
>
> Key: CARBONDATA-353
> URL: https://issues.apache.org/jira/browse/CARBONDATA-353
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Lionx
>Assignee: Lionx
>Priority: Minor
>
> Update doc for dateformat option



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CARBONDATA-276) Add trim option

2016-11-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15625774#comment-15625774
 ] 

ASF GitHub Bot commented on CARBONDATA-276:
---

Github user sujith71955 commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/200#discussion_r85957803
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/surrogatekeysgenerator/csvbased/CarbonCSVBasedSeqGenStep.java
 ---
@@ -472,6 +475,7 @@ public boolean processRow(StepMetaInterface smi, 
StepDataInterface sdi) throws K
   break;
   }
 }
+<<< HEAD
--- End diff --

is this file is having any conflict?


> Add trim option
> ---
>
> Key: CARBONDATA-276
> URL: https://issues.apache.org/jira/browse/CARBONDATA-276
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Lionx
>Assignee: Lionx
>Priority: Minor
>
> Fix a bug and add trim option.
> Bug: When string is contains LeadingWhiteSpace or TrailingWhiteSpace, query 
> result is null. This is because the dictionary ignore the LeadingWhiteSpace 
> and TrailingWhiteSpace and the csvInput dose not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CARBONDATA-276) Add trim option

2016-11-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15625769#comment-15625769
 ] 

ASF GitHub Bot commented on CARBONDATA-276:
---

Github user sujith71955 commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/200#discussion_r85957411
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/surrogatekeysgenerator/csvbased/CarbonCSVBasedSeqGenMeta.java
 ---
@@ -1694,5 +1699,19 @@ public void setTableOption(String tableOption) {
   public TableOptionWrapper getTableOptionWrapper() {
 return tableOptionWrapper;
   }
+
+  public String getIsUseTrim() {
+return isUseTrim;
+  }
+
+  public void setIsUseTrim(Boolean[] isUseTrim) {
+for (Boolean flag: isUseTrim) {
+  if (flag) {
+this.isUseTrim += "T";
--- End diff --

Use  TRUE/FALSE for better readability


> Add trim option
> ---
>
> Key: CARBONDATA-276
> URL: https://issues.apache.org/jira/browse/CARBONDATA-276
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Lionx
>Assignee: Lionx
>Priority: Minor
>
> Fix a bug and add trim option.
> Bug: When string is contains LeadingWhiteSpace or TrailingWhiteSpace, query 
> result is null. This is because the dictionary ignore the LeadingWhiteSpace 
> and TrailingWhiteSpace and the csvInput dose not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-354) Query execute successfully even not argument given in count function

2016-11-01 Thread Prabhat Kashyap (JIRA)
Prabhat Kashyap created CARBONDATA-354:
--

 Summary: Query execute successfully even not argument given in 
count function
 Key: CARBONDATA-354
 URL: https://issues.apache.org/jira/browse/CARBONDATA-354
 Project: CarbonData
  Issue Type: Bug
Reporter: Prabhat Kashyap
Priority: Minor


When I am executing following command:
select count() from tableName;

It gave me no error and execute successfully but it gives following exception 
when I execute the same in Hive:

FAILED: UDFArgumentException Argument expected



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)