date:20161010

[jira] [Created] (CARBONDATA-294) Timestamp datatype Error

2016-10-10 Thread Lionx (JIRA)

Lionx created CARBONDATA-294:


 Summary: Timestamp datatype Error
 Key: CARBONDATA-294
 URL: https://issues.apache.org/jira/browse/CARBONDATA-294
 Project: CarbonData
  Issue Type: Bug
Reporter: Lionx
Assignee: Lionx
Priority: Critical


In CarbonExample, When Loading 2015/7/23 as a Timestamp, when querying, it will 
return 2015-01-23 xx:xx:xx:xx. Six months have been stolen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Discussion regrading design of data load after kettle removal.

2016-10-10 Thread Jacky Li

Hi Ravindra,

It seems the picture is missing, can you post it in a URL and share the
link?

Regards,
Jacky



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discussion-regrading-design-of-data-load-after-kettle-removal-tp1672p1725.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.

[GitHub] incubator-carbondata pull request #225: Abstract Snappy interface and sepera...

2016-10-10 Thread Zhangshunyu

GitHub user Zhangshunyu opened a pull request:

https://github.com/apache/incubator-carbondata/pull/225

Abstract Snappy interface and seperate it from Compressor interface

## Why raise this pr?
Currently, we only have snappy compressor who extends form Compressor 
interface, for future expansion, we need to abstract Snappy interface and 
seperate it from Compressor interface, it means `Compressor interface is the 
parent of all compressors, and SnappyCompressor and the other compressor's 
interface should extends Compressor interface, as to different data type for 
different compressor, it would extend its own interface.`
## How to test?
Pass all the test cases.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Zhangshunyu/incubator-carbondata 
compress_interface

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/225.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #225


commit 98536737d786c40192d197a3af3e52254949d4fd
Author: Zhangshunyu 
Date:   2016-10-10T09:17:31Z

Abstract Snappy interface and seperate it from Compressor interface




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Re: Discussion regrading design of data load after kettle removal.

2016-10-10 Thread Jacky Li

Hi Ravindra,

I have following questions:

1. How does DataLoadProcessorStep inteface work? For each step, it will call
its child step to execute and apply its logic to the returned iterator of
the child? And how does it map to OutputFormat in hadoop interface?

2. This step interface relies on iterator to do the encoding row by row,
will it be convinient to add batch encoder support now or later? 

3. for the ditionary part, besides generator I think it is better also
considering the interface for the reading of dictionary while querying. Are
you planning to use the same interface? If so, it is not just a Generator.
If the dictionary interface is well designed, other developer can also add
new dictionary type. For example:
- based on usage frequency to assign dictionary value, for better
compression, similar to huffman encoding
- order-preserving dictionary which can do range filter on dictionary value
directly

Regards,
Jacky



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discussion-regrading-design-of-data-load-after-kettle-removal-tp1672p1726.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.

[jira] [Created] (CARBONDATA-295) Abstract Snappy interface and seperate it from Compressor interface

2016-10-10 Thread zhangshunyu (JIRA)

zhangshunyu created CARBONDATA-295:
--

 Summary: Abstract Snappy interface and seperate it from Compressor 
interface
 Key: CARBONDATA-295
 URL: https://issues.apache.org/jira/browse/CARBONDATA-295
 Project: CarbonData
  Issue Type: Improvement
  Components: data-load
Affects Versions: 0.1.1-incubating
Reporter: zhangshunyu
Assignee: zhangshunyu
Priority: Minor
 Fix For: 0.2.0-incubating


Currently, we only have snappy compressor who extends form Compressor 
interface, for future expansion, we need to abstract Snappy interface and 
seperate it from Compressor interface, it means Compressor interface is the 
parent of all compressors, and SnappyCompressor interface and the other 
compressor's interface(or abstract class) should extends Compressor interface, 
as to different data type for different compressor, it would extend its own 
interface/abstract class.
for example: Compressor -> SnappyCompressor -> SnappyDoubleCompression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] incubator-carbondata pull request #226: [CARBONDATA-294]Fix timestamp data e...

2016-10-10 Thread lion-x

GitHub user lion-x opened a pull request:

https://github.com/apache/incubator-carbondata/pull/226

[CARBONDATA-294]Fix timestamp data error

# Why raise this PR?
In some Examples and testcases, 
**CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT** is assigned a wrong timestamp 
format "/mm/dd". This wrong format will cause that Month is set a default 
value 1.

for example, 2015/07/23 will be set as 2015/01/23 00:07:xx.xxx .

The right timestamp format should be /MM/dd. This PR fix the wrong uses 
in some example files and testcase files. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lion-x/incubator-carbondata timeError

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/226.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #226


commit 4ae904f881766ab0990132e8be6fc6d7cfaf72a8
Author: lion-x 
Date:   2016-10-10T11:17:35Z

Fixtimestamperror




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Re: Discussion regrading design of data load after kettle removal.

2016-10-10 Thread Ravindra Pesala

Hi Jacky,

https://drive.google.com/open?id=0B4TWTVbFSTnqeElyWko5NDlBZkdxS3NrMW1PZndzMG5ZM2Y0


1. Yes it calls child step to execute and apply its logic to return
iterator just like spark sql.  For CarbonOutputFormat  it will use
RecordBufferedWriterIterator and collects the data in batches.
https://drive.google.com/open?id=0B4TWTVbFSTnqTF85anlDOUQ5S1BqYzFpLWcwZnBLSVVqSWpj

2. Yes,this interface relies on processing row by row. But we can also
execute in batches in iterator.

3.Yes, dictionary interface is used for reading dictionary while querying.
Ok based on my understanding I have added this interface, we can discuss
more on it and update the interface.


Regards,
Ravi

On 10 October 2016 at 14:56, Jacky Li  wrote:

> Hi Ravindra,
>
> I have following questions:
>
> 1. How does DataLoadProcessorStep inteface work? For each step, it will
> call
> its child step to execute and apply its logic to the returned iterator of
> the child? And how does it map to OutputFormat in hadoop interface?
>
> 2. This step interface relies on iterator to do the encoding row by row,
> will it be convinient to add batch encoder support now or later?
>
> 3. for the ditionary part, besides generator I think it is better also
> considering the interface for the reading of dictionary while querying. Are
> you planning to use the same interface? If so, it is not just a Generator.
> If the dictionary interface is well designed, other developer can also add
> new dictionary type. For example:
> - based on usage frequency to assign dictionary value, for better
> compression, similar to huffman encoding
> - order-preserving dictionary which can do range filter on dictionary value
> directly
>
> Regards,
> Jacky
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/Discussion-
> regrading-design-of-data-load-after-kettle-removal-tp1672p1726.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>



-- 
Thanks & Regards,
Ravi

RE: Discussion about using multi local directorys to improve dataloading perfomance

2016-10-10 Thread Jihong Ma

Agree, help boost performance.

Jenny

-Original Message-
From: Jacky Li [mailto:jacky.li...@qq.com] 
Sent: Saturday, October 08, 2016 9:09 AM
To: dev@carbondata.incubator.apache.org
Subject: Re: Discussion about using multi local directorys to improve 
dataloading perfomance

Yes, I think it is a good feature to have. Please feel free to create JIRA 
issue and Pull Request. 

Regards,
Jacky

> 在 2016年10月9日，上午12:04，caiqiang  写道：
> 
> Hi All,
>  For each dataloading, we write the sorted temp files into only one different 
> local directory. I think this is a bottle neck of dataloading. It is 
> neccessary to use multi local directorys in multi disks for each dataloading 
> to improve dataloading performance.

[GitHub] incubator-carbondata pull request #225: [CARBONDATA-295]Abstract Compressor ...

2016-10-10 Thread Zhangshunyu

Github user Zhangshunyu closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/225


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (CARBONDATA-296) 1.Add CSVInputFormat to read csv files.

2016-10-10 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-296:
--

 Summary: 1.Add CSVInputFormat to read csv files.
 Key: CARBONDATA-296
 URL: https://issues.apache.org/jira/browse/CARBONDATA-296
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Add CSVInputFormat to read csv files, it should use Univocity parser to read 
csv files to get optimal performance. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CARBONDATA-297) 2. Add interfaces for data loading.

2016-10-10 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-297:
--

 Summary: 2. Add interfaces for data loading.
 Key: CARBONDATA-297
 URL: https://issues.apache.org/jira/browse/CARBONDATA-297
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Add the major interface classes for data loading so that the following jiras 
can use this interfaces to implement it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CARBONDATA-298) 3. Add InputProcessorStep which should iterate recordreader and parse the data as per the data type.

2016-10-10 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-298:
--

 Summary: 3. Add InputProcessorStep which should iterate 
recordreader and parse the data as per the data type.
 Key: CARBONDATA-298
 URL: https://issues.apache.org/jira/browse/CARBONDATA-298
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Add InputProcessorStep which should iterate recordreader/RecordBufferedWriter 
and parse the data as per the data types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CARBONDATA-299) 4. Add dictionary generator interfaces and give implementation for pre created dictionary.

2016-10-10 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-299:
--

 Summary: 4. Add dictionary generator interfaces and give 
implementation for pre created dictionary.
 Key: CARBONDATA-299
 URL: https://issues.apache.org/jira/browse/CARBONDATA-299
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Add dictionary generator interfaces and give implementation for pre-created 
dictionary(which is generated separetly).




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CARBONDATA-300) 5. Add EncodeProcessorStep which encodes the data with dictionary.

2016-10-10 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-300:
--

 Summary: 5. Add EncodeProcessorStep which encodes the data with 
dictionary.
 Key: CARBONDATA-300
 URL: https://issues.apache.org/jira/browse/CARBONDATA-300
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Add EncodeProcessorStep which encodes the data with dictionary.This dictionary 
can be obtained from dictionary interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CARBONDATA-301) 6. Add SortProcessorStep which sorts the data as per dimension order and write the sorted files to temp location.

2016-10-10 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-301:
--

 Summary: 6. Add SortProcessorStep which sorts the data as per 
dimension order and write the sorted files to temp location.
 Key: CARBONDATA-301
 URL: https://issues.apache.org/jira/browse/CARBONDATA-301
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Add SortProcessorStep which sorts the data as per dimension order and write the 
sorted files to temp location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CARBONDATA-302) 7. Add DataWriterProcessorStep which reads the data from sort temp files and creates carbondata files.

2016-10-10 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-302:
--

 Summary: 7. Add DataWriterProcessorStep which reads the data from 
sort temp files and creates carbondata files.
 Key: CARBONDATA-302
 URL: https://issues.apache.org/jira/browse/CARBONDATA-302
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Add DataWriterProcessorStep which reads the data from sort temp files and merge 
sort it, and apply mdk generator on key and creates carbondata files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CARBONDATA-303) 8. Add CarbonTableOutpuFormat to write data to carbon.

2016-10-10 Thread Ravindra Pesala (JIRA)

Ravindra Pesala created CARBONDATA-303:
--

 Summary: 8. Add CarbonTableOutpuFormat to write data to carbon.
 Key: CARBONDATA-303
 URL: https://issues.apache.org/jira/browse/CARBONDATA-303
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Add CarbonTableOutpuFormat to write data to carbon. It should use 
DataProcessorStep interface to load the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

2016-10-10 Thread Zhangshunyu

Github user Zhangshunyu commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/219#discussion_r82720156
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/surrogatekeysgenerator/csvbased/CarbonCSVBasedSeqGenStep.java
 ---
@@ -1171,6 +1171,14 @@ else if(isComplexTypeColumn[j]) {
   DirectDictionaryGenerator directDictionaryGenerator1 =
   DirectDictionaryKeyGeneratorFactory
   
.getDirectDictionaryGenerator(details.getColumnType());
+  String[] timeformats = meta.timeFormat.split(",");
+  for(String timeformat:timeformats){
--- End diff --

Style, need space:  'for(' => 'for ('  , the same to '){' => ') {', and 
some other places.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #215: [WIP][CARBONDATA-2] Remove kettle fr...

2016-10-10 Thread ravipesala

Github user ravipesala closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/215


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

2016-10-10 Thread QiangCai

Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/219#discussion_r82719262
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/keygenerator/directdictionary/timestamp/TimeStampDirectDictionaryGenerator.java
 ---
@@ -117,15 +117,24 @@ private TimeStampDirectDictionaryGenerator() {
* @return dictionary value
*/
   @Override public int generateDirectSurrogateKey(String memberStr) {
-SimpleDateFormat timeParser = new 
SimpleDateFormat(CarbonProperties.getInstance()
-.getProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
-CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT));
+String timeString;
+String formatString;
+if (memberStr.contains(CarbonCommonConstants.COLON_SPC_CHARACTER)){
--- End diff --

What is the reason the data contain COLON_SPC_CHARACTER?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

2016-10-10 Thread QiangCai

Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/219#discussion_r82720651
  
--- Diff: 
hadoop/src/test/java/org/apache/carbondata/hadoop/test/util/StoreCreator.java 
---
@@ -356,6 +356,7 @@ public static void executeGraph(LoadModel loadModel, 
String storeLocation, Strin
 schmaModel.setEscapeCharacter("\\");
 schmaModel.setQuoteCharacter("\"");
 schmaModel.setCommentCharacter("#");
+
schmaModel.setTimeFormat(CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT);
--- End diff --

No need to modify this file


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

2016-10-10 Thread QiangCai

Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/219#discussion_r82720457
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/surrogatekeysgenerator/csvbased/CarbonCSVBasedSeqGenStep.java
 ---
@@ -1171,6 +1171,14 @@ else if(isComplexTypeColumn[j]) {
   DirectDictionaryGenerator directDictionaryGenerator1 =
   DirectDictionaryKeyGeneratorFactory
   
.getDirectDictionaryGenerator(details.getColumnType());
--- End diff --

If the column type is TimeStamp, please provide dateformat to KeyGenerator.
Better to provide different key generator for each date format.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

2016-10-10 Thread QiangCai

Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/219#discussion_r82719466
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/keygenerator/directdictionary/timestamp/TimeStampDirectDictionaryGenerator.java
 ---
@@ -117,15 +117,24 @@ private TimeStampDirectDictionaryGenerator() {
* @return dictionary value
*/
   @Override public int generateDirectSurrogateKey(String memberStr) {
-SimpleDateFormat timeParser = new 
SimpleDateFormat(CarbonProperties.getInstance()
-.getProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
-CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT));
+String timeString;
--- End diff --

please use word "date" instead of "time"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

2016-10-10 Thread QiangCai

Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/219#discussion_r82720605
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/surrogatekeysgenerator/csvbased/CarbonCSVBasedSeqGenStep.java
 ---
@@ -1171,6 +1171,14 @@ else if(isComplexTypeColumn[j]) {
   DirectDictionaryGenerator directDictionaryGenerator1 =
   DirectDictionaryKeyGeneratorFactory
   
.getDirectDictionaryGenerator(details.getColumnType());
+  String[] timeformats = meta.timeFormat.split(",");
+  for(String timeformat:timeformats){
+if(timeformat.startsWith(details.getColumnName())){
+  timeformat = timeformat.replaceFirst(":",
+  CarbonCommonConstants.COLON_SPC_CHARACTER);
+  tuple = timeformat.replace(details.getColumnName(), 
tuple);
+}
+  }
--- End diff --

better to not modify tuple value


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

2016-10-10 Thread lion-x

Github user lion-x commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/219#discussion_r82722044
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/keygenerator/directdictionary/timestamp/TimeStampDirectDictionaryGenerator.java
 ---
@@ -117,15 +117,24 @@ private TimeStampDirectDictionaryGenerator() {
* @return dictionary value
*/
   @Override public int generateDirectSurrogateKey(String memberStr) {
-SimpleDateFormat timeParser = new 
SimpleDateFormat(CarbonProperties.getInstance()
-.getProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
-CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT));
+String timeString;
+String formatString;
+if (memberStr.contains(CarbonCommonConstants.COLON_SPC_CHARACTER)){
--- End diff --

because in some format like -XX-XX 00:00:00.000, it has colon, it will 
make mistake when separating the memberstring.
for example member string like, 2016-08-11 00:00:00.000:-MM-dd 
HH.mm.ss.SSS


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[ANNOUNCE] Apache CarbonData 0.1.1-incubating Release

2016-10-10 Thread Liang Big data

Hi,

The Apache CarbonData team would like to announce the release of Apache
 CarbonData 0.1.1-incubating.

Apache CarbonData(incubating) is a new big data file format for faster
interactive query using advanced columnar storage, index, compression and
encoding techniques to improve computing efficiency.

The release artifacts can be downloaded here:
https://dist.apache.org/repos/dist/release/incubator/carbondata/0.1.1-incubating/

Maven artifacts have been made available here:
https://repository.apache.org/content/repositories/releases/org/apache/carbondata

The release notes can be found here:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12320220&version=12338021

Regards
Liang

[GitHub] incubator-carbondata pull request #224: [CARBONDATA-239]Add scan_blocklet_nu...

2016-10-10 Thread sujith71955

Github user sujith71955 commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/224#discussion_r82729006
  
--- Diff: 
core/src/main/java/org/apache/carbondata/scan/processor/AbstractDataBlockIterator.java
 ---
@@ -127,11 +133,15 @@ protected boolean updateScanner() {
 }
   }
 
-  private AbstractScannedResult getNextScannedResult() throws 
QueryExecutionException {
+  private AbstractScannedResult 
getNextScannedResult(QueryStatisticsRecorder recorder,
--- End diff --

Why we need to change this getNextScannedResult() method  parameters. if 
required please pass a statistics model, this will make sure that our method 
parameters wont grow


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (CARBONDATA-294) Timestamp datatype Error

Re: Discussion regrading design of data load after kettle removal.

[GitHub] incubator-carbondata pull request #225: Abstract Snappy interface and sepera...

Re: Discussion regrading design of data load after kettle removal.

[jira] [Created] (CARBONDATA-295) Abstract Snappy interface and seperate it from Compressor interface

[GitHub] incubator-carbondata pull request #226: [CARBONDATA-294]Fix timestamp data e...

Re: Discussion regrading design of data load after kettle removal.

RE: Discussion about using multi local directorys to improve dataloading perfomance

[GitHub] incubator-carbondata pull request #225: [CARBONDATA-295]Abstract Compressor ...

[jira] [Created] (CARBONDATA-296) 1.Add CSVInputFormat to read csv files.

[jira] [Created] (CARBONDATA-297) 2. Add interfaces for data loading.

[jira] [Created] (CARBONDATA-298) 3. Add InputProcessorStep which should iterate recordreader and parse the data as per the data type.

[jira] [Created] (CARBONDATA-299) 4. Add dictionary generator interfaces and give implementation for pre created dictionary.

[jira] [Created] (CARBONDATA-300) 5. Add EncodeProcessorStep which encodes the data with dictionary.

[jira] [Created] (CARBONDATA-301) 6. Add SortProcessorStep which sorts the data as per dimension order and write the sorted files to temp location.

[jira] [Created] (CARBONDATA-302) 7. Add DataWriterProcessorStep which reads the data from sort temp files and creates carbondata files.

[jira] [Created] (CARBONDATA-303) 8. Add CarbonTableOutpuFormat to write data to carbon.

[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

[GitHub] incubator-carbondata pull request #215: [WIP][CARBONDATA-2] Remove kettle fr...

[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

[GitHub] incubator-carbondata pull request #219: [CARBONDATA-37]Support different tim...

[ANNOUNCE] Apache CarbonData 0.1.1-incubating Release

[GitHub] incubator-carbondata pull request #224: [CARBONDATA-239]Add scan_blocklet_nu...

27 matches

Site Navigation

Mail list logo

Footer information