[ANNOUNCE] Apache CarbonData 1.1.0 Released

2017-05-19 Thread Liang Chen
Hi All,

The Apache CarbonData PMC team is happy to announce the release of Apache
CarbonData version 1.1.0.
   The key features of this release are highlighted as below.

   -  Introduced new data format called V3 to improve scan performance (~20
   to 50%).
   -  Alter table support in carbondata. (for Spark 2.1)
   -  Supported Batch Sort to improve data loading performance.
   -  Improved Single pass load by upgrading to latest netty framework and
   launched dictionary client for each loading
   -  Supported range filters to combine the between filters to one filter
   to improve the filter performance.
   -  Many improvements done on large cluster especially in query
   processing.
   -  More than 160 bugs and many improvements done in this release.


The release notes is available at:
*https://cwiki.apache.org/confluence/display/CARBONDATA/Apache+CarbonData+1.1.0+Release
*

You can follow this document to use these artifacts:
https://github.com/apache/carbondata/blob/master/docs/quick-start-guide.md


You can find the latest CarbonData document and learn more at:
http://carbondata.apache.org 

Thanks
The Apache CarbonData team


[ANNOUNCE] Apache CarbonData 1.2.0 released

2017-09-29 Thread Liang Chen
The CarbonData is a new BigData file format for a faster interactive query
using advanced columnar storage, index, compression, and encoding
techniques to improve computing efficiency. In turn, it will help to speed
up queries an order of magnitude faster over PetaBytes of data.

The Apache CarbonData PMC team is happy to announce the release of 1.2.0,
the community put very significant effort on improving this release , more
than 50 contributors finished 200+ pull requests for improvements and bug
fixes.

1.Release Notes:
https://cwiki.apache.org/confluence/display/CARBONDATA/Apache+CarbonData+1.2.0+Release

2.Some key improvement in this patch release:
   1)Sort columns feature:  It enables users to define only required
   columns (which are used in query filters) can be sorted while loading
the data. It improves the loading speed.
   2)Support 4 type of sort scope: Local sort, Batch sort, Global sort, No sort
while creating the table
   3)Support partition
   4)Optimize data update and delete for Spark 2.1
   5)Further, improve performance by optimizing measure filter feature
   6)DataMap framework to add custom indexes
   7)Ecosystem feature1: support Presto integration
   8)Ecosystem feature2: support Hive integration

You can follow this document to use these artifacts:
https://github.com/apache/carbondata/blob/master/docs/quick-start-guide.md


We welcome your help and feedback, you can find the more CarbonData
document and learn more at: http://carbondata.apache.org/

Thanks
The Apache CarbonData team