date:20161114

[GitHub] incubator-carbondata pull request #317: added test case for file system

2016-11-14 Thread anubhav100

GitHub user anubhav100 opened a pull request:

https://github.com/apache/incubator-carbondata/pull/317

added test case for file system

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[CARBONDATA-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).
 - [ ] Testing done
 
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- What manual testing you have done?
- Any additional information to help reviewers in testing this 
change.
 
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
 
---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/anubhav100/incubator-carbondata CARBONDATA-410

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/317.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #317


commit 14ebd6ddae9621580582e1b42dc5e47457a3455c
Author: anubhav100 
Date:   2016-11-15T06:18:51Z

added test case for file system




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (CARBONDATA-410) Implement test cases for core.datastore.file system

2016-11-14 Thread SWATI RAO (JIRA)

SWATI RAO created CARBONDATA-410:


 Summary: Implement test cases for core.datastore.file system
 Key: CARBONDATA-410
 URL: https://issues.apache.org/jira/browse/CARBONDATA-410
 Project: CarbonData
  Issue Type: Task
Reporter: SWATI RAO






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[Feature] proposal for update and delete support in Carbon data

2016-11-14 Thread Vinod KC

Hi All
I would like to propose following new features in Carbon data
1) Update statement to support modifying existing records in carbon data
table
2) Delete statement to remove records from carbon data table

A) Update operation: 'Update' features can be added to CarbonData using
intermediate Delta files [delete/update delta files] support with lesser
impact on existing code.
Update can be considered as a ‘delete’ followed by an‘insert’ operation.
Once an update is done on carbon data file, on select query operation,
Carbondata store reader can make use of delete delta data cache to exclude
deleted records in that segment and then include records from newly added
update delta files.

B) Delete operation: In the case of delete operation, a delete delta file
will be added to each segment matching the records. During select query
operation Carbon data reader will exclude those deleted records from the
result set.

Please share your suggestions and thoughts about design and functional
aspects on this feature. I’ll share a detailed design document about above
thoughts later.

Regards
Vinod

[GitHub] incubator-carbondata pull request #316: [WIP]create agg table segment for ev...

2016-11-14 Thread Jay357089

Github user Jay357089 closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/316


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Re: [RESULT][VOTE] Apache CarbonData 0.2.0-incubating release

2016-11-14 Thread Uma gangumalla

Sorry for coming late on this.

here is my +1 (binding) too.

I will carry my +1 to incubator list as well.

Regards,
Uma

On Sun, Nov 13, 2016 at 6:20 AM, Liang Chen  wrote:

> Hi
>
> PPMC vote has passed, the result as below:
> +1(binding) : 6
> +1(non-binding) : 6
> Thanks all for your vote.
>
> Regards
> Liang
>
> Liang Chen wrote
> > Hi all,
> >
> > I submit the CarbonData 0.2.0-incubating to your vote.
> >
> > Release Notes:
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12320220&version=12337896
> >
> > Staging Repository:
> > https://repository.apache.org/content/repositories/
> orgapachecarbondata-1006
> >
> > Git Tag:
> > carbondata-0.2.0-incubating
> >
> > Please vote to approve this release:
> > [ ] +1 Approve the release
> > [ ] -1 Don't approve the release (please provide specific comments)
> >
> > This vote will be open for at least 72 hours. If this vote passes (we
> need
> > at least 3 binding votes, meaning three votes from the PPMC), I will
> > forward to
>
> > general@.apache
>
> >  for  the IPMC votes.
> >
> > Here is my vote : +1 (binding)
> >
> > Regards
> > Liang
>
>
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/VOTE-Apache-
> CarbonData-0-2-0-incubating-release-tp2823p2881.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>

RE: Single Pass Data Load Design

2016-11-14 Thread Jihong Ma

Hi Ravindra,

Thank you for putting together a proposal for improving data load process！

Please find my comments in-lined in the Google doc. 

Jihong

-Original Message-
From: Ravindra Pesala [mailto:ravi.pes...@gmail.com] 
Sent: Sunday, November 13, 2016 4:24 AM
To: dev
Subject: Single Pass Data Load Design

Hi All,

Please find the proposed solutions for single pass data load.

https://docs.google.com/document/d/1_sSN9lccCZo4E_X3pNP5PchQACqif3AOXKTuG-YJAcc/edit?usp=sharing
-- 
Thanks & Regards,
Ravindra

[GitHub] incubator-carbondata pull request #316: [WIP]create agg table segment for ev...

2016-11-14 Thread Jay357089

GitHub user Jay357089 opened a pull request:

https://github.com/apache/incubator-carbondata/pull/316

[WIP]create agg table segment for every fact table single segment

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[CARBONDATA-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).
 - [ ] Testing done
 
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- What manual testing you have done?
- Any additional information to help reviewers in testing this 
change.
 
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
 
---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Jay357089/incubator-carbondata createAggTable

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/316.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #316


commit 92da6a1313ef5eaa532e10d46828d0e6791a4119
Author: Jay357089 
Date:   2016-11-10T07:01:37Z

create agg table segment for every fact table single segment




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

2016-11-14 Thread Ravindra Pesala

+1

On Mon, Nov 14, 2016, 3:54 PM sujith chacko 
wrote:

> Hi liang,
> Yes,  its for high cardinality columns.
> Thanks,
> Sujith
>
> On Nov 14, 2016 2:01 PM, "Liang Chen"  wrote:
>
> > Hi
> >
> > I have one query : for no dictionary columns which are high cardinality
> > like phone number, Whether the pruning cost is hight,or not ?
> >
> > Regards
> > Liang
> >
> > 2016-11-14 15:18 GMT+08:00 sujith chacko :
> >
> > > Hi All,
> > >
> > >   I am going to  optimize the LIKE Filter query flow for no-dictionary
> > > columns, please find the details mentioned below.
> > >
> > > *Current design:*
> > > For Like filter queries no push down is happening to carbon layer,
> > because
> > > of this there will be no block/blocklet level pruning which can happen
> > > before applying the LIKE filters, this can add overhead while scanning
> > > since the system has to scan all the blocks and blocklets in order to
> > apply
> > > filters.
> > >
> > > *Proposed design/solution:*
> > > Like filters(startsWith,endsWith,contains) can be pushed to carbon
> > engine
> > > layer so that carbon can perform block and blocklet level pruning
> inorder
> > > before applying filters.
> > > Block level pruning will be happening in driver side and blocklet level
> > > pruning will be done in executer as per existing design.
> > >
> > > Requesting all to please provide valuable feedback and vote for
> > > implementing the above solution inorder to  improve Like Filter
> Queries.
> > >
> > > Thanks,
> > > Sujith
> > >
> >
> >
> >
> > --
> > Regards
> > Liang
> >
>

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

2016-11-14 Thread sujith chacko

Hi liang,
Yes,  its for high cardinality columns.
Thanks,
Sujith

On Nov 14, 2016 2:01 PM, "Liang Chen"  wrote:

> Hi
>
> I have one query : for no dictionary columns which are high cardinality
> like phone number, Whether the pruning cost is hight,or not ?
>
> Regards
> Liang
>
> 2016-11-14 15:18 GMT+08:00 sujith chacko :
>
> > Hi All,
> >
> >   I am going to  optimize the LIKE Filter query flow for no-dictionary
> > columns, please find the details mentioned below.
> >
> > *Current design:*
> > For Like filter queries no push down is happening to carbon layer,
> because
> > of this there will be no block/blocklet level pruning which can happen
> > before applying the LIKE filters, this can add overhead while scanning
> > since the system has to scan all the blocks and blocklets in order to
> apply
> > filters.
> >
> > *Proposed design/solution:*
> > Like filters(startsWith,endsWith,contains) can be pushed to carbon
> engine
> > layer so that carbon can perform block and blocklet level pruning inorder
> > before applying filters.
> > Block level pruning will be happening in driver side and blocklet level
> > pruning will be done in executer as per existing design.
> >
> > Requesting all to please provide valuable feedback and vote for
> > implementing the above solution inorder to  improve Like Filter Queries.
> >
> > Thanks,
> > Sujith
> >
>
>
>
> --
> Regards
> Liang
>

[jira] [Created] (CARBONDATA-409) Drop non-existing macro executes successfully while it must give an error.

2016-11-14 Thread Sangeeta Gulia (JIRA)

Sangeeta Gulia created CARBONDATA-409:
-

 Summary: Drop non-existing macro executes successfully while it 
must give an error.
 Key: CARBONDATA-409
 URL: https://issues.apache.org/jira/browse/CARBONDATA-409
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Reporter: Sangeeta Gulia


I have created a macro :

CREATE TEMPORARY MACRO simple_add (x int, y int) x + y;
then i dropped the macro.

 > drop temporary macro simple_add;
OK
Time taken: 0.038 seconds
hive> 
> 
> select simple_add(2,3);
FAILED: SemanticException [Error 10011]: Line 1:7 Invalid function 'simple_add'
  

then i again tried to drop the same macro and it again executed without any 
exception:
> drop temporary macro simple_add;
OK
Time taken: 0.016 seconds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

2016-11-14 Thread Kumar Vishal

+1
Hi Liang,
Pruning cost won't be high as block pruning will be done at complete btree
level and it will improve query performance for no dictionary column.

-Regards
Kumar Vishal

On Nov 14, 2016 14:01, "Liang Chen"  wrote:

> Hi
>
> I have one query : for no dictionary columns which are high cardinality
> like phone number, Whether the pruning cost is hight,or not ?
>
> Regards
> Liang
>
> 2016-11-14 15:18 GMT+08:00 sujith chacko :
>
> > Hi All,
> >
> >   I am going to  optimize the LIKE Filter query flow for no-dictionary
> > columns, please find the details mentioned below.
> >
> > *Current design:*
> > For Like filter queries no push down is happening to carbon layer,
> because
> > of this there will be no block/blocklet level pruning which can happen
> > before applying the LIKE filters, this can add overhead while scanning
> > since the system has to scan all the blocks and blocklets in order to
> apply
> > filters.
> >
> > *Proposed design/solution:*
> > Like filters(startsWith,endsWith,contains) can be pushed to carbon
> engine
> > layer so that carbon can perform block and blocklet level pruning inorder
> > before applying filters.
> > Block level pruning will be happening in driver side and blocklet level
> > pruning will be done in executer as per existing design.
> >
> > Requesting all to please provide valuable feedback and vote for
> > implementing the above solution inorder to  improve Like Filter Queries.
> >
> > Thanks,
> > Sujith
> >
>
>
>
> --
> Regards
> Liang
>

Re: Single Pass Data Load Design

2016-11-14 Thread Liang Chen

Hi

Yes, good feature. This improvement would significantly improve data load
performance.
Can you provide a sequence diagram for the whole data load process?

Regards
Liang

2016-11-14 15:42 GMT+08:00 Jacky Li :

> Hi Ravindra,
>
> Thanks for proposing this design. It is really exciting if CarbonData can
> do
> 1-pass solution for loading. I have given some comment in the design
> document.
>
> Regards,
> Jacky
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/Single-Pass-
> Data-Load-Design-tp2875p2894.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>



-- 
Regards
Liang

[GitHub] incubator-carbondata pull request #313: [CARBONDATA-405]Fixed Data load fail...

2016-11-14 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/313


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

2016-11-14 Thread Liang Chen

Hi

I have one query : for no dictionary columns which are high cardinality
like phone number, Whether the pruning cost is hight,or not ?

Regards
Liang

2016-11-14 15:18 GMT+08:00 sujith chacko :

> Hi All,
>
>   I am going to  optimize the LIKE Filter query flow for no-dictionary
> columns, please find the details mentioned below.
>
> *Current design:*
> For Like filter queries no push down is happening to carbon layer, because
> of this there will be no block/blocklet level pruning which can happen
> before applying the LIKE filters, this can add overhead while scanning
> since the system has to scan all the blocks and blocklets in order to apply
> filters.
>
> *Proposed design/solution:*
> Like filters(startsWith,endsWith,contains) can be pushed to carbon engine
> layer so that carbon can perform block and blocklet level pruning inorder
> before applying filters.
> Block level pruning will be happening in driver side and blocklet level
> pruning will be done in executer as per existing design.
>
> Requesting all to please provide valuable feedback and vote for
> implementing the above solution inorder to  improve Like Filter Queries.
>
> Thanks,
> Sujith
>



-- 
Regards
Liang

[GitHub] incubator-carbondata pull request #317: added test case for file system

[jira] [Created] (CARBONDATA-410) Implement test cases for core.datastore.file system

[Feature] proposal for update and delete support in Carbon data

[GitHub] incubator-carbondata pull request #316: [WIP]create agg table segment for ev...

Re: [RESULT][VOTE] Apache CarbonData 0.2.0-incubating release

RE: Single Pass Data Load Design

[GitHub] incubator-carbondata pull request #316: [WIP]create agg table segment for ev...

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

[jira] [Created] (CARBONDATA-409) Drop non-existing macro executes successfully while it must give an error.

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

Re: Single Pass Data Load Design

[GitHub] incubator-carbondata pull request #313: [CARBONDATA-405]Fixed Data load fail...

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

14 matches

Site Navigation

Mail list logo

Footer information