[GitHub] incubator-carbondata issue #620: [WIP]Added batch sort to improve the loadin...

2017-03-01 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/620
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/991/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #620: [WIP]Added batch sort to improve the loadin...

2017-03-01 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/incubator-carbondata/pull/620
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (CARBONDATA-704) data mismatch between hive and carbondata after loading for bigint values

2017-03-01 Thread anubhav tarar (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891726#comment-15891726
 ] 

anubhav tarar commented on CARBONDATA-704:
--

reasonr for this issue is byte compression of data

> data mismatch between hive and carbondata after loading for bigint values
> -
>
> Key: CARBONDATA-704
> URL: https://issues.apache.org/jira/browse/CARBONDATA-704
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.0.0-incubating
>Reporter: SWATI RAO
>Assignee: anubhav tarar
> Attachments: Test_Data1 (4).csv
>
>
> carbondata
> 0: jdbc:hive2://localhost:1> create table Test_Boundary (c1_int 
> int,c2_Bigint Bigint,c3_Decimal Decimal(38,30),c4_double double,c5_string 
> string,c6_Timestamp Timestamp,c7_Datatype_Desc string) STORED BY 
> 'org.apache.carbondata.format' ;
> 0: jdbc:hive2://localhost:1>  LOAD DATA INPATH 
> 'hdfs://localhost:54310/Test_Data1.csv' INTO table Test_Boundary OPTIONS  
>   
> ('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='');
> 0: jdbc:hive2://localhost:1> select c2_Bigint from Test_Boundary;
> +--+--+
> |  c2_Bigint   |
> +--+--+
> | NULL |
> | NULL |
> | NULL |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> | 9223372036854775807  |
> +--+--+
> but in hive
> create table Test_Boundary_hive (c1_int int,c2_Bigint Bigint,c3_Decimal 
> Decimal(38,30),c4_double double,c5_string string,c6_Timestamp 
> Timestamp,c7_Datatype_Desc string)  ROW FORMAT DELIMITED FIELDS TERMINATED BY 
> ",";
> LOAD DATA LOCAL INPATH 'Test_Data1.csv' into table Test_Boundary_hive;
> select c2_Bigint from Test_Boundary_hive;
> +---+--+
> |   c2_Bigint   |
> +---+--+
> | 1234  |
> | 2345  |
> | 3456  |
> | 4567  |
> | 9223372036854775807   |
> | -9223372036854775808  |
> | -9223372036854775807  |
> | -9223372036854775806  |
> | -9223372036854775805  |
> | 0 |
> | 9223372036854775807   |
> | 9223372036854775807   |
> | 9223372036854775807   |
> | NULL  |
> | NULL  |
> | NULL  |
> +---+--+



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] incubator-carbondata issue #620: [WIP]Added batch sort to improve the loadin...

2017-03-01 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/620
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/990/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #620: [WIP]Added batch sort to improve the loadin...

2017-03-01 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/620
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/989/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #620: [WIP]Added batch sort to improve the loadin...

2017-03-01 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/620
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/988/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-737) Add Map datatype support as Hive

2017-03-01 Thread Crabo Yang (JIRA)
Crabo Yang created CARBONDATA-737:
-

 Summary: Add Map datatype support as Hive
 Key: CARBONDATA-737
 URL: https://issues.apache.org/jira/browse/CARBONDATA-737
 Project: CarbonData
  Issue Type: New Feature
Reporter: Crabo Yang


Due to the lack of "alter ... add column ..." syntax support, we’re badly need 
a "Map datatype" in the complex types.

As Hive docs:
Maps

Maps in Hive are similar to Java Maps.
Syntax: MAP



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] incubator-carbondata pull request #620: [WIP]Added batch sort to improve the...

2017-03-01 Thread ravipesala
GitHub user ravipesala opened a pull request:

https://github.com/apache/incubator-carbondata/pull/620

[WIP]Added batch sort to improve the loading performance



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ravipesala/incubator-carbondata batch-sort

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/620.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #620


commit e440bce45913ea3a643ac647d245b130f73db3dd
Author: ravipesala 
Date:   2017-03-01T16:27:32Z

Added batch sort to improve the loading performance




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #614: [CARBONDATA-714]Documented how to handle ba...

2017-03-01 Thread PallaviSingh1992
Github user PallaviSingh1992 commented on the issue:

https://github.com/apache/incubator-carbondata/pull/614
  
@chenliang613 please review


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #619: [CARBONDATA-735]Dictionary performance issu...

2017-03-01 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/619
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/986/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #619: [CARBONDATA-735]Dictionary performan...

2017-03-01 Thread kumarvishal09
GitHub user kumarvishal09 opened a pull request:

https://github.com/apache/incubator-carbondata/pull/619

[CARBONDATA-735]Dictionary performance issue with multiple task in same 
executor

**Problem:**
Currently when more than 1 task is getting launched in one node for a query 
both the task is trying to load the dictionary data and its impacting 
dictionary loading performance.
**Solution:**
Need to add monitor for dictionary loading one task will load the 
dictionary and other task will be waiting and share the same dictionary data 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kumarvishal09/incubator-carbondata 
DictionaryLoadingissuewithmultipletaksinsameexecutor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/619.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #619


commit c79bd5255a3dac9ea7db698f23046d56e95d399f
Author: kumarvishal 
Date:   2017-02-28T10:28:02Z

Dictionary performance issue with multiple task in same executor




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-736) Dictionary Loading issue in Decoder

2017-03-01 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-736:
---

 Summary: Dictionary Loading issue in Decoder
 Key: CARBONDATA-736
 URL: https://issues.apache.org/jira/browse/CARBONDATA-736
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal


Problem:
Currently in Carbon dictionary decoder it is loading the dictionary files, it 
is using get api, when number of columns are high it can use getAll api to load 
dictionary data concurrently 

Solution:
Use get All API



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-735) Dictionary Loading performance issue with multiple task in single node

2017-03-01 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-735:
---

 Summary: Dictionary Loading performance issue with multiple task 
in single node
 Key: CARBONDATA-735
 URL: https://issues.apache.org/jira/browse/CARBONDATA-735
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal


Problem:
Currently when more than 1 task is getting launched in one node for a query 
both the task is trying to load the dictionary data and its impacting 
dictionary loading performance.
Solution:
Need to add monitor for dictionary loading one task will load the dictionary 
and other task will be waiting and share the same dictionary data 




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (CARBONDATA-391) [Documentation] Information missing about how to load data from data source other than CSV

2017-03-01 Thread Prabhat Kashyap (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhat Kashyap closed CARBONDATA-391.
--
Resolution: Fixed

> [Documentation] Information missing about how to load data from data source 
> other than CSV
> --
>
> Key: CARBONDATA-391
> URL: https://issues.apache.org/jira/browse/CARBONDATA-391
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Prabhat Kashyap
>Priority: Trivial
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (CARBONDATA-354) Query execute successfully even not argument given in count function

2017-03-01 Thread Prabhat Kashyap (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhat Kashyap closed CARBONDATA-354.
--
Resolution: Invalid

> Query execute successfully even not argument given in count function
> 
>
> Key: CARBONDATA-354
> URL: https://issues.apache.org/jira/browse/CARBONDATA-354
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Prabhat Kashyap
>Assignee: Akash R Nilugal
>Priority: Minor
>
> When I am executing following command:
> select count() from tableName;
> It gave me no error and execute successfully but it gives following exception 
> when I execute the same in Hive:
> FAILED: UDFArgumentException Argument expected



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] incubator-carbondata issue #616: [CARBONDATA-708] Fixed Between Filter Issue...

2017-03-01 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/616
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/985/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---