[jira] [Created] (CARBONDATA-3155) DataFrame support read CarbonSession/SDK written data

2018-12-07 Thread xubo245 (JIRA)
xubo245 created CARBONDATA-3155:
---

 Summary: DataFrame support read CarbonSession/SDK written data
 Key: CARBONDATA-3155
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3155
 Project: CarbonData
  Issue Type: Improvement
Reporter: xubo245
Assignee: xubo245


DataFrame support read CarbonSession/SDK written data



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2981: [CARBONDATA-3154] Fix spark-2.1 test error

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2981
  
Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9921/



---


[GitHub] carbondata issue #2981: [CARBONDATA-3154] Fix spark-2.1 test error

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2981
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1873/



---


[GitHub] carbondata issue #2981: [CARBONDATA-3154] Fix spark-2.1 test error

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2981
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1661/



---


[GitHub] carbondata issue #2981: [CARBONDATA-3154] Fix spark-2.1 test error

2018-12-07 Thread xubo245
Github user xubo245 commented on the issue:

https://github.com/apache/carbondata/pull/2981
  
retest this please


---


[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support

2018-12-07 Thread qiuchenjian
Github user qiuchenjian commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2980#discussion_r239991055
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/loading/parser/impl/RowParserImpl.java
 ---
@@ -34,8 +37,12 @@
   private int numberOfColumns;
 
   public RowParserImpl(DataField[] output, CarbonDataLoadConfiguration 
configuration) {
-String[] complexDelimiters =
+String[] tempComplexDelimiters =
 (String[]) 
configuration.getDataLoadProperty(DataLoadProcessorConstants.COMPLEX_DELIMITERS);
+Queue complexDelimiters = new LinkedList<>();
+for (int i = 0; i < 4; i++) {
--- End diff --

“i < 4”  the 4 is not clear, it's better to exchage is by constant 


---


[GitHub] carbondata issue #2981: [CARBONDATA-3154] Fix spark-2.1 test error

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2981
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1872/



---


[GitHub] carbondata issue #2981: [CARBONDATA-3154] Fix spark-2.1 test error

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2981
  
Build Failed  with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9920/



---


[GitHub] carbondata issue #2981: [CARBONDATA-3154] Fix spark-2.1 test error

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2981
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1660/



---


[GitHub] carbondata issue #2981: [CARBONDATA-3154] Fix spark-2.1 test error

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2981
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1659/



---


[GitHub] carbondata pull request #2981: [CARBONDATA-3154] Fix spark-2.1 test error

2018-12-07 Thread xubo245
GitHub user xubo245 opened a pull request:

https://github.com/apache/carbondata/pull/2981

[CARBONDATA-3154] Fix spark-2.1 test error

[CARBONDATA-3154] Fix spark-2.1 test error
This PR fix  spark-2.1 test error, including:
1. fix 6 errors of 
org.apache.spark.sql.carbondata.datasource.SparkCarbonDataSourceTest

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 No
 - [ ] Any backward compatibility impacted?
 No
 - [ ] Document update required?
No
 - [ ] Testing done
   fix test code
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
No


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xubo245/carbondata 
CARBONDATA-3154_FixSpark2_1_0TestError

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2981.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2981


commit 82ce986f500e8412c7cc515814f0fffb84b26890
Author: xubo245 
Date:   2018-12-07T16:01:43Z

[CARBONDATA-3154] Fix spark-2.1 test error




---


[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
Build Success with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9918/



---


[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1870/



---


[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2980
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1869/



---


[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2980
  
Build Failed  with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9917/



---


[GitHub] carbondata issue #2979: [CARBONDATA-3153] Complex delimiters change

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2979
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1868/



---


[GitHub] carbondata issue #2979: [CARBONDATA-3153] Complex delimiters change

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2979
  
Build Failed  with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9916/



---


[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2980
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1657/



---


[GitHub] carbondata issue #2978: [WIP] Added lazy load and direct vector fill support...

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1658/



---


[jira] [Created] (CARBONDATA-3154) Fix spark-2.1 test error

2018-12-07 Thread xubo245 (JIRA)
xubo245 created CARBONDATA-3154:
---

 Summary: Fix spark-2.1 test error
 Key: CARBONDATA-3154
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3154
 Project: CarbonData
  Issue Type: Bug
Affects Versions: 1.5.1
Reporter: xubo245
Assignee: xubo245
 Fix For: 1.5.2


Now the CI didn't run Spark2.1 UT, only compile, so when test spark2.1, there 
are some errors.

for example:

command:

{code:java}
-Pspark-2.1  clean install
{code}

error1:


{code:java}
2018-12-07 21:47:20 INFO  HiveMetaStore:746 - 0: get_database: global_temp
2018-12-07 21:47:20 INFO  audit:371 - ugi=xubo  ip=unknown-ip-addr  
cmd=get_database: global_temp   
2018-12-07 21:47:20 WARN  ObjectStore:568 - Failed to get database global_temp, 
returning NoSuchObjectException
*** RUN ABORTED ***
  org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 
'location' expecting {, '(', '.', 'SELECT', 'FROM', 'AS', 'WITH', 
'VALUES', 'TABLE', 'INSERT', 'MAP', 'REDUCE', 'OPTIONS', 'CLUSTERED', 
'PARTITIONED'}(line 1, pos 150)

== SQL ==
create table par_table(male boolean, age int, height double, name string, 
address string,salary long, floatField float, bytefield byte) using parquet 
location 
'/Users/xubo/Desktop/xubo/git/carbondata1/integration/spark-datasource/target/warehouse2'
--^^^
  at 
org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:197)
  at 
org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:99)
  at 
org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:45)
  at 
org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:53)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
  at 
org.apache.spark.sql.carbondata.datasource.SparkCarbonDataSourceTest.createParquetTable(SparkCarbonDataSourceTest.scala:1126)
  at 
org.apache.spark.sql.carbondata.datasource.SparkCarbonDataSourceTest.beforeAll(SparkCarbonDataSourceTest.scala:1359)
  at 
org.scalatest.BeforeAndAfterAll$class.beforeAll(BeforeAndAfterAll.scala:187)
  at 
org.apache.spark.sql.carbondata.datasource.SparkCarbonDataSourceTest.beforeAll(SparkCarbonDataSourceTest.scala:38)
  at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:253)
  ...
[INFO] 
[INFO] Reactor Summary:
{code}

there are another 5 errors in 
org.apache.spark.sql.carbondata.datasource.SparkCarbonDataSourceTest




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support

2018-12-07 Thread manishnalla1994
GitHub user manishnalla1994 opened a pull request:

https://github.com/apache/carbondata/pull/2980

[CARBONDATA-3017] Map DDL Support

Support Create DDL for Map type.
This PR is dependant on PR#2979 for the change of delimiters.

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [x] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/manishnalla1994/carbondata MapDDL5Dec

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2980.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2980


commit 322b52a64c840317a6905664e8de16327e3635e0
Author: Manish Nalla 
Date:   2018-10-16T09:48:08Z

MapDDLSupport

commit 3d119888a80e7d8f9cab59e477984b56af1309f6
Author: manishnalla1994 
Date:   2018-12-07T08:18:31Z

Added Testcases and Local Dict Support

commit 5fe06801360fc04bab9c1239ea8d007f37bc69d4
Author: manishnalla1994 
Date:   2018-12-07T13:28:54Z

Test Files for Map

commit 4cc8ba13b234a13b9a3cef541e37f492153e7d1b
Author: manishnalla1994 
Date:   2018-12-07T14:44:12Z

Changed TestCases and Supported 2 new delimiters




---


[GitHub] carbondata issue #2979: [CARBONDATA-3153] Complex delimiters change

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2979
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1656/



---


[GitHub] carbondata pull request #2979: [CARBONDATA-3153] Complex delimiters change

2018-12-07 Thread manishnalla1994
GitHub user manishnalla1994 opened a pull request:

https://github.com/apache/carbondata/pull/2979

[CARBONDATA-3153] Complex delimiters change

Changed the two Complex Delimiters used to '\001' and '\002'.

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [x] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/manishnalla1994/carbondata ComplexDelimiters

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2979.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2979


commit 7cfa05fbf65b5b176fe94ce6c36e4deb10a2a437
Author: manishnalla1994 
Date:   2018-12-07T09:25:58Z

Delimiters changed

commit bcf265316627f49862a994292bb37169afe40403
Author: manishnalla1994 
Date:   2018-12-07T13:46:57Z

Change of 2 complex delimiters




---


[jira] [Created] (CARBONDATA-3153) Change of Complex Delimiters

2018-12-07 Thread MANISH NALLA (JIRA)
MANISH NALLA created CARBONDATA-3153:


 Summary: Change of Complex Delimiters
 Key: CARBONDATA-3153
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3153
 Project: CarbonData
  Issue Type: Bug
Reporter: MANISH NALLA
Assignee: MANISH NALLA






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3005) Supporting Gzip as Column Compressor

2018-12-07 Thread Shardul Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shardul Singh updated CARBONDATA-3005:
--
Summary: Supporting Gzip as Column Compressor  (was: Proposing Gzip 
Compression support)

> Supporting Gzip as Column Compressor
> 
>
> Key: CARBONDATA-3005
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3005
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Shardul Singh
>Assignee: Shardul Singh
>Priority: Minor
>
> Currently CarbonData uses Snappy as default codec to compress its columnar 
> file, Other than SNAPPY carbondata supports zstd. This Issue is targeted to 
> support:
> 1. Gzip compression codec.
> Benefits of Gzip are :
>  # Gzip offers reduced file size compared to other codec like snappy but at 
> the cost of processing speed.
>  # Gzip is suitable for users who have cold data i.e. data which which are 
> stored permanently and will be queried rarely.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2977: [WIP] [CARBONDATA-3147] Fixed concurrent load issue

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2977
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1867/



---


[GitHub] carbondata issue #2977: [WIP] [CARBONDATA-3147] Fixed concurrent load issue

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2977
  
Build Failed  with Spark 2.3.2, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9915/



---


[GitHub] carbondata issue #2975: [WIP][CARBONDATA-3145] Read improvement for complex ...

2018-12-07 Thread qiuchenjian
Github user qiuchenjian commented on the issue:

https://github.com/apache/carbondata/pull/2975
  
@dhatchayani  I read the DimensionRawColumnChunk, this class has cache the 
decoded  DimensionColumnPage,   is your Map added useful?

  public DimensionColumnPage decodeColumnPage(int pageNumber) {
assert pageNumber < pagesCount;
if (dataChunks == null) {
  dataChunks = new DimensionColumnPage[pagesCount];
}
if (dataChunks[pageNumber] == null) {
  try {
dataChunks[pageNumber] = chunkReader.decodeColumnPage(this, 
pageNumber, null);
  } catch (IOException | MemoryException e) {
throw new RuntimeException(e);
  }
}

return dataChunks[pageNumber];
  }



---


[GitHub] carbondata issue #2975: [WIP][CARBONDATA-3145] Read improvement for complex ...

2018-12-07 Thread dhatchayani
Github user dhatchayani commented on the issue:

https://github.com/apache/carbondata/pull/2975
  
@qiuchenjian Actually, DimensionColumnPage is decoded from 
DimensionRawColumnChunk. So it is meaningless to cache it in 
DimensionRawColumnChunk. Both serves different purposes.


---


[GitHub] carbondata issue #2977: [WIP] [CARBONDATA-3147] Fixed concurrent load issue

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2977
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1655/



---


[GitHub] carbondata issue #2975: [WIP][CARBONDATA-3145] Read improvement for complex ...

2018-12-07 Thread qiuchenjian
Github user qiuchenjian commented on the issue:

https://github.com/apache/carbondata/pull/2975
  
Are the decoded DimensionColumnPage are cached in DimensionRawColumnChunk, 
so the function of cache is used by other code. 
Such as, DimensionRawColumnChunk contains a 
Map cacheDimPages, when a Page is decoded, it will 
cache in this map. 


---


[GitHub] carbondata issue #2977: [WIP] [CARBONDATA-3147] Fixed concurrent load issue

2018-12-07 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2977
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1654/



---