[jira] [Commented] (KYLIN-3070) Add a config property for flat table storage format

2017-12-08 Thread Shaofeng SHI (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284471#comment-16284471
 ] 

Shaofeng SHI commented on KYLIN-3070:
-

yeah I will review it soon; Thanks for the reminder.

> Add a config property for flat table storage format
> ---
>
> Key: KYLIN-3070
> URL: https://issues.apache.org/jira/browse/KYLIN-3070
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v2.2.0
> Environment: HDP 2.5.6, Kylin 2.2.0
>Reporter: Vsevolod Ostapenko
>Assignee: Vsevolod Ostapenko
>Priority: Minor
>  Labels: newbie
> Attachments: KYLIN-3070.master.001.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Flat table storage format is currently hard-coded as SEQUENCEFILE in the 
> core-job/src/main/java/org/apache/kylin/job/JoinedFlatTable.java
> That prevents using Impala as a SQL engine while using beeline CLI (via 
> custom JDBC URL), as Impala cannot write sequence files.
> Adding a parameter to kylin.properties to override the default setting would 
> address the issue.
> Removing a hard-coded value for storage format might be good idea in and on 
> itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KYLIN-3070) Add a config property for flat table storage format

2017-12-08 Thread Vsevolod Ostapenko (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284311#comment-16284311
 ] 

Vsevolod Ostapenko commented on KYLIN-3070:
---

[~yimingliu] or [~Shaofengshi], could one of you guys review my changes and 
provide feedback or, if the changes are ok, commit them into the master?

> Add a config property for flat table storage format
> ---
>
> Key: KYLIN-3070
> URL: https://issues.apache.org/jira/browse/KYLIN-3070
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v2.2.0
> Environment: HDP 2.5.6, Kylin 2.2.0
>Reporter: Vsevolod Ostapenko
>Assignee: Vsevolod Ostapenko
>Priority: Minor
>  Labels: newbie
> Attachments: KYLIN-3070.master.001.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Flat table storage format is currently hard-coded as SEQUENCEFILE in the 
> core-job/src/main/java/org/apache/kylin/job/JoinedFlatTable.java
> That prevents using Impala as a SQL engine while using beeline CLI (via 
> custom JDBC URL), as Impala cannot write sequence files.
> Adding a parameter to kylin.properties to override the default setting would 
> address the issue.
> Removing a hard-coded value for storage format might be good idea in and on 
> itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KYLIN-3070) Add a config property for flat table storage format

2017-12-06 Thread Vsevolod Ostapenko (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16281336#comment-16281336
 ] 

Vsevolod Ostapenko commented on KYLIN-3070:
---

Patch file is attached, please review. Let me know, if you have any questions 
or comments.

> Add a config property for flat table storage format
> ---
>
> Key: KYLIN-3070
> URL: https://issues.apache.org/jira/browse/KYLIN-3070
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v2.2.0
> Environment: HDP 2.5.6, Kylin 2.2.0
>Reporter: Vsevolod Ostapenko
>Assignee: Vsevolod Ostapenko
>Priority: Minor
>  Labels: newbie
> Attachments: KYLIN-3070.master.001.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Flat table storage format is currently hard-coded as SEQUENCEFILE in the 
> core-job/src/main/java/org/apache/kylin/job/JoinedFlatTable.java
> That prevents using Impala as a SQL engine while using beeline CLI (via 
> custom JDBC URL), as Impala cannot write sequence files.
> Adding a parameter to kylin.properties to override the default setting would 
> address the issue.
> Removing a hard-coded value for storage format might be good idea in and on 
> itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KYLIN-3070) Add a config property for flat table storage format

2017-12-06 Thread Billy Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16281268#comment-16281268
 ] 

Billy Liu commented on KYLIN-3070:
--

[~seva_ostapenko] it's your turn now. 

> Add a config property for flat table storage format
> ---
>
> Key: KYLIN-3070
> URL: https://issues.apache.org/jira/browse/KYLIN-3070
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v2.2.0
> Environment: HDP 2.5.6, Kylin 2.2.0
>Reporter: Vsevolod Ostapenko
>Assignee: Vsevolod Ostapenko
>Priority: Minor
>  Labels: newbie
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Flat table storage format is currently hard-coded as SEQUENCEFILE in the 
> core-job/src/main/java/org/apache/kylin/job/JoinedFlatTable.java
> That prevents using Impala as a SQL engine while using beeline CLI (via 
> custom JDBC URL), as Impala cannot write sequence files.
> Adding a parameter to kylin.properties to override the default setting would 
> address the issue.
> Removing a hard-coded value for storage format might be good idea in and on 
> itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KYLIN-3070) Add a config property for flat table storage format

2017-12-06 Thread Vsevolod Ostapenko (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16281003#comment-16281003
 ] 

Vsevolod Ostapenko commented on KYLIN-3070:
---

I made a fix and tested it on my copy of the master branch.
My version of the fix introduces two new parameters in the kylin.properties:
* kylin.source.hive.flat-table-storage-format, which defaults to SEQUENCEFILE
* kylin.source.hive.flat-table-field-delimiter, which defaults to \u001F (Unit 
separator, the same default field separator that Hive uses)

I tested my changes internally and confirmed that they are working as expected.
Btw, while making the change I found a problem with existing handling of the 
TEXTFILE field separators - namely, the value was always fetched from 
kylin.source.jdbc.field-delimiter (apparently a kludge), which technically has 
no direct relations to flat table, so introduction of the 
kylin.source.hive.flat-table-field-delimiter seems warranted.
If you don't have changes ready, please reassign this JIRA ticket to me.

> Add a config property for flat table storage format
> ---
>
> Key: KYLIN-3070
> URL: https://issues.apache.org/jira/browse/KYLIN-3070
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v2.2.0
> Environment: HDP 2.5.6, Kylin 2.2.0
>Reporter: Vsevolod Ostapenko
>Assignee: Rong H
>Priority: Minor
>  Labels: newbie
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Flat table storage format is currently hard-coded as SEQUENCEFILE in the 
> core-job/src/main/java/org/apache/kylin/job/JoinedFlatTable.java
> That prevents using Impala as a SQL engine while using beeline CLI (via 
> custom JDBC URL), as Impala cannot write sequence files.
> Adding a parameter to kylin.properties to override the default setting would 
> address the issue.
> Removing a hard-coded value for storage format might be good idea in and on 
> itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KYLIN-3070) Add a config property for flat table storage format

2017-12-01 Thread Shaofeng SHI (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274200#comment-16274200
 ] 

Shaofeng SHI commented on KYLIN-3070:
-

Hi Vsevolod, we all use Sandbox (HDP 2.4) for development at this moment; or if 
the change is minor (not need debug), you can change the code and then made a 
binary build to verify in a non-sandbox env.

Rong, please go ahead.

> Add a config property for flat table storage format
> ---
>
> Key: KYLIN-3070
> URL: https://issues.apache.org/jira/browse/KYLIN-3070
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v2.2.0
> Environment: HDP 2.5.6, Kylin 2.2.0
>Reporter: Vsevolod Ostapenko
>Priority: Minor
>  Labels: newbie
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Flat table storage format is currently hard-coded as SEQUENCEFILE in the 
> core-job/src/main/java/org/apache/kylin/job/JoinedFlatTable.java
> That prevents using Impala as a SQL engine while using beeline CLI (via 
> custom JDBC URL), as Impala cannot write sequence files.
> Adding a parameter to kylin.properties to override the default setting would 
> address the issue.
> Removing a hard-coded value for storage format might be good idea in and on 
> itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KYLIN-3070) Add a config property for flat table storage format

2017-11-30 Thread RongH (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273837#comment-16273837
 ] 

RongH commented on KYLIN-3070:
--

I want to do it. Could you assign this issue to me?

> Add a config property for flat table storage format
> ---
>
> Key: KYLIN-3070
> URL: https://issues.apache.org/jira/browse/KYLIN-3070
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v2.2.0
> Environment: HDP 2.5.6, Kylin 2.2.0
>Reporter: Vsevolod Ostapenko
>Priority: Minor
>  Labels: newbie
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Flat table storage format is currently hard-coded as SEQUENCEFILE in the 
> core-job/src/main/java/org/apache/kylin/job/JoinedFlatTable.java
> That prevents using Impala as a SQL engine while using beeline CLI (via 
> custom JDBC URL), as Impala cannot write sequence files.
> Adding a parameter to kylin.properties to override the default setting would 
> address the issue.
> Removing a hard-coded value for storage format might be good idea in and on 
> itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KYLIN-3070) Add a config property for flat table storage format

2017-11-30 Thread Vsevolod Ostapenko (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273081#comment-16273081
 ] 

Vsevolod Ostapenko commented on KYLIN-3070:
---

I don't mind to work on this issue, but I'm new to the process and existing 
instructions on setting up dev environment seem a bit outdated.
If there are updated version for a non-sandbox HDP 2.5.x install and Kylin 
2.2.x, I'd like to have that in order to have my environment setup correctly.

> Add a config property for flat table storage format
> ---
>
> Key: KYLIN-3070
> URL: https://issues.apache.org/jira/browse/KYLIN-3070
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.2.0
> Environment: HDP 2.5.6, Kylin 2.2.0
>Reporter: Vsevolod Ostapenko
>Assignee: Dong Li
>Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Flat table storage format is currently hard-coded as SEQUENCEFILE in the 
> core-job/src/main/java/org/apache/kylin/job/JoinedFlatTable.java
> That prevents using Impala as a SQL engine while using beeline CLI (via 
> custom JDBC URL), as Impala cannot write sequence files.
> Adding a parameter to kylin.properties to override the default setting would 
> address the issue.
> Removing a hard-coded value for storage format might be good idea in and on 
> itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)